Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by bossanova352 View Post
    How silly of me! Well I changed the formatting, but unfortunately I'm still not getting any output from Ray. This is what the file looks like now (all sequences on one line):

    Again, it looks like this step has some clues as to what is going on:

    Fixed! It was another formatting issue, (^M characters were showing up after the one-line formatting). Thanks, Seb! I appreciate the help.
    Short seeds mean that they are not connecting with one another.

    Can you provide a couple of lines from CoverageDistribution.txt (head) ?


    • Originally posted by seb567 View Post
      Short seeds mean that they are not connecting with one another.

      Can you provide a couple of lines from CoverageDistribution.txt (head) ?
      Sure! It does seem to be working now, I'm getting contigs and scaffolds in my output files.

      # KmerCoverage Frequency
      # Any frequency is a even number because of odd k-mer length
      2 158870850
      3 43942818
      4 18999600
      5 10198722
      6 6257290
      7 4165874
      8 2937460
      9 2155282


      • Ray 2.3.1


        Ray 2.3.1 is now available on

        Significant changes:

        * This version includes "Surveyor" to compute similarity (or distance) matrices
        for hundreds or possibli thousands of samples.
        * fix compilation error on Apple OS X Mavericks
        * fix infinite loop when running on 2 CPU cores
        * fix a bug when the number of ranks is a prime number

        All changes in Ray:

        Rob Egan (1):
        fix compilation on NERSC's edison machine using PrgEnv-intel

        Sébastien Boisvert (30):
        SequencesLoader: fix bad automatic pairing of sequence files
        SequencesLoader: fix compilation warnings
        Surveyor: verify buffer size before getting producer
        Surveyor: add a variable to store the period
        Surveyor: run in actor-model-only mode
        spawn actors with spawn instead of spawnActor
        Documentation: add some documentation for Surveyor
        SeedExtender: add some assertions
        Searcher: disable verbose outputs
        Surveyor: skip invalid files
        coloring: added comments for coloring subsystem
        update release procedure
        next release will be 2.3.1
        fix infinite loop when running on 2 CPU cores
        fix a bug when the number of ranks is a prime number
        print number of payloads
        add some code to test directed surveys with Surveyor
        fix reproducibility issue for similarity and distance matrices
        Surveyor: support nucleotides in lower case
        report invalid edges as warnings instead of errors
        documentation: add license in README
        Surveyor: report 0 hits when necessary
        SeedingData: provide prototypes for friend functions
        Surveyor: fix compilation issue without debug code
        seeds: add a parameter -minimum-seed-length (default 100)
        add option -graph-only to stop after graph building
        fix compilation error on Apple OS X Mavericks
        use CONFIG_ASSERT instead of ASSERT for optional code
        version 2.3.1
        update releases

        Changes in RayPlatform:

        Rob Egan (1):
        fix compilation on NERSC's edison machine using PrgEnv-intel

        Sébastien Boisvert (15):
        communication: relay buffer bytes instead of buffer 64-bit integers
        core: add a actor-model-only mode
        actors: add playground status with -debug
        core: add buffer statistics with -debug
        actor model: change the method name from spawnActor to spawn
        fix the code for testing message integrity
        fix a regression introduced in a01f97eae41bcd759bfc521d84053552cf38d521
        files: add method to check if a file is valid
        add mini-rank information in the message metadata
        fix mini-rank runtime engine
        print registered message tags in debug mode
        documentation: add LGPLv3 info in README
        communication: some routes don't require routing
        use CONFIG_ASSERT instead of ASSERT for optional code
        fix compilation warning


        • Hello,

          When I look through some outputs generated from the amos file following assembly, many of the contigs were assigned 0 reads (used default bank2contig after seeing many contigs were not showing up in the generated sam file). Obviously, this does not make much sense, but I was wondering if anyone else has came across this? I was trying to avoid mapping by using the amos file and now I just want to confirm that the contigs I am getting are 'real' I suppose.

          I thought this may be due to read recycling at first, but reads show up under multiple contigs still. Anyone have other ideas what is causing this issue or how to correct it during assembly?



          • I am trying to assemble 275 paired end Illumina reads that I have interleaved together. Previously I was successfuly ran the interleaved files at the Kmer value 137. I compiled latest Ray version at Max Kmer size of 600 (technically 599).

            That code was:

            mpiexec -n 30 Ray -k 137 -i interleaved.fastq -o Ray_K137

            Now if I try a smaller Kmer value, I am running into a weird error Chunk Size error.

            I have tried:

            mpiexec -n 10 Ray -k 51 -i interleaved.fastq -o Ray_K51_try3
            mpiexec -n 30 Ray -k 51 -i interleaved.fastq -o Ray_K51_try3
            All these have caused the same Chunk Size error. I even tried it without mpiexec enabled. I still was retruned with the error below.

            Rank 0 : VirtualCommunicator (service provided by VirtualCommunicator): 2957916 virtual messages generated 115295 real messages (3.89785%)
            Rank 0 freed 549453824 bytes from the path memory pool (chunks: 131)
            Rank 0: gossiping generated 0 messages (gossips: 0 ---> 0)
            Critical exception: The length of the requested memory exceeds the CHUNK_SIZE: 36423920 > 33554432
            Ray: RayPlatform/memory/MyAllocator.cpp:97: void* MyAllocator::allocate(int): Assertion `false' failed.
            [BioLinux301:05209] *** Process received signal ***
            [BioLinux301:05209] Signal: Aborted (6)
            [BioLinux301:05209] Signal code:  (-6)
            [BioLinux301:05209] [ 0] /lib/x86_64-linux-gnu/ [0x7f4a34d27340]
            [BioLinux301:05209] [ 1] /lib/x86_64-linux-gnu/ [0x7f4a34987bb9]
            [BioLinux301:05209] [ 2] /lib/x86_64-linux-gnu/ [0x7f4a3498afc8]
            [BioLinux301:05209] [ 3] /lib/x86_64-linux-gnu/ [0x7f4a34980a76]
            [BioLinux301:05209] [ 4] /lib/x86_64-linux-gnu/ [0x7f4a34980b22]
            [BioLinux301:05209] [ 5] Ray() [0x533b50]
            [BioLinux301:05209] [ 6] Ray() [0x4f7552]
            [BioLinux301:05209] [ 7] Ray() [0x551768]
            [BioLinux301:05209] [ 8] Ray() [0x5550ab]
            [BioLinux301:05209] [ 9] Ray() [0x5562ea]
            [BioLinux301:05209] [10] Ray() [0x413379]
            [BioLinux301:05209] [11] Ray() [0x40c5bf]
            [BioLinux301:05209] [12] /lib/x86_64-linux-gnu/ [0x7f4a34972ec5]
            [BioLinux301:05209] [13] Ray() [0x40e0cf]
            [BioLinux301:05209] *** End of error message ***
            zsh: abort      Ray -k 51 -i interleaved.fastq -o Ray_K51_try3
            I am running this on BioLinux 8 Workstation that 32 threads and specs are: Intel Xeon E5-2640v2 2 Ghz with 128 GB of RAM.

            Really appreciate on how to proceed.
            Last edited by Zapages; 05-08-2015, 04:22 AM.


            • Maybe this question was already asked somewhere, but I can not find it:

              Is there a way to set the maximum insert size for paired end assembly with Ray? If not, what is the maximum insert size considered?

              I have an assembly which uses both normal insert size Illumina reads ( ~ 250 bp) and some longer insert sizes ( ~ 500 bp). When adding this last library, the results do not improve, which I think is suspicious.. Any ideas?


              • Contigs, or scaffolds?
                Have you tried giving the a possible distance for the reads to the assembler?


                • Hi Folks,
                  I have serious problem with Ray and open mpi
                  I am using a cluster with 4 nodes each has 8 cores and surprisingly when I run ray on single node with mpirun -np 8 it takes shorter time than I use two nodes and so on for example for one node it takes 5mins and for two nodes mpirun -np16 it taks 8min and for 3 nodes mpirun -24 it takes 12 mins and so on can any body please help me to find out the problem


                  • Can you explain Ray Surveyor in a bit more detail? I'm having a hard time understanding the documentation but I think this could be of use to me.


                    Latest Articles


                    • seqadmin
                      Exploring the Dynamics of the Tumor Microenvironment
                      by seqadmin

                      The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
                      07-08-2024, 03:19 PM
                    • seqadmin
                      Exploring Human Diversity Through Large-Scale Omics
                      by seqadmin

                      In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
                      06-25-2024, 06:43 AM





                    Topics Statistics Last Post
                    Started by seqadmin, Yesterday, 07:20 AM
                    0 responses
                    Last Post seqadmin  
                    Started by seqadmin, 07-16-2024, 05:49 AM
                    0 responses
                    Last Post seqadmin  
                    Started by seqadmin, 07-15-2024, 06:53 AM
                    0 responses
                    Last Post seqadmin  
                    Started by seqadmin, 07-10-2024, 07:30 AM
                    0 responses
                    Last Post seqadmin