Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • large genome assembly using paired end and mate paire reads.

    Which assembler should i go for if i want to assemble genome of 1 -1.5 GB size. i have illumina paired end and mate pair reads of 101 bp length. how can i use mate paire reads for Scaffolding?

  • #2
    I'd try Allpaths-LG (http://www.broadinstitute.org/softwa...paths-lg/blog/) if your paired end reads mostly overlap (i.e. fragment size of ~180 bp). It will use your mate pairs for scaffolding.

    Comment


    • #3
      Thank you sarvidsson,

      I have access to 96GB memory and 24 core machine. is it sufficiant to work with Allpaths-LG?

      Comment


      • #4
        96 GB can be a bit tight, 24 cores should be fine - I'd expect the assembly to run for up to 3 days. If you error correct and normalize the paired end reads prior to assembly (with e.g. BBNorm http://seqanswers.com/forums/showthread.php?t=49763) you typically reduce memory usage for the assembly.

        Comment


        • #5
          We have 100-200Mbp fungal assemblies that run out of memory (with AllPaths-LG) on 128GB nodes, but complete on 256GB nodes. I'm guessing memory may be a serious problem; you probably are going to need more.

          Megahit is fast and seems to have a relatively low memory consumption, and Minia was designed for low memory consumption, so if AllPaths fails you might try those. Or, buy more memory, which will be essential if you plan to routinely assemble large genomes.

          Comment


          • #6
            With that amount of memory I'd recommend SGA...Minia is great too but there isn't a scaffolding option.

            Comment


            • #7
              Allpaths-LG is a good option if you have enough RAM and CPUs. Also I wonder whether one of your PE libraries are overlapping i.e. from Allpaths-LG doc "average separation size must be slightly less than twice the read size, such that the reads from a pair will likely overlap".

              Comment


              • #8
                Thank you all.

                I think i have to go for SGA or minia due to lack of memory. is it a good option to use paired end reads for assembly and then go for scaffolding with mate pair data.?
                which tool would be suitable for 101 bp mate pair data for scaffolding?

                Comment


                • #9
                  Originally posted by Pinal View Post
                  Thank you all.

                  I think i have to go for SGA or minia due to lack of memory. is it a good option to use paired end reads for assembly and then go for scaffolding with mate pair data.?
                  which tool would be suitable for 101 bp mate pair data for scaffolding?
                  You can start with SSPACE.
                  BaseClear offers a wide range of advanced bioinformatics and biostatistics solutions, primarily in the areas of genomics and microbiomics data analysis.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Best Practices for Single-Cell Sequencing Analysis
                    by seqadmin



                    While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                    06-06-2024, 07:15 AM
                  • seqadmin
                    Latest Developments in Precision Medicine
                    by seqadmin



                    Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                    Somatic Genomics
                    “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                    05-24-2024, 01:16 PM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 07:24 AM
                  0 responses
                  10 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 06-13-2024, 08:58 AM
                  0 responses
                  11 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 06-12-2024, 02:20 PM
                  0 responses
                  16 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 06-07-2024, 06:58 AM
                  0 responses
                  184 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X