Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • jstjohn
    Member
    • Jun 2010
    • 35

    merging scaffolds from several SOAPdenovo assemblies into a single consensus assembly

    Does anyone have experience with merging scaffolds from a few assemblies into a single consensus assembly? When I run SOAPdenovo with different k cutoffs I get different resulting scaffold characteristics. Higher k values tend to give me a few longer scaffolds with better n50, but overall a lower mean scaffold size and more shorter scaffolds than with smaller values of k. It seems like a good idea to merge assemblies generated with a few of these kmer values into a single consensus assembly, right?

    I have looked at a few tools to do this. One of them is called Reconciliator http://www.genome.umd.edu/reconcilia...structions.htm but it would require me to convert a few SOAPdenovo assembly output files into "Sanger/WashU" format which looks like it could involve quite a bit of work. I am not sure the amount of work required to generate all of these required input files would be worth it?

    I also found minimus2 which looks like it is primarily designed to merge two assemblies at a time, and I have 4 I would like to merge. I could do pairwise merging with the previous consensus, but that seems like it could lead to problems... I am also finding some complaints on forums about how the program deals with N's.

    The last program I found is called MAIA, but it is distributed as a Matlab package with dependencies on a few matlab distributed toolkits (yuck) and it looks like it also requires a "closely related reference genome" which I definitely do not have.

    Thanks for any suggestions or experiences with this.

    -John
  • natstreet
    Member
    • Nov 2009
    • 83

    #2
    Another option is the recently-released Zorro (which is based on minimus2 but makes using NGS data friendlier). However, it is also for pairwise merging.

    Comment

    • boetsie
      Senior Member
      • Feb 2010
      • 245

      #3
      maybe have a look at older threads;

      Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


      Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


      and zorro looks also as an interesting tool!
      Boetsie

      Comment

      • oben
        Junior Member
        • Apr 2011
        • 1

        #4
        Hi~

        Maybe this TGICL is helpful



        Oben

        Comment

        • boetsie
          Senior Member
          • Feb 2010
          • 245

          #5
          Another tool is GAM;



          haven't tried it yet though

          Comment

          • scalabrin
            Member
            • Jul 2009
            • 22

            #6
            Originally posted by boetsie View Post
            Hi,
            GAM was sanger based and currently supports assembly from Arachne and PCAP only. We are working on a NGS version that supports bam files.

            Comment

            • luc
              Senior Member
              • Dec 2010
              • 469

              #7
              GAM-NGS was published a few weeks ago:

              Genomic Assemblies Merger for NGS. Contribute to vice87/gam-ngs development by creating an account on GitHub.




              It looks very promising; I needed the help from a very good admin to get it installed. For my large datasets however it stopped repeatedly about 20 minutes into one of the last steps.
              Last edited by luc; 07-27-2013, 10:09 AM.

              Comment

              • scalabrin
                Member
                • Jul 2009
                • 22

                #8
                Originally posted by luc View Post
                GAM-NGS was published a few weeks ago:

                Genomic Assemblies Merger for NGS. Contribute to vice87/gam-ngs development by creating an account on GitHub.




                It looks very promising; I needed the help from a very good admin to get it installed. For my large datasets however it stopped repeatedly about 20 minutes into one of the last steps.

                Please, contact the contact author, he will surely help you!

                About installation, strange, I was easily able to install it myself, even on my home without asking for any system path.

                Comment

                • mberacochea
                  Junior Member
                  • Aug 2014
                  • 5

                  #9
                  Hi,

                  I have a related question. I have a draft genome (sequenced with PGM) and I want to get the most of the data. Merging different assemblies (different software) is useful?

                  Thanks!

                  Comment

                  • scalabrin
                    Member
                    • Jul 2009
                    • 22

                    #10
                    Originally posted by mberacochea View Post
                    Hi,

                    I have a related question. I have a draft genome (sequenced with PGM) and I want to get the most of the data. Merging different assemblies (different software) is useful?

                    Thanks!

                    It depends on the assemblies you want to merge. If they are very similar to each other then there is no meaning to merge them. For example, if you run different assemblies with ABySS at different k-mer length, short kmer assemblies are usually subsets of longer kmers assemblies. While, if you are comparing an assembly done with AllPaths-LG and one with ABySS you might get different results and merging them would be very usefull.

                    Comment

                    • mberacochea
                      Junior Member
                      • Aug 2014
                      • 5

                      #11
                      I thought so, I haven't assembled the genome with other tools. Will do to check.

                      Thanks!

                      Comment

                      Latest Articles

                      Collapse

                      • SEQadmin2
                        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                        by SEQadmin2


                        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                        ...
                        06-02-2026, 10:05 AM
                      • SEQadmin2
                        Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                        by SEQadmin2


                        With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                        Introduction

                        Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                        05-22-2026, 06:42 AM
                      • SEQadmin2
                        Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                        by SEQadmin2

                        Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                        Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                        05-06-2026, 09:04 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, Yesterday, 08:59 AM
                      0 responses
                      13 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 12:03 PM
                      0 responses
                      22 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 11:40 AM
                      0 responses
                      19 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 05-28-2026, 11:40 AM
                      0 responses
                      32 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...