Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • gardiea
    Junior Member
    • Sep 2010
    • 3

    Minimus2

    Hi,

    I have a question about minumus2. I am using it to join two velvet assemblies of the same Illumina data but produced for the different hash lengths. Each assembly is approximately ~ 32Mb. The resulting minimus2 merged assembly is 2 times the size of the input data.

    Has anyone observed similar problem, what is the source of such size increase and how to solve it? I tried to change overlap value but it did not change much.

    Thanks for your advice!
  • nickloman
    Senior Member
    • Jul 2009
    • 355

    #2
    I would presume (assuming you have fed the contigs to Minimus correctly) the two assemblies are sufficiently different that Minimus struggles to fnid overlapping regions to join together.

    You could use something like MUMMER to check this.

    In any case, I'm not sure that the approach you are taking by mixing the results of two assemblies with different k-mer lengths is likely to result in a better result.

    Comment

    • Torst
      Senior Member
      • Apr 2008
      • 275

      #3
      I agree with Nick overall in that joining two assemblies using k1 and k2 will probably not gain much UNLESS you had trimmed your reads to variable length, and a stack of your reads were shorter than one of the k values, and hence couldn't be used.

      Minimus2 couldn't join them due to lack of overlap I guess, or maybe you didn't run it correctly. It is a bit confusing - I use a Perl script wrapper which I have attached (it needs BioPerl installed).
      Attached Files

      Comment

      • gardiea
        Junior Member
        • Sep 2010
        • 3

        #4
        Thanks a lot for the advice and thoughts. The idea of merging assemblies of k1 and k2 (for instance kmer 31 and 61) was to get more continuous consensus assembly. But I discovered few problems, minimus can't efficiently deal with Ns. Splitting contigs with Ns contradicts the whole idea of getting longer contigs. Short contigs (abundant in velvet assemblies) are not always merged the way you would expect. Finally, velvet assemblies produced for different kmers do seem to differ a lot (worrying).

        I think I run minimus2 correctly since I tested it on the sample dataset and it worked, in any case thanks for the script, it is very helpful.

        Comment

        • Adjuvant
          Member
          • Sep 2010
          • 13

          #5
          Did you try changing the program call from make-consensus to make-consensus_poly within the runAmos script? I outlined the change in this thread:



          It seemed to do a better job of handling N's and other ambiguity codes for me.

          This seems to be the only place this program is referenced:

          Comment

          • ikim
            Member
            • Mar 2010
            • 13

            #6
            I would agree in that the multiple kmer approach has significantly increased the number of full length contigs in our illumina assemblies, and make much more sense than testing for a single optimal kmer. I've been using either cd-hit to cluster the separate runs or cap3 to assemble them. My recent trial of minimus2 gave yields similar to our cd-hit results i.e. reduced dataset by ~1/4. Have you considered using velvet -long for your final assembly?

            Comment

            • gardiea
              Junior Member
              • Sep 2010
              • 3

              #7
              Thanks a lot for the minimus2 thread!

              We tried to use -long velvet option but run into memory problems in our system.

              This might be also a useful tip - we discovered many overlapping contigs within a single velvet assembly that have an overlap shorter than a kmer and therefore are not merged by velvet. Currently, we are trying to merge such contigs...

              Comment

              Latest Articles

              Collapse

              • GATTACAT
                Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by GATTACAT
                Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                07-01-2026, 11:43 AM
              • SEQadmin2
                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by SEQadmin2


                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                Here are nine questions we think about, in roughly the order they matter, before...
                06-18-2026, 07:11 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, 07-02-2026, 11:08 AM
              0 responses
              8 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-30-2026, 05:37 AM
              0 responses
              12 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-26-2026, 11:10 AM
              0 responses
              20 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-17-2026, 06:09 AM
              0 responses
              54 views
              0 reactions
              Last Post SEQadmin2  
              Working...