Announcement

Collapse

Welcome to the New Seqanswers!

Welcome to the new Seqanswers! We'd love your feedback, please post any you have to this topic: New Seqanswers Feedback.
See more
See less

Minimus2

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Minimus2

    Hi,

    I have a question about minumus2. I am using it to join two velvet assemblies of the same Illumina data but produced for the different hash lengths. Each assembly is approximately ~ 32Mb. The resulting minimus2 merged assembly is 2 times the size of the input data.

    Has anyone observed similar problem, what is the source of such size increase and how to solve it? I tried to change overlap value but it did not change much.

    Thanks for your advice!

  • #2
    I would presume (assuming you have fed the contigs to Minimus correctly) the two assemblies are sufficiently different that Minimus struggles to fnid overlapping regions to join together.

    You could use something like MUMMER to check this.

    In any case, I'm not sure that the approach you are taking by mixing the results of two assemblies with different k-mer lengths is likely to result in a better result.

    Comment


    • #3
      I agree with Nick overall in that joining two assemblies using k1 and k2 will probably not gain much UNLESS you had trimmed your reads to variable length, and a stack of your reads were shorter than one of the k values, and hence couldn't be used.

      Minimus2 couldn't join them due to lack of overlap I guess, or maybe you didn't run it correctly. It is a bit confusing - I use a Perl script wrapper which I have attached (it needs BioPerl installed).
      Attached Files

      Comment


      • #4
        Thanks a lot for the advice and thoughts. The idea of merging assemblies of k1 and k2 (for instance kmer 31 and 61) was to get more continuous consensus assembly. But I discovered few problems, minimus can't efficiently deal with Ns. Splitting contigs with Ns contradicts the whole idea of getting longer contigs. Short contigs (abundant in velvet assemblies) are not always merged the way you would expect. Finally, velvet assemblies produced for different kmers do seem to differ a lot (worrying).

        I think I run minimus2 correctly since I tested it on the sample dataset and it worked, in any case thanks for the script, it is very helpful.

        Comment


        • #5
          Did you try changing the program call from make-consensus to make-consensus_poly within the runAmos script? I outlined the change in this thread:

          http://seqanswers.com/forums/showthread.php?t=6367

          It seemed to do a better job of handling N's and other ambiguity codes for me.

          This seems to be the only place this program is referenced:
          http://sourceforge.net/project/shown...ease_id=405988

          Comment


          • #6
            I would agree in that the multiple kmer approach has significantly increased the number of full length contigs in our illumina assemblies, and make much more sense than testing for a single optimal kmer. I've been using either cd-hit to cluster the separate runs or cap3 to assemble them. My recent trial of minimus2 gave yields similar to our cd-hit results i.e. reduced dataset by ~1/4. Have you considered using velvet -long for your final assembly?

            Comment


            • #7
              Thanks a lot for the minimus2 thread!

              We tried to use -long velvet option but run into memory problems in our system.

              This might be also a useful tip - we discovered many overlapping contigs within a single velvet assembly that have an overlap shorter than a kmer and therefore are not merged by velvet. Currently, we are trying to merge such contigs...

              Comment

              Working...
              X