Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging Velvet Assemblies

    Hi,

    This is my first post, and I look forward to being part of the community.

    I've been prepping, sequencing, and assembling pools of ~10 BAC clones using PE100 reads on an Illumina. The average clone size is ~150 kb, but they can range from 50-250 kb. During the preps, I pooled an equal weight of DNA for each BAC clone, so I expect different levels of coverage from each BAC. I know that Velvet produces optimal assemblies with a k-mer coverage of 20-30X, and k-mers of ~55 give me an average coverage of that level. However, large BACs will have <20X coverage and small BACs will have >30X with this k-mer. To deal with this, I've been running Velvet with a series of k-mers (31, 41, ..., 81), and my plan is to merge the contigs from the series of assemblies.

    Initially, I just used Mummer to align the contigs produced from each assembly to the other assemblies, and I wrote a script to parse the Mummer output and discard contigs that are nested within larger contigs. This works OK, but I'm looking for something more sophisticated. Does anybody have suggestions of the best software for doing this merging. What I want to do is quite simple, but I'm just not sure of the best software to use.

    Thanks,
    Mike

  • #2
    I found some tools can do this job, such as CAP3, Phrap, CA, and MAIA.
    But I didn't actually make any of them work well.
    Hope you could try and show your results.

    Comment


    • #3
      Hello Mike,

      I'm also trying to do this. I think CAP3 might be the best tool but am still exploring this...there is a guy in our department who's written a program using CAP3 to merge velvet and abyss assemblies. It might be of some use. Let me know if you've found any other solutions. You also are in the great state of Oregon...where are you located?

      Comment


      • #4
        Hi kbushley,

        I've been using Minimus2 and am somewhat satisfied. I haven't tried CAP3 yet. I'm at the University of Oregon. Go Ducks!!!

        Best,
        Mike

        Comment


        • #5
          Thanks, I was reading up on that one today. Would you be willing to share your script that parses MUMmer output...that sound rather useful. Go Beaves -.

          Comment


          • #6
            Sure. Get me your email address and I'll send them.

            Comment


            • #7
              I'd also be really interested to give the scripts a try, if possible as this is something I've been looking for a good solution to. Can I send my email address to get a copy?

              Comment


              • #8
                I also ran trans-abyss and velevt (with different k-mer), created a fasta file of all the assemblies I got (from the various velvet runs and from trans-abyss) and ran on that cap3.
                What is the script for cap3 is doing?
                Am I missing an important step?

                Comment


                • #9
                  merge contigs

                  Hi, mike

                  I sent you an email and discussed about the merge of contigs using Mummer. I am not sure you get it. No reply after I sent message. Hope to hear from you. Thanks.

                  Rongman

                  Comment


                  • #10
                    Hi Mike,

                    Have you tried Phrap? I'm assembling overlapping BACs recently and I've tried CAP3, Minimus2 and Phrap to remove the redundance of merged contigs, and Phrap works best.
                    But there is still redundance in the final assembly. I'd like to try Mummer next. Can you send me a copy of your script?

                    Thanks!
                    Seth

                    Comment


                    • #11
                      Originally posted by Seth View Post
                      ....
                      But there is still redundance in the final assembly.....
                      Seth
                      Hi Seth

                      How do you assess redundancy and how do you determine when two contigs are redundant and should be merged rather than being too different to each other? I'm not sure what species you work with, but for our assemblies of highly heterozygous plants this is a huge issue. So far I've failed to find an option for achieving this that isn't horribly slow on large(ish) assemblies (400 Mbp +).

                      One thing I haven't yet tried is using PCAP as a replacement for CAP3. Has anyone tried it?

                      I would also be interested in a copy of the script if possible.

                      Comment


                      • #12
                        Originally posted by natstreet View Post
                        Hi Seth

                        How do you assess redundancy and how do you determine when two contigs are redundant and should be merged rather than being too different to each other? I'm not sure what species you work with, but for our assemblies of highly heterozygous plants this is a huge issue. So far I've failed to find an option for achieving this that isn't horribly slow on large(ish) assemblies (400 Mbp +).
                        Hi,
                        I used the total base count of final assembly to assess the redundancy. And the maximum length of target region can be estimated from the insert length of the BAC and BACs' count. I'm not familiar with the algorithms adopted in those softwares but I think the main idea is to identify overlapping contigs and join them together.

                        Have you tried Hapsembler? Designed for assembling highly heterozygous genomes, but also slow.

                        Comment


                        • #13
                          Hi Seth

                          Thanks for the pointer to hapsembler, I hadn't come across it before. I'll test it out asap.

                          Comment


                          • #14
                            Hapsembler

                            Hi,

                            I am assembling Hepatitis C virus hypervariable regions E1 and E2, which have lots of SNPs. I am using hapsembler but it is very slow. How is your experience with hapsembler?

                            Comment


                            • #15
                              I have a similar problem. I want to combine contigs/scaffolds assembled with different dataset, e.g. sanger,454 and solexa. I wanted to combine them based on Mummer alignments. However, it's so hand for me. The organism is 40M and the largest scaffold is 2M. Can I use these software to finish my job?
                              Last edited by ZhigangLi; 09-26-2011, 06:09 PM.
                              github:
                              https://github.com/Bioinformatics-and-Genomics

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Best Practices for Single-Cell Sequencing Analysis
                                by seqadmin



                                While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                                06-06-2024, 07:15 AM
                              • seqadmin
                                Latest Developments in Precision Medicine
                                by seqadmin



                                Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                                Somatic Genomics
                                “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                                05-24-2024, 01:16 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Today, 07:49 AM
                              0 responses
                              12 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 07:23 AM
                              0 responses
                              14 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 06-17-2024, 06:54 AM
                              0 responses
                              16 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 06-14-2024, 07:24 AM
                              0 responses
                              24 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X