Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • geschickten
    Member
    • Jul 2009
    • 31

    VELVET or ABYSS for Transcriptome

    We are planning to use ABYSS and Velvet for de novo assembly on transcriptome data. Just wondering if the group can share their experience with either tool; also how does both compare? and which is the best tool available for the assembly of transcriptome data? Thank you...
  • yvan.wenger
    Member
    • Aug 2009
    • 30

    #2
    De novo transcriptome assembly?

    Hello,

    I am wondering if you had any reply from your question concerning the best tool for the assembly of transcriptome... I am up to evaluate the tools but it seems that our draft genome gives an advantage to assemblies leading to short contigs as it has roughly 130'000 contigs (genomic then). As a consequence, the assembly with the best mapping to the genome is one with short contigs (otherwise large assembled contigs would jump from one genomic contig to another because those are quite shorts).

    As N50 does not seem to be a good metric for transcriptomes, I was wondering what other measures/manip to use to rank the different assemblies. Also, I noted that both correct and wrong contigs can be found in all assemblies and that they are often different (you can find a correct contig that is only represented in a rather "bad" assembly for example). Given this, I am wondering if somebody in this forum as seen data on alternative methods to obtain good contigs without a good genome? I for instance just re-had a look on the Abyss paper (De novo Transcriptome Assembly with Abyss, Birol et al, Bioinformatics Advance Access published June 15 2009) and see there that they still assess their transcritpome assembly using the human genome. As an alternative, I am thinking to merge several assemblies, compare those that merge together if any, maybe keep a contig only if it appears in at least two different assemblies and so one... but everything needs to be done.

    Any thoughts on all that? Or otherwise, is there a forum dedicated to this topic?

    Best,

    Yvan

    Original message:
    We are planning to use ABYSS and Velvet for de novo assembly on transcriptome data. Just wondering if the group can share their experience with either tool; also how does both compare? and which is the best tool available for the assembly of transcriptome data? Thank you...

    Comment

    • geschickten
      Member
      • Jul 2009
      • 31

      #3
      Yvan,

      Well I haven't received any replies from the forum. I must admit I am new to this world of genomics and hence I may not be able to pass my comments on your observation.

      I have not come across any forum dedicated to this topic.

      Do let me know about your evaluation and if required we can even take this offline...

      Comment

      • wjeck
        Member
        • Mar 2009
        • 39

        #4
        all,

        I believe that the short answer is: The proper tools are not publicly available yet. There is a wrong way to do this: assembling transcriptome data like it's genomic, and a right way: yet to be determined. I'm looking for pretty much the same thing and I can't seem to find it. The primary problem with assembling transcriptome data like it's genomic is that most transcriptome data sets have some genomic contamination, and they have alternative splicings. Both of these facts run counter to the assumptions of the genome assemblers, in which there is no alternative splicing (or at most two haplotype alternatives). If anyone is thinking about working on new assemblers for these new data sets PM me; I'm very interested in exploring the topic and maybe sitting down to write one.

        Cheers,
        --Will

        Comment

        • geschickten
          Member
          • Jul 2009
          • 31

          #5
          Will,

          I am willing to work on this and if you are okay then we can work together to design/develop an assembler for transcriptome data!

          prahalad

          Comment

          • NSTbioinformatics
            Member
            • Apr 2009
            • 24

            #6
            I have used velvet and ABySS to assembly genomic sequences from Illumina reads. However velvet runs very slow and can not process 36507944 reads X36 nt + 95398944 reads X 76 nt on 32 G memory computer, it stoped due to memory problem. I don't know how to solve it.

            From the paper De novo Transcriptome Assembly with Abyss, Birol et al, ABySS could assemble shotgun + pairedend runs together. I am wondering how it works. In the manual of ABySS, it only shows to assemble shotgun run and paired end run separatly.

            I would like to hear from others about them

            Comment

            • geschickten
              Member
              • Jul 2009
              • 31

              #7
              NST, do you think you can share this papaer "De novo Transcriptome Assembly with Abyss, Birol et al"

              -p

              Comment

              • jts
                Member
                • Feb 2009
                • 22

                #8
                Hi NSTbioinformatics,

                If you post the details of your problem on the abyss-users mailing list (http://www.bcgsc.ca/mailman/listinfo/abyss-users) Shaun Jackman or I can help you set up abyss for your data set. You will be able to assemble both single-end and paired-end reads in the same run but some care must be taken when choosing the assembly parameters.

                Regards,
                Jared Simpson

                Comment

                • kmcarr
                  Senior Member
                  • May 2008
                  • 1181

                  #9
                  Here are the references for Abyss:

                  Simpson et al. ABySS: A parallel assembler for short read sequence data. Genome Res (2009) vol. 19 (6) pp. 1117-23

                  Birol et al. De novo Transcriptome Assembly with ABySS. Bioinformatics (2009) pp.

                  Comment

                  • jnfass
                    Member
                    • Aug 2008
                    • 88

                    #10
                    Originally posted by NSTbioinformatics View Post
                    I have used velvet and ABySS to assembly genomic sequences from Illumina reads. However velvet runs very slow and can not process 36507944 reads X36 nt + 95398944 reads X 76 nt on 32 G memory computer, it stoped due to memory problem. I don't know how to solve it.

                    From the paper De novo Transcriptome Assembly with Abyss, Birol et al, ABySS could assemble shotgun + pairedend runs together. I am wondering how it works. In the manual of ABySS, it only shows to assemble shotgun run and paired end run separatly.

                    I would like to hear from others about them
                    I've done velvet assemblies with > 100M reads (some paired-end) on a 512G machine ... yes, it does take a lot of memory ... but I'd be interested in hearing if ABySS is any better. My understanding is that these assemblers like to have the whole assembly graph in memory at once, and that's the roadblock to assembling in smaller RAM spaces (though, I've seen a few comments from people working on parallelizing one or the other program).

                    Before I had access to a large memory machine, I ran the single ended assembly first, then used those contigs as "long" reads to add to an assembly of the paired reads.

                    Velvet can definitely do single and paired reads together, and if you change a parameter before compiling, you can have an unlimited number of different paired read sets, each with different insert lengths.

                    Comment

                    • Zigster
                      Jeremy Leipzig
                      • May 2009
                      • 117

                      #11
                      Originally posted by NSTbioinformatics View Post
                      However velvet runs very slow and can not process 36507944 reads X36 nt + 95398944 reads X 76 nt on 32 G memory computer, it stoped due to memory problem. I don't know how to solve it.
                      I have had better luck with Velvet running at longer kmers and to a lesser extent higher coverage cutoffs. Apparently this is counter-intuitive given that there are 16x possible kmers of length 31 than say 29, but velvetg is much more likely to hit the wall at the shorter kmers.

                      I recently did a de novo transcriptome assembly of 100,425,440 72bp paired end reads totaling over 7,034,311,658 bp on a 256G machine but could not get below kmer 29 without crashing.

                      Fortunately velvet now accepts very large kmer lengths, so I would try those before giving up.
                      --
                      Jeremy Leipzig
                      Bioinformatics Programmer
                      --
                      My blog
                      Twitter

                      Comment

                      • beelu
                        Junior Member
                        • Mar 2008
                        • 7

                        #12
                        Hi jnfass and Zigster, how do you build your machine to 512G/256G? How many CPU do you have and whats your RAM to core ratio? Thanks.

                        Beelu

                        Comment

                        • Zigster
                          Jeremy Leipzig
                          • May 2009
                          • 117

                          #13
                          We use a Dell Poweredge something-or-other with 4 X7350 (16 cores total)
                          --
                          Jeremy Leipzig
                          Bioinformatics Programmer
                          --
                          My blog
                          Twitter

                          Comment

                          • geschickten
                            Member
                            • Jul 2009
                            • 31

                            #14
                            Originally posted by jnfass View Post
                            I've done velvet assemblies with > 100M reads (some paired-end) on a 512G machine ... yes, it does take a lot of memory ... but I'd be interested in hearing if ABySS is any better. My understanding is that these assemblers like to have the whole assembly graph in memory at once, and that's the roadblock to assembling in smaller RAM spaces (though, I've seen a few comments from people working on parallelizing one or the other program).

                            Before I had access to a large memory machine, I ran the single ended assembly first, then used those contigs as "long" reads to add to an assembly of the paired reads.

                            Velvet can definitely do single and paired reads together, and if you change a parameter before compiling, you can have an unlimited number of different paired read sets, each with different insert lengths.

                            Hi jnfass,

                            Can you please share some information on who's doing the work on parallelizing assemblers? Also kindly point to some good open source parallel assemblers if you know any.. thank you

                            Comment

                            • geschickten
                              Member
                              • Jul 2009
                              • 31

                              #15
                              Originally posted by Zigster View Post
                              I have had better luck with Velvet running at longer kmers and to a lesser extent higher coverage cutoffs. Apparently this is counter-intuitive given that there are 16x possible kmers of length 31 than say 29, but velvetg is much more likely to hit the wall at the shorter kmers.

                              I recently did a de novo transcriptome assembly of 100,425,440 72bp paired end reads totaling over 7,034,311,658 bp on a 256G machine but could not get below kmer 29 without crashing.

                              Fortunately velvet now accepts very large kmer lengths, so I would try those before giving up.
                              Hi Zigster,

                              Can you please share the exact configuration of the machine that you used to for this run. Also what's your take on if somebody allows you to run this in Cloud?? would you go for it?

                              Comment

                              Latest Articles

                              Collapse

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, 06-05-2026, 10:09 AM
                              0 responses
                              15 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-04-2026, 08:59 AM
                              0 responses
                              32 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 12:03 PM
                              0 responses
                              35 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 11:40 AM
                              0 responses
                              23 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...