Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RNA-Seq: Full-length transcriptome assembly from RNA-Seq data without a reference gen

    Syndicated from PubMed RSS Feeds

    Full-length transcriptome assembly from RNA-Seq data without a reference genome.

    Nat Biotechnol. 2011 May 15;

    Authors: Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A

    Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.

    PMID: 21572440 [PubMed - as supplied by publisher]



    More...

  • #2
    Has anybody tried this method/software?

    Comment


    • #3
      Yes, we tried the software extensively and found it to take much less RAM than Velvet+Oases.
      http://homolog.us

      Comment


      • #4
        I am performing RNA-seq without a reference genome and just tried Trinity. A quick look at some of the stats on the assemblies is very promising, an increase in the number of contigs above 1000 relative to my merged ABYSS assemblies from multiple kmers.

        Has anyone used their Trinity assemblies directly for gene expression analysis? If so I would be interested to hear the approach people are taking. If Trinity is preserving transcript diversity and the resulting assemblies include 'all' possible isoforms of a gene, then is it possible to directly use these contigs as a reference for mapping the raw reads against to perform expression analysis?

        Comment


        • #5
          I just did a comparison between Velvet/Oases, Abyss/Trans-Abyss and Trinity. While Trinity can't be used in on multi k-values, it seems to be more less short transcripts.

          Comment


          • #6
            Trinity has a default k-mer length of 25 but you can change it. I used the k-mer at default and got much better transcript length then Abyss. Longest contig in Abyss was 1157 bp the longest in trinity was 1455 bp.

            The program does not take as much memory as Velvet/Oases. However if your reads are long (100bp) it takes a while for inchworm to complete. For my 41 million ~ 100 bp reads it took four days. I think this is because of the short k-mer length and my long reads. If your reads are shorter I am sure if will complete faster.

            The authors of the trinity suggested that for expression analysis to use the trinity contigs as a reference and align the reads using Bowtie and then cufflinks for expression comparison.

            I am currently trying that now and even though it should be possible to go directly from Bowtie to SAMtools to Cufflinks I am getting errors and no one has come with an answer why.

            Also I should mention that if you have a zombie process problem that occurs during butterfly, like I did. Check out the FAQ the answer to the problem is there. http://trinityrnaseq.sourceforge.net/
            Last edited by jdjax; 08-07-2011, 08:55 AM. Reason: more info
            jdjax
            Ph.d. Student
            Åarhus University

            Comment


            • #7
              Hi jdjax

              Originally posted by jdjax View Post

              The program does not take as much memory as Velvet/Oases. However if your reads are long (100bp) it takes a while for inchworm to complete. For my 41 million ~ 100 bp reads it took four days. I think this is because of the short k-mer length and my long reads. If your reads are shorter I am sure if will complete faster.
              http://trinityrnaseq.sourceforge.net/
              What were your computer specs when you ran this analysis, and were your reads paired end?
              Thanks

              Comment


              • #8
                Originally posted by Kennels View Post
                Hi jdjax



                What were your computer specs when you ran this analysis, and were your reads paired end?
                Thanks
                I would recommend looking over this website for your answer:


                This website going give you the answer you are looking for. If it is not in the website join the mailing list and then you can ask the manager of the program.
                jdjax
                Ph.d. Student
                Åarhus University

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:37 PM
                0 responses
                8 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 06:07 PM
                0 responses
                8 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                49 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                66 views
                0 likes
                Last Post seqadmin  
                Working...
                X