Seqanswers Leaderboard Ad

**westerman** · 08-05-2013, 01:00 PM

So ... what average contig length do you expect? And why? While the transcriptome projects that come through my hands do generate some transcripts in the 'few thousands' most of them are much shorter.

The average length *may* vary because of the number of short transcripts being kept between the various programs. If one program keeps all transcripts while another throws away transcripts less than 200 bases then your average will vary even if the longest transcripts do not. Really you can not say much about average lengths unless you also know the shortest/longest lengths and the distribution.

**westerman** · 08-05-2013, 01:03 PM

Also, if you are going to do denovo transcriptome assembly then you really owe it to yourself to try out Trinity instead of using non-transcriptome assemblers.

**Wallysb01** · 08-05-2013, 01:12 PM

The length of your assemblies will be greatly impacted on expression level of the genes you're assembling. Even in very deeply sequenced samples, and with replicates, its going to be very hard to assemble very many genes from TSS to polyA, or even start to stop codon. Simple statistics like N50, or mean length, just don't mean much for transcriptomes.

You need to do some sort of orthology assignment to get an idea of how complete your assembly is, or how one assembly compares to another. If you're going for simple statistics about your assembly. I'd much rather just look at number of transcripts >1kbp than N50/average length, because its usually in the 500-1000bp range that you start getting meaningful information for downstream analysis.

And to try to answer your questions about how to improve assembly length, I would just say try trans-ABySS (which uses multiple k-mer approach and might be the best assembler in terms of completeness) and Trinity (does a nice job with length and ease of downstream analysis). From your use of Velvet Oases, it sounds like you're doing this on microbes, but you may still find success with those two.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 31 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 32 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 53 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Create longer contigs from transcriptome assembly

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News