Hello everybody,
Has somebody done transcriptome assemblies with 454 Titanium sequences?
The goal of my experiment would be to assemble a comprehensive "transcriptome space" and not to do anything quantitative. Based on preliminary experiments, I expect roughly 15'000-20'000 different genes representing between 25x10⁶ and 50x10⁶ Mb refseq-like transcripts (including all possible exons in each reference sequence).
I*would particularly be interested if someone could advise/speculate on one or several of these points:
- PolyA enriched samples vs Normalized samples (loss of rare transcripts? efficiency?)
- Paired end librairies (overlapping maybe? or particular insert sizes?) vs single end librairies
- The quantity of reads (or plates assuming 1x10⁶ reads per plate) required to signal a good fraction of all the transcripts expressed (e.g 95-98%)
- The "average" (for the whole sequencing)*or "minimum" (for rare transcripts) coverage required to obtain a good contiguity after de-novo assembly (e.g. 80%+ full length transcripts)
- The incidence of the points mentioned above on the presence of rare transcripts such as transcription factors in the final assembly.
Please don't hesitate to point to recent publications or other threads.
Cheers,
Yvan
Has somebody done transcriptome assemblies with 454 Titanium sequences?
The goal of my experiment would be to assemble a comprehensive "transcriptome space" and not to do anything quantitative. Based on preliminary experiments, I expect roughly 15'000-20'000 different genes representing between 25x10⁶ and 50x10⁶ Mb refseq-like transcripts (including all possible exons in each reference sequence).
I*would particularly be interested if someone could advise/speculate on one or several of these points:
- PolyA enriched samples vs Normalized samples (loss of rare transcripts? efficiency?)
- Paired end librairies (overlapping maybe? or particular insert sizes?) vs single end librairies
- The quantity of reads (or plates assuming 1x10⁶ reads per plate) required to signal a good fraction of all the transcripts expressed (e.g 95-98%)
- The "average" (for the whole sequencing)*or "minimum" (for rare transcripts) coverage required to obtain a good contiguity after de-novo assembly (e.g. 80%+ full length transcripts)
- The incidence of the points mentioned above on the presence of rare transcripts such as transcription factors in the final assembly.
Please don't hesitate to point to recent publications or other threads.
Cheers,
Yvan
Comment