@Xinwu, My segment length was default 25.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
@plabaj (and anyone else),
I have read your comment with interest and some concern. I plan to generate SOLID RNA-seq data (paired end 50bp+25bp) and wonder whether Tophat is any better at handling paired end SOLID data. I was planning to utilise Tophat for helping with new transcript/exon discovery. BTW this is all with mouse, so have annotated genome to work with.
Thanks.
Comment
-
Hi All,
I am trying to align paired-end Solid whole-transcriptome reads (50+35 FR) using Bowtie 0.12.7.
Strangely, when mapping paired-end, no read pairs (other than a few repeat-regions) map to the genome. In addition, no read pairs map to the transcriptome. (Ensembl genes-based reference).
When mapping individual reads, about 30% of 50bp reads and 20% of 35bp reads map successfully.
I must be doing something wrong. Using --ff changes nothing (reads are actually FR. The older Solid mate-pairs were FF) Read csfasta files are 100% pair matched (with read name suffixes _F3 and _F5-BC).
Examples (using only 1k reads but the results hold for the full set - as well as with exon-only Ensemble genes as reference, one fasta element per gene)
PE:
bowtie -f -p 8 --fr -C -S --sam-nohead --sam-nosq -s 1000000 -u 1000 --Q1
$dir/F3_QV.qual --Q2 $dir/F5_QV.qual -1 $dir/F3.csfasta -2
$dir/F5.csfasta ~/p2/indexes/bowtie/hg19_c align.sam
# reads processed: 1000
# reads with at least one reported alignment: 1 (0.10%)
# reads that failed to align: 999 (99.90%)
Reported 1 paired-end alignments to 1 output stream(s)
The aligned pair:
17_213_1598_F3 67 chr2 154876116 255 48M = 154876139 56 CACACACACACACACACACACACACACACACACACACACACACACACA LbRL\TMUVBH\TQ.8[[bOM```LG]^_LJ^\_SN]bcca_aaabbW XA:i:1 MD:Z:48 NM:i:0 CM:i:1
17_213_1598_F5-BC 131 chr2 154876140 255 33M = 154876115 -58 CACACACACACACACACACACACACACAACAGC @NcPIAE`c^E:S_^KI]`cPGZZSED)!E1!3 XA:i:0 MD:Z:29A3 NM:i:1 CM:i:5
F3-ends (50b) only
[markus@q34 run]$ bowtie -f -p 8 -C -S --sam-nohead --sam-nosq -s
1000000 -u 1000 -Q $dir/F3_QV.qual ~/p2/indexes/bowtie/hg19_c
$dir/F3.csfasta algn.sam
# reads processed: 1000
# reads with at least one reported alignment: 319 (31.90%)
# reads that failed to align: 681 (68.10%)
Reported 319 alignments to 1 output stream(s)
F5-ends (35b) only
[markus@q34 run]$ bowtie -f -p 8 -C -S --sam-nohead --sam-nosq -s
1000000 -u 1000 -Q $dir/F5_QV.qual ~/p2/indexes/bowtie/hg19_c
$dir/F5.csfasta algn.sam
# reads processed: 1000
# reads with at least one reported alignment: 287 (28.70%)
# reads that failed to align: 713 (71.30%)
Reported 287 alignments to 1 output stream(s)
Running tophat generated a decent result, so plenty of reads should map in pairs.
I'd be grateful if anyone can help me out, or provide any hint.
Edit: I got it working now, I might have forgotten about the --fr for the full data set.Last edited by mackan; 04-20-2011, 03:46 AM.
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...-
Channel: Articles
Yesterday, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
39 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
41 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
||
Started by seqadmin, 04-10-2024, 09:21 AM
|
0 responses
35 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 09:21 AM
|
||
Started by seqadmin, 04-04-2024, 09:00 AM
|
0 responses
55 views
0 likes
|
Last Post
by seqadmin
04-04-2024, 09:00 AM
|
Comment