Following Ben's suggestion, so I mapped the two mates seperately. But still only a small percentage of reads get mapped.
===mate1=====
reads processed: 130653415
# reads with at least one reported alignment: 47875358 (36.64%)
# reads that failed to align: 82659307 (63.27%)
# reads with alignments suppressed due to -m: 118750 (0.09%)
Reported 47875358 alignments to 1 output stream(s)
====mate2=====
# reads processed: 130653415
# reads with at least one reported alignment: 35773578 (27.38%)
# reads that failed to align: 94798249 (72.56%)
# reads with alignments suppressed due to -m: 81588 (0.06%)
Reported 35773578 alignments to 1 output stream(s)
If bowtie works well, then there are two explanations: either the human refseqs are far from complete, or, the sample is contaminated with a lot of pre-mRNA/genomic DNA.
Any comments are welcome
Iris
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Originally posted by Ben Langmead View PostHi Iris,
Paired-end alignment against the transcriptome is tricky, because the distance between the mates in genome space is affected both by fragment length, and on the "shape" of the transcriptome. E.g. when a fragment spans a long intron, the intron size is in some sense "added" to the fragment length. A better way of measuring sequencing quality is to break the mates apart and align both files in an unpaired manner. And an even better way is to use TopHat or another tool that attempts spliced mapping, so that alignments that span splice junctions can be found as well.
Hope that helps,
Ben
Leave a comment:
-
A more useful way to pose this question in the future is to find one example of a read or a pair of reads that you expected would map, but did not.
Leave a comment:
-
Originally posted by Ben Langmead View PostHi Iris,
Paired-end alignment against the transcriptome is tricky, because the distance between the mates in genome space is affected both by fragment length, and on the "shape" of the transcriptome. E.g. when a fragment spans a long intron, the intron size is in some sense "added" to the fragment length. A better way of measuring sequencing quality is to break the mates apart and align both files in an unpaired manner. And an even better way is to use TopHat or another tool that attempts spliced mapping, so that alignments that span splice junctions can be found as well.
Hope that helps,
Ben
I see. I'll map the two mates separately.
Iris
Leave a comment:
-
Hi Iris,
Paired-end alignment against the transcriptome is tricky, because the distance between the mates in genome space is affected both by fragment length, and on the "shape" of the transcriptome. E.g. when a fragment spans a long intron, the intron size is in some sense "added" to the fragment length. A better way of measuring sequencing quality is to break the mates apart and align both files in an unpaired manner. And an even better way is to use TopHat or another tool that attempts spliced mapping, so that alignments that span splice junctions can be found as well.
Hope that helps,
Ben
Leave a comment:
-
transcriptome mapping with bowtie -- only 22% reads hit
To check how well the sequencing parts were done, I used bowtie to map all the reads from RNA-seq directly to human refseqs, and I got only 22% (of ) of reads have at least one hit, 78% failed to align, which I think is impossible.
my command is:
"bowtie -p 5 -m 10 -f trx -1 mate1.fa -2 mate2.fa >output"
The output:
# reads processed: 130653415
# reads with at least one reported alignment: 28765542 (22.02%)
# reads that failed to align: 101882364 (77.98%)
# reads with alignments suppressed due to -m: 5509 (0.00%)
Reported 28765542 paired-end alignments to 1 output stream(s)
Any idea why so few reads got mapped? what did I do wrong?
Thank you very much,
IrisTags: None
Latest Articles
Collapse
-
by seqadmin
While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...-
Channel: Articles
Today, 07:15 AM -
-
by seqadmin
Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.
Somatic Genomics
“We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...-
Channel: Articles
05-24-2024, 01:16 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 08:18 AM
|
0 responses
10 views
0 likes
|
Last Post
by seqadmin
Today, 08:18 AM
|
||
Started by seqadmin, Today, 08:04 AM
|
0 responses
12 views
0 likes
|
Last Post
by seqadmin
Today, 08:04 AM
|
||
Started by seqadmin, 06-03-2024, 06:55 AM
|
0 responses
13 views
0 likes
|
Last Post
by seqadmin
06-03-2024, 06:55 AM
|
||
Started by seqadmin, 05-30-2024, 03:16 PM
|
0 responses
27 views
0 likes
|
Last Post
by seqadmin
05-30-2024, 03:16 PM
|
Leave a comment: