Good to know, thanks!
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Hi all,
I am having similar problems filtering my bowtie2 output, although not as severe as the originally reported numbers. I mapped reads back to a denovo assembled transcriptome with the following settings
-all --end-to-end --score-min L,-0.1,-0.1 --no-discordant --no-mixed
44223325 reads; of these:
44223325 (100.00%) were paired; of these:
7691237 (17.39%) aligned concordantly 0 times
25521175 (57.71%) aligned concordantly exactly 1 time
11010913 (24.90%) aligned concordantly >1 times
82.61% overall alignment rate
I am aware that the --all setting leaves the MAPQ without much meaning, so using this to filter out uniquely mapped reads is not possible. I could redo the analysis to avoid this problem, but would rather continue working with the same dataset to keep things consistent There are no mentions of this setting causing any other problems though.
Is there something I am missing, or did my settings do something unexpected?
Best,
Jan Philip
Comment
-
The presence of an XS auxiliary tag doesn't mean that an alignment isn't unique (n.b., "unique" isn't really a useful term, there's a reason for MAPQ scores). bowtie2 should only count an alignment as not unique if the XS and AS scores are the same.
Note that you're typically best off simply filtering by MAPQ score.
Comment
-
Thank you for the swift reply. I however do not fully understand how uniqueness is defined, if not by the fact that a given read only maps to one location (given the set score thresholds etc). I understand that if the first location a read maps to is significantly better than the alternate locations (by MAPQ score feks) it could probably also be considered unique in some respect.
However, if it is true that uniquely mapped reads could also have a XS tag, I would have expected the number of reads without it to be significantly lower, not higher, than the number reported by bowtie. So I am still pretty puzzled by my results.
I will try to have a look at your posts regarding the MAPQ scores and seriously consider redoing my analyses.
Comment
-
Therein lies the problem, there is no single definition of "uniqueness". There are multiple incompatible definitions. Further, if we relax the --score-min settings enough then by some definitions there will never be any unique alignments. This is why MAPQ is a generally more useful concept and you'd be better served just forgetting about the term "unique" in this context.
Comment
-
Thanks for the help, I will try to figure out how to best solve the issue for my experiments.
I found this, which might be of interest to others trying to understand how bowtie2 assigns scores: link. There are also some interesting thoughts on uniqueness discussed in this and an older blog post.
Comment
-
Hi all,
This worked for me, but I don't know if it is a general solution. If you set the -k paramenter in Bowtie2 to >=2, you should have at least twice the name of the read in your SAM file. You can use that to remove reads that appear >1 times in the file my_filename.sam. This way you don't have to undertand how Bowtie sets tags and flags.
prefix="my_filename"
tail -n +$(expr $(grep "^@" "$prefix.sam" | wc -l | cut -f 1 -d " ") + 1) "$prefix.sam" | sort | cut -f 1 | uniq -cd | cut -d " " -f 8 > "$prefix.toremove"
grep -vwF -f "$prefix.toremove" "$prefix.sam" > "$prefix.unique.sam"
rm "$prefix.toremove"Last edited by keo; 03-30-2017, 07:18 PM.
Comment
-
Originally posted by dpryan View PostThe presence of an XS auxiliary tag doesn't mean that an alignment isn't unique (n.b., "unique" isn't really a useful term, there's a reason for MAPQ scores). bowtie2 should only count an alignment as not unique if the XS and AS scores are the same.
Note that you're typically best off simply filtering by MAPQ score.
MAPQ=39 ... AS:i:0 XS:i:0
39 seems like an arbitrary value. In my case, The lines that don't have XS score, have a score of 42.
Comment
Latest Articles
Collapse
-
by seqadmin
Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.
Somatic Genomics
“We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...-
Channel: Articles
05-24-2024, 01:16 PM -
-
by seqadmin
The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...-
Channel: Articles
05-06-2024, 07:48 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 06-03-2024, 06:55 AM
|
0 responses
12 views
0 likes
|
Last Post
by seqadmin
06-03-2024, 06:55 AM
|
||
Started by seqadmin, 05-30-2024, 03:16 PM
|
0 responses
24 views
0 likes
|
Last Post
by seqadmin
05-30-2024, 03:16 PM
|
||
Comprehensive Sequencing of Great Ape Sex Chromosomes Yields Insights into Evolution and Genetic Variability
by seqadmin
Started by seqadmin, 05-29-2024, 01:32 PM
|
0 responses
29 views
0 likes
|
Last Post
by seqadmin
05-29-2024, 01:32 PM
|
||
Started by seqadmin, 05-24-2024, 07:15 AM
|
0 responses
215 views
0 likes
|
Last Post
by seqadmin
05-24-2024, 07:15 AM
|
Comment