Header Leaderboard Ad
Collapse
NH:n:x in BAM output
Collapse
Announcement
Collapse
No announcement yet.
X
-
Maybe my last sentence was not right after all. It seems that NH:i:1 and MAPQ50 are constant when changing -g (except if -g, in which case all the reads are (erroneously) reported as NH:i:1 and MAPQ50). However, NH:i (>1) and MAPQ (<50) counts do not coincide with align_summary multiple mapping reads number. I would really like to understand align_summary report.
-
Originally posted by giorgifm View PostDear all,
... Tophat2 reports the NH flawlessly, therefore there must be a way to run Bowtie(2) with an option to generate this report.
Federico Giorgi (CUMC) & Davide Scaglione (IGA)
From the TopHat2 manual:
"If there are more alignments with the same score than this number, TopHat will randomly report only this many alignments"
But not reporting them should not mean that they are not there, as align_summary.txt seems to understand. However, in my case (TopHat2.0.9) all reads with multiple mappings > -g are reported with MAPQ50 and NH:i:1
Leave a comment:
-
Originally posted by peer.b View Post
Leave a comment:
-
Originally posted by Gianza View Postwhile XS:i the score for the second-best one.
Or rather, if the reported AS score is equal to the Q field of the BAM, then we are looking at (one of) the best hits? In this case the problem remains, and it would be simpler to just count the number of reads/fragments and check for their Q scores.
So what I believe now is that the only way to obtain the NH is writing an addNH script, that processes the entire readname-sorted BAM to add the NH field based ont he Qs and the FLAGs contained within the BAM itseld
Leave a comment:
-
To remove multiple hit reads, I am also using
samtools view -q4 -F4 file.bam
Following the explanation given by this post:
Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc
I concur that the lack of an NH:i:n field in BWA, and especially in Bowtie is unbelievable, I see no reason why the programmers omitted this field. If it weren't for "standards" and complaining reviewers, I would switch to other aligners like ERNE at once
Leave a comment:
-
Reading the Bowtie2 manual, it seems that the AS:i: flag reports the score for the best alignment, while XS:i the score for the second-best one. Thus one could compare these two value to figure out wheather a read is mapping ambiguosly or not.
Still, I'm really frustated not being able to control the maximum number of mismatches and/or gaps as tophat2 easily does with two dedicated options.
Cheers
Davide
Leave a comment:
-
NH:n:x in BAM output
Dear all,
the option specified for the number of hits in a SAM output, NH, is amazingly missing from the two most popular aligners (namely BWA and Bowtie). However, Tophat2 reports the NH flawlessly, therefore there must be a way to run Bowtie(2) with an option to generate this report.
So far, too much of the community is resorting to the samtools view -q3 trick, which is able to extract alignments with a mapping quality below 50% (by definition, a multiple hitting read will have all its n equally good alignments with a score of at most 1/n).
Thanks for any hints. I hope this post will become viral.
Federico Giorgi (CUMC) & Davide Scaglione (IGA)
Latest Articles
Collapse
-
Differential Expression and Data Visualization: Recommended Tools for Next-Level Sequencing Analysisby seqadmin
After covering QC and alignment tools in the first segment and variant analysis and genome assembly in the second segment, we’re wrapping up with a discussion about tools for differential gene expression analysis and data visualization. In this article, we include recommendations from the following experts: Dr. Mark Ziemann, Senior Lecturer in Biotechnology and Bioinformatics, Deakin University; Dr. Medhat Mahmoud Postdoctoral Research Fellow at Baylor College of Medicine;...-
Channel: Articles
05-23-2023, 12:26 PM -
-
by seqadmin
Continuing from our previous article, we share variant analysis and genome assembly tools recommended by our experts Dr. Medhat Mahmoud, Postdoctoral Research Fellow at Baylor College of Medicine, and Dr. Ming "Tommy" Tang, Director of Computational Biology at Immunitas and author of From Cell Line to Command Line.
Variant detection and analysis tools
Mahmoud classifies variant detection work into two main groups: short variants (<50...-
Channel: Articles
05-19-2023, 10:03 AM -
-
by seqadmin
With new tools and computational resources being released regularly, it can be hard to determine which are best suited for the analysis process and which older tools continue to be maintained. In an effort to assist the sequencing community, we interviewed three highly skilled bioinformaticians about their recommended tools for several important analysis applications.
Quality control and preprocessing tools
“Garbage in, garbage out” is a popular...-
Channel: Articles
05-16-2023, 10:11 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Exploring French-Canadian Ancestry: Insights into Migration, Settlement Patterns, and Genetic Structure
by seqadmin
Started by seqadmin, 05-26-2023, 09:22 AM
|
0 responses
8 views
0 likes
|
Last Post
by seqadmin
05-26-2023, 09:22 AM
|
||
Started by seqadmin, 05-24-2023, 09:49 AM
|
0 responses
15 views
0 likes
|
Last Post
by seqadmin
05-24-2023, 09:49 AM
|
||
Introducing ProtVar: A Web Tool for Contextualizing and Interpreting Human Missense Variation in Proteins
by seqadmin
Started by seqadmin, 05-23-2023, 07:14 AM
|
0 responses
30 views
0 likes
|
Last Post
by seqadmin
05-23-2023, 07:14 AM
|
||
Started by seqadmin, 05-18-2023, 11:36 AM
|
0 responses
116 views
0 likes
|
Last Post
by seqadmin
05-18-2023, 11:36 AM
|
Leave a comment: