Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Thanks for the responses, that is very helpful. That certainly makes sense to give unmapped mates the mapped mate coordinates for sorting purposes, however it is confusing that BWA sets other fields when the read is unmapped. Thank you for explaining to ignore those fields, as I missed that in the SAM specification.
-
Originally posted by swbarnes2 View PostSecond, sam specs call for the unmapped mate of a mapped read to be given the mapping position of the mapped mate. This is so that the unmapped read will sort next to its mapped mate. But that's another source of reads that have the 4 flagged, and have mapping coordinates
Leave a comment:
-
To clairfy a bit, bwa concatantes reference sequences, so if a read hangs off of one sequence onto another, it will be both given the right mapping coordinates, and give then 4 flag.
Second, sam specs call for the unmapped mate of a mapped read to be given the mapping position of the mapped mate. This is so that the unmapped read will sort next to its mapped mate. But that's another source of reads that have the 4 flagged, and have mapping coordinates.
So this is why you have reads like that.
Leave a comment:
-
On further reflection, you could argue Picard's http://picard.sourceforge.net/explain-flags.html is broken as it should ignore these bits when 0x4 is set. Perhaps it would be more helpful to put "(irrelevant)" or something next to them as that would be more useful.
Leave a comment:
-
Originally posted by Tally View PostUsing this tool to interpret the samflag http://picard.sourceforge.net/explain-flags.html, it seems that this read is:
Paired
Proper pair
Read unmapped
Mate reverse strand
Second in pair
-how can the read be in a proper pair, yet unmapped?
-how can the read have a mapping location and a non-zero mapping quality if it is unmapped?
Thus BWA is within the specification, but nevertheless the output is confusing and arguably a bug.
Originally posted by Tally View PostOn a previous occasion with the same data, I encountered errors when adding RG headers with Picard:
Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: Record 2096393, Read name HWUSI-EAS091C_0049:1:29:16923:17016#0, MAPQ should be 0 for unmapped read.
Again, BWA is within the specification, but it would be cleaner to use MAPQ of zero which is what Picard checks for with its default stringency.
So again, BWA is within the specification, but nevertheless the output is confusing and arguably a bug.
Leave a comment:
-
BWA samflag inconsistencies
Dear bioinformatics community,
I have used BWA to align paired-end reads to a mammalian genome. I extracted orphan reads from the alignment with this command:
awk 'and ($2, 0x0004) && ! and ($2, 0x0008) && $3!~/Un/' alignment.sam > orphans.sam
I have noticed some apparent errors in the output, for example:
HWUSI-EAS091C_0049:8:10:13851:9893#0 167 27 39960072 29 11S64M 28 167 0 TGTGAATGTAGTGCCCATGAAGGTGGATGCAGAGATCCGGCAGTCGTTGCAGTGCACCCTGCTGTGTCCCCGTAT IIHIIIIIIIIGIIIIIIHIIIIFIIHIIIHHIGIIIIIFIIHHHIIIFHHIGIIIHHIGHGGEEHGGIHGGDEE XT:A:M NM:i:0 SM:i:29 AM:i:29 XM:i:0 XO:i:0 XG:i:0 MD:Z:64
Using this tool to interpret the samflag http://picard.sourceforge.net/explain-flags.html, it seems that this read is:
Paired
Proper pair
Read unmapped
Mate reverse strand
Second in pair
-how can the read be in a proper pair, yet unmapped?
-how can the read have a mapping location and a non-zero mapping quality if it is unmapped?
Another example:
HWUSI-EAS091C_0049:8:17:3987:3357#0 117 6 84719005 37 9M1D66M Un0684.1-12705 6795 0 ATGATGATGTGGTGATGATGATGATGATGATGATTATGATGATGATGATGATGATGATGGTGATGATGATGATGG IIDIIIGIIG<IHIHHHIEIIGIHGHIIIHDIHFIIIIIIGIIHIIHHIIDIIIIGIIIIIIGIHIIIIIIIIII XT:A:U NM:i:4 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:3 XO:i:1 XG:i:1 MD:Z:9^A25G6G32A0
Paired
Unmapped
Read reverse
Mate reverse
First in pair
-how can it be ‘mapped to reverse strand’ yet unmapped?
-how can it have a mapping location and a non-zero mapping quality if it unmapped?
On a previous occasion with the same data, I encountered errors when adding RG headers with Picard:
Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: Record 2096393, Read name HWUSI-EAS091C_0049:1:29:16923:17016#0, MAPQ should be 0 for unmapped read.
There were countless of these errors, and I was forced to add the flag "VALIDATION_STRINGENCY=LENIENT" to the command line (a helpful tip from this site) to complete task.
Why is BWA giving unmapped reads non-zero mapping qualities?
Any help here would be much appreciated, because other than this I am rather fond of the speed and accuracy of BWA compared to other aligners so it would be nice to be able to correct this issue.
Thanks in advance,
TallyTags: None
Latest Articles
Collapse
-
by seqadmin
This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.
The Headliner
The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...-
Channel: Articles
03-03-2025, 01:39 PM -
-
by seqadmin
The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...-
Channel: Articles
02-24-2025, 06:31 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 03-03-2025, 01:15 PM
|
0 responses
179 views
0 likes
|
Last Post
by seqadmin
03-03-2025, 01:15 PM
|
||
Started by seqadmin, 02-28-2025, 12:58 PM
|
0 responses
272 views
0 likes
|
Last Post
by seqadmin
02-28-2025, 12:58 PM
|
||
Started by seqadmin, 02-24-2025, 02:48 PM
|
0 responses
659 views
0 likes
|
Last Post
by seqadmin
02-24-2025, 02:48 PM
|
||
Started by seqadmin, 02-21-2025, 02:46 PM
|
0 responses
268 views
0 likes
|
Last Post
by seqadmin
02-21-2025, 02:46 PM
|
Leave a comment: