Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
I too am seeing this error using GATK (v1.4-5-g253a07f) during indel realignment. I've never encountered it until today: 24 out of 28 files processed fine, but 4 of them fail prematurely due to a 'malformed' bam error on entries that are supposedly missing the quality score but have between 30 and 68 bases.
-
Thanks, I'll probably remove short reads like you suggest as they are likely to do more harm than good!
Leave a comment:
-
I'm not surprised that the "doesn't pass QC" flag is set on that read. A * by itself in the QUAL field like that normally would mean "no quality stored", which would indeed be a malformed line. However, a single * is ambiguous in this case, since it's also a possible QUAL+33 score (for a crappy base call).
Frankly, you'd be well off removing such short reads, since their mapping is going to be totally unreliable and they won't contribute anything meaningful to your results. Presumably whatever program you're using to do the adaptor trimming is capable of not returning reads below a certain size.
Leave a comment:
-
The offending read:
T_SOLEXA-GA01_r:6:9:1538:8018 528 chr7 111016499 0 1M * 0 0 C * XT:A:R NM:i:0 XN:i:1 X0:f:1.36217e+08 XM:i:0 XO:i:0 XG:i:0 MD:A:1 RG:Z:NR_49w XI:Z:AACTCCG YI:Z:.--/-2/ ZQ:A:L
Leave a comment:
-
Along the same line of inquiry as ulz_peter, have a look in the SAM/BAM file you used as input to see if the original read is malformed or if this is being introduced along the way. It's odd for a read to be only 1 base long.
Leave a comment:
-
looks pretty strange: he found a read having only one base and no associated quality. Do you do any kind of adaptor sequence removal or quality trimming? Anyways I've never seen that error...
Leave a comment:
-
What's causing malformed reads
Hello everyone,
My first post here so please excuse any etiquette mistakes. I'm working through a GATK pipeline for sequence data from multiple individuals. I have got to the local indel realignment phase and midway through the realignment process (target locator already run) I get an error message which kills the process:
ERROR MESSAGE: SAM/BAM file SAMFileReader{..file path} is malformed: BAM file has a read with mismatching number of bases and base qualities. Offender: T_SOLEXA-GA02:6:9:1538:8018 [1 bases] [0 quals]
I have found a way to get around this using -filterMBQ which skips malformed reads. But I am curious about the underlying cause of the problem. Is it most likely that something I have done incorrectly during the pipeline involving file formatting has created a mismatch between bases and base qualities, or is it the case that these mismatches can occur at low frequency as a normal part of the sequencing process? As the Malformed read filter exists it makes me think that these can just occur 'naturally' but I have no idea why.
Any thoughts or those with experience of this problem I'd really appreciate hearing from you. I'm apprehensive about moving on with the pipeline without understanding the root of the problem.
Best,
Rubal7Tags: None
Latest Articles
Collapse
-
by seqadmin
Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...-
Channel: Articles
09-23-2024, 06:35 AM -
-
by seqadmin
During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.
Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...-
Channel: Articles
09-09-2024, 10:59 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 10-02-2024, 04:51 AM
|
0 responses
11 views
0 likes
|
Last Post
by seqadmin
10-02-2024, 04:51 AM
|
||
Started by seqadmin, 10-01-2024, 07:10 AM
|
0 responses
19 views
0 likes
|
Last Post
by seqadmin
10-01-2024, 07:10 AM
|
||
Started by seqadmin, 09-30-2024, 08:33 AM
|
0 responses
23 views
0 likes
|
Last Post
by seqadmin
09-30-2024, 08:33 AM
|
||
Started by seqadmin, 09-26-2024, 12:57 PM
|
0 responses
17 views
0 likes
|
Last Post
by seqadmin
09-26-2024, 12:57 PM
|
Leave a comment: