Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • jfb
    replied
    I too am seeing this error using GATK (v1.4-5-g253a07f) during indel realignment. I've never encountered it until today: 24 out of 28 files processed fine, but 4 of them fail prematurely due to a 'malformed' bam error on entries that are supposedly missing the quality score but have between 30 and 68 bases.

    Leave a comment:


  • Rubal7
    replied
    Thanks, I'll probably remove short reads like you suggest as they are likely to do more harm than good!

    Leave a comment:


  • dpryan
    replied
    I'm not surprised that the "doesn't pass QC" flag is set on that read. A * by itself in the QUAL field like that normally would mean "no quality stored", which would indeed be a malformed line. However, a single * is ambiguous in this case, since it's also a possible QUAL+33 score (for a crappy base call).

    Frankly, you'd be well off removing such short reads, since their mapping is going to be totally unreliable and they won't contribute anything meaningful to your results. Presumably whatever program you're using to do the adaptor trimming is capable of not returning reads below a certain size.

    Leave a comment:


  • Rubal7
    replied
    The offending read:
    T_SOLEXA-GA01_r:6:9:1538:8018 528 chr7 111016499 0 1M * 0 0 C * XT:A:R NM:i:0 XN:i:1 X0:f:1.36217e+08 XM:i:0 XO:i:0 XG:i:0 MD:A:1 RG:Z:NR_49w XI:Z:AACTCCG YI:Z:.--/-2/ ZQ:A:L

    Leave a comment:


  • Rubal7
    replied
    Thanks guys, checking both these things now

    Leave a comment:


  • dpryan
    replied
    Along the same line of inquiry as ulz_peter, have a look in the SAM/BAM file you used as input to see if the original read is malformed or if this is being introduced along the way. It's odd for a read to be only 1 base long.

    Leave a comment:


  • ulz_peter
    replied
    looks pretty strange: he found a read having only one base and no associated quality. Do you do any kind of adaptor sequence removal or quality trimming? Anyways I've never seen that error...

    Leave a comment:


  • Rubal7
    started a topic What's causing malformed reads

    What's causing malformed reads

    Hello everyone,

    My first post here so please excuse any etiquette mistakes. I'm working through a GATK pipeline for sequence data from multiple individuals. I have got to the local indel realignment phase and midway through the realignment process (target locator already run) I get an error message which kills the process:

    ERROR MESSAGE: SAM/BAM file SAMFileReader{..file path} is malformed: BAM file has a read with mismatching number of bases and base qualities. Offender: T_SOLEXA-GA02:6:9:1538:8018 [1 bases] [0 quals]

    I have found a way to get around this using -filterMBQ which skips malformed reads. But I am curious about the underlying cause of the problem. Is it most likely that something I have done incorrectly during the pipeline involving file formatting has created a mismatch between bases and base qualities, or is it the case that these mismatches can occur at low frequency as a normal part of the sequencing process? As the Malformed read filter exists it makes me think that these can just occur 'naturally' but I have no idea why.

    Any thoughts or those with experience of this problem I'd really appreciate hearing from you. I'm apprehensive about moving on with the pipeline without understanding the root of the problem.

    Best,

    Rubal7

Latest Articles

Collapse

  • seqadmin
    Recent Developments in Metagenomics
    by seqadmin





    Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
    09-23-2024, 06:35 AM
  • seqadmin
    Understanding Genetic Influence on Infectious Disease
    by seqadmin




    During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

    Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
    09-09-2024, 10:59 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 10-02-2024, 04:51 AM
0 responses
11 views
0 likes
Last Post seqadmin  
Started by seqadmin, 10-01-2024, 07:10 AM
0 responses
19 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-30-2024, 08:33 AM
0 responses
23 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-26-2024, 12:57 PM
0 responses
17 views
0 likes
Last Post seqadmin  
Working...
X