Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • MAPQ must should be 0 for unmapped read

    Hi I am not too sure what to make of this.
    the sam file was from bwa
    is it a bug in bwa output?

    java -Xmx2g -jar /home/corona/bin/source/picard-tools-1.14/ViewSam.jar INPUT=sorted.sam ALIGNMENT_STATUS=Aligned > sortedaligned.sam


    Exception in thread "main" net.sf.samtools.SAMFormatException: Error parsing text SAM file. MAPQ must should be 0 for unmapped read.; File sorted.sam; Line 8910023
    Line: ./S2:747_1696_219 4 chr18 90771994 25 48M * 0 0 AGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGTTTGGGGCTT ]]]]]]PLQ]]XW[]QA6H]]VI+3]M9FSIFG@QQ:!)]0+OJRL:7 XT:A:U CM:i:1 XN:i:10 X0:i:1 X1:i:0 XM:i:4 XO:i:0 XG:i:0 MD:Z:40C7
    at net.sf.samtools.SAMTextReader.reportErrorParsingLine(SAMTextReader.java:176)
    at net.sf.samtools.SAMTextReader.access$500(SAMTextReader.java:40)
    at net.sf.samtools.SAMTextReader$RecordIterator.parseLine(SAMTextReader.java:385)
    at net.sf.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:232)
    at net.sf.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:196)
    at net.sf.picard.sam.ViewSam.doWork(ViewSam.java:68)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:143)
    at net.sf.picard.sam.ViewSam.main(ViewSam.java:58)
    http://kevin-gattaca.blogspot.com/

  • #2
    I am getting a similar error with Picard ValidateSamFile.jar after running bwa 0.5.6...


    $ java -Xmx4g -jar picard-tools-1.14/ValidateSamFile.jar INPUT=IC201N.sam

    [Tue Mar 09 11:35:57 EST 2010] net.sf.picard.sam.ValidateSamFile INPUT=IC201N.sam MODE=VERBOSE MAX_OUTPUT=100 IGNORE_WARNINGS=false TMP_DIR=/var/folders/zK/zKWfvkbvHui1XRNLQFGiFrpceyw/-Tmp-/mard VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000
    ERROR: Read groups is empty
    ERROR: Record 115632, Read name HWUSI-EAS715_100113:3:4:917:341#0, MAPQ must should be 0 for unmapped read.
    ERROR: Record 115632, Read name HWUSI-EAS715_100113:3:4:917:341#0, CIGAR should have zero elements for unmapped read.
    ERROR: Record 179735, Read name HWUSI-EAS715_100113:3:6:472:1636#0, MAPQ must should be 0 for unmapped read.
    ERROR: Record 179735, Read name HWUSI-EAS715_100113:3:6:472:1636#0, CIGAR should have zero elements for unmapped read.
    ERROR: Record 1944399, Read name HWUSI-EAS715_100113:3:59:1626:1079#0, MAPQ must should be 0 for unmapped read.
    ERROR: Record 1944399, Read name HWUSI-EAS715_100113:3:59:1626:1079#0, CIGAR should have zero elements for unmapped read.
    ERROR: Record 2044255, Read name HWUSI-EAS715_100113:3:63:461:1816#0, MAPQ must should be 0 for unmapped read.
    ERROR: Record 2044255, Read name HWUSI-EAS715_100113:3:63:461:1816#0, CIGAR should have zero elements for unmapped read.
    ERROR: Record 2173246, Read name HWUSI-EAS715_100113:3:67:1075:709#0, MAPQ must should be 0 for unmapped read.
    ERROR: Record 2173246, Read name HWUSI-EAS715_100113:3:67:1075:709#0, CIGAR should have zero elements for unmapped read.
    ERROR: Record 2174105, Read name HWUSI-EAS715_100113:3:67:1125:1723#0, MAPQ must should be 0 for unmapped read.
    ERROR: Record 2174105, Read name HWUSI-EAS715_100113:3:67:1125:1723#0, CIGAR should have zero elements for unmapped read.
    ERROR: Record 2663210, Read name HWUSI-EAS715_100113:3:83:1248:281#0, MAPQ must should be 0 for unmapped read.
    ERROR: Record 2663210, Read name HWUSI-EAS715_100113:3:83:1248:281#0, CIGAR should have zero elements for unmapped read.
    ERROR: Record 3064036, Read name HWUSI-EAS715_100113:3:96:383:1101#0, CIGAR should have zero elements for unmapped read.
    ERROR: Record 3301642, Read name HWUSI-EAS715_100113:3:103:698:837#0, MAPQ must should be 0 for unmapped read.
    ERROR: Record 3301642, Read name HWUSI-EAS715_100113:3:103:698:837#0, CIGAR should have zero elements for unmapped read.
    ERROR: Record 3718533, Read name HWUSI-EAS715_100113:3:115:843:1180#0, MAPQ must should be 0 for unmapped read.
    ERROR: Record 3718533, Read name HWUSI-EAS715_100113:3:115:843:1180#0, CIGAR should have zero elements for unmapped read.
    [Tue Mar 09 11:36:26 EST 2010] net.sf.picard.sam.ValidateSamFile done.
    Runtime.totalMemory()=84475904


    Turns out that the first read from above maps off the end of the reference chromosome...

    Ref# CAGAGGCGGCGGCTCGGGGAGAAACCTCAGGCACGGCCGGGCCACCAGGAAAACACGGCCGCGGGATC <- end of chromosome
    Read CGGGGGCGGCGGCTCGGGGAGAAACCTCAGGCACGGCCGGGGCACCAGGAAAACACGGCCGCGGGATCCCA

    So is this a bug in bwa?

    Comment


    • #3
      Originally posted by mard View Post
      I am getting a similar error with Picard ValidateSamFile.jar after running bwa 0.5.6...


      $ java -Xmx4g -jar picard-tools-1.14/ValidateSamFile.jar INPUT=IC201N.sam

      [Tue Mar 09 11:35:57 EST 2010] net.sf.picard.sam.ValidateSamFile INPUT=IC201N.sam MODE=VERBOSE MAX_OUTPUT=100 IGNORE_WARNINGS=false TMP_DIR=/var/folders/zK/zKWfvkbvHui1XRNLQFGiFrpceyw/-Tmp-/mard VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000
      ERROR: Read groups is empty
      ERROR: Record 115632, Read name HWUSI-EAS715_100113:3:4:917:341#0, MAPQ must should be 0 for unmapped read.
      ERROR: Record 115632, Read name HWUSI-EAS715_100113:3:4:917:341#0, CIGAR should have zero elements for unmapped read.
      ERROR: Record 179735, Read name HWUSI-EAS715_100113:3:6:472:1636#0, MAPQ must should be 0 for unmapped read.
      ERROR: Record 179735, Read name HWUSI-EAS715_100113:3:6:472:1636#0, CIGAR should have zero elements for unmapped read.
      ERROR: Record 1944399, Read name HWUSI-EAS715_100113:3:59:1626:1079#0, MAPQ must should be 0 for unmapped read.
      ERROR: Record 1944399, Read name HWUSI-EAS715_100113:3:59:1626:1079#0, CIGAR should have zero elements for unmapped read.
      ERROR: Record 2044255, Read name HWUSI-EAS715_100113:3:63:461:1816#0, MAPQ must should be 0 for unmapped read.
      ERROR: Record 2044255, Read name HWUSI-EAS715_100113:3:63:461:1816#0, CIGAR should have zero elements for unmapped read.
      ERROR: Record 2173246, Read name HWUSI-EAS715_100113:3:67:1075:709#0, MAPQ must should be 0 for unmapped read.
      ERROR: Record 2173246, Read name HWUSI-EAS715_100113:3:67:1075:709#0, CIGAR should have zero elements for unmapped read.
      ERROR: Record 2174105, Read name HWUSI-EAS715_100113:3:67:1125:1723#0, MAPQ must should be 0 for unmapped read.
      ERROR: Record 2174105, Read name HWUSI-EAS715_100113:3:67:1125:1723#0, CIGAR should have zero elements for unmapped read.
      ERROR: Record 2663210, Read name HWUSI-EAS715_100113:3:83:1248:281#0, MAPQ must should be 0 for unmapped read.
      ERROR: Record 2663210, Read name HWUSI-EAS715_100113:3:83:1248:281#0, CIGAR should have zero elements for unmapped read.
      ERROR: Record 3064036, Read name HWUSI-EAS715_100113:3:96:383:1101#0, CIGAR should have zero elements for unmapped read.
      ERROR: Record 3301642, Read name HWUSI-EAS715_100113:3:103:698:837#0, MAPQ must should be 0 for unmapped read.
      ERROR: Record 3301642, Read name HWUSI-EAS715_100113:3:103:698:837#0, CIGAR should have zero elements for unmapped read.
      ERROR: Record 3718533, Read name HWUSI-EAS715_100113:3:115:843:1180#0, MAPQ must should be 0 for unmapped read.
      ERROR: Record 3718533, Read name HWUSI-EAS715_100113:3:115:843:1180#0, CIGAR should have zero elements for unmapped read.
      [Tue Mar 09 11:36:26 EST 2010] net.sf.picard.sam.ValidateSamFile done.
      Runtime.totalMemory()=84475904


      Turns out that the first read from above maps off the end of the reference chromosome...

      Ref# CAGAGGCGGCGGCTCGGGGAGAAACCTCAGGCACGGCCGGGCCACCAGGAAAACACGGCCGCGGGATC <- end of chromosome
      Read CGGGGGCGGCGGCTCGGGGAGAAACCTCAGGCACGGCCGGGGCACCAGGAAAACACGGCCGCGGGATCCCA

      So is this a bug in bwa?
      It's complicated (best Top Gun quote). See the samtools mailing lists (help and devel) regarding these two issues. Feel free to voice your concerns on those lists as the solution to the above is ongoing.

      Comment


      • #4
        Indeed,




        Goose: [Extending his middle finger] You know, the finger!
        Charlie: Yes, I know the finger, Goose.
        Goose: Sorry. I hate when it does that.
        Charlie: [to Maverick] So you're the one.
        Maverick: Yes, ma'am.
        -drd

        Comment


        • #5
          Thanks for the info.

          So looks like, for the moment, that the solution is to ignore these warnings in Picard by adding either IGNORE={INVALID_MAPPING_QUALITY,INVALID_CIGAR} (for ValidateSamFile.jar) or VALIDATION_STRINGENCY=SILENT (for ViewSam.jar, SortSam.jar and MarkDuplicates.jar)

          Would that be correct?

          Comment


          • #6
            Yes it would seem so.
            I am afraid that it will cause viewing problems in IGV downstream though but I can't confirm it as well.
            http://kevin-gattaca.blogspot.com/

            Comment


            • #7
              Sorry for the bump, but I wonder what the consequences are when we start using VALIDATION_STRINGENCY=SILENT. We'd like Picard to ignore

              Code:
              ERROR: Record 29078883, Read name HWUSI-EAS536_0001:4:51:19663:20378#0, CIGAR should have zero elements for unmapped read.
              ERROR: Record 29183722, Read name HWUSI-EAS536_0001:4:14:8317:13044#0, MAPQ should be 0 for unmapped read.
              ERROR: Record 29183722, Read name HWUSI-EAS536_0001:4:14:8317:13044#0, CIGAR should have zero elements for unmapped read.
              which, in the dataset we're currently analysing, are the only 3 errors.

              But... won't VALIDATION_STRINGENCY=SILENT ignore other, more serious issues as well (in the next dataset)?

              Also, when we have Picard ignore these errors, what about later steps? We are currently using FixMates (even on single end data) to get rid of the error.

              For the record, we too are using GATK with BWA.

              Originally posted by KevinLam
              I am afraid that it will cause viewing problems in IGV downstream though but I can't confirm it as well.
              Did anyone confirm this?
              Last edited by Bruins; 11-23-2010, 05:49 AM.

              Comment


              • #8
                I would suggest not using SILENT at all. If you have to, be more specific on the error you want to ignore. The validation is very granular.
                -drd

                Comment


                • #9
                  Originally posted by drio View Post
                  I would suggest not using SILENT at all. If you have to, [...]
                  Well... the problem is that after reading this thread and some googling, I get the feeling that SILENT (or IGNORE=) is the only solution, other than running FixMates.

                  That takes me back to my question one: what would be the consequence of using SILENT? Drio would you like to comment on that some more?

                  I'm planning to run some tests, I'll report back later.

                  Comment


                  • #10
                    You can setup VALIDATION_STRINGENCY=LENIENT, that will tell picard to show any error it sees but to continue with the processing. After that, you can inspect the validation output and decide to bail out or continue the execution of you your pipeline.
                    -drd

                    Comment


                    • #11
                      I know its way later but...

                      I noticed recently that Picard has CleanSam.jar whose only purpose at this point is to clip mappings that extend beyond the reference. That may be helpful in removing these errors.
                      Last edited by petriedish; 12-14-2011, 11:39 AM. Reason: Typo

                      Comment


                      • #12
                        CleanSam didn't seem to work - It just ignored the same reads but didn't remove them. At least for me..
                        Setting to lenient validation seemed to work though!

                        Comment


                        • #13
                          I am getting the same problem by using:
                          MarkDuplicates.jar
                          Using LENIENT also goes through.

                          Did someone look at the original question: is it a bug in bwa?
                          I try to repeat the pipeline published by Bowen et al 2012 in Genetics and they do not mention this problem with picard.

                          Comment


                          • #14
                            Bowtie generated alignments do not have this issue.

                            Comment


                            • #15
                              What could be the conclusion of this post ? I am facing the same problem.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Advanced Tools Transforming the Field of Cytogenomics
                                by seqadmin


                                At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...
                                09-26-2023, 06:26 AM
                              • seqadmin
                                How RNA-Seq is Transforming Cancer Studies
                                by seqadmin



                                Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
                                09-07-2023, 11:15 PM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 09-29-2023, 09:38 AM
                              0 responses
                              10 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 09-27-2023, 06:57 AM
                              0 responses
                              12 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 09-26-2023, 07:53 AM
                              1 response
                              25 views
                              0 likes
                              Last Post seed_phrase_metal_storage  
                              Started by seqadmin, 09-25-2023, 07:42 AM
                              0 responses
                              17 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X