Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • MAPQ must should be 0 for unmapped read

    Hi I am not too sure what to make of this.
    the sam file was from bwa
    is it a bug in bwa output?

    java -Xmx2g -jar /home/corona/bin/source/picard-tools-1.14/ViewSam.jar INPUT=sorted.sam ALIGNMENT_STATUS=Aligned > sortedaligned.sam


    Exception in thread "main" net.sf.samtools.SAMFormatException: Error parsing text SAM file. MAPQ must should be 0 for unmapped read.; File sorted.sam; Line 8910023
    Line: ./S2:747_1696_219 4 chr18 90771994 25 48M * 0 0 AGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGTTTGGGGCTT ]]]]]]PLQ]]XW[]QA6H]]VI+3]M9FSIFG@QQ:!)]0+OJRL:7 XT:A:U CM:i:1 XN:i:10 X0:i:1 X1:i:0 XM:i:4 XO:i:0 XG:i:0 MD:Z:40C7
    at net.sf.samtools.SAMTextReader.reportErrorParsingLine(SAMTextReader.java:176)
    at net.sf.samtools.SAMTextReader.access$500(SAMTextReader.java:40)
    at net.sf.samtools.SAMTextReader$RecordIterator.parseLine(SAMTextReader.java:385)
    at net.sf.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:232)
    at net.sf.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:196)
    at net.sf.picard.sam.ViewSam.doWork(ViewSam.java:68)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:143)
    at net.sf.picard.sam.ViewSam.main(ViewSam.java:58)
    http://kevin-gattaca.blogspot.com/

  • #2
    I am getting a similar error with Picard ValidateSamFile.jar after running bwa 0.5.6...


    $ java -Xmx4g -jar picard-tools-1.14/ValidateSamFile.jar INPUT=IC201N.sam

    [Tue Mar 09 11:35:57 EST 2010] net.sf.picard.sam.ValidateSamFile INPUT=IC201N.sam MODE=VERBOSE MAX_OUTPUT=100 IGNORE_WARNINGS=false TMP_DIR=/var/folders/zK/zKWfvkbvHui1XRNLQFGiFrpceyw/-Tmp-/mard VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000
    ERROR: Read groups is empty
    ERROR: Record 115632, Read name HWUSI-EAS715_100113:3:4:917:341#0, MAPQ must should be 0 for unmapped read.
    ERROR: Record 115632, Read name HWUSI-EAS715_100113:3:4:917:341#0, CIGAR should have zero elements for unmapped read.
    ERROR: Record 179735, Read name HWUSI-EAS715_100113:3:6:472:1636#0, MAPQ must should be 0 for unmapped read.
    ERROR: Record 179735, Read name HWUSI-EAS715_100113:3:6:472:1636#0, CIGAR should have zero elements for unmapped read.
    ERROR: Record 1944399, Read name HWUSI-EAS715_100113:3:59:1626:1079#0, MAPQ must should be 0 for unmapped read.
    ERROR: Record 1944399, Read name HWUSI-EAS715_100113:3:59:1626:1079#0, CIGAR should have zero elements for unmapped read.
    ERROR: Record 2044255, Read name HWUSI-EAS715_100113:3:63:461:1816#0, MAPQ must should be 0 for unmapped read.
    ERROR: Record 2044255, Read name HWUSI-EAS715_100113:3:63:461:1816#0, CIGAR should have zero elements for unmapped read.
    ERROR: Record 2173246, Read name HWUSI-EAS715_100113:3:67:1075:709#0, MAPQ must should be 0 for unmapped read.
    ERROR: Record 2173246, Read name HWUSI-EAS715_100113:3:67:1075:709#0, CIGAR should have zero elements for unmapped read.
    ERROR: Record 2174105, Read name HWUSI-EAS715_100113:3:67:1125:1723#0, MAPQ must should be 0 for unmapped read.
    ERROR: Record 2174105, Read name HWUSI-EAS715_100113:3:67:1125:1723#0, CIGAR should have zero elements for unmapped read.
    ERROR: Record 2663210, Read name HWUSI-EAS715_100113:3:83:1248:281#0, MAPQ must should be 0 for unmapped read.
    ERROR: Record 2663210, Read name HWUSI-EAS715_100113:3:83:1248:281#0, CIGAR should have zero elements for unmapped read.
    ERROR: Record 3064036, Read name HWUSI-EAS715_100113:3:96:383:1101#0, CIGAR should have zero elements for unmapped read.
    ERROR: Record 3301642, Read name HWUSI-EAS715_100113:3:103:698:837#0, MAPQ must should be 0 for unmapped read.
    ERROR: Record 3301642, Read name HWUSI-EAS715_100113:3:103:698:837#0, CIGAR should have zero elements for unmapped read.
    ERROR: Record 3718533, Read name HWUSI-EAS715_100113:3:115:843:1180#0, MAPQ must should be 0 for unmapped read.
    ERROR: Record 3718533, Read name HWUSI-EAS715_100113:3:115:843:1180#0, CIGAR should have zero elements for unmapped read.
    [Tue Mar 09 11:36:26 EST 2010] net.sf.picard.sam.ValidateSamFile done.
    Runtime.totalMemory()=84475904


    Turns out that the first read from above maps off the end of the reference chromosome...

    Ref# CAGAGGCGGCGGCTCGGGGAGAAACCTCAGGCACGGCCGGGCCACCAGGAAAACACGGCCGCGGGATC <- end of chromosome
    Read CGGGGGCGGCGGCTCGGGGAGAAACCTCAGGCACGGCCGGGGCACCAGGAAAACACGGCCGCGGGATCCCA

    So is this a bug in bwa?

    Comment


    • #3
      Originally posted by mard View Post
      I am getting a similar error with Picard ValidateSamFile.jar after running bwa 0.5.6...


      $ java -Xmx4g -jar picard-tools-1.14/ValidateSamFile.jar INPUT=IC201N.sam

      [Tue Mar 09 11:35:57 EST 2010] net.sf.picard.sam.ValidateSamFile INPUT=IC201N.sam MODE=VERBOSE MAX_OUTPUT=100 IGNORE_WARNINGS=false TMP_DIR=/var/folders/zK/zKWfvkbvHui1XRNLQFGiFrpceyw/-Tmp-/mard VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000
      ERROR: Read groups is empty
      ERROR: Record 115632, Read name HWUSI-EAS715_100113:3:4:917:341#0, MAPQ must should be 0 for unmapped read.
      ERROR: Record 115632, Read name HWUSI-EAS715_100113:3:4:917:341#0, CIGAR should have zero elements for unmapped read.
      ERROR: Record 179735, Read name HWUSI-EAS715_100113:3:6:472:1636#0, MAPQ must should be 0 for unmapped read.
      ERROR: Record 179735, Read name HWUSI-EAS715_100113:3:6:472:1636#0, CIGAR should have zero elements for unmapped read.
      ERROR: Record 1944399, Read name HWUSI-EAS715_100113:3:59:1626:1079#0, MAPQ must should be 0 for unmapped read.
      ERROR: Record 1944399, Read name HWUSI-EAS715_100113:3:59:1626:1079#0, CIGAR should have zero elements for unmapped read.
      ERROR: Record 2044255, Read name HWUSI-EAS715_100113:3:63:461:1816#0, MAPQ must should be 0 for unmapped read.
      ERROR: Record 2044255, Read name HWUSI-EAS715_100113:3:63:461:1816#0, CIGAR should have zero elements for unmapped read.
      ERROR: Record 2173246, Read name HWUSI-EAS715_100113:3:67:1075:709#0, MAPQ must should be 0 for unmapped read.
      ERROR: Record 2173246, Read name HWUSI-EAS715_100113:3:67:1075:709#0, CIGAR should have zero elements for unmapped read.
      ERROR: Record 2174105, Read name HWUSI-EAS715_100113:3:67:1125:1723#0, MAPQ must should be 0 for unmapped read.
      ERROR: Record 2174105, Read name HWUSI-EAS715_100113:3:67:1125:1723#0, CIGAR should have zero elements for unmapped read.
      ERROR: Record 2663210, Read name HWUSI-EAS715_100113:3:83:1248:281#0, MAPQ must should be 0 for unmapped read.
      ERROR: Record 2663210, Read name HWUSI-EAS715_100113:3:83:1248:281#0, CIGAR should have zero elements for unmapped read.
      ERROR: Record 3064036, Read name HWUSI-EAS715_100113:3:96:383:1101#0, CIGAR should have zero elements for unmapped read.
      ERROR: Record 3301642, Read name HWUSI-EAS715_100113:3:103:698:837#0, MAPQ must should be 0 for unmapped read.
      ERROR: Record 3301642, Read name HWUSI-EAS715_100113:3:103:698:837#0, CIGAR should have zero elements for unmapped read.
      ERROR: Record 3718533, Read name HWUSI-EAS715_100113:3:115:843:1180#0, MAPQ must should be 0 for unmapped read.
      ERROR: Record 3718533, Read name HWUSI-EAS715_100113:3:115:843:1180#0, CIGAR should have zero elements for unmapped read.
      [Tue Mar 09 11:36:26 EST 2010] net.sf.picard.sam.ValidateSamFile done.
      Runtime.totalMemory()=84475904


      Turns out that the first read from above maps off the end of the reference chromosome...

      Ref# CAGAGGCGGCGGCTCGGGGAGAAACCTCAGGCACGGCCGGGCCACCAGGAAAACACGGCCGCGGGATC <- end of chromosome
      Read CGGGGGCGGCGGCTCGGGGAGAAACCTCAGGCACGGCCGGGGCACCAGGAAAACACGGCCGCGGGATCCCA

      So is this a bug in bwa?
      It's complicated (best Top Gun quote). See the samtools mailing lists (help and devel) regarding these two issues. Feel free to voice your concerns on those lists as the solution to the above is ongoing.

      Comment


      • #4
        Indeed,




        Goose: [Extending his middle finger] You know, the finger!
        Charlie: Yes, I know the finger, Goose.
        Goose: Sorry. I hate when it does that.
        Charlie: [to Maverick] So you're the one.
        Maverick: Yes, ma'am.
        -drd

        Comment


        • #5
          Thanks for the info.

          So looks like, for the moment, that the solution is to ignore these warnings in Picard by adding either IGNORE={INVALID_MAPPING_QUALITY,INVALID_CIGAR} (for ValidateSamFile.jar) or VALIDATION_STRINGENCY=SILENT (for ViewSam.jar, SortSam.jar and MarkDuplicates.jar)

          Would that be correct?

          Comment


          • #6
            Yes it would seem so.
            I am afraid that it will cause viewing problems in IGV downstream though but I can't confirm it as well.
            http://kevin-gattaca.blogspot.com/

            Comment


            • #7
              Sorry for the bump, but I wonder what the consequences are when we start using VALIDATION_STRINGENCY=SILENT. We'd like Picard to ignore

              Code:
              ERROR: Record 29078883, Read name HWUSI-EAS536_0001:4:51:19663:20378#0, CIGAR should have zero elements for unmapped read.
              ERROR: Record 29183722, Read name HWUSI-EAS536_0001:4:14:8317:13044#0, MAPQ should be 0 for unmapped read.
              ERROR: Record 29183722, Read name HWUSI-EAS536_0001:4:14:8317:13044#0, CIGAR should have zero elements for unmapped read.
              which, in the dataset we're currently analysing, are the only 3 errors.

              But... won't VALIDATION_STRINGENCY=SILENT ignore other, more serious issues as well (in the next dataset)?

              Also, when we have Picard ignore these errors, what about later steps? We are currently using FixMates (even on single end data) to get rid of the error.

              For the record, we too are using GATK with BWA.

              Originally posted by KevinLam
              I am afraid that it will cause viewing problems in IGV downstream though but I can't confirm it as well.
              Did anyone confirm this?
              Last edited by Bruins; 11-23-2010, 05:49 AM.

              Comment


              • #8
                I would suggest not using SILENT at all. If you have to, be more specific on the error you want to ignore. The validation is very granular.
                -drd

                Comment


                • #9
                  Originally posted by drio View Post
                  I would suggest not using SILENT at all. If you have to, [...]
                  Well... the problem is that after reading this thread and some googling, I get the feeling that SILENT (or IGNORE=) is the only solution, other than running FixMates.

                  That takes me back to my question one: what would be the consequence of using SILENT? Drio would you like to comment on that some more?

                  I'm planning to run some tests, I'll report back later.

                  Comment


                  • #10
                    You can setup VALIDATION_STRINGENCY=LENIENT, that will tell picard to show any error it sees but to continue with the processing. After that, you can inspect the validation output and decide to bail out or continue the execution of you your pipeline.
                    -drd

                    Comment


                    • #11
                      I know its way later but...

                      I noticed recently that Picard has CleanSam.jar whose only purpose at this point is to clip mappings that extend beyond the reference. That may be helpful in removing these errors.
                      Last edited by petriedish; 12-14-2011, 11:39 AM. Reason: Typo

                      Comment


                      • #12
                        CleanSam didn't seem to work - It just ignored the same reads but didn't remove them. At least for me..
                        Setting to lenient validation seemed to work though!

                        Comment


                        • #13
                          I am getting the same problem by using:
                          MarkDuplicates.jar
                          Using LENIENT also goes through.

                          Did someone look at the original question: is it a bug in bwa?
                          I try to repeat the pipeline published by Bowen et al 2012 in Genetics and they do not mention this problem with picard.

                          Comment


                          • #14
                            Bowtie generated alignments do not have this issue.

                            Comment


                            • #15
                              What could be the conclusion of this post ? I am facing the same problem.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM
                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              34 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              37 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              31 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              53 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X