Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • chrisW
    Junior Member
    • Jun 2010
    • 7

    samtools: parse error in SAM to BAM conversion

    Greetings, I am a novice user with little experience running command line software. I am enjoying learning, though error messages leave me at a loss.

    Background: I used bwa to create my SAM file. When I attempt to use the "view" option to convert to BAM I receive the following:

    [samopen] SAM header is present: 100 sequences.
    Parse error at line 18750817: unmatched CIGAR operation
    Aborted

    I am unclear as to what the problem is. Any guidance?

    -Chris
  • nilshomer
    Nils Homer
    • Nov 2008
    • 1283

    #2
    Originally posted by chrisW View Post
    Greetings, I am a novice user with little experience running command line software. I am enjoying learning, though error messages leave me at a loss.

    Background: I used bwa to create my SAM file. When I attempt to use the "view" option to convert to BAM I receive the following:

    [samopen] SAM header is present: 100 sequences.
    Parse error at line 18750817: unmatched CIGAR operation
    Aborted

    I am unclear as to what the problem is. Any guidance?

    -Chris
    It may be a bug in the program that produced the BAM; you may be able to post on that program's help mailing list. Optionally, you could print out line "18750817" to show us the offending line.

    Comment

    • chrisW
      Junior Member
      • Jun 2010
      • 7

      #3
      Thank you for getting back to me. I don't know how to get text from the terminal and paste it into these forum windows. I did, however, look at the offending line. It is missing the actual nucleotide and associated Illumina information. Is there an easy way to explain how I can get terminal screen printout into a form I can "cut/paste"? I could then show you exactly what I'm seeing.

      Based on your comments, it appears that bwa perhaps introduced an error in creating the SAM file ...or there may be something off in the datafiles I'm feeding into bwa?

      Again, I appreciate your willingness to help. There is a steep learning curve for me as I've never had any Unix training. I just learned the "sed" function to look at the offending line and see what was wrong.
      -Chris

      Comment

      • lh3
        Senior Member
        • Feb 2008
        • 686

        #4
        is it the last line in output? i guess it is truncated.

        Comment

        • chrisW
          Junior Member
          • Jun 2010
          • 7

          #5
          No, it's not the last line in output. I'm not sure of the line's exact position in the SAM file, but I printed out the lines prior and after the offending line and they are intact with a nucleotide string followed by what I understand to be Illumina information (perhaps pertaining to quality of the nucleotide call?)

          I would show you what I'm seeing but I don't know how to get output from my terminal window into a form that I can post, other than typing verbatim what I'm seeing into these chat windows. There must be an easier way.

          At this point I don't know if this offending line is the only one in my SAM file or if there are others. It just appears to be the first and thus aborts the SAM --> BAM conversion.

          Thanks for you help.

          Comment

          • chrisW
            Junior Member
            • Jun 2010
            • 7

            #6
            Figured out my terminal to chat cut/paste. So, here are three lines from my .sam file. The middle line is the offending line that brings up the "unmatched CIGAR operation".

            output/

            chriswall@ubuntu:/host/Users/chris.wall/Desktop/Mastigo-genomics/bwa_cw$ sed '18750816,18750818!d' WC.sam
            ILLUMINA-2F52BD:6:50:1169:1446#0 163 NODE_37776_length_48465_cov_160.191483\par 35442 29 59M = 35600 217 TTTTATCGGTGTGTATCGGTGTGTATCGGTGGGCGAGTTTTCAACAAGATTATGGGGCA gggfggggfdae[_aee_eeZ_^]_`aaeS]\FY\`dWZ]_^`_`WcSZXg`dgfcbYV XT:A:U NM:i:2 SM:i:0 AM:i:0 X0:i:1 X1:i:2 XM:i:2 XO:i:0 XG:i:0 MD:Z:7T27G23

            ILLUMINA-2F52BD:6:50:1169:1947#0 83 NODE_51983_length_172857_cov_173.044144\par 119160 60 5 XT:A:U NM:i:1 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:39C19

            ILLUMINA-2F52BD:6:50:1178:417#0 83 NODE_43256_length_74978_cov_171.684830\par 68201 60 59M = 67821 -439 CAGAGAAAAGACAGGTGATTATCAGACAACCTACGGATTGATAGGAATTTTGAACCGCA bgdgbgfgf^dddddgdfgffggfbaggZefeae\ggggbfgggegfggggcg`feffe XT:A:U NM:i:0 SM:i:37 AM:i:37 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:59

            /output

            Can this offending line and perhaps any subsequent offending lines be deleted from the .sam file and not introduce a problem?

            Comment

            • nilshomer
              Nils Homer
              • Nov 2008
              • 1283

              #7
              The sequence and qualities are also missing along with the CIGAR. What program generated the SAM file?

              Comment

              • chrisW
                Junior Member
                • Jun 2010
                • 7

                #8
                bwa 0.5.5 package from Ubuntu

                Comment

                • nilshomer
                  Nils Homer
                  • Nov 2008
                  • 1283

                  #9
                  Originally posted by chrisW View Post
                  bwa 0.5.5 package from Ubuntu
                  Try the latest source code from the BWA SVN repository.

                  Comment

                  • lh3
                    Senior Member
                    • Feb 2008
                    • 686

                    #10
                    actually 0.5.5 is more stable than the latest release. 1KG has been using it to map >20TB of data and it never fails.

                    Comment

                    • nilshomer
                      Nils Homer
                      • Nov 2008
                      • 1283

                      #11
                      Any suggestions for the missing CIGAR?

                      Comment

                      • lh3
                        Senior Member
                        • Feb 2008
                        • 686

                        #12
                        I guess this is a disk error or input error. It is quite unlikely to happen if you read the source code on printing alignments. It has never happened before.

                        Comment

                        • DNASpeaks
                          Junior Member
                          • Oct 2012
                          • 5

                          #13
                          parse error using samtools view -bT, ValidateSamFile output: empty seq dictionary!

                          Originally posted by lh3 View Post
                          I guess this is a disk error or input error. It is quite unlikely to happen if you read the source code on printing alignments. It has never happened before.
                          Hello Everyone,
                          I am having trouble converting the sam files to bam using samtools. I used this cmd to convert my sam to bam -

                          samtools view -bT hg19.fa s_chip2.sam > s_chip2.bam
                          I got this error.
                          Parse error at line 7707082: sequence and quality are inconsistent
                          Aborted



                          I ran ValidateSamFile.jar, a picard tool and got the following error, hundreds of them -

                          WARNING: Read name HWI-ST798R:82: D18MUACXX:2:1101:2025:1987, A record is missing a read group
                          ERROR: Record 5, Read name HWI-ST798R:82: D18MUACXX:2:1101:8625:1992, Empty sequence dictionary.


                          I am not sure how to fix this, any help is appreciated.

                          Comment

                          Latest Articles

                          Collapse

                          • SEQadmin2
                            Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                            by SEQadmin2


                            I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


                            Here are nine questions we think about, in roughly the order they matter, before...
                            06-18-2026, 07:11 AM
                          • SEQadmin2
                            From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                            by SEQadmin2


                            Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                            The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                            ...
                            06-02-2026, 10:05 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by SEQadmin2, 06-17-2026, 06:09 AM
                          0 responses
                          26 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-09-2026, 11:58 AM
                          0 responses
                          43 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-05-2026, 10:09 AM
                          0 responses
                          48 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-04-2026, 08:59 AM
                          0 responses
                          49 views
                          0 reactions
                          Last Post SEQadmin2  
                          Working...