Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • win804
    Member
    • Apr 2010
    • 18

    bitwise flag in SAM file

    Hi Everyone,
    Recently I have aligned my sequencing results using BWA and get the aligned result in SAM format.
    My sequence is a single end read and therefore I would expect the bitwise flag (second field) in the SAM format contains only 0, 4, 16 (correct me if i m wrong).

    However, I notice one aligned result with strange value:
    id1 20 chrX 154913749 25 36M * 0 0 TGCGGTCCAACCCTAACCCTAACCCTAACCCTAACC SVVVVV[[[[[[[[[[Z[Z[[[[ZZ[[[[[Z[[[[Z XT:A:U NM:i:2X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:2T4T28

    I cannot explain why there is a bitwise flag 20 in the second field. I expect it to come from 4+16, however, since it is mapped in reverse strand (give 16), then why there is 4 (which means unmapped) contribute to the bitwise flag?

    Thank you very much for the help.
  • nilshomer
    Nils Homer
    • Nov 2008
    • 1283

    #2
    Originally posted by win804 View Post
    Hi Everyone,
    Recently I have aligned my sequencing results using BWA and get the aligned result in SAM format.
    My sequence is a single end read and therefore I would expect the bitwise flag (second field) in the SAM format contains only 0, 4, 16 (correct me if i m wrong).

    However, I notice one aligned result with strange value:
    id1 20 chrX 154913749 25 36M * 0 0 TGCGGTCCAACCCTAACCCTAACCCTAACCCTAACC SVVVVV[[[[[[[[[[Z[Z[[[[ZZ[[[[[Z[[[[Z XT:A:U NM:i:2X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:2T4T28

    I cannot explain why there is a bitwise flag 20 in the second field. I expect it to come from 4+16, however, since it is mapped in reverse strand (give 16), then why there is 4 (which means unmapped) contribute to the bitwise flag?

    Thank you very much for the help.
    Probably a bug in BWA. Why is there a chr/position and non-zero mapping quality for this read? Also, I "blatted" the base sequence and it did not map to the X chromosome. Anyone else?

    Comment

    • win804
      Member
      • Apr 2010
      • 18

      #3
      I guess it's a bug also. I modified the sequence according to MD:Z:2T4T28 and blatted against hg18 and I got this hit:
      QUERY SCORE START END QSIZE IDENTITY CHRO STRAND START END SPAN
      id1 35 1 36 36 100.0% 2 + 181848646 181848793 148

      Thanks Nilshomer.
      I wonder whether anybody has encountered this kind of problem before.

      Comment

      • drio
        Senior Member
        • Oct 2008
        • 323

        #4
        Let see what Heng saids. If you can, share the reference and the read that is causing the issue so the bug can be reproduced. Actually, do you get the same results when aligning that single read against your genome?
        -drd

        Comment

        • lh3
          Senior Member
          • Feb 2008
          • 686

          #5
          Nothing is wrong. Flag 16 only means the read sequence is on the reverse strand. It may or may not be mapped at all. You see coordinate because it stands out of chrX. The BLAT result is not informative. Note the "148"bp span. Maybe you are mapping RNA-seq reads, but bwa is unable to map reads across exon juctions.

          Comment

          • yh253
            Member
            • Jul 2009
            • 16

            #6
            Hi Heng,

            Would you please help to clarify one point that if a read mapped to the junction of two chromosomes, I understand that its Flag will be 4 and I can still see the position, CIGAR and all other tags, but would the MAPQ be set at zero?

            I saw some chip-seq single end alignments from BWA in my data with Flag=20. I took it as unmapped, but the MAPQ is not equal zero. Is this legal in terms of sam format?

            Thanks in advance!

            Yuan

            Comment

            • win804
              Member
              • Apr 2010
              • 18

              #7
              My data is a ChIP-seq data.
              To test again, I have extracted the single read and aligned towards hg18.
              The result becomes:
              id1 4 chrX 154913749 25 36M * 0 0 TGCGGTCCAACCCTAACCCTAACCCTAACCCTAACC SVVVVV[[[[[[[[[[Z[Z[[[[ZZ[[[[[Z[[[[Z XT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0
              XG:i:0 MD:Z:2T4T28

              So, this time it tells me that it is unmapped (with bitwise flag = 4). But strangely it still contains the coordinate. I noticed that the length of chrX is @SQ SN:chrX LN:154913754, which means the mapping coordinate may be wrong? Since the POS=154913749 and if I add the length of my sequence, it is more than chrX length.

              Comment

              • yh253
                Member
                • Jul 2009
                • 16

                #8
                If a read mapped to the junction of two chromosomes, it is shown as unmapped, but you can still see all the tags. Your read may mapped to the end of chrX and the beginning the following chromosome.

                Yuan

                Comment

                • win804
                  Member
                  • Apr 2010
                  • 18

                  #9
                  Hi Yuan,
                  Thanks for your explanation. Now I understand already.

                  Comment

                  • Xi Wang
                    Senior Member
                    • Oct 2009
                    • 317

                    #10
                    Code:
                    GA-C_0001:1:1:4471:3450#0 117 hg19_refGene_NR_033322 1558 0 * = 1558 0 NCCAATGTGCCCTGTGTGGTCCCGTTACGTGTCCCATCATGTTTTGTGC B````ba`bb`bbbbbbbbbbbb`bbbbbbbabbbbbbbbbbbbbbbbb  
                    GA-C_0001:1:1:4471:3450#0 157 hg19_refGene_NR_033322 1558 37 49M = 1558 0 CGCTGAGCATGATGGGGCATGTGCGGGAGCGCCAGGCGGGGCATGTAAC bbaaa^b_abcbbbbObbbbcbbbbbbbbbbbbaabbbbbbbbbbbbbb XT:A:U NM:i:1 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:1T47
                    Two lines for two paired-end reads' mapping results are shown here. It was done by BWA. Please note the flags of the two reads: 117 and 157. The third and the fourth bits (from right) of 117 and 157 are (0,1) and (1,1), respectively. The third bit indicates whether the current read is unmapped and the fourth bit indicates whether the mate is unmapped. So it is clear that the third and fourth bits (0,1) and (1,1) for the mate reads are conflicted with each other. Is this a bug of BWA? Any suggestions? Thanks in advance.
                    Xi Wang

                    Comment

                    • misko
                      Junior Member
                      • Jun 2010
                      • 7

                      #11
                      I am not sure if this is a bug, but it does not look like proper SAM format?
                      The 157 flag -> mate on positive, query on negative
                      The 117 flag -> mate on negative, query on negative
                      Not sure if this helps but i was having trouble parsing SAM output and checking it a while back so i made a small tool to help out with making the SAM output a bit more verbose. If you want i posted the source and binaries to,

                      Hope this helps?

                      Misko

                      Comment

                      • Xi Wang
                        Senior Member
                        • Oct 2009
                        • 317

                        #12
                        A nice tool. But how did you think it's not in a proper SAM format? It is the output of BWA.
                        Xi Wang

                        Comment

                        • lh3
                          Senior Member
                          • Feb 2008
                          • 686

                          #13
                          Sometimes bwa flags read1 as unmapped, but for read2, the mate-unmapped flag is not set. This is a bug.

                          Comment

                          • Xi Wang
                            Senior Member
                            • Oct 2009
                            • 317

                            #14
                            Originally posted by lh3 View Post
                            Sometimes bwa flags read1 as unmapped, but for read2, the mate-unmapped flag is not set. This is a bug.
                            Thanks, Heng. Hope next release will solve this:-)
                            Xi Wang

                            Comment

                            • Ting Zhang
                              Junior Member
                              • Jul 2010
                              • 2

                              #15
                              Questions about BWA

                              Originally posted by lh3 View Post
                              Sometimes bwa flags read1 as unmapped, but for read2, the mate-unmapped flag is not set. This is a bug.
                              BWA is quite a good software, but I met some problems when using.
                              My first question is: as Doc. Heng Li said, “Sometimes BWA flags read1 as unmapped, but for read2, the mate-unmapped flag is not set”, flag may make mistakes. So which field should be checked to see if the read has been mapped? Is that only “NM” or something else?
                              The second question is: when there are several best hits (X0: i: n, n>1), why are annotation fields given for only one best hit, but not for the rest hits in the “XA” record? How do you choose the fully annotated hit? And why are hits in the “XA” sometimes the same as that hit?
                              Waiting for the answers eagerly

                              Comment

                              Latest Articles

                              Collapse

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, Today, 06:09 AM
                              0 responses
                              15 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-09-2026, 11:58 AM
                              0 responses
                              34 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-05-2026, 10:09 AM
                              0 responses
                              39 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-04-2026, 08:59 AM
                              0 responses
                              47 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...