Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sam Flag 65 and 129 after BWA


    I am trying to run BWA with a paired end simulated test data (Illumina and 75bp long).

    Basically I generated out of a reference Database an artifical sample and I am trying to verify the results of BWA.

    Unfortuneteley I am not very sure about the flags in the Sam file, I thought that mapped reads as a pair are displayed with the flag: 2

    But in my sam file i just get the flag (65 and 12), which describes:
    65: paired read and forward
    129: paired read and reverse

    But i thought that the flag for mapped (0x0002) will be set as well? So my question is, if there is something wrong in the sample or what flags do I have to use to extract the mapped reads from paired and from single-end data.

    This is an ouput from sampe:
    @SQ SN:gi|157704448|ref|AC_000133.1| LN:219475005
    @PG ID:bwa PN:bwa VN:0.5.9-r16
    testSample_0_1 65 gi|157704448|ref|AC_000133.1| 1 37 75M = 151 150 ATTGACAAGGGGAGGGAAAAGAGGAACAGAAATTCTTTTCTAT$
    testSample_0_2 129 gi|157704448|ref|AC_000133.1| 151 37 75M = 1 -150 ATAACTTGGAAGCTTCCTTTAAAAGGAACATCAGGAGGTGATT$

    Greetings and many thanks,

  • #2
    No, you are misreading the flags. 65 = 64 +1, which means it's the first read, and it's paired. 129 = 128 + 1, meaning it's the second read, and it's paired. Both are in the forward direction. That's why they aren't properly paired. You can see that for yourself if you blat them.

    The magic numbers for flags for properly paired reads are 83,99,147,163


    • #3
      Thank you very much ! this explains it why it didnt work.

      Well, what does it mean: the second read? I though to be honest, that the second is read is meant as the reverse?!

      and do you also have magic numbers for mate reads? cause, I want to extract reads, were one strand is mapped and the other is not.

      thanks in advance


      • #4
        You seem very confused.

        2 doesn't mean "mapped".
        It means "mapped in the proper pair". That means one forward, one reverse, with the distance between them being around the average insert size as comapred to the other reads in the project.

        DNA fragments don't know which direction you are calling forward, and which you are calling reverse. The adaptors just go on whatever end they can. So you can't expect all of read one to run in the direction that we by convention call "forward".

        If you made your fastqs as it looks like you did, with those two reads, both in the same direction, and made it look like they were paired by putting one each in a different fastq, then bwa did exactly as its supposed to. The reason it looks strange is because that would be a very strange pair of reads in a real experiment. Rev-comp one of them, and try again, and you will get better results.

        I'm not sure how bwa handles mate reads, I've never tried it. It might fail to flag them as properly paired, because they point out, instead of in, like paired ends. I suppose revcomping both fastqs (and reversing the quality strings) might allow bwa to flag them as properly paired.

        If you want all the reads where one end mapped, and one end didn't, you want all the reads with a 4 or an 8, but not both. 4 means "Did not map". 8 means "mate didn't map". Samtools view can filter like that.


        Latest Articles


        • seqadmin
          Latest Developments in Precision Medicine
          by seqadmin

          Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

          Somatic Genomics
          “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
          Yesterday, 01:16 PM
        • seqadmin
          Recent Advances in Sequencing Analysis Tools
          by seqadmin

          The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
          05-06-2024, 07:48 AM





        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 07:15 AM
        0 responses
        Last Post seqadmin  
        Started by seqadmin, 05-23-2024, 10:28 AM
        0 responses
        Last Post seqadmin  
        Started by seqadmin, 05-23-2024, 07:35 AM
        0 responses
        Last Post seqadmin  
        Started by seqadmin, 05-22-2024, 02:06 PM
        0 responses
        Last Post seqadmin