Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sam Flag 65 and 129 after BWA

    Hi,

    I am trying to run BWA with a paired end simulated test data (Illumina and 75bp long).

    Basically I generated out of a reference Database an artifical sample and I am trying to verify the results of BWA.

    Unfortuneteley I am not very sure about the flags in the Sam file, I thought that mapped reads as a pair are displayed with the flag: 2

    But in my sam file i just get the flag (65 and 12), which describes:
    65: paired read and forward
    129: paired read and reverse

    But i thought that the flag for mapped (0x0002) will be set as well? So my question is, if there is something wrong in the sample or what flags do I have to use to extract the mapped reads from paired and from single-end data.

    This is an ouput from sampe:
    :
    @SQ SN:gi|157704448|ref|AC_000133.1| LN:219475005
    @PG ID:bwa PN:bwa VN:0.5.9-r16
    testSample_0_1 65 gi|157704448|ref|AC_000133.1| 1 37 75M = 151 150 ATTGACAAGGGGAGGGAAAAGAGGAACAGAAATTCTTTTCTAT$
    testSample_0_2 129 gi|157704448|ref|AC_000133.1| 151 37 75M = 1 -150 ATAACTTGGAAGCTTCCTTTAAAAGGAACATCAGGAGGTGATT$


    Greetings and many thanks,
    TOmoi

  • #2
    No, you are misreading the flags. 65 = 64 +1, which means it's the first read, and it's paired. 129 = 128 + 1, meaning it's the second read, and it's paired. Both are in the forward direction. That's why they aren't properly paired. You can see that for yourself if you blat them.

    The magic numbers for flags for properly paired reads are 83,99,147,163

    Comment


    • #3
      Thank you very much ! this explains it why it didnt work.

      Well, what does it mean: the second read? I though to be honest, that the second is read is meant as the reverse?!

      and do you also have magic numbers for mate reads? cause, I want to extract reads, were one strand is mapped and the other is not.

      thanks in advance

      Comment


      • #4
        You seem very confused.

        2 doesn't mean "mapped".
        It means "mapped in the proper pair". That means one forward, one reverse, with the distance between them being around the average insert size as comapred to the other reads in the project.

        DNA fragments don't know which direction you are calling forward, and which you are calling reverse. The adaptors just go on whatever end they can. So you can't expect all of read one to run in the direction that we by convention call "forward".

        If you made your fastqs as it looks like you did, with those two reads, both in the same direction, and made it look like they were paired by putting one each in a different fastq, then bwa did exactly as its supposed to. The reason it looks strange is because that would be a very strange pair of reads in a real experiment. Rev-comp one of them, and try again, and you will get better results.

        I'm not sure how bwa handles mate reads, I've never tried it. It might fail to flag them as properly paired, because they point out, instead of in, like paired ends. I suppose revcomping both fastqs (and reversing the quality strings) might allow bwa to flag them as properly paired.

        If you want all the reads where one end mapped, and one end didn't, you want all the reads with a 4 or an 8, but not both. 4 means "Did not map". 8 means "mate didn't map". Samtools view can filter like that.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Recent Developments in Metagenomics
          by seqadmin





          Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
          09-23-2024, 06:35 AM
        • seqadmin
          Understanding Genetic Influence on Infectious Disease
          by seqadmin




          During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

          Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
          09-09-2024, 10:59 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 10-02-2024, 04:51 AM
        0 responses
        13 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 10-01-2024, 07:10 AM
        0 responses
        21 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 09-30-2024, 08:33 AM
        0 responses
        25 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 09-26-2024, 12:57 PM
        0 responses
        18 views
        0 likes
        Last Post seqadmin  
        Working...
        X