Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sam Flag 65 and 129 after BWA

    Hi,

    I am trying to run BWA with a paired end simulated test data (Illumina and 75bp long).

    Basically I generated out of a reference Database an artifical sample and I am trying to verify the results of BWA.

    Unfortuneteley I am not very sure about the flags in the Sam file, I thought that mapped reads as a pair are displayed with the flag: 2

    But in my sam file i just get the flag (65 and 12), which describes:
    65: paired read and forward
    129: paired read and reverse

    But i thought that the flag for mapped (0x0002) will be set as well? So my question is, if there is something wrong in the sample or what flags do I have to use to extract the mapped reads from paired and from single-end data.

    This is an ouput from sampe:
    :
    @SQ SN:gi|157704448|ref|AC_000133.1| LN:219475005
    @PG ID:bwa PN:bwa VN:0.5.9-r16
    testSample_0_1 65 gi|157704448|ref|AC_000133.1| 1 37 75M = 151 150 ATTGACAAGGGGAGGGAAAAGAGGAACAGAAATTCTTTTCTAT$
    testSample_0_2 129 gi|157704448|ref|AC_000133.1| 151 37 75M = 1 -150 ATAACTTGGAAGCTTCCTTTAAAAGGAACATCAGGAGGTGATT$


    Greetings and many thanks,
    TOmoi

  • #2
    No, you are misreading the flags. 65 = 64 +1, which means it's the first read, and it's paired. 129 = 128 + 1, meaning it's the second read, and it's paired. Both are in the forward direction. That's why they aren't properly paired. You can see that for yourself if you blat them.

    The magic numbers for flags for properly paired reads are 83,99,147,163

    Comment


    • #3
      Thank you very much ! this explains it why it didnt work.

      Well, what does it mean: the second read? I though to be honest, that the second is read is meant as the reverse?!

      and do you also have magic numbers for mate reads? cause, I want to extract reads, were one strand is mapped and the other is not.

      thanks in advance

      Comment


      • #4
        You seem very confused.

        2 doesn't mean "mapped".
        It means "mapped in the proper pair". That means one forward, one reverse, with the distance between them being around the average insert size as comapred to the other reads in the project.

        DNA fragments don't know which direction you are calling forward, and which you are calling reverse. The adaptors just go on whatever end they can. So you can't expect all of read one to run in the direction that we by convention call "forward".

        If you made your fastqs as it looks like you did, with those two reads, both in the same direction, and made it look like they were paired by putting one each in a different fastq, then bwa did exactly as its supposed to. The reason it looks strange is because that would be a very strange pair of reads in a real experiment. Rev-comp one of them, and try again, and you will get better results.

        I'm not sure how bwa handles mate reads, I've never tried it. It might fail to flag them as properly paired, because they point out, instead of in, like paired ends. I suppose revcomping both fastqs (and reversing the quality strings) might allow bwa to flag them as properly paired.

        If you want all the reads where one end mapped, and one end didn't, you want all the reads with a 4 or an 8, but not both. 4 means "Did not map". 8 means "mate didn't map". Samtools view can filter like that.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Addressing Off-Target Effects in CRISPR Technologies
          by seqadmin






          The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
          08-27-2024, 04:44 AM
        • seqadmin
          Selecting and Optimizing mRNA Library Preparations
          by seqadmin



          Sequencing mRNA provides a snapshot of cellular activity, allowing researchers to study the dynamics of cellular processes, compare gene expression across different tissue types, and gain insights into the mechanisms of complex diseases. “mRNA’s central role in the dogma of molecular biology makes it a logical and relevant focus for transcriptomic studies,” stated Sebastian Aguilar Pierlé, Ph.D., Application Development Lead at Inorevia. “One of the major hurdles for...
          08-07-2024, 12:11 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 08-27-2024, 04:40 AM
        0 responses
        16 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 08-22-2024, 05:00 AM
        0 responses
        293 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 08-21-2024, 10:49 AM
        0 responses
        135 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 08-19-2024, 05:12 AM
        0 responses
        124 views
        0 likes
        Last Post seqadmin  
        Working...
        X