Header Leaderboard Ad

Collapse

Unable to find flag in SAM with bowtie2 - but can with BWA

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Unable to find flag in SAM with bowtie2 - but can with BWA

    I'm currently using Bowtie2 to map my reads to a small region of a genome to find what region it's in by collecting the unmapped pairs of the reads which have mapped.

    I'm currently using samtools flags to get the reads I want:

    ---a--- ......... -------------
    |------------------------gene----------------|
    -------..............----b-----
    What we want is
    a: retain unmapped reads whose mate is on reverse strand (-f 36)
    b: retain unmapped reads (-f 4) excluding reads whose mate is on reverse strand or unmapped (-F 40)

    samtools view -Sb -f 36 sam > NtermBam
    samtools view -Sb -f 4 -F 40 sam > CtermBam

    Previously when mapping with BWA both of these commands would work and I would get an N terminus bam and a C terminus bam. However, when I map with Bowtie2 (because the mapping itself is better), I get no reads in the N terminus bam and all the reads go to the C terminus bam.

    Does anyone know why this might be? I'm pretty sure the samtools commands are right because it works with BWA. I'm thinking it has something to do with Bowtie2, maybe not recording flags correctly?

  • #2
    A couple points. Firstly, what was the exact bowtie2 command that you specified? Depending on the flags you used, the unmapped reads may or may not even be included. Secondly, I hope you take the orientation of your genes into account when you start calling the files Nterm and Cterm. They'll be swapped if the gene is on the - strand.

    BTW, the most straight forward solution is to just find a couple reads like this from the bwa alignment and use grep to see what bowtie2 did with them. Presuming they map the same, you'll then know if bowtie2 is doing something unexpected with the flags.

    Comment


    • #3
      The bowtie2 commands I specified were these:

      bowtie2-build ref base_name
      bowtie2 --sensitive-local -x base_name -1 fq1 -2 fq2 -S sam

      I'm aware that Nterm and Cterm have particular meanings, I'm just not too worried about it at this point. Once I get past this section in my testing I'll make sure the orientation is taken into account.

      Comment


      • #4
        Originally posted by dpryan View Post
        BTW, the most straight forward solution is to just find a couple reads like this from the bwa alignment and use grep to see what bowtie2 did with them. Presuming they map the same, you'll then know if bowtie2 is doing something unexpected with the flags.
        I agree with dpryan -- get a couple of reads that you like from BWA and see what Bowtie2 did with them. The sam/bam format is flexible enough to allow programs to have different opinions on what flags are appropriate. As an aside BWA and Bowtie2 will report different template lengths (column 9) for the same mappings which just shows how 'flexible' (or ill-defined) the sam/bam format is.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Improved Targeted Sequencing: A Comprehensive Guide to Amplicon Sequencing
          by seqadmin



          Amplicon sequencing is a targeted approach that allows researchers to investigate specific regions of the genome. This technique is routinely used in applications such as variant identification, clinical research, and infectious disease surveillance. The amplicon sequencing process begins by designing primers that flank the regions of interest. The DNA sequences are then amplified through PCR (typically multiplex PCR) to produce amplicons complementary to the targets. RNA targets...
          03-21-2023, 01:49 PM
        • seqadmin
          Targeted Sequencing: Choosing Between Hybridization Capture and Amplicon Sequencing
          by seqadmin




          Targeted sequencing is an effective way to sequence and analyze specific genomic regions of interest. This method enables researchers to focus their efforts on their desired targets, as opposed to other methods like whole genome sequencing that involve the sequencing of total DNA. Utilizing targeted sequencing is an attractive option for many researchers because it is often faster, more cost-effective, and only generates applicable data. While there are many approaches...
          03-10-2023, 05:31 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 01:40 PM
        0 responses
        6 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-29-2023, 11:44 AM
        0 responses
        12 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-24-2023, 02:45 PM
        0 responses
        20 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-22-2023, 12:26 PM
        0 responses
        28 views
        0 likes
        Last Post seqadmin  
        Working...
        X