No announcement yet.

Bowtie2 - getting a bam containing mapped reads only

  • Filter
  • Time
  • Show
Clear All
new posts

  • Bowtie2 - getting a bam containing mapped reads only

    I'm mapping a virus from a large bam that is mostly host sequence. Bowtie2 outputs ALL of the reads, not just those that aligned to the reference sequence.

    Is there a way to get bowtie to *directly* output only mapped reads, or do I have to post-process them through Samtools?


  • #2
    I don't know if Bowtie can do that, but BBMap can output only mapped reads if you use a command like this: -Xmx8g in=reads.fq outm=mapped.sam ref=reference.fa

    "out" specifies a stream for all reads. "outm" specifies a stream for only mapped reads, and "outu" specifies a stream for only unmapped reads. All 3 of them can be used together. Additionally, you can use the flag "po=f" (default) or "po=t", which stands for "paired only", to decide whether mapped but unpaired reads go to "outu" or "outm". Also, "ambig=toss" will direct ambiguously-mapped reads to "outu" rather than "outm".

    Note that for paired reads in 2 files, you can use "in1" and "in2" instead of "in".


    • #3

      Look into the output options, I think "--al-conc" will do the trick.

      You can also use "--un-conc" to discard reads that map to a reference sequence(useful when you want to get rid of reads that map to expected contamination such as rRNA)


      • #4
        yueluo's suggestions will output aligned, or unaligned reads to a new FastQ file. The OP was asking about a way to get only reads with alignments written to the SAM file output by Bowtie2. To do that simply add the option "--no-unal" to your bowtie2 command line. This suppresses writing of SAM lines for reads which do not align.

        BTW, the manual is your friend.