No announcement yet.

BWA mem reporting poor alignments

  • Filter
  • Time
  • Show
Clear All
new posts

  • BWA mem reporting poor alignments

    I've been using bwa mem to align illumina reads to contaminant genomic databases (ribosomal RNAs, gbbct, etc). I've noticed unusually high percentages of reads mapping to my contaminants (10% or more when I expected relatively little contamination) and suspiciously high percentages of reads mapping to multiple contaminant databases (human as well as gbbct, for instance). I blast-ed a few of these "contaminant" bacterial reads online, and they were showing relatively poor alignments to bacterial sequences (e.g., 30-40% of the read mapping with a few mismatches).

    Right now I'm using the default parameters, which obviously isn't working. I'm wondering if someone could give me some advice on how to tweak things to restrict myself to better alignments.

  • #2
    You could filter on mapping quality, e.g. by piping through samtools a la

    bwa mem ref fastq1 fastq2 | samtools view -S -q30 -


    • #3
      That sounds like a good idea. Do you know the range in the quality scores? What's a reasonable cutoff to ask for? Is -q 30 what you would suggest?


      • #4
        And one more thing ...
        I've read that BWA gives ambiguously mapped reads a quality of 0. Would I have to pull those reads out separately somehow?