I am using BWA to align paired end Illumina reads to a published reference (where each of our target species' chromosomes is represented as a large scaffold). However, there appears to be a large colinear stretch representing about 70% of the chloroplast in one of these scaffolds. As recovering the chloroplast is a major goal of my analysis, I am thinking about masking this region of the scaffold in question (when I do the alignment and don't mask, I get very low coverage of the cp after filtering out ambiguous reads).
I know I can mask the reference using bedtools and I think in this case I want to use hard masking (replace with Ns) but I am wondering if this will achieve the desired result: e.g. BWA will not allow any reads to map to this region. Can anyone clarify how BWA treats masked regions of the reference?
Thanks!
I know I can mask the reference using bedtools and I think in this case I want to use hard masking (replace with Ns) but I am wondering if this will achieve the desired result: e.g. BWA will not allow any reads to map to this region. Can anyone clarify how BWA treats masked regions of the reference?
Thanks!
Comment