Hello all,
I have some Illumina HiSeq paired end whole genome human data that has 2 - 3% of paired end reads where the two ends are mapping to different chromosomes. I'm using bwa with mostly default parameters, such as
I also tried mapping some of the cross-chromosomal reads with bowtie as single ends and it placed them in the same location.
The reads map uniquely with high quality (eg. 37 on phred scale) and seem to occur randomly with an even distribution across chromosomes. Bwa reports X0=1 and X1=0, indicating that there are no alternative mappings for the two ends (and even when I add flags to allow more edit distance and gap opens it still doesn't find any). The cross-chromosomal read pairs always seem to be single cases at a location which makes me think it can't be a biological effect.
I'm wondering if anybody knows an explanation for these kind of reads and what the best way to treat them is?
Many thanks!
Simon
I have some Illumina HiSeq paired end whole genome human data that has 2 - 3% of paired end reads where the two ends are mapping to different chromosomes. I'm using bwa with mostly default parameters, such as
Code:
bwa aln -t 16 -q 10 reference.fa r1.fastq > r1.sai ... bwa sampe reference.fa r1.sai r2.sai r1.fastq r2.fastq > r1.sam
The reads map uniquely with high quality (eg. 37 on phred scale) and seem to occur randomly with an even distribution across chromosomes. Bwa reports X0=1 and X1=0, indicating that there are no alternative mappings for the two ends (and even when I add flags to allow more edit distance and gap opens it still doesn't find any). The cross-chromosomal read pairs always seem to be single cases at a location which makes me think it can't be a biological effect.
I'm wondering if anybody knows an explanation for these kind of reads and what the best way to treat them is?
Many thanks!
Simon
Comment