Hi,
I have aligned several re-sequenced individuals (Illumina HiSeq, paired-end with IS ~450bp) to a draft assembly, and when looking at the alignments in IGV, it seemed like almost all of them had their mate mapped on a different scaffold.
I extracted some read pairs from the bam file and got this weired result:
HWI-ST0866:1861D42ACXX:2:1313:18722:3523 pR1 scaffold37985 44312 60 100M scaffold54822 4963 0 AATAAATCTACCCCAGCCAGCCTAAAGCCACTCCCTCCTGTCCCATCACTTCATGCCTTTTCACAAAGTCCCTTCTCAGCCTTTTTGCAGCAGCAATCAG @BCFDDFFHHHHHJJIIIFIIJIIJGIGDIIIJJJJIJI<BGHIFJIJJIIJIIJJJJHIJJCE;CDHCC?CCEHFEFFFFE3>@CDDDDDB@DBDACCD NM:i:0 AS:i:100 XS:i:0
HWI-ST0866:1861D42ACXX:2:1313:18722:3523 prR2 scaffold37985 44635 60 100M scaffold39009 30358 0 TGAAGGGTTAAAAACACATGGAGCCAGCATGTTCTGATCACTGAGAAACAACACTGTGAATCAGCACAGCAGCAAATGGGAGTGACAGGCACTGGAGATG CDDBACEECDECCBFFHFHHHIJIJIGIJIJJJJJJIHHIJJJJJIJIHDDHGIHIGIIGDJJIJJIIIIGIFJJJJJJJJJJJJJJHHHHGFFFFFCCC NM:i:1 AS:i:95 XS:i:0
The first read in the pair is located on scaffold37985 and says it has its mate on scaffold54822, but when looking at the mate (second line) it's also located on scaffold37985, but reports to have a mate on scaffold39009!
To me this seems so wrong, why can't they just form a proper pair, when they obviously have perfect matches close to each other on the same scaffold??? Am I missing something here?
I ran bwa mem on 8 cores with otherwise default settings :
bwa mem -t 8 reference readfile1.fastq readfile2.fastq >out.sam
Any input on this would be much appreciated!
Linnéa
I have aligned several re-sequenced individuals (Illumina HiSeq, paired-end with IS ~450bp) to a draft assembly, and when looking at the alignments in IGV, it seemed like almost all of them had their mate mapped on a different scaffold.
I extracted some read pairs from the bam file and got this weired result:
HWI-ST0866:1861D42ACXX:2:1313:18722:3523 pR1 scaffold37985 44312 60 100M scaffold54822 4963 0 AATAAATCTACCCCAGCCAGCCTAAAGCCACTCCCTCCTGTCCCATCACTTCATGCCTTTTCACAAAGTCCCTTCTCAGCCTTTTTGCAGCAGCAATCAG @BCFDDFFHHHHHJJIIIFIIJIIJGIGDIIIJJJJIJI<BGHIFJIJJIIJIIJJJJHIJJCE;CDHCC?CCEHFEFFFFE3>@CDDDDDB@DBDACCD NM:i:0 AS:i:100 XS:i:0
HWI-ST0866:1861D42ACXX:2:1313:18722:3523 prR2 scaffold37985 44635 60 100M scaffold39009 30358 0 TGAAGGGTTAAAAACACATGGAGCCAGCATGTTCTGATCACTGAGAAACAACACTGTGAATCAGCACAGCAGCAAATGGGAGTGACAGGCACTGGAGATG CDDBACEECDECCBFFHFHHHIJIJIGIJIJJJJJJIHHIJJJJJIJIHDDHGIHIGIIGDJJIJJIIIIGIFJJJJJJJJJJJJJJHHHHGFFFFFCCC NM:i:1 AS:i:95 XS:i:0
The first read in the pair is located on scaffold37985 and says it has its mate on scaffold54822, but when looking at the mate (second line) it's also located on scaffold37985, but reports to have a mate on scaffold39009!
To me this seems so wrong, why can't they just form a proper pair, when they obviously have perfect matches close to each other on the same scaffold??? Am I missing something here?
I ran bwa mem on 8 cores with otherwise default settings :
bwa mem -t 8 reference readfile1.fastq readfile2.fastq >out.sam
Any input on this would be much appreciated!
Linnéa
Comment