Greetings. I am working with an Illumina library of whole genome sequence for a very AT-rich genome. When I initially did a FASTQC I saw this:
The bimodal distribution is indicative of contamination. I mapped the reads to a reference and looking at the unmapped reads, i see that they are indeed the source of the smaller bump. However, when I look at the MAPPED reads, I get this even stranger distribution (both actual and theoretical):
Normally, after the mapping the distribution with this organism looks like this:
The FASTQC analyses doesn't reveal any overrepresented sequences, and the library has only a 9.11% duplication level. What could possibly be going on?
The bimodal distribution is indicative of contamination. I mapped the reads to a reference and looking at the unmapped reads, i see that they are indeed the source of the smaller bump. However, when I look at the MAPPED reads, I get this even stranger distribution (both actual and theoretical):
Normally, after the mapping the distribution with this organism looks like this:
The FASTQC analyses doesn't reveal any overrepresented sequences, and the library has only a 9.11% duplication level. What could possibly be going on?
Comment