I've got some old Illumina RNA Seq data and there's an unusual trend in terms of Kmer detection. I have 16 samples done over two chips, 200bp inserts, read length of 78.

GGGGG follows an exponential growth in score as bp increases and TTTTT remains relatively high. This is common to almost all samples in the first chip on both forward and reverse reads. On the second chip, the trend seems to be confined to reverse reads.

The per base quality isn't great so my next step will be to trim, however with the unusual trend I wondered if anyone had any theories as to why or how that could be occurring?

Attached Files