Dear all,
I am analyzing the result of RNAseq on Hela cells. Libraries were created by Epicentre ScriptSeq kit, and they have around 7.000.000 reads each. I mapped the reads with TopHat to both hg18 and hg19. Most of transcripts look OK, but weird thing happens with RNA genes-snRNA, snoRNAs etc. Almost all reads cover reference RNAs starting from nucleotide +5 to +10, not from the annotated 5' end, even if the total coverage is very high. Out of more than 2000 reads mapped to U1 snRNA only 2 or 3 have their 5' end identical to the gene. Have anyone experienced this? What could be the reason and how to deal with it?
I am analyzing the result of RNAseq on Hela cells. Libraries were created by Epicentre ScriptSeq kit, and they have around 7.000.000 reads each. I mapped the reads with TopHat to both hg18 and hg19. Most of transcripts look OK, but weird thing happens with RNA genes-snRNA, snoRNAs etc. Almost all reads cover reference RNAs starting from nucleotide +5 to +10, not from the annotated 5' end, even if the total coverage is very high. Out of more than 2000 reads mapped to U1 snRNA only 2 or 3 have their 5' end identical to the gene. Have anyone experienced this? What could be the reason and how to deal with it?