Hi all,
I am trying to work on a small RNA dataset generated using Illumina TruSeqâ„¢ Small RNA protocol.
I have run the data through the various tools specialised for miRNA analysis (such as miRDEEP, miRAnalyser for miRNA identification and novel miRNA prediction),
In addition to many reads hitting known mouse miRNAs, other ncRNAs and possible candidate miRNAs...
*****
I also ended up getting many reads (a big percentage) which align with the mouse genome :Both transcribed regions and unannotated regions.
My first question is :
Why should my reads have perfect (no bp mis-alignment) hits against mRNAs? Would these represent by-products of genes degradation (with the gene fragments in the acceptable length range around 25 nt) )? Any suggestions?
These hits are present through out the genome and show similar presence (in terms of read coverage and read numbers) in most (if not all) places of the genome. So when I map the reads using bowtie against the genome ..I see a very similar pattern in both samples.
Should I even consider checking a possible differential gradient for these reads between the two samples?
Secondly:
What are these reads hitting in big numbers against unannotated (mostly intronic regions!!) of the genome?
Any ideas are welcome.
cheers,
Nandan
I am trying to work on a small RNA dataset generated using Illumina TruSeqâ„¢ Small RNA protocol.
I have run the data through the various tools specialised for miRNA analysis (such as miRDEEP, miRAnalyser for miRNA identification and novel miRNA prediction),
In addition to many reads hitting known mouse miRNAs, other ncRNAs and possible candidate miRNAs...
*****
I also ended up getting many reads (a big percentage) which align with the mouse genome :Both transcribed regions and unannotated regions.
My first question is :
Why should my reads have perfect (no bp mis-alignment) hits against mRNAs? Would these represent by-products of genes degradation (with the gene fragments in the acceptable length range around 25 nt) )? Any suggestions?
These hits are present through out the genome and show similar presence (in terms of read coverage and read numbers) in most (if not all) places of the genome. So when I map the reads using bowtie against the genome ..I see a very similar pattern in both samples.
Should I even consider checking a possible differential gradient for these reads between the two samples?
Secondly:
What are these reads hitting in big numbers against unannotated (mostly intronic regions!!) of the genome?
Any ideas are welcome.
cheers,
Nandan
Comment