Hello to All,
I am a newcomer in this area and recently we initiated one small RNA NGS project to study the differential expression of miRNA in a cancer. We done (outsourced) small RNA NGS analysis using total RNA, isolated with mirVana kit, from 6 cancer cell lines . Library preparation was done using TruSeq Small RNA preparation kit V2 from Illumina according to manufacturer’s protocol. The small RNA library was selected using PAGE gel and the band between 145bp and 160bp was used for further sequencing. The selected libraries was sequenced in Single read 50 base run, using TruSeq SR cluster kit v3-cBot-HS and TruSeq SBS kit v3-HS, on a single lane of Illumina HiSeq1000 platform. The BaseCall files were converted into FASTQ files and demultiplexed using CASAVA v1.8.2. We obtained 31 to 43 million reads in each sample and %>=Q30 bases was around 96% in all samples. Then, we subjected each data file to adapter trimming and aligned to hg19 build refseq and small RNA annotations. The % of aligned reads was 80-83%. When we subjected to Genic Region filtering, the miRNA area was very low, only between 4 to 20% (read counts 1.7 to 7 million) in these samples. What may be the reason for such low count for miRNA in our NGS data? Is, it is a technical problem or data analysis (bioinformatics) issue? Also what will be the minimum % of miRNA area in total small RNA population for further valid analysis? I would be grateful, if you can help us to solve the problem.
I am a newcomer in this area and recently we initiated one small RNA NGS project to study the differential expression of miRNA in a cancer. We done (outsourced) small RNA NGS analysis using total RNA, isolated with mirVana kit, from 6 cancer cell lines . Library preparation was done using TruSeq Small RNA preparation kit V2 from Illumina according to manufacturer’s protocol. The small RNA library was selected using PAGE gel and the band between 145bp and 160bp was used for further sequencing. The selected libraries was sequenced in Single read 50 base run, using TruSeq SR cluster kit v3-cBot-HS and TruSeq SBS kit v3-HS, on a single lane of Illumina HiSeq1000 platform. The BaseCall files were converted into FASTQ files and demultiplexed using CASAVA v1.8.2. We obtained 31 to 43 million reads in each sample and %>=Q30 bases was around 96% in all samples. Then, we subjected each data file to adapter trimming and aligned to hg19 build refseq and small RNA annotations. The % of aligned reads was 80-83%. When we subjected to Genic Region filtering, the miRNA area was very low, only between 4 to 20% (read counts 1.7 to 7 million) in these samples. What may be the reason for such low count for miRNA in our NGS data? Is, it is a technical problem or data analysis (bioinformatics) issue? Also what will be the minimum % of miRNA area in total small RNA population for further valid analysis? I would be grateful, if you can help us to solve the problem.
Comment