Hi!
Ive seen a lot of threads on this, but I can't figure it out. I got 16-60 millions single end reads in each library. Ive used Tophat 2 with UCSC GTF file for hg19.
This is my code:
samtools view accepted_hits.bam | \
htseq-count -m intersection-nonempty -s no -a 10 \
- UCSC/hg19/genes.gtf \
> Out.txt
Here is a typical result, its propotional to the library size:
no_feature 7013689
ambiguous 269370
too_low_aQual 0
not_aligned 0
alignment_not_unique 6645341
How come i get on average 25 - 50% reads that is "no_feature",
"ambiguous" or "alignment_not_unique".
This is RNAseq, and if I must visually inspect, how to precede?
Ive seen a lot of threads on this, but I can't figure it out. I got 16-60 millions single end reads in each library. Ive used Tophat 2 with UCSC GTF file for hg19.
This is my code:
samtools view accepted_hits.bam | \
htseq-count -m intersection-nonempty -s no -a 10 \
- UCSC/hg19/genes.gtf \
> Out.txt
Here is a typical result, its propotional to the library size:
no_feature 7013689
ambiguous 269370
too_low_aQual 0
not_aligned 0
alignment_not_unique 6645341
How come i get on average 25 - 50% reads that is "no_feature",
"ambiguous" or "alignment_not_unique".
This is RNAseq, and if I must visually inspect, how to precede?
Comment