Hello,
I have a trouble using htseq-count, I used TopHat to aling my data, but when I tried to count the number of reads all the genes present 0 read.
I checked the chromosome names in ./gtf file: cut -f 1 -d " " ~/HG19/genes.gtf | cut -f 1 | sort | uniq So all of them are started with "chr". The same result was accepted by bowtie2-inspect --names ~/HG19/genome. (in .gtf file, I've seen the numbers like "chrHSCHR9_2_CTG3" and don't know what it is).
1) programs/tophat2 -p 4 -r 50 -G ~/HG19/genes.gtf -o tst_output HG19/genome tst_fastq Result: accepted_hits.bam deletions.bed junctions.bed prep_reads.info align_summary.txt insertions.bed logs unmapped.bam
samtools view -H ./accepted_hits.bam:
less align_summary.txt:
2) samtools view -h tst_output/accepted_hits.bam | htseq-count - ~/HG19/genes.gtf > tst_count.txt tail tst_count.txt:
I have no idea what should I do, so could you please help me.
PS: after samtools sort -n -o temp-sorted.bam accepted_hits.bam I have the same result ("SO:queryname" in temp-sorted.bam file).
Thanks,
Maria.
I have a trouble using htseq-count, I used TopHat to aling my data, but when I tried to count the number of reads all the genes present 0 read.
I checked the chromosome names in ./gtf file: cut -f 1 -d " " ~/HG19/genes.gtf | cut -f 1 | sort | uniq So all of them are started with "chr". The same result was accepted by bowtie2-inspect --names ~/HG19/genome. (in .gtf file, I've seen the numbers like "chrHSCHR9_2_CTG3" and don't know what it is).
1) programs/tophat2 -p 4 -r 50 -G ~/HG19/genes.gtf -o tst_output HG19/genome tst_fastq Result: accepted_hits.bam deletions.bed junctions.bed prep_reads.info align_summary.txt insertions.bed logs unmapped.bam
samtools view -H ./accepted_hits.bam:
@HD VN:1.0 SO:coordinate @SQ SN:chr1 LN:249250621 @SQ SN:chr10 LN:135534747 @SQ SN:chr11 LN:135006516 @SQ SN:chr12 LN:133851895 @SQ SN:chr13 LN:115169878 @SQ SN:chr14 LN:107349540 @SQ SN:chr15 LN:102531392 @SQ SN:chr16 LN:90354753 @SQ SN:chr17 LN:81195210 @SQ SN:chr18 LN:78077248 @SQ SN:chr19 LN:59128983 @SQ SN:chr2 LN:243199373 @SQ SN:chr20 LN:63025520 @SQ SN:chr21 LN:48129895 @SQ SN:chr22 LN:51304566 @SQ SN:chr3 LN:198022430 @SQ SN:chr4 LN:191154276 @SQ SN:chr5 LN:180915260 @SQ SN:chr6 LN:171115067 @SQ SN:chr7 LN:159138663 @SQ SN:chr8 LN:146364022 @SQ SN:chr9 LN:141213431 @SQ SN:chrM LN:16571 @SQ SN:chrX LN:155270560 @SQ SN:chrY LN:59373566 @PG ID:TopHat VN:2.1.1 CLrograms/tophat -p 4 -r 50 -G ~/HG19/genes.gtf -o tst_output HG19/genome tst.fastq
Reads: Input : 99 Mapped : 76 (76.8% of input) of these: 1 ( 1.3%) have multiple alignments (0 have >20) 76.8% overall read mapping rate.
ENST00000610276 0 ENST00000610277 0 ENST00000610278 0 ENST00000610279 0 ENST00000610280 0 __no_feature 40 __ambiguous 27 __too_low_aQual 0 __not_aligned 0 __alignment_not_unique 15
PS: after samtools sort -n -o temp-sorted.bam accepted_hits.bam I have the same result ("SO:queryname" in temp-sorted.bam file).
Thanks,
Maria.
Comment