Seqanswers Leaderboard Ad

**GenoMax** · 05-30-2017, 04:33 AM

If you are just starting out then do not use TopHat. It is no longer the state of art for RNAseq data analysis.

You could use any other splice aware aligner or if you want to stay in the "family" then HISAT2/StringTie is the current recommended software from the same folks who developed TopHat.

**GenoMax** · 05-30-2017, 04:35 AM

If your reference, indexs and annotations do not match exactly (in terms of gene names) then you are not going to get the counting to work. For counting also consider using featureCounts. Much faster, can produce count matrix from multiple BAM files and can take non-sorted BAM's.

**tirohia** · 05-31-2017, 05:34 PM

I'm not just starting out. I used the exact same pipeline to process a timcourse about a year-ish ago - one of the reasons I was thinking that the gtf/genome files, as you say, might not be matching. In an odeal world, there'd be an associated gtf file alongside the pre-generated bowtie indexes.

Thanks for pointing me towards HISAT2, will investigate/align. Though, just because it's no longer cutting edge, doesn't mean that Tophat is now useless. The alignment should at least, be reasonable. Moving to HISAT2 will leave me with the same conundrum of not being sure that the pre-built indexes are the same as the gtf file I get from Ensembl. Unless I go and build one myself that is.

So now it just gets strange. I had found featurecounts and ran it yesterday. It's giving me a very small proportion of reads mapping to genes and a large proportion being multi-mapped. Which leaves me with two possibilities. Either a) there's something unexpected going on in my data or b) given that it's using the same gtf that htseq-count used to produce zero counts, there's something odd going on in the gtf/genome file combo. I'm guessing a), but am curious as to why htseq wasn't/isn't working.

**GenoMax** · 06-01-2017, 04:06 AM

Originally posted by tirohia View Post

I'm not just starting out. I used the exact same pipeline to process a timcourse about a year-ish ago - one of the reasons I was thinking that the gtf/genome files, as you say, might not be matching. In an odeal world, there'd be an associated gtf file alongside the pre-generated bowtie indexes.

You can get those from Illumina iGenomes site. The bundle contains matching sequence, annotations, indexes the whole bit.

Thanks for pointing me towards HISAT2, will investigate/align. Though, just because it's no longer cutting edge, doesn't mean that Tophat is now useless.

Fair point. Authors of TopHat have this note on their site now.
---------------------------------------
Please note that TopHat has entered a low maintenance, low support stage as it is now largely superseded by HISAT2 which provides the same core functionality (i.e. spliced alignment of RNA-Seq reads), in a more accurate and much more efficient way.
----------------------------------------

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

No counts from HTSEq

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News