we are using bowtie to generate the bam file.we are intend to use tophat and cufflink ,HTseq to count the short reads.but we can not find any gtf file related to our species.Could we use these software?and we can only find gff3 files,is there any possibility that we generate by our self using gff3 files?
Seqanswers Leaderboard Ad
Collapse
X
-
I found a great script for converting gff3 to gtf and also one for converting cufflinks gtf to gff3, both of which have saved me much hassle for using data from, and getting data into, GBrowse. By default the gff3togtf script creates gene_id entries in the attributes column but cufflinks will only work with gene_name. I've left the script in it's original form here but you should either change the script or post-process the gtf file produced using e.g. a sed command.
I've attached them both and as soon as our server with my notes stored on it is back up again I will edit this reply to link to the originals to make sure credit is given to the right people.
-
-
About the HTseq
Originally posted by Simon Anders View PostBe sure to read the man page of htseq-count. There are options to tell how the gene ID attribute is called in your GFF file (Ensembl's standard is "gene_id", but as 'natstreet' just said, you also see 'gene_name', 'ID' or whatever).
50972 GFF lines processed.
100000 reads processed.
200000 reads processed.
300000 reads processed.
400000 reads processed.
500000 reads processed.
600000 reads processed.
700000 reads processed.
727886 reads processed.
13101 229869
no_feature 498017
ambiguous 0
too low aQual 0
not aligned 4460065
but i can not get the results that counts for each feature. Could you tell me what i should do to get the number of each genes or each exon's short reads.
Thanks!
Comment
-
-
Originally posted by dingkai0564 View PostThanks for your advice. It seems that i can make the HTseq running,however,i only get the results of :
50972 GFF lines processed.
100000 reads processed.
200000 reads processed.
300000 reads processed.
400000 reads processed.
500000 reads processed.
600000 reads processed.
700000 reads processed.
727886 reads processed.
13101 229869
no_feature 498017
ambiguous 0
too low aQual 0
not aligned 4460065
but i can not get the results that counts for each feature. Could you tell me what i should do to get the number of each genes or each exon's short reads.
Thanks!
Comment
-
-
So you can supply TopHat with a GTF file of annotated transcripts, which, using the --GTF option, will be the first place where reads are mapped, followed by the whole genome, with or without novel junction discovery in this second stage. As I understand it, this is after TopHat 1.4.
I'm curious to know how t was before 1.4. I think you could already give TopHat a GTF file, but it used it second. Am I right? If so, what is the difference between using it [the GTF file] first and using it second after the genome?
Comment
-
Latest Articles
Collapse
-
by seqadmin
This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.
The Headliner
The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...-
Channel: Articles
03-03-2025, 01:39 PM -
-
by seqadmin
The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...-
Channel: Articles
02-24-2025, 06:31 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 03-20-2025, 05:03 AM
|
0 responses
17 views
0 reactions
|
Last Post
by seqadmin
03-20-2025, 05:03 AM
|
||
Started by seqadmin, 03-19-2025, 07:27 AM
|
0 responses
18 views
0 reactions
|
Last Post
by seqadmin
03-19-2025, 07:27 AM
|
||
Started by seqadmin, 03-18-2025, 12:50 PM
|
0 responses
19 views
0 reactions
|
Last Post
by seqadmin
03-18-2025, 12:50 PM
|
||
Started by seqadmin, 03-03-2025, 01:15 PM
|
0 responses
186 views
0 reactions
|
Last Post
by seqadmin
03-03-2025, 01:15 PM
|
Comment