retreiving sequences
i did cuffcompare and the files "cuffcmp.combined.gtf" "cuffcmp.loci" "cuffcmp.stats" "cuffcmp.tracking". i am interested in unknown transcripts. so i did a grep on .tracking file. the count came to 3940.
now, how do i get the sequence of those 3940 transcripts?
Seqanswers Leaderboard Ad
Collapse
X
-
and in addition to the above, can someone help me with the cuffcompare output posted above to answer this questions:
Determining the total number of transcripts (Known, partial and novel) assembled that are compatible with the existing annotation.
How to determine the total of unannotated spliced isoforms of known genes.
How to determine the number of transcripts found in the intergenic regions at certain distances like 1,000bp from known genes.
Thanks.
Leave a comment:
-
-
Can someone help me out with this cuffcompare output.
I'm imploring someone to please help or provide a link that will give me more information.
Leave a comment:
-
-
Cuffcompare output
Hi Folks, I need your help on this isssue. I ran cuffcompare and I got the following output below:
#= Summary for dataset: cufflinks_6/transcripts.gtf :
# Query mRNAs : 41274 in 33119 loci (24486 multi-exon transcripts)
# (6071 multi-transcript loci, ~1.2 transcripts per locus)
# Reference mRNAs : 26679 in 24525 loci (20258 multi-exon)
# Corresponding super-loci: 14030
#--------------------| Sn | Sp | fSn | fSp
Base level: 66.3 49.7 - -
Exon level: 48.9 60.9 51.2 63.8
Intron level: 63.4 88.3 63.7 88.8
Intron chain level: 31.1 25.7 39.1 32.3
Transcript level: 0.0 0.0 0.4 0.2
Locus level: 25.4 18.8 29.7 21.9
Matching intron chains: 6303
Matching loci: 6236
Missed exons: 69277/207203 ( 33.4%)
Novel exons: 24838/166289 ( 14.9%)
Missed introns: 59718/181523 ( 32.9%)
Novel introns: 6896/130363 ( 5.3%)
Missed loci: 9351/24525 ( 38.1%)
Novel loci: 14555/33119 ( 43.9%)
Total union super-loci across all input datasets: 35358
(12030 multi-transcript, ~3.8 transcripts per locus)
Can someone help me with the intepretation of this result? I searched through the manual but got no clue.
THanks.
Leave a comment:
-
-
Hi,
Just to confirm, Ensembl does use 1-based coordinates for the genome (in gtf and other files).
Leave a comment:
-
-
Originally posted by Thomas Doktor View PostYou also need to convert the chromosome coordinates in the Ensembl GTF from the first base being 1 to first base being 0 (simply subtract 1 from the start coordinate).
I think Ensembl GTF is fine itself without subtracting 1 from the start coordinate, since GTF start is 1-based.
Leave a comment:
-
-
As pointed out, the standard Ensembl GTF file contains chromosome numbers (1,2,3,..) instead of chromosome identifiers (chr1,chr2,chr3,...) so you need to convert these. You also need to convert the chromosome coordinates in the Ensembl GTF from the first base being 1 to first base being 0 (simply subtract 1 from the start coordinate).
After these two conversion steps you should be able to use the Cufflinks suite of programs.
EDIT
You do not need to edit the coordinates in the GTF file.Last edited by Thomas Doktor; 12-06-2010, 11:25 AM. Reason: Users should not edit the feature coordinates in GTF files
Leave a comment:
-
-
Originally posted by nat View PostHi there
I am running Cufflinks 'cuffcompare' on transcript.gtf (produced by cufflinks) and comparing it with a .gtf file downloaded from Ensemble
as such:
./cuffcompare -r /home/Homo_sapiens.GRCh37.55.gtf transcripts.gtf
and I seem to get no matches at all between files?!
#= Summary for dataset: transcripts.gtf :
# Total mRNAs : 17735 in 17537 loci (17696 multi-exon)
# Reference mRNAs : 99330 in 43502 loci (82822 multi-exon)
# Corresponding super-loci: 0
#--------------------| Sn | Sp | fSn | fSp
Base level: 0.0 0.0 - -
Exon level: 0.0 0.0 0.0 0.0
Intron level: 0.0 0.0 0.0 0.0
Intron chain level: 0.0 0.0 0.0 0.0
Transcript level: 0.0 0.0 0.0 0.0
Locus level: 0.0 0.0 0.0 0.0
Missed exons: 353318/353318 (100.0%)
Wrong exons: 45831/45831 (100.0%)
Missed introns: 272474/272474 (100.0%)
Wrong introns: 28264/28264 (100.0%)
Missed loci: 0/43502 ( 0.0%)
Wrong loci: 0/17537 ( 0.0%)
HAs anyone else tried this - where did you get your reference .gtf from? Ive used this previously when TopHat itself calculated RPKM values, and it worked fine
Thanks
Leave a comment:
-
-
cuffcompare output
Hi there
I am running Cufflinks 'cuffcompare' on transcript.gtf (produced by cufflinks) and comparing it with a .gtf file downloaded from Ensemble
as such:
./cuffcompare -r /home/Homo_sapiens.GRCh37.55.gtf transcripts.gtf
and I seem to get no matches at all between files?!
#= Summary for dataset: transcripts.gtf :
# Total mRNAs : 17735 in 17537 loci (17696 multi-exon)
# Reference mRNAs : 99330 in 43502 loci (82822 multi-exon)
# Corresponding super-loci: 0
#--------------------| Sn | Sp | fSn | fSp
Base level: 0.0 0.0 - -
Exon level: 0.0 0.0 0.0 0.0
Intron level: 0.0 0.0 0.0 0.0
Intron chain level: 0.0 0.0 0.0 0.0
Transcript level: 0.0 0.0 0.0 0.0
Locus level: 0.0 0.0 0.0 0.0
Missed exons: 353318/353318 (100.0%)
Wrong exons: 45831/45831 (100.0%)
Missed introns: 272474/272474 (100.0%)
Wrong introns: 28264/28264 (100.0%)
Missed loci: 0/43502 ( 0.0%)
Wrong loci: 0/17537 ( 0.0%)
HAs anyone else tried this - where did you get your reference .gtf from? Ive used this previously when TopHat itself calculated RPKM values, and it worked fine
ThanksTags: None
-
Latest Articles
Collapse
-
by seqadmin
This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.
The Headliner
The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...-
Channel: Articles
03-03-2025, 01:39 PM -
-
by seqadmin
The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...-
Channel: Articles
02-24-2025, 06:31 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 03-20-2025, 05:03 AM
|
0 responses
18 views
0 reactions
|
Last Post
by seqadmin
03-20-2025, 05:03 AM
|
||
Started by seqadmin, 03-19-2025, 07:27 AM
|
0 responses
25 views
0 reactions
|
Last Post
by seqadmin
03-19-2025, 07:27 AM
|
||
Started by seqadmin, 03-18-2025, 12:50 PM
|
0 responses
19 views
0 reactions
|
Last Post
by seqadmin
03-18-2025, 12:50 PM
|
||
Started by seqadmin, 03-03-2025, 01:15 PM
|
0 responses
187 views
0 reactions
|
Last Post
by seqadmin
03-03-2025, 01:15 PM
|
Leave a comment: