Seqanswers Leaderboard Ad

**saskak** · 11-16-2015, 06:22 AM

Additional frame inconsistencies

Unfortunately no one has suggested reasonable explanation for my previous problems.

Additionally to that I also found a few frame inconsistencies, i.e. column 8 (count from 1).

For the gene: FBgn0033313 and transcript: FBtr0305081 there is something not quite right with the frame of the start_codons, i.e. column 8.
The gff for this gene and transcript reads for the first few CDS:

2R FlyBase CDS 8616078 8616078 . + 0 Parent=FBtr0305081
2R FlyBase CDS 8616327 8616516 . + 2 Parent=FBtr0310448,FBtr0310449,FBtr0305081
2R FlyBase CDS 8616700 8618171 . + 1 Parent=FBtr0290112,FBtr0301363,FBtr0310448,FBtr0310449,FBtr0305080,FBtr0305081,FBtr0305082
2R FlyBase CDS 8618234 8618461 . + 2 Parent=FBtr0290112,FBtr0301363,FBtr0310448,FBtr0310449,FBtr0305080,FBtr0305081,FBtr0305082

I parsed to:

2R FlyBase start_codon 8616078 8616078 . + 0 gene_id "FBgn0033313"; gene_symbol "Cirl"; transcript_id "FBtr0305081"; transcript_symbol "Cirl-RG";
2R FlyBase start_codon 8616327 8616328 . + 2 gene_id "FBgn0033313"; gene_symbol "Cirl"; transcript_id "FBtr0305081"; transcript_symbol "Cirl-RG";
2R FlyBase CDS 8616078 8616078 . + 0 gene_id "FBgn0033313"; gene_symbol "Cirl"; transcript_id "FBtr0305081"; transcript_symbol "Cirl-RG";
2R FlyBase CDS 8616327 8616516 . + 2 gene_id "FBgn0033313"; gene_symbol "Cirl"; transcript_id "FBtr0305081"; transcript_symbol "Cirl-RG";

Nevertheless, in FlyBase's gtf the frame of the second start_codon is:

2R FlyBase start_codon 8616078 8616078 . + 0 gene_id "FBgn0033313"; gene_symbol "Cirl"; transcript_id "FBtr0305081"; transcript_symbol "Cirl-RG";
2R FlyBase start_codon 8616327 8616328 . + 1 gene_id "FBgn0033313"; gene_symbol "Cirl"; transcript_id "FBtr0305081"; transcript_symbol "Cirl-RG";
2R FlyBase CDS 8616078 8616078 15 + 0 gene_id "FBgn0033313"; gene_symbol "Cirl"; transcript_id "FBtr0305081"; transcript_symbol "Cirl-RG";
2R FlyBase CDS 8616327 8616516 15 + 2 gene_id "FBgn0033313"; gene_symbol "Cirl"; transcript_id "FBtr0305081"; transcript_symbol "Cirl-RG";

Note the frame is 1 in start_codon 8616327 8616328. As this start_codon has two bases, then according to the gtf2.2 guidelines, the frame should be 2, i.e. the third base in the feature is the start of a codon. This is not the only case of such mis-framing around, I count quite a few.

I checked this in Ensembl's gtf and this appears to be 2 as I parsed it. Do you think I should I contact FlyBase to inquire about these.

Many thanks indeed for any help.

**westerman** · 11-16-2015, 01:43 PM

Originally posted by saskak View Post

I checked this in Ensembl's gtf and this appears to be 2 as I parsed it. Do you think I should I contact FlyBase to inquire about these.

Yes. They will know their dataset better than most of us on SeqAnswers. If there is a problem then they will appreciate knowing about it.

**saskak** · 11-27-2015, 09:17 AM

Solved

Contacted FlyBase and it turned out they had a bug/s in their annotation pipeline. Should be fixed in the 6.08 gtf file.

**dpryan** · 11-27-2015, 01:40 PM

Thanks for the follow up and getting this corrected!

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 17 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Something wrong in FlyBase's gtf (gff to gtf conversion)

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News