Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cufflinks annotating outputs incorrectly

    Hey,

    I've ran the cufflinks pipeline with both flybase release 5 and UCSC release 3 Drosophila melanogaster annotations, and I'm getting strange annotations in the outputs. The example below is for the UCSC annotation, since it ended up working better with cufflinks overall (but this problem was also the case for the flybase annotation).

    CuffMerge entries for "galectin" gene

    chr2L Cufflinks exon 21821 22941 . + . gene_id "XLOC_000002"; transcript_id "TCONS_00000005"; exon_number "1"; gene_name "galectin"; oId "CUFF.59.1"; nearest_ref "NM_001272859"; class_code "j"; tss_id "TSS2";
    chr2L Cufflinks exon 22998 23422 . + . gene_id "XLOC_000002"; transcript_id "TCONS_00000005"; exon_number "2"; gene_name "galectin"; oId "CUFF.59.1"; nearest_ref "NM_001272859"; class_code "j"; tss_id "TSS2";
    chr2L Cufflinks exon 74903 75018 . + . gene_id "XLOC_000002"; transcript_id "TCONS_00000005"; exon_number "3"; gene_name "galectin"; oId "CUFF.59.1"; nearest_ref "NM_001272859"; class_code "j"; tss_id "TSS2";
    chr2L Cufflinks exon 75078 76276 . + . gene_id "XLOC_000002"; transcript_id "TCONS_00000005"; exon_number "4"; gene_name "galectin"; oId "CUFF.59.1"; nearest_ref "NM_001272859"; class_code "j"; tss_id "TSS2";

    UCSC annotation entries for "galectin"

    chr2L unknown exon 71757 71804 . + . gene_id "galectin"; gene_name "galectin"; p_id "P18001"; transcript_id "NM_001258884"; tss_id "TSS6545";
    chr2L unknown exon 71950 72081 . + . gene_id "galectin"; gene_name "galectin"; p_id "P18001"; transcript_id "NM_001258884"; tss_id "TSS6545";
    chr2L unknown CDS 72013 72081 . + 0 gene_id "galectin"; gene_name "galectin"; p_id "P18001"; transcript_id "NM_001258884"; tss_id "TSS6545";
    chr2L unknown start_codon 72013 72015 . + . gene_id "galectin"; gene_name "galectin"; p_id "P18001"; transcript_id "NM_001258884"; tss_id "TSS6545";
    chr2L unknown exon 72387 72977 . + . gene_id "galectin"; gene_name "galectin"; p_id "P12530"; transcript_id "NM_134643"; tss_id "TSS12137";
    chr2L unknown CDS 72603 72977 . + 0 gene_id "galectin"; gene_name "galectin"; p_id "P12530"; transcript_id "NM_134643"; tss_id "TSS12137";
    chr2L unknown start_codon 72603 72605 . + . gene_id "galectin"; gene_name "galectin"; p_id "P12530"; transcript_id "NM_134643"; tss_id "TSS12137";
    chr2L unknown exon 73485 73692 . + . gene_id "galectin"; gene_name "galectin"; p_id "P8803"; transcript_id "NM_001169367"; tss_id "TSS3981";
    chr2L unknown exon 73485 73692 . + . gene_id "galectin"; gene_name "galectin"; p_id "P9464"; transcript_id "NM_001272859"; tss_id "TSS3981";
    chr2L unknown CDS 73570 73692 . + 0 gene_id "galectin"; gene_name "galectin"; p_id "P8803"; transcript_id "NM_001169367"; tss_id "TSS3981";
    chr2L unknown start_codon 73570 73572 . + . gene_id "galectin"; gene_name "galectin"; p_id "P8803"; transcript_id "NM_001169367"; tss_id "TSS3981";
    chr2L unknown exon 73820 73897 . + . gene_id "galectin"; gene_name "galectin"; p_id "P9464"; transcript_id "NM_001272859"; tss_id "TSS3981";
    chr2L unknown exon 74129 74572 . + . gene_id "galectin"; gene_name "galectin"; p_id "P7409"; transcript_id "NM_001169366"; tss_id "TSS12421";
    chr2L unknown CDS 74501 74572 . + 0 gene_id "galectin"; gene_name "galectin"; p_id "P7409"; transcript_id "NM_001169366"; tss_id "TSS12421";
    chr2L unknown start_codon 74501 74503 . + . gene_id "galectin"; gene_name "galectin"; p_id "P7409"; transcript_id "NM_001169366"; tss_id "TSS12421";
    chr2L unknown CDS 74903 75018 . + 0 gene_id "galectin"; gene_name "galectin"; p_id "P8803"; transcript_id "NM_001169367"; tss_id "TSS3981";
    chr2L unknown CDS 74903 75018 . + 0 gene_id "galectin"; gene_name "galectin"; p_id "P12530"; transcript_id "NM_134643"; tss_id "TSS12137";
    chr2L unknown CDS 74903 75018 . + 0 gene_id "galectin"; gene_name "galectin"; p_id "P18001"; transcript_id "NM_001258884"; tss_id "TSS6545";
    chr2L unknown CDS 74903 75018 . + 0 gene_id "galectin"; gene_name "galectin"; p_id "P7409"; transcript_id "NM_001169366"; tss_id "TSS12421";
    chr2L unknown exon 74903 75018 . + . gene_id "galectin"; gene_name "galectin"; p_id "P8803"; transcript_id "NM_001169367"; tss_id "TSS3981";
    chr2L unknown exon 74903 75018 . + . gene_id "galectin"; gene_name "galectin"; p_id "P12530"; transcript_id "NM_134643"; tss_id "TSS12137";
    chr2L unknown exon 74903 75018 . + . gene_id "galectin"; gene_name "galectin"; p_id "P18001"; transcript_id "NM_001258884"; tss_id "TSS6545";
    chr2L unknown exon 74903 75018 . + . gene_id "galectin"; gene_name "galectin"; p_id "P7409"; transcript_id "NM_001169366"; tss_id "TSS12421";
    chr2L unknown exon 74903 75018 . + . gene_id "galectin"; gene_name "galectin"; p_id "P9464"; transcript_id "NM_001272859"; tss_id "TSS3981";
    chr2L unknown CDS 75078 76095 . + 1 gene_id "galectin"; gene_name "galectin"; p_id "P8803"; transcript_id "NM_001169367"; tss_id "TSS3981";
    chr2L unknown CDS 75078 76095 . + 1 gene_id "galectin"; gene_name "galectin"; p_id "P12530"; transcript_id "NM_134643"; tss_id "TSS12137";
    chr2L unknown CDS 75078 76095 . + 1 gene_id "galectin"; gene_name "galectin"; p_id "P18001"; transcript_id "NM_001258884"; tss_id "TSS6545";
    chr2L unknown CDS 75078 76095 . + 1 gene_id "galectin"; gene_name "galectin"; p_id "P7409"; transcript_id "NM_001169366"; tss_id "TSS12421";
    chr2L unknown exon 75078 76211 . + . gene_id "galectin"; gene_name "galectin"; p_id "P8803"; transcript_id "NM_001169367"; tss_id "TSS3981";
    chr2L unknown exon 75078 76211 . + . gene_id "galectin"; gene_name "galectin"; p_id "P12530"; transcript_id "NM_134643"; tss_id "TSS12137";
    chr2L unknown exon 75078 76211 . + . gene_id "galectin"; gene_name "galectin"; p_id "P18001"; transcript_id "NM_001258884"; tss_id "TSS6545";
    chr2L unknown exon 75078 76211 . + . gene_id "galectin"; gene_name "galectin"; p_id "P7409"; transcript_id "NM_001169366"; tss_id "TSS12421";
    chr2L unknown exon 75078 76211 . + . gene_id "galectin"; gene_name "galectin"; p_id "P9464"; transcript_id "NM_001272859"; tss_id "TSS3981";
    chr2L unknown CDS 75280 76095 . + 0 gene_id "galectin"; gene_name "galectin"; p_id "P9464"; transcript_id "NM_001272859"; tss_id "TSS3981";
    chr2L unknown start_codon 75280 75282 . + . gene_id "galectin"; gene_name "galectin"; p_id "P9464"; transcript_id "NM_001272859"; tss_id "TSS3981";
    chr2L unknown stop_codon 76096 76098 . + . gene_id "galectin"; gene_name "galectin"; p_id "P8803"; transcript_id "NM_001169367"; tss_id "TSS3981";
    chr2L unknown stop_codon 76096 76098 . + . gene_id "galectin"; gene_name "galectin"; p_id "P12530"; transcript_id "NM_134643"; tss_id "TSS12137";
    chr2L unknown stop_codon 76096 76098 . + . gene_id "galectin"; gene_name "galectin"; p_id "P18001"; transcript_id "NM_001258884"; tss_id "TSS6545";
    chr2L unknown stop_codon 76096 76098 . + . gene_id "galectin"; gene_name "galectin"; p_id "P7409"; transcript_id "NM_001169366"; tss_id "TSS12421";
    chr2L unknown stop_codon 76096 76098 . + . gene_id "galectin"; gene_name "galectin"; p_id "P9464"; transcript_id "NM_001272859"; tss_id "TSS3981";

    As you can see, in the annotation galectin starts at 71757 on 2L, ending at 76098, however cufflinks has placed it starting at 21821 and ending at 76276.

    Any thoughts on this would be very much appreciated.
    Thanks,

    Gordon

Latest Articles

Collapse

  • seqadmin
    Recent Advances in Sequencing Analysis Tools
    by seqadmin


    The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
    05-06-2024, 07:48 AM
  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 05-14-2024, 07:03 AM
0 responses
17 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-10-2024, 06:35 AM
0 responses
40 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-09-2024, 02:46 PM
0 responses
50 views
0 likes
Last Post seqadmin  
Started by seqadmin, 05-07-2024, 06:57 AM
0 responses
41 views
0 likes
Last Post seqadmin  
Working...
X