Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • RockChalkJayhawk
    Senior Member
    • Mar 2009
    • 192

    Cuffdiff multi-protein vs multi-promoter

    I was hoping someone could clarify. I don't quite understnd the difference between the x_y_tss and x_y_CDS. I know it has something to do with the tss_ID and p_IDs, and I've read the manual 10 times. Does the x_y_tss tests mean that transcripts from the same gene, but at same promoter site are expressed differently and the x_y_CDS mean alternative transcripts but same promoter? In other words tss is for exon skipping and CDS is for alternative promoters?

    If this is the case, what is the difference in these differential expression tests from the cds.diff, splicing.diff, and promoters.diff files?
  • Cole Trapnell
    Senior Member
    • Nov 2008
    • 213

    #2
    We'll clarify the manual with a picture, which I think will explain this issue much better. In the short term: suppose you have a gene with two isoforms, each of which starts at a different TSS. They could actually code for the same protein (i.e. only differ in UTR length), or they could code for different proteins. In the first case, they would have the same p_id. In the second case, they'd have different p_ids. The reason we did it this way is that we are interested in cases where you have switching in promoter use, and we wanted to see in how many of these genes that switch might actually mean a switch in the dominant protein being produced.

    In our manual and terminology, "splicing" refers only to the processing of a primary transcript, so alternative TSS doesn't strictly fall under "splicing". I realize that many people group alternative TSS under "alternative splicing".

    So within a given gene:

    X_Y_tss_group_exp has rows that are groups of transcripts that share a tss_id, and gives the total FPKM for each TSS group
    X_Y_gene_exp has rows that are groups of transcripts that share a gene_id, and gives the total FPKM for each gene
    X_Y_cds_exp has rows that are groups of transcripts that share a p_id, and gives the total FPKM for each CDS group

    X_Y_splicing has rows that are groups of transcripts that share a tss_id, and gives the change in relative abundance of transcripts that share a tss_id

    X_Y_promoters has rows that are groups of primary transcripts that share a gene_id. There is one primary transcript for each tss_id, and its expression is given in X_Y_tss_group_exp. X_Y_promoters gives the change in relative abundance of primary transcripts that share a gene_id, i.e. genes with promoter switching.

    X_Y_cds (not X_Y_cds_exp) is just like X_Y_promoters, except instead of primary transcripts (transcripts grouped by tss_id), we're working with groups of transcripts that code for the same protein (transcripts grouped by tss_id).
    Last edited by Cole Trapnell; 03-26-2010, 09:18 AM.

    Comment

    • RockChalkJayhawk
      Senior Member
      • Mar 2009
      • 192

      #3
      Originally posted by Cole Trapnell View Post
      We'll clarify the manual with a picture, which I think will explain this issue much better. In the short term: suppose you have a gene with two isoforms, each of which starts at a different TSS. They could actually code for the same protein (i.e. only differ in UTR length), or they could code for different proteins. In the first case, they would have the same p_id. In the second case, they'd have different p_ids. The reason we did it this way is that we are interested in cases where you have switching in promoter use, and we wanted to see in how many of these genes that switch might actually mean a switch in the dominant protein being produced.
      So transcripts sharing p_id means they have alternative UTRs (but same protein sequence) whereas those that have different p_id are involved in exon skipping?

      I may just have to wait for the picture. How soon do you think the Cufflinks paper will be out?

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Pathogen Surveillance with Advanced Genomic Tools
        by seqadmin




        The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
        03-24-2025, 11:48 AM
      • seqadmin
        New Genomics Tools and Methods Shared at AGBT 2025
        by seqadmin


        This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

        The Headliner
        The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
        03-03-2025, 01:39 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 03-20-2025, 05:03 AM
      0 responses
      49 views
      0 reactions
      Last Post seqadmin  
      Started by seqadmin, 03-19-2025, 07:27 AM
      0 responses
      57 views
      0 reactions
      Last Post seqadmin  
      Started by seqadmin, 03-18-2025, 12:50 PM
      0 responses
      49 views
      0 reactions
      Last Post seqadmin  
      Started by seqadmin, 03-03-2025, 01:15 PM
      0 responses
      200 views
      0 reactions
      Last Post seqadmin  
      Working...