Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cuffdiff multi-protein vs multi-promoter

    I was hoping someone could clarify. I don't quite understnd the difference between the x_y_tss and x_y_CDS. I know it has something to do with the tss_ID and p_IDs, and I've read the manual 10 times. Does the x_y_tss tests mean that transcripts from the same gene, but at same promoter site are expressed differently and the x_y_CDS mean alternative transcripts but same promoter? In other words tss is for exon skipping and CDS is for alternative promoters?

    If this is the case, what is the difference in these differential expression tests from the cds.diff, splicing.diff, and promoters.diff files?

  • #2
    We'll clarify the manual with a picture, which I think will explain this issue much better. In the short term: suppose you have a gene with two isoforms, each of which starts at a different TSS. They could actually code for the same protein (i.e. only differ in UTR length), or they could code for different proteins. In the first case, they would have the same p_id. In the second case, they'd have different p_ids. The reason we did it this way is that we are interested in cases where you have switching in promoter use, and we wanted to see in how many of these genes that switch might actually mean a switch in the dominant protein being produced.

    In our manual and terminology, "splicing" refers only to the processing of a primary transcript, so alternative TSS doesn't strictly fall under "splicing". I realize that many people group alternative TSS under "alternative splicing".

    So within a given gene:

    X_Y_tss_group_exp has rows that are groups of transcripts that share a tss_id, and gives the total FPKM for each TSS group
    X_Y_gene_exp has rows that are groups of transcripts that share a gene_id, and gives the total FPKM for each gene
    X_Y_cds_exp has rows that are groups of transcripts that share a p_id, and gives the total FPKM for each CDS group

    X_Y_splicing has rows that are groups of transcripts that share a tss_id, and gives the change in relative abundance of transcripts that share a tss_id

    X_Y_promoters has rows that are groups of primary transcripts that share a gene_id. There is one primary transcript for each tss_id, and its expression is given in X_Y_tss_group_exp. X_Y_promoters gives the change in relative abundance of primary transcripts that share a gene_id, i.e. genes with promoter switching.

    X_Y_cds (not X_Y_cds_exp) is just like X_Y_promoters, except instead of primary transcripts (transcripts grouped by tss_id), we're working with groups of transcripts that code for the same protein (transcripts grouped by tss_id).
    Last edited by Cole Trapnell; 03-26-2010, 09:18 AM.

    Comment


    • #3
      Originally posted by Cole Trapnell View Post
      We'll clarify the manual with a picture, which I think will explain this issue much better. In the short term: suppose you have a gene with two isoforms, each of which starts at a different TSS. They could actually code for the same protein (i.e. only differ in UTR length), or they could code for different proteins. In the first case, they would have the same p_id. In the second case, they'd have different p_ids. The reason we did it this way is that we are interested in cases where you have switching in promoter use, and we wanted to see in how many of these genes that switch might actually mean a switch in the dominant protein being produced.
      So transcripts sharing p_id means they have alternative UTRs (but same protein sequence) whereas those that have different p_id are involved in exon skipping?

      I may just have to wait for the picture. How soon do you think the Cufflinks paper will be out?

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Latest Developments in Precision Medicine
        by seqadmin



        Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

        Somatic Genomics
        “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
        Today, 01:16 PM
      • seqadmin
        Recent Advances in Sequencing Analysis Tools
        by seqadmin


        The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
        05-06-2024, 07:48 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Today, 07:15 AM
      0 responses
      10 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 10:28 AM
      0 responses
      15 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 07:35 AM
      0 responses
      16 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 05-22-2024, 02:06 PM
      0 responses
      8 views
      0 likes
      Last Post seqadmin  
      Working...
      X