Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • splice-form aware input for DESeq/EdgeR/baySeq

    I am looking for a way to quantify and statistically evaluate spliceforms across a set of RNA-Seq experiments.

    My current understanding is that the input to DESeq/edgeR/baySeq should be simply reads mapped to a gene locus. Since cufflinks assigns spliceform abundances one library at a time, any systematic errors inherent in that sample are confounded into the spliceform quantification problem. As I understand it, these systematic errors really should be corrected for with a general linear model, which takes as input all the samples of interest/relevance. (Cufflinks aficionados, please correct me if I am wrong!)

    Also as I understand it, the developers of DESeq/edgeR/baySeq have come to the conclusion (along with many others) that RPKM/FPKM is not a sufficient correction to be able to compare different genes within the same library to each other. There appear to be additional biases (beyond length) that affect the transformation from mRNA to RNA-Seq sequences. Therefore, it has been suggested that instead, it is only reasonable, for now, to restrict ourselves to comparing abundances of the same gene between different samples.

    I find this solution somewhat unsatisfying, though. If I have two spliceforms which, by definition, originate from the same genetic locus, but have different lengths, and varying expression levels in the two (or more) conditions I am surveying, then the noise associated with those two expression levels is different. Moreover, given the current model for mean-variance relationships (negative bionomial), noise, unlike expression level, is not linear. So I would not expect the noise from two genes with the same average expression level, but one containing many differently-regulated spliceforms, and the other containing a single spliceform, to follow the same distribution. Ideally, I would want a general linear model that can simultaneously correct for systematic (non-biological) errors in the sample collection process and estimate spliceform abundances as well.

    Is there a good reason such a model is unnecessary? Is there a good reason to be content with locus-level abundances?

    Thanks for your input!
    ~Rachel

  • #2
    Check out DEXseq. It looks at differential exon-usage and is based on DESeq.

    Comment


    • #3
      Originally posted by Rachel Hillmer View Post
      I find this solution somewhat unsatisfying, though. If I have two spliceforms which, by definition, originate from the same genetic locus, but have different lengths, and varying expression levels in the two (or more) conditions I am surveying, then the noise associated with those two expression levels is different. Moreover, given the current model for mean-variance relationships (negative bionomial), noise, unlike expression level, is not linear. So I would not expect the noise from two genes with the same average expression level, but one containing many differently-regulated spliceforms, and the other containing a single spliceform, to follow the same distribution.
      ~Rachel
      It is perfectly possible for a sum of negative binomial random variables to also be negatively binomial distributed, even when the means are different. So it is perfectly possible for a quadratic mean-variance relationship to hold both for separate isoforms of varying expression levels and for the total aggregate count for the whole gene region. This requires only that the level of biological variability be comparable between the isoforms.

      Even if this relationship was not satisfied exactly, the quadratic variance function would likely still be more realistic than assuming biological variability to be absent, which is what one is doing by using a Poisson distribution.

      Gordon

      Comment


      • #4
        Originally posted by chadn737 View Post
        Check out DEXseq. It looks at differential exon-usage and is based on DESeq.
        Or the function spliceVariants() in the edgeR package.

        This and DEXSeq are designed to test for differential splicing though -- they don't attempt to quantify the expression levels of the different isoforms in absolute terms.

        Gordon

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        31 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        32 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        28 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        53 views
        0 likes
        Last Post seqadmin  
        Working...
        X