Header Leaderboard Ad

Collapse

Multiple DGE libraries comparison. (EdgeR baySeq DESeq)

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #46
    Hi Simon,

    When using the command "cds = estimateVarianceFunctions(cds)" I get the following error:

    Error in lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth, :
    newsplit: out of vertex space
    In addition: Warning messages:
    1: In lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth, :
    procv: no points with non-zero weight
    2: In lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth, :
    procv: no points with non-zero weight
    3: In lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth, :
    procv: no points with non-zero weight
    4: In lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth, :
    procv: no points with non-zero weight
    5: In lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth, :
    procv: no points with non-zero weight
    6: In lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth, :
    procv: no points with non-zero weight
    7: In lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth, :
    procv: no points with non-zero weight
    8: In lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth, :
    procv: no points with non-zero weight
    9: In lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth, :
    procv: no points with non-zero weight
    10: In lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth, :
    procv: no points with non-zero weight

    I'm not sure what's causing the error and was wondering if anyone has seen it before?

    Thank you.

    Comment


    • #47
      Hi Joro,

      "out of vertex space" is an error from locfit, which DESeq uses internally. I've seen the error occuring onl,y once so far, in a quite unusual data set.

      You can try to increase the number of vertices with:
      Code:
      cds <- estimateVarianceFunctions( cds, locfit_extra_args=list(maxk=300) )
      If this does not help, could you send me your CountDataSet object (save it with save( cds, file="cds.rda" )) so that I can try around with it?

      Cheers
      Simon

      Comment


      • #48
        recovering counts from cufflinks

        Originally posted by Boel View Post
        Is there a way to output the actual counts?
        Did anyone respond to this post?

        From the cufflinks output, I tried using the coverage value * (exon length/read length) to get counts, but it doesn't seem to be the right value. My reasoning was that if this is the counts (or fragments) in an exon, I should be able to recover the total number of mapped fragments and that value should be the same over all exons.

        fragments / (exon length/1000) = fragments/kb, then

        FPKM * fragments/kb = X million reads

        However, I'm getting a 2-fold range for how many million reads were sequenced. Could this be because the FPKM is a MLE?

        Thanks for any input/advice.

        Alisha

        Comment


        • #49
          Originally posted by agent99 View Post
          Did anyone respond to this post?

          From the cufflinks output, I tried using the coverage value * (exon length/read length) to get counts, but it doesn't seem to be the right value. My reasoning was that if this is the counts (or fragments) in an exon, I should be able to recover the total number of mapped fragments and that value should be the same over all exons.

          fragments / (exon length/1000) = fragments/kb, then

          FPKM * fragments/kb = X million reads

          However, I'm getting a 2-fold range for how many million reads were sequenced. Could this be because the FPKM is a MLE?

          Thanks for any input/advice.

          Alisha
          I have had some incosistencies with cufflinks with gene-level expression. I would sugest moving on to using a simple program such as HT-Seq or BEDtools to get the actual tag counts within the gene region. Then, you could use DESeq for the normalization.

          Comment


          • #50
            Originally posted by RockChalkJayhawk View Post
            I have had some incosistencies with cufflinks with gene-level expression. I would sugest moving on to using a simple program such as HT-Seq or BEDtools to get the actual tag counts within the gene region. Then, you could use DESeq for the normalization.
            Thanks for the advice. Apologies for being a little unclear in this post. I had hoped to get transcript level counts, so I was using cufflinks to determine abundance of different isoforms.

            Comment


            • #51
              Hi again Simon, I wonder if you had since then any suggestion about a possible method in order to compare multiple library condition.

              Thanks

              Comment


              • #52
                Multiple DGE libraries comparison. (EdgeR baySeq DESeq)

                Dear Roman,

                This is a very late reply to your original post of a year ago in which you described an experiment with four RNA-Seq libraries, Male-Old, Male-Young, Female-Old and Female-Young. You wanted to find transcripts differentially expressed between Male and Female.

                This type of experiment is typically treated in a way that is analogous to a paired t-test, in which you test for Male vs Female differences, adjusting for any differences between Young and Old.

                When you originally posted your question, edgeR did not have facilities for this type of analysis, but it has since the Bioconductor release of Oct 2010. The edgeR Vignette gives in Section 11 a worked example of an analysis with paired comparisons (tumour vs normal for the same patients) that is analogous to your problem. The analysis goes something like this:

                design <- model.matrix(~Age+Sex)
                y <- estimateGLMCommonDisp(y,design)
                disp <- y$common.dispersion
                fit <- glmFit(y,design,dispersion=disp)
                lrt <- glmLRT(y,fit)
                topTags(lrt)

                edgeR not require you to have replicates of any of the four
                conditions, because the linear model that is fitted has only 3 coefficients.

                Try out the paired comparison analysis and let us know if you have any problems with it.

                Best wishes
                Gordon

                Comment


                • #53
                  Originally posted by Simon Anders View Post
                  Wow, this sounds like an embarrassing bug!

                  I've had a look immediately -- and, guess what, the variance adjustment does exactly what it should, but what I have written about it in the vignette (and further up in this thread) was completely the wrong way round.

                  ...

                  I'll correct the vignette right away.

                  Thanks for pointing this out
                  Simon
                  Hi Simon,

                  Thanks for the helpful comments in this thread!

                  But, I was reading the whole thread from the beginning - and maybe you should edit the post - number 31 - to say you noticed later it is the wrong way round for others doing the same and reading the whole thread! I've only just noticed on post 45 when I could have saved myself some confusion earlier in the thread.

                  Chris

                  Comment

                  Working...
                  X