Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Limma/voom for RNA-seq data

    I am a beginner in using limma in order to analyse data and have some problems when trying to use it for the analysis of RNA-seq data.
    To be more precize, I have a time course experiment, with samples from two conditions obtained at different time points, and I would like to see how gene expression changes through time.
    I have a count table dataframe with genes as rows and conditions as colums and here is what I've done:

    row.names(countTable)=make.names(countTable[,1], unique=TRUE)

    #creating the targets frame from the file Target.txt which is:
    #FileName Target
    #F1 condition1 (=first column of countTable)
    targets <- readTargets("Target.txt",sep="\t")

    #now for creating a design matrix

    f <- factor(targets$Target, levels=conditions)
    design <- model.matrix(~0+f)
    colnames(design) <- conditions
    #this gives me something like:
    # condition1 condition2 ...
    #1 0 1
    #2 1 0

    #Then I try to normalize my data
    Count=data.matrix(countTable,rownames.force=NA)#because it won't be numeric otherwise?
    norm.factor <- calcNormFactors(Count)
    #I try to use voom
    y<-DGEList(counts=Count, lib.size=colSums(Count)*norm.factor, norm.factors=norm.factor, genes=rownames(countTable))

    And this is where it returns an error:
    Error in plot.window(...) : finite values needed for 'ylim'
    1: In min(x) : no argument found for min ; Inf is returned
    2: In max(x) : no argument found for max ; -Inf is returned

    Could someone please help me? I don't really know what I have done wrong.
    Another thing, more general this time. Could someone explain to me how is used the design matrix exactly?
    Thanks for reading!
    (please forgive me if you see any grammar mistake- i'm french. But you're welcome to correct me.)
    Last edited by BioLion; 04-07-2014, 01:30 AM.

  • #2
    It looks a lot like you don't have replicates for your conditions (at least your "conditions" variable would seem to indicate that). Since voom tries to accurately model the mean-variance relationship I wouldn't be surprised if the lack of replications causes serious problems with that. The actual error message is due to it attempting to make a graph of the mean-variance relationship (at least that's my recollection). I recall that you can disable that, though then you'd likely just run into problems downstream (you might want to have a look at the returned values in any case).

    For how the design matrix is actually used, it's quite useful if you've taken linear algebra. With linear (or generalized linear) modeling, you're trying to solve the equation:

    Where "Y" is the matrix of observed values, B0 is an intercept or base expression level vector, B1 is the vector containing fit coefficients, "X" is your design matrix and "err" are the residuals. For more details on that, I'd refer you to a statistics/math book or even wikipedia (the English article is OK at least).


    • #3
      Thanks a lot for your clear answers. I get what is the design matric now.
      True, I don't have replicates, it is an exploratory analysis. Then, if I understand well, the programm wasn't able to estimate a mean-variance relationship and couldn't plot a graph?
      Another think: I just tried without precizing plot=TRUE and it gave me a new error:
      v <- voom(y,design)
      Erreor in approxfun(l, rule = 2) :
      need at least two non-NA values to interpolate

      However when I tried to see where were the NA-values, it did not returned any...
      > sum($counts))
      [1] 0
      > sum($samples))
      [1] 0
      > sum($genes))
      [1] 0

      Could you help me to understand this error?
      Thanks again, and sorry if these questions seems too basic.


      • #4
        I suspect that stems from the lack of replicates. Since voom tries to fit a mean-variance relationship using lowess regression (at least if my memory serves), then the regression step probably uses approxfun at some point, which wouldn't work with NAs introduced by a lack replicates. I suppose you could do the equivalent of a "blind" estimation, by doing:

        design2 <- model.matrix(~1)
        v <- voom(y,design2)
        and then using "design" instead of "design2" later on. However I wouldn't put much weight on the results. I suspect that there's no good way to do this without replicates, which should be included even in pilot experiments (the whole idea of a pilot experiment is to gauge the rough effect size, which means you need to know the background variance of at least one of the groups). It's really unfortunate that people so often attempt pilots without's mostly a waste of money and time.
        Last edited by dpryan; 04-07-2014, 05:30 AM.


        • #5
          I should add that before using the "blind" method I outlined, you should probably search the Bioconductor email list for mention of this sort of situation and, if not, ask Gordon Smyth or one of the other authors of voom/limma/edgeR for advice. I suspect that they'll echo my assessment of the usefulness of unreplicated experiments, but then at least you'll get an answer from the absolute most expert people on limma/voom.


          • #6
            Thanks again for your answers. I do realize that this is not the ideal situation.
            I'll try and contact the authors of voom/limma and then try to use the blind method if I don't find another accurate way of doing this.
            Have a good day, and thanks again!


            • #7
              Hello, BioLion

              Have you solved this problem? I met the same question with you.


              Latest Articles


              • seqadmin
                Best Practices for Single-Cell Sequencing Analysis
                by seqadmin

                While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                06-06-2024, 07:15 AM
              • seqadmin
                Latest Developments in Precision Medicine
                by seqadmin

                Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                Somatic Genomics
                “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                05-24-2024, 01:16 PM





              Topics Statistics Last Post
              Started by seqadmin, Today, 07:23 AM
              0 responses
              Last Post seqadmin  
              Started by seqadmin, 06-17-2024, 06:54 AM
              0 responses
              Last Post seqadmin  
              Started by seqadmin, 06-14-2024, 07:24 AM
              0 responses
              Last Post seqadmin  
              Started by seqadmin, 06-13-2024, 08:58 AM
              0 responses
              Last Post seqadmin