Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Limma/voom for RNA-seq data

    Hello,
    I am a beginner in using limma in order to analyse data and have some problems when trying to use it for the analysis of RNA-seq data.
    To be more precize, I have a time course experiment, with samples from two conditions obtained at different time points, and I would like to see how gene expression changes through time.
    I have a count table dataframe with genes as rows and conditions as colums and here is what I've done:

    countTable<-read.table("TableauDeseq.txt",header=TRUE,sep="\t",na.strings="NA")
    row.names(countTable)=make.names(countTable[,1], unique=TRUE)

    #creating the targets frame from the file Target.txt which is:
    #FileName Target
    #F1 condition1 (=first column of countTable)
    #etc
    targets <- readTargets("Target.txt",sep="\t")
    row.names(targets)=make.names(targets[,1])

    #now for creating a design matrix
    conditions<-factor(c("condition1","condition2","condition3","condition4","condition5","condition6","condition7","condition8"))

    f <- factor(targets$Target, levels=conditions)
    design <- model.matrix(~0+f)
    colnames(design) <- conditions
    #this gives me something like:
    # condition1 condition2 ...
    #1 0 1
    #2 1 0

    #Then I try to normalize my data
    library("edgeR")
    Count=data.matrix(countTable,rownames.force=NA)#because it won't be numeric otherwise?
    norm.factor <- calcNormFactors(Count)
    #I try to use voom
    y<-DGEList(counts=Count, lib.size=colSums(Count)*norm.factor, norm.factors=norm.factor, genes=rownames(countTable))


    And this is where it returns an error:
    Error in plot.window(...) : finite values needed for 'ylim'
    1: In min(x) : no argument found for min ; Inf is returned
    2: In max(x) : no argument found for max ; -Inf is returned



    Could someone please help me? I don't really know what I have done wrong.
    Another thing, more general this time. Could someone explain to me how is used the design matrix exactly?
    Thanks for reading!
    (please forgive me if you see any grammar mistake- i'm french. But you're welcome to correct me.)
    Last edited by BioLion; 04-07-2014, 01:30 AM.

  • #2
    It looks a lot like you don't have replicates for your conditions (at least your "conditions" variable would seem to indicate that). Since voom tries to accurately model the mean-variance relationship I wouldn't be surprised if the lack of replications causes serious problems with that. The actual error message is due to it attempting to make a graph of the mean-variance relationship (at least that's my recollection). I recall that you can disable that, though then you'd likely just run into problems downstream (you might want to have a look at the returned values in any case).

    For how the design matrix is actually used, it's quite useful if you've taken linear algebra. With linear (or generalized linear) modeling, you're trying to solve the equation:

    Code:
    Y=B0+B1X+err
    Where "Y" is the matrix of observed values, B0 is an intercept or base expression level vector, B1 is the vector containing fit coefficients, "X" is your design matrix and "err" are the residuals. For more details on that, I'd refer you to a statistics/math book or even wikipedia (the English article is OK at least).

    Comment


    • #3
      Thanks a lot for your clear answers. I get what is the design matric now.
      True, I don't have replicates, it is an exploratory analysis. Then, if I understand well, the programm wasn't able to estimate a mean-variance relationship and couldn't plot a graph?
      Another think: I just tried without precizing plot=TRUE and it gave me a new error:
      v <- voom(y,design)
      Erreor in approxfun(l, rule = 2) :
      need at least two non-NA values to interpolate


      However when I tried to see where were the NA-values, it did not returned any...
      > sum(is.na(y$counts))
      [1] 0
      > sum(is.na(y$samples))
      [1] 0
      > sum(is.na(y$genes))
      [1] 0

      Could you help me to understand this error?
      Thanks again, and sorry if these questions seems too basic.

      Comment


      • #4
        I suspect that stems from the lack of replicates. Since voom tries to fit a mean-variance relationship using lowess regression (at least if my memory serves), then the regression step probably uses approxfun at some point, which wouldn't work with NAs introduced by a lack replicates. I suppose you could do the equivalent of a "blind" estimation, by doing:

        Code:
        design2 <- model.matrix(~1)
        v <- voom(y,design2)
        and then using "design" instead of "design2" later on. However I wouldn't put much weight on the results. I suspect that there's no good way to do this without replicates, which should be included even in pilot experiments (the whole idea of a pilot experiment is to gauge the rough effect size, which means you need to know the background variance of at least one of the groups). It's really unfortunate that people so often attempt pilots without replicates...it's mostly a waste of money and time.
        Last edited by dpryan; 04-07-2014, 05:30 AM.

        Comment


        • #5
          I should add that before using the "blind" method I outlined, you should probably search the Bioconductor email list for mention of this sort of situation and, if not, ask Gordon Smyth or one of the other authors of voom/limma/edgeR for advice. I suspect that they'll echo my assessment of the usefulness of unreplicated experiments, but then at least you'll get an answer from the absolute most expert people on limma/voom.

          Comment


          • #6
            Thanks again for your answers. I do realize that this is not the ideal situation.
            I'll try and contact the authors of voom/limma and then try to use the blind method if I don't find another accurate way of doing this.
            Have a good day, and thanks again!

            Comment


            • #7
              Hello, BioLion

              Have you solved this problem? I met the same question with you.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                Yesterday, 07:01 AM
              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              58 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              54 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              45 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              55 views
              0 likes
              Last Post seqadmin  
              Working...
              X