Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • EdgeR script help

    Hi. I had just started learning R and have been working on a script on EdgeR for Illumina data to analyze differential expression. SO I have two different samples with two tissues from each for which I have count data set for each one of them. I am intersted in comparing one tissue from one sample with other one from another sample (as well interested to compare within a sample). Below is the script I was trying to use, which gives error. Since I am new to this R programming, I am not able to figure what or where the problem is. I would really appreciate if someone could help me with this.Thanks.

    library(edgeR)
    library(limma)
    setwd("C:\\cygwin\\home\\Prabhakaran")
    load("test1.txt")
    raw.data <- read.table(file="test1.txt", header=TRUE)
    names(raw.data)
    d <- raw.data[, 2:3]
    rownames(d) <- raw.data[ ,1]
    sample <- factor(R91+92_LFE_Hits, X95+96_clark_E_hits)
    d <- estimateCommonDisp(d)
    et <- exactTest(d)
    topTags(et)
    etTag = rownames(topTags(et)$table)
    sum(et$table$p.value <0.05)
    sum(p.adjust(et$table$p.value,method="BH") < 0.1)
    good = sum(et$table$p.value <0.05)
    goodList = topTags(et, n=good)
    sink("output1.csv")
    goodList
    sink()

  • #2
    EdgeR script help

    What error do you get?

    Have you tried running the commands in your script one at a
    time to see how far you get before you get the error?

    Comment


    • #3
      d <- raw.data[, 2:3]

      This means that you have two columns of count data, so you cannot do DE analysis because of not having biological replicates. Not sure if EdgeR gives an error because of this, but it is the first thing to note.

      As Mastal says, if you give output or run lines one by one and show us that we could help more.

      Comment


      • #4
        Thanks Mastal and Bruce.
        I did run the script line by line and I get an error at line 9 and 10. Also i am not able to perform the exactTest. Tried to modify the two lines where I got an error but didnt get the right one.

        And Yes i have countdata in the 2nd and 3rd column in the file and it is the data of two tissues from the same sample say epidermis and meristem. trying to look at the DE genes between two.
        Last edited by Prabhakaran; 07-30-2013, 06:31 AM.

        Comment


        • #5
          Originally posted by Prabhakaran View Post

          sample <- factor(R91+92_LFE_Hits, X95+96_clark_E_hits)
          d <- estimateCommonDisp(d)
          Are the lines above the lines that gave errors?

          What were the error messages that you got?

          I'm not surprised that you weren't able to perform the exact test,
          if the errors were in the lines immediately before that step.

          I'm not sure what you are trying to do with the variable 'sample',
          a factor in R should have levels, so you need a vector with the allowed values for the levels. If you are trying to join two vectors, in R you would do it as 'c(vector1, vector2)'.

          Comment


          • #6
            I wasnt even surprised to see that error when the problem is in reading the input itself. I tried to modify the column name still it doesnt work. Coming to the R part, like i mentioned earlier im just starting and trying to understand as i go along. Following is the error:

            Error in source("C:\\cygwin\\home\\Prabhakaran\\test.R") :
            C:\cygwin\home\Prabhakaran\test.R:9:19: unexpected input
            8: rownames(d) <- raw.data[ ,1]
            9: sample <- c(R91+92_

            Comment


            • #7
              The 'sample' variable is not used anywhere in the rest of your script. What is it for? Is it conditions? So what do you have when you do:
              head(raw.data)
              Do you get those two names you are trying to put into 'sample'? Try wrapping them in quotes:
              sample<-c("R91+92_LFE_Hits", "X95+96_clark_E_hits")
              otherwise R looks for variable of those names, which probably do not exist...

              Comment


              • #8
                A couple comments:
                Code:
                raw.data <- read.table(file="test1.txt", header=TRUE)
                names(raw.data)
                d <- raw.data[, 2:3]
                rownames(d) <- raw.data[ ,1]
                can simply be:
                Code:
                d <- read.table(file="test1.txt", header=T, row.names=1)
                You may also be getting an error since you're using "rownames" rather than "row.names" (the former takes an array and the latter a data frame).

                Code:
                sample <- factor(R91+92_LFE_Hits, X95+96_clark_E_hits)
                As already pointed out, you need quotes:
                Code:
                sample <- factor("R91+92_LFE_Hits", "X95+96_clark_E_hits")
                Aside from that, with only two samples, you're wasting your time anyway.

                Comment


                • #9
                  Hi,

                  Could anyone share an EdgeR script for differential expression analysis of four different samples.

                  Thank you.

                  Comment


                  • #10
                    What's the experimental design and have you already read through the vignette?

                    Comment


                    • #11
                      Hello,
                      I have started using edgeR, I need some input about the method.
                      I have a data set with read counts from two different patients ( lets say it looks like the below)
                      head(m)
                      Sample_118z.0 Sample_132z.0 Sample_118p.0 Sample_132p2.0
                      XLOC_000001 626 3516 1534 2603
                      XLOC_000002 82 342 175 304
                      XLOC_000003 361 2000 80 195
                      XLOC_000004 30 143 49 66
                      XLOC_000005 0 0 0 0
                      XLOC_000006 0 0 0 1

                      A sample of the data file
                      I also have the spike ins data which should be used to normalized this counts
                      head(sp)
                      Sample_118z.0 Sample_132z.0 Sample_118p.0 Sample_132p2.0
                      ERCC-00009 30 143 49 66
                      ERCC-00025 2 13 9 7
                      ERCC-00031 0 0 0 0
                      ERCC-00034 1 9 1 3
                      ERCC-00035 11 35 5 7
                      ERCC-00042 37 186 43 78

                      I have used the DESeq earlier with the same experimental conditions and normalized the read counts with spike ins data and then ran the n.binom test for extract DEGs , but I cannot figure out how to proceed the same in case of the edgeR. Can anyone help me?

                      Comment


                      • #12
                        I mentioned this on your thread on biostars, but I'll reply here to for the sake of others. If you read the help for calcNormFactors, you'll see that it places the normalization factors in object$samples$norm.factors, so just fill them in there.

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Strategies for Sequencing Challenging Samples
                          by seqadmin


                          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                          03-22-2024, 06:39 AM
                        • seqadmin
                          Techniques and Challenges in Conservation Genomics
                          by seqadmin



                          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                          Avian Conservation
                          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                          03-08-2024, 10:41 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, 03-27-2024, 06:37 PM
                        0 responses
                        12 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 03-27-2024, 06:07 PM
                        0 responses
                        11 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 03-22-2024, 10:03 AM
                        0 responses
                        53 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 03-21-2024, 07:32 AM
                        0 responses
                        68 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X