Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Two-way ANOVA, replicates, software.

    Hi,

    I have a couple quick questions regarding experimental setup and analysis. I am trying to figure out how to best design my experiment to get the information I need while also minimizing cost. I am using an inducible transgene to identify the targets of my transcription factor with mRNA-seq. I am setting up my experiment with a two- way ANOVA design:

    Inducible Line (not treated) ; Inducible line (treated with inducing chemical)
    Wild-type control (not treated); Wild-type control (treated with inducing chemical)

    At a minimum, I plan to submit two biological replicates for sequencing for both lines/conditions. I am also hoping to using additional time points post induction (again, will depend on cost).

    Question 1) I know more replicates are better, but there is the cost limitation. How much will my statistical power increase by including a 3rd replicate? Is it likely that just two replicates would be useful or is this something that needs to be determined empirically? I have read the other threads on the website about the Fisher's Exact Test (two replicates) vs. T-test (need 3 replicates), but don't know if the value of the 3rd replicate similarly holds true for the two-way ANOVA.

    Question 2) When I have my data, I will need software to accommodate the two-way ANOVA analysis. Do most NGS analysis software packages offer two-way ANOVA analysis? Does anyone have any recommendations? I am working with Arabidopsis.

    If anyone has any insight into either of these questions, I would be grateful for your input. Thanks!

    eggplant72

  • #2
    Some comments in random order:

    - Fisher's exact test, as usually employed, cannot deal with any replicates, and hence should not be used.

    - Replicates are not as much a cost issue as people seem to think, because you can multiplex several samples on one lane. Hence, you should decide how many lanes you can afford and how many samples you can obtain. The best approach, in my opinion, is to tag all the samples with multiplexing tags, pool them and spread them over the available lanes. (See e.g. Doerge and Auer, 2010.)

    - Power depends more on sequencing depth than replicate number. This is because once you share information across genes, as done by edgeR and DESeq, it does not make that much difference whether you have two or three replicates. If you don't share information (e.g., do a standard t test), you won't get anywhere with less than, say, six or seven replicates.

    - Nevertheless, if you want 100 counts for a given gene, you are better off getting them from four replicate samples, each sequenced to 25, than two sequenced to 50 each.

    - Outliers are an annoying issue, and more replicates help here.

    - edgeR and DESeq both support two-way anova, BaySeq does not (if I recall correctly), cuffdiff neither.

    Comment


    • #3
      Originally posted by Simon Anders View Post
      Some comments in random order:

      - Fisher's exact test, as usually employed, cannot deal with any replicates, and hence should not be used.

      - Replicates are not as much a cost issue as people seem to think, because you can multiplex several samples on one lane. Hence, you should decide how many lanes you can afford and how many samples you can obtain. The best approach, in my opinion, is to tag all the samples with multiplexing tags, pool them and spread them over the available lanes. (See e.g. Doerge and Auer, 2010.)

      - Power depends more on sequencing depth than replicate number. This is because once you share information across genes, as done by edgeR and DESeq, it does not make that much difference whether you have two or three replicates. If you don't share information (e.g., do a standard t test), you won't get anywhere with less than, say, six or seven replicates.

      - Nevertheless, if you want 100 counts for a given gene, you are better off getting them from four replicate samples, each sequenced to 25, than two sequenced to 50 each.

      - Outliers are an annoying issue, and more replicates help here.

      - edgeR and DESeq both support two-way anova, BaySeq does not (if I recall correctly), cuffdiff neither.
      Hi Simon, I will look into edgeR and DESeq. Thank you very much for your help!

      Comment


      • #4
        Hi Simon,
        How to run 2-way ANOVA using DESeq?
        Thanks,

        Comment


        • #5
          See the vignette of the devel version: http://www.bioconductor.org/packages.../doc/DESeq.pdf

          Comment


          • #6
            Thanks, How to download/find the pasilla data package?

            Comment


            • #7
              You will need the development version of R (2.14) and do:

              Code:
              source("http://bioconductor.org/biocLite.R")
              biocLite("pasilla")
              Or from here:

              This package provides per-exon and per-gene read counts computed for selected genes from RNA-seq data that were presented in the article

              Comment


              • #8
                You can also download pasilla package source and install it on other versions of R (eg. mine is 2.13.1)

                Code:
                install.packages("/home/xxx/pasilla_0.2.5.tar.gz", repos=NULL, type="source")

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Recent Advances in Sequencing Analysis Tools
                  by seqadmin


                  The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                  05-06-2024, 07:48 AM
                • seqadmin
                  Essential Discoveries and Tools in Epitranscriptomics
                  by seqadmin




                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                  04-22-2024, 07:01 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Today, 07:03 AM
                0 responses
                10 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-10-2024, 06:35 AM
                0 responses
                29 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-09-2024, 02:46 PM
                0 responses
                38 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-07-2024, 06:57 AM
                0 responses
                31 views
                0 likes
                Last Post seqadmin  
                Working...
                X