Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Experimental Design: Which Kind of Replicate to Use?

    We are planning to do some F2 mapping project and I'm unsure on how should I perform replicates for the parentals.

    The parental strains we use are inbred model strains, and, hence, supposedly genetically uniform. I have two choices on replication:
    1. Extract from 10 animals of each strain, pool 1ug of their RNA together, and sequence this pooled RNA twice as technical replicates;
    2. Extract RNA from 2 animals from a parental strain, and perform one sequencing each;
    3. Extract from 1 animal, and do a technical replicate.


    Which one would make most statistical sense in downstream analysis?

    (BTW: Does that also mean it would be appropriate for me to do technical replicates for each of the F2? They are most certainly considered a different treatment individually... This is going to cost us one extra flow cell...?)

  • #2
    It all depends on what downstream analysis you want to perform.

    Without this, I can only give you the general advice that possibilities 1 and 3 are wrong for any purpose, and 2 may or may not be useful. Also, better explain to us why you think that technical replicates are important. In most cases, you do need biological replicates (see the multitude of earlier threads on the subject), while technical replicates offer little extra value, except for trouble shooting. Also, why would technical replicates require extra flow cells? Are you sure you are not confusing with sequencing depth requirements or forgetting about multiplexing?

    Comment


    • #3
      Thank you for your quick reply, Simon.

      We are planning to strictly perform expression analysis using NGS. The F2 data are mainly for eQTL mapping, but the parental data may be used for any other kind of expression analysis, especially given we're doing some consomic and congenic lines out of those.

      As for the issue of technical replicates: Each F2 individual (n=40) would have a completely different genotype due to recombination, so finding a biological replicate for every animal is clearly impossible, unlike their inbred parentals. If I get from what you said correctly, doing technical replicates for the F2 would not improve the statistical power in any way, so this is completely unnecessary?

      And as for the "extra flow cell": I should have meant an extra lane.
      Last edited by SamCurt; 12-23-2011, 10:55 PM.

      Comment


      • #4
        By the way, Simon, now there's a reason about read depth vs robustness.

        You have mentioned on this forums that it's better to sacrifice depth for robustness in Expression Analyses through multiplexing. But how much depth should we sacrifice?

        Our previous project was done on a GAII without multiplexing (one sample/lane), but would switch to a HiSeq. Of course, since HiSeq makes 5X as many reads as GAII, a <5-plex run on a single lane would still have GAII-levels of depth. But how about 9-plex (3 replicates each for each strain), which would be only 55% of GAII depth per sample? Would the decrease in depth more than offset by the increase in degrees of freedom?

        Comment


        • #5
          Let's first discuss the simple case where it is clear how to replicate, each treatment of cell cultures with something, so that you can simply repeat the treatment on another sample. Then, if you just do twice the number of replicates to half the depth, you do not lose information because your statistical power depends on the total number of reads per experimental condition, not on the number of reads per sample. DESeq, for example, keeps the replicates apart only to estimate the dispersion but for the actual test for differential expression, it sums up the counts form replicates.

          Hence, you should estimate how many reads in total you want per condition and then spread this read budget over as many replicate samples as practical. If you want to get the same power as in your previous experiment, keep the total number of reads per condition the same.

          Comment


          • #6
            Now for your specific experiment, first regarding the parents: Even the expression of isogenic litter mates reared together typical differs considerably. If you either pool several mice or sample several mice indivually, these variation will average out to a degree so that your result is closer to the population mean than if you had only once mouse. However, for most potential downstream analyses, it will be important to know how far away you are still from the population mean, i.e. how much variation is left, and this you cannot see if you pool all samples. It may, however, make sense to make two or three pools, each with a different subset of the available mice, if multiplex tagging each mouse is not practical. The preferred way is, of course, always to pool only in silico, i.e., to get separate counts and sum them only later, in the analysis.

            Of course, in the case of vertebrate animals, my statement about "as many replicate samples as practical" needs to be modified. We do not want to sacrifice more animals than strictly needed. Especially here, it can help to find a good experimental design to use published data from a similar system and play with it to get a feel, by re-analysing it after throwing out some samples or a fraction of the reads, to see how power changes if one reduces the number of reads or samples.

            For the F2 mice: No, you do not need technical replicates. You just need to make sure you have enough sequencing depth per mouse so that any expression effects are not drowned in Poisson noise, and this depends on whether you want to look only at strongly or also at medium expressed genes. You need lots of mice, of course. 40 does sound good but you may want to do some math before starting to make sure.

            Comment


            • #7
              Thank you for your detailed reply, Simon.

              For Parentals: My knowledge on microarray statistics was that if you have n individuals, it is better to divide it into smaller subsets and run microarray on each subset pool separately, rather than pool all n individuals and do technical replicates. Your basic idea of "keep total read count of each biological group constant" pretty much echoed this view.
              Given out number of parentals (10 per strain) we're pretty much limited by the amount barcodes that can be run on a lane (12)--the best would be 5 pools (ie what you wrote as subsets) per strain, but probably we have to settle with 3 or 4 pools per strain, despite the number of animals per pool may not be the same.

              F2: I was relieved that technical replicates would not be needed. Of course I would do some model testing for robustness, but we have far more than 40 animals' tissue archived, and it would not be far too much of an hassle to extract a bit more RNA...

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Latest Developments in Precision Medicine
                by seqadmin



                Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                Somatic Genomics
                “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                05-24-2024, 01:16 PM
              • seqadmin
                Recent Advances in Sequencing Analysis Tools
                by seqadmin


                The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                05-06-2024, 07:48 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 05-24-2024, 07:15 AM
              0 responses
              16 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-23-2024, 10:28 AM
              0 responses
              18 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-23-2024, 07:35 AM
              0 responses
              22 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-22-2024, 02:06 PM
              0 responses
              11 views
              0 likes
              Last Post seqadmin  
              Working...
              X