Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • differential gene expression and variance issues

    I have 5 time-points with 2 biological replicates (collected and prepared on separate days following exactly the same protocol) of bacteria during starvation-induced development. I've done analysis using CLC genome workbench and tophat-cufflinks-cuffdiff (yes, I realize I probably only need bowtie for bacteria, but I figured looking for nonexistent splice junctions would just take computational time and shouldn't change anything).

    My problem is this; there are a number of genes that I know are differentially regulated (previously published, validated by me by qPCR) that go up by many fold (one example goes from 50 RPKM to like 4000) but that both programs say are not statistically significantly regulated because there is high variability between replicates.

    Instead, the genes that are given as statistically significantly regulated are expressed at very low levels and don't have as much variability or a very high fold up-(or down) regulation (from 20 to 2 RPKM, for example). These seem less likely to be interesting biologically.

    So my question is, am I going to be able to get anything statistically valid out of this data, or if there's a lot of variation am I just out of luck? I am sure I could just cherry-pick genes for future work, but that seems like a waste of data.

    If I try DESeq, will I just have the same problem in a different format, or might the different ways the programs analyze the data change the way statistics are calculated?

    Thanks,
    Anna

  • #2
    If you want to know whether DESeq will give you the same answer, you will just have to try.

    As for the qPCR validation: Have you only validated that the gene goes up in one replicate, or have you also validated that the variance is low by performing your qPCR on the time points of the second replicate, too?

    Comment


    • #3
      I didn't do qPCR validation of the second data set, but if I do parallel analyses for each set of replicates (at least in the CLC software) I do see up-regulation of a number of known genes within each replicate set of timepoints. There is a bit of variation in timing, etc. but the genes I expect to go up do go up.


      The problem comes when I try to do statistics, then the large variance in levels between the replicates makes the p-values really big for most of my "known" up-regulated genes.
      I'm considering whether I need to do some sort of paired comparison, but then I'm not sure if I'll have to do separate analyses for each timepoint, comparing each timepoint to 0hrs, and then if I do that, do I have to make an even more severe significance correction if I'm effectively doing 4 separate tests...I wish I'd taken statistics more recently than 10 years ago.

      On a partly unrelated note, the more I look through my data, the more I feel like cufflinks/cuffdiff is just not ideal for bacterial genomes. I feel like it doesn't deal well with the whole "many genes are in operons" issue. Has anyone else had experience with this and did you find something better?

      And are there any programs that don't lump sense and antisense transcripts when counting reads mapping to a particular genomic region (also a somewhat bacteria-specific problem, I think)?

      Comment


      • #4
        Yes, when a paired analysis is warranted, it can have much more power than a naive one. Then, you have to use a tool like DESeq, because cuffdiff does not offer functionality for designs that go beyond a two-group comparison.

        Comment


        • #5
          Thanks, Simon, I'll give DESeq a try.

          Comment


          • #6
            Hi amcloon

            I am interested in the outcome of your analysis with DEseq, since I have a similar issue with multiple timepoint analysis and variability between samples.

            Did you end up using the paired analysis, or staying with single analyses comparing everything to time zero?

            Cheers

            Sam

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Recent Advances in Sequencing Analysis Tools
              by seqadmin


              The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
              Today, 07:48 AM
            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Today, 07:17 AM
            0 responses
            11 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 05-02-2024, 08:06 AM
            0 responses
            19 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-30-2024, 12:17 PM
            0 responses
            20 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-29-2024, 10:49 AM
            0 responses
            28 views
            0 likes
            Last Post seqadmin  
            Working...
            X