Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • miaom
    Member
    • Oct 2012
    • 12

    edgeR: how to calculate group average/means ?

    I am trying edgeR now to find deferentially expressed genes.
    I have run
    d <- DGEList(counts=data,group=g,lib.size=libSizes)
    d <- calcNormFactors(d)
    d <- estimateCommonDisp(d)
    d <- estimateTagwiseDisp(d)
    d <- exactTest(d)
    # d.final <- topTags(de.com,n = length(data[,1]))
    and got some results like

    logFC logCPM PValue
    A1bg 1.168034660 -3.842894 0.4137326
    A1cf 0.000000000 -Inf 1.0000000
    A1i3 0.000000000 -Inf 1.0000000
    A2m -0.003703085 4.419204 0.9421192
    A3galt2 0.437990409 2.861665 0.2611483

    But I'd like to know the average/mean of the gene expressions in each condition/groups?
    It seems logFC it is different from "log2(rowMeans(GrpB)/ rowMeans(GrpA))" calculating from cpm() counts.

    How should I do the calculation ?

    Thanks!
  • xrao
    Member
    • Mar 2014
    • 10

    #2
    Have you found the answer? I am also curious. Thank you!

    Comment

    • Schelarina
      Member
      • Apr 2014
      • 18

      #3
      Do you need CPM per each sample after TMM normalization?
      you could do something like this
      cpm <- cpm(d, log=TRUE, lib.sizes=lib.sizes, normalized.lib.sizes = TRUE, prior.count=0.25)
      if your cpm values are negative then you can set the prior.count to a different value

      Comment

      • xrao
        Member
        • Mar 2014
        • 10

        #4
        Thank you! But we meant mean values for each group/condition.

        Comment

        • ckruse6
          Junior Member
          • Aug 2018
          • 1

          #5
          Manual Calc option

          So, I know it isn't ideal, but let's take the actual values of log(CPM) and log(FC) that you have from edgeR. Keep in mind this will only work if your replicates of experimental and control were equal, but you could tweak it as needed. Might be worth double checking that logCPM is using base 2 which I believe is default.

          Using "A" to be the average expression of your experimental condition and "B" to be your control as you've run it.

          FC=A/B, so
          2^log(FC) will get you (A/B)
          CPM=(A+B)/2, so
          2^log(CPM)=(A+B)/2

          Basic algebra solves for B,
          B=2(2^log(CPM))/(1+2^(log(FC)))

          and then
          A= 2(2^log(CPM)) - B

          Using the values of a typical EdgeR, this is minimal R work to generate the columns you wanted.

          Hope this helps!

          Comment

          Latest Articles

          Collapse

          • SEQadmin2
            From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
            by SEQadmin2


            Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


            The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
            ...
            06-02-2026, 10:05 AM
          • SEQadmin2
            Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
            by SEQadmin2


            With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


            Introduction

            Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
            05-22-2026, 06:42 AM
          • SEQadmin2
            Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
            by SEQadmin2

            Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


            Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
            05-06-2026, 09:04 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by SEQadmin2, Yesterday, 08:59 AM
          0 responses
          14 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-02-2026, 12:03 PM
          0 responses
          22 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-02-2026, 11:40 AM
          0 responses
          19 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 05-28-2026, 11:40 AM
          0 responses
          32 views
          0 reactions
          Last Post SEQadmin2  
          Working...