Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Batch effect for RNAseq data

    Hi all

    I am pretty new to RNAseq data and currently working on RNAseq data from Brainspan database (http://brainspan.org/). The data from the database contains normalized expression values and, from my knowledge, it needs batch effect processing. Is there any bioconductor package or other ways to do this?

    Thanks

  • #2
    The standard tool is the SVA package, with the combat command.

    Comment


    • #3
      Hi - I have a question related to 'manually' adjusting for batch effects using RNASeq data (and by manually I mean not using built in batch adjustment from packages like edgeR and DESeq2, but using ComBat/gene-wise normalization/linear modelling to adjust for batch effects).

      I realize there are a few options to eliminate such effects, but most methods (such as ComBat or a linear model) require normalized (normal) count data to begin with. So for instance, one would use cpm() in edgeR or DESeq to fetch normalized counts (in log space) which can then be used for batch adjustment with the corresponding batch variable from the experimental design.

      My question is - upon adjusting these normalized counts for batch effect (through any method), you cannot plug those numbers back in to any differential expression package function (edgeR or DESeq) as this will result in nonsensical results. At the same time - we cannot use raw counts for the batch adjustment prior to normalizing them.

      How does one solve this issue? I have a pretty strong batch effect in my data that I'm struggling to remove effectively prior to differential expression testing

      Thanks

      Comment


      • #4
        In the case of SVA, you get a list containing the surrogate variables. You then just add them as covariates to your design. Combat() itself produces a tweaked expression-set, which is more useful for something like limma.

        Comment


        • #5
          Originally posted by dpryan View Post
          In the case of SVA, you get a list containing the surrogate variables. You then just add them as covariates to your design. Combat() itself produces a tweaked expression-set, which is more useful for something like limma.

          Thanks for your reply! I actually did try adding the batch term as a covariate to the design model specification in both edgeR and DESeq2 but I see very few DE genes (10-20 out of 20,000 tested) which is why I was looking to do it independently through ComBat or another method.

          My main issue is that I might have my corrected (normalized) counts through independent batch-adjustment methods but any DE package (DESeq, edgeR or even limma's voom) would require raw counts because it does internal normalization/rescaling which would make the corresponding results not make sense anymore.

          I don't see an easy way around this (Is there any package or specification where it lets you give it already normalized data without doing any transformation internally?)

          Thanks any help would be greatly appreciated

          Comment


          • #6
            Just use limma. You don't need to do voom().

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            18 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            22 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            16 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            47 views
            0 likes
            Last Post seqadmin  
            Working...
            X