Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Matina
    Junior Member
    • Jun 2012
    • 6

    Comparison between different batches and global analysis

    Hello everyone,

    I have RNA-seq data coming from control, breast cancer and endometrial cancer patients. Is an ongoing study and so far we've sequenced 15 patients (5 control, 5 breast cancer and 5 endometrial cancer) about a year ago and 5 patients (1 control, 2 breast cancer and 2 endometrial cancer) a month ago.

    I want to compare between the independent batches and also combine them and do a global analysis. I did an independent DE analysis on each batch using EdgeR and DESeq2 however when i compare the gene lists I have very few common DE genes. Do I need to do some kind of normalisation before i compare them? About the global analysis, I will put everything together but how can I correct for batch effects?

    Thank you so much!
  • blancha
    Senior Member
    • May 2013
    • 367

    #2
    Just a general comment first.
    I've have very poor results regarding correlation between human clinical samples.
    There are too many variables that are not controlled for.
    Often, when you enquire about the collection method of the samples, you'll find that they are not identical for all samples, e.g. some samples were frozen for a long period of time, the tumor samples contain different proportions of normal tissue, ...
    I've often been unpleasantly surprised that the samples that were described to me as replicates actually had significant differences.
    The best results are generally obtained when the person collecting the samples is experienced in RNA-Seq analysis, and is aware how the collection and preparation of the samples can affect the downstream analysis.

    There are also variations between cancers, but that is biologically relevant.

    As an analyst, you need to get as much information as possible about the collection method of the samples, and collect as much metrics as possible to better understand the samples (e.g. RNA integrity number for all the samples, exotic alignment rate, ...).

    -----

    Second, you have few or no replicates in your second batch so you would expect DESeq and edgeR to give you a different list of differentially expressed genes. In the second batch for example, you have no control replicates which will affect the calculation of the dispersion level by DESeq and edgeR.

    -----

    To answer your question though, I would first try clustering the samples using the normalized counts. If the samples do no cluster according to the tissue of origin but rather according to the batch, I would try removing the batch effect with ComBat. I would then repeat the clustering to verify if ComBat has successfully removed the batch effect.
    Last edited by blancha; 05-18-2014, 10:01 AM.

    Comment

    • Matina
      Junior Member
      • Jun 2012
      • 6

      #3
      Thank you very much for you answer blancha.
      I'll try using comBat!

      Comment

      Latest Articles

      Collapse

      • GATTACAT
        Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
        by GATTACAT
        Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
        07-01-2026, 11:43 AM
      • SEQadmin2
        Nine Things a Sample Prep Scientist Thinks About Before Sequencing
        by SEQadmin2


        I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

        Here are nine questions we think about, in roughly the order they matter, before...
        06-18-2026, 07:11 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by SEQadmin2, Yesterday, 11:08 AM
      0 responses
      6 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-30-2026, 05:37 AM
      0 responses
      11 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-26-2026, 11:10 AM
      0 responses
      19 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-17-2026, 06:09 AM
      0 responses
      53 views
      0 reactions
      Last Post SEQadmin2  
      Working...