Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combining RNA-seq datasets

    I searched a lot of threads here and elsewhere without finding anything exactly like this.

    I have two RNA-seq datasets from different dates and different platforms. Split between them are 4 groups (normal and 3 stages of cancer). This is the distribution:

    Set 1
    1x normal
    2x stage 1
    3x stage 2
    7x stage 3

    Set 2
    2x normal
    2x stage 1
    2x stage 2
    2x stage 3

    We want to combine the datasets and make comparisons between the groups for differential expression. So far, I've tried:

    -Combine all FPKM values into 1 table
    -Run ComBat on the table, specifying dataset as batch and stage as a covariate
    -Skipped voom() since they are not raw counts and log2 converted the ComBat output for limma.
    -Run lmFit, contrasts.fit, and eBayes from limma on the converted output.

    My questions/confusion is over:

    1. Should I be using FPKM values or the raw counts for this, given the two datasets and need for batch removal?

    2. What is the best way to run limma on the ComBat output without conversion through voom()?

    3. Are there any other glaring problems with this approach?

    Thanks!

  • #2
    Hi -
    Were you able to find a solution to this problem?

    Comment


    • #3
      This is actually a very common type of RNA-seq analysis where we combine two datasets. You can run voom and limma on the raw counts, as you would for any analysis. When you form the design matrix, include a term for the batch effect like this:

      design <- model.matrix(~Stage+Set)

      Here Set is the batch factor taking values "Set1" and "Set2" and Stage is the experimental factor taking values "Normal", "Stage1", "Stage2" and "Stage3".

      This is very standard type of analysis. There is no need for any external batch correction such as Combat.

      Comment


      • #4
        Hi Gordon- Thanks for your reply.
        I am a beginner in this. I know how to use DESeq2 to analyze RNASeq data from tables generated by summarizeOverlaps function. I was wondering how would your suggestion be implemented in this.
        Hussein

        Comment


        • #5
          Originally posted by Gordon Smyth View Post
          This is actually a very common type of RNA-seq analysis where we combine two datasets. You can run voom and limma on the raw counts, as you would for any analysis. When you form the design matrix, include a term for the batch effect like this:

          design <- model.matrix(~Stage+Set)

          Here Set is the batch factor taking values "Set1" and "Set2" and Stage is the experimental factor taking values "Normal", "Stage1", "Stage2" and "Stage3".

          This is very standard type of analysis. There is no need for any external batch correction such as Combat.
          I have been using limma for a while now, but I am also a bit unsure about the syntax of the model.matrix command when it comes to batch effects, random effects, paired design, etc.

          If I understood correctly, then I don't need to use any special command like removeBatchEffect() ? I see this command come up on some Bioconductor threads when I google. But I no longer see it in the limma manual. Is this now deprecated?

          Thank you for talking the time to answer my question.

          Comment


          • #6
            Originally posted by NGSfan View Post
            If I understood correctly, then I don't need to use any special command like removeBatchEffect() ? I see this command come up on some Bioconductor threads when I google. But I no longer see it in the limma manual. Is this now deprecated?
            Type ?removeBatchEffect to read the documentation page. The documentation page explains that it is used to make unsupervised plots rather than for differential expression analyses.

            This is the way that removeBatchEffect has always been treated. It has not been removed from any documentation.
            Last edited by Gordon Smyth; 01-29-2015, 10:06 PM. Reason: minor grammar improvement

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Advanced Tools Transforming the Field of Cytogenomics
              by seqadmin


              At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...
              09-26-2023, 06:26 AM
            • seqadmin
              How RNA-Seq is Transforming Cancer Studies
              by seqadmin



              Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
              09-07-2023, 11:15 PM
            • seqadmin
              Methods for Investigating the Transcriptome
              by seqadmin




              Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.

              Whole Transcriptome RNA-seq
              Whole transcriptome sequencing...
              08-31-2023, 11:07 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:57 AM
            0 responses
            9 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 09-26-2023, 07:53 AM
            0 responses
            8 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 09-25-2023, 07:42 AM
            0 responses
            15 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 09-22-2023, 09:05 AM
            0 responses
            44 views
            0 likes
            Last Post seqadmin  
            Working...
            X