Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • DEseq2 analysis: Seeming incongruity between PCA & distance heatmap

    Hello all,

    I'm running an RNAseq analysis with DESeq2 (R version 3.1.0, DESeq2_1.4.5 ). Looking at my QC plots, I noticed an odd discrepancy between the PCA plot and the distance heatmap.

    One of the samples (labeled Sample_4 in the attached images) clusters right among the other samples on the PCA, but on the heatmap it appears to be an outlier compared to the other samples.

    I've run a lot of analyses like this using DESeq2 with very similar code, and I've never seen a discrepancy this big between these two plots before. Has anyone encountered this situation before, or have a good idea as to what might explain this?

    Could it have to do with the relatively small amount of total variation explained by PC1 and PC2 (18.4% & 17.6% respectively)?


    Attached Files

  • #2
    The distance between samples on the PCA plot is an approximation of the distance using all the genes, and the quality of the approximation depends on how much variance is captured by PC1 and PC2 (here only ~35%).

    Also, if you were using plotPCA (it looks like you are not though), the PCA is calculated on the top n genes ranked by variance, instead of all the genes.


    • #3
      Hey Mike. Thanks for responding. I'm working with Alex on this. We also used the plotPCA function and got the same results (the code used here to make the PCA plots was based on plotPCA - we had difficulty changing the default colors at one point in the plocPCA's history).

      I believe the sample distance heatmap was made using some code that may have been part of the DESeq vignette at one point - something like `heatmap.2(as.matrix(dist(t(assay(rld)))))` where rld is the regularized log transformed dataset.

      So I think I'm hearing you say we're seeing this because the first two PCs aren't explaining that much variance. Still seems odd to me that the distance matrix based on all genes (or top N based on variance) still shows this particular sample as a pretty obvious outlier where PCA did not.

      Looking forward to those time-series examples in the vignette you promised last month in Boston


      • #4
        hi Stephen,

        Something to explore: look into the other PC's to see if this sample sticks out in one of those.

        Yes, the time series dataset I submitted to Bioc is currently in review, and once that's done I can write up a workflow. It's a fission yeast time series.


        Latest Articles


        • seqadmin
          Understanding Genetic Influence on Infectious Disease
          by seqadmin

          During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

          Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
          09-09-2024, 10:59 AM
        • seqadmin
          Addressing Off-Target Effects in CRISPR Technologies
          by seqadmin

          The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
          08-27-2024, 04:44 AM





        Topics Statistics Last Post
        Started by seqadmin, 09-11-2024, 02:44 PM
        0 responses
        Last Post seqadmin  
        Started by seqadmin, 09-06-2024, 08:02 AM
        0 responses
        Last Post seqadmin  
        Started by seqadmin, 09-03-2024, 08:30 AM
        0 responses
        Last Post seqadmin  
        Started by seqadmin, 08-27-2024, 04:40 AM
        0 responses
        Last Post seqadmin  