Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DEseq2 analysis: Seeming incongruity between PCA & distance heatmap

    Hello all,

    I'm running an RNAseq analysis with DESeq2 (R version 3.1.0, DESeq2_1.4.5 ). Looking at my QC plots, I noticed an odd discrepancy between the PCA plot and the distance heatmap.

    One of the samples (labeled Sample_4 in the attached images) clusters right among the other samples on the PCA, but on the heatmap it appears to be an outlier compared to the other samples.

    I've run a lot of analyses like this using DESeq2 with very similar code, and I've never seen a discrepancy this big between these two plots before. Has anyone encountered this situation before, or have a good idea as to what might explain this?

    Could it have to do with the relatively small amount of total variation explained by PC1 and PC2 (18.4% & 17.6% respectively)?

    Thanks,

    Alex
    Attached Files

  • #2
    The distance between samples on the PCA plot is an approximation of the distance using all the genes, and the quality of the approximation depends on how much variance is captured by PC1 and PC2 (here only ~35%).

    Also, if you were using plotPCA (it looks like you are not though), the PCA is calculated on the top n genes ranked by variance, instead of all the genes.

    Comment


    • #3
      Hey Mike. Thanks for responding. I'm working with Alex on this. We also used the plotPCA function and got the same results (the code used here to make the PCA plots was based on plotPCA - we had difficulty changing the default colors at one point in the plocPCA's history).

      I believe the sample distance heatmap was made using some code that may have been part of the DESeq vignette at one point - something like `heatmap.2(as.matrix(dist(t(assay(rld)))))` where rld is the regularized log transformed dataset.

      So I think I'm hearing you say we're seeing this because the first two PCs aren't explaining that much variance. Still seems odd to me that the distance matrix based on all genes (or top N based on variance) still shows this particular sample as a pretty obvious outlier where PCA did not.

      Looking forward to those time-series examples in the vignette you promised last month in Boston

      Comment


      • #4
        hi Stephen,

        Something to explore: look into the other PC's to see if this sample sticks out in one of those.

        Yes, the time series dataset I submitted to Bioc is currently in review, and once that's done I can write up a workflow. It's a fission yeast time series.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Advanced Methods for the Detection of Infectious Disease
          by seqadmin




          The recent pandemic caused worldwide health, economic, and social disruptions with its reverberations still felt today. A key takeaway from this event is the need for accurate and accessible tools for detecting and tracking infectious diseases. Timely identification is essential for early intervention, managing outbreaks, and preventing their spread. This article reviews several valuable tools employed in the detection and surveillance of infectious diseases.
          ...
          Yesterday, 01:15 PM
        • seqadmin
          Strategies for Investigating the Microbiome
          by seqadmin




          Microbiome research has led to the discovery of important connections to human and environmental health. Sequencing has become a core investigational tool in microbiome research, a subject that we covered during a recent webinar. Our expert speakers shared a number of advancements including improved experimental workflows, research involving transmission dynamics, and invaluable analysis resources. This article recaps their informative presentations, offering insights...
          11-09-2023, 07:02 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 08:12 AM
        0 responses
        15 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 11-22-2023, 09:29 AM
        1 response
        51 views
        0 likes
        Last Post VilliamPast  
        Started by seqadmin, 11-22-2023, 08:53 AM
        0 responses
        59 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 11-21-2023, 08:24 AM
        0 responses
        31 views
        0 likes
        Last Post seqadmin  
        Working...
        X