Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • feralBiologist
    Member
    • Jun 2011
    • 61

    Interpreting a PCA plot produced with DESeq2

    I've used DESeq2 to analyse a few RNA-Seq samples. I followed pretty closely the manual and using the following code

    Code:
    vsd <- varianceStabilizingTransformation(dds)
    data <- plotPCA(vsd, intgroup=c("condition"), returnData=TRUE)
    percentVar <- round(100 * attr(data, "percentVar"))
    plotPCA <- ggplot(data, aes(PC1, PC2, color=condition)) +
      geom_point(size=3) +
      xlab(paste0("PC1: ",percentVar[1],"% variance")) +
      ylab(paste0("PC2: ",percentVar[2],"% variance")) +
      geom_text(aes(label=names),hjust=0.25, vjust=-0.5, show_guide = F)
    ggsave("PCA.pdf", plot = plotPCA)
    I created the following PCA plot:



    What I don't understand is - what are the units that are recorded on the x/y axes? What's their meaning?
    Last edited by feralBiologist; 02-14-2015, 05:30 PM.
  • dpryan
    Devon Ryan
    • Jul 2011
    • 3478

    #2
    The axes are dimensionless, they have no units.

    Comment

    • feralBiologist
      Member
      • Jun 2011
      • 61

      #3
      Originally posted by dpryan View Post
      The axes are dimensionless, they have no units.
      Thanks for your reply! So you are saying the units printed on the x/y axes have no meaning? Do you know how can I edit the code to remove them?

      Comment

      • dpryan
        Devon Ryan
        • Jul 2011
        • 3478

        #4
        plotPCA <- plotPCA + theme(axis.text.x = element_blank(), axis.text.y=element_blank())

        Or something like that. Check the ggplot2 documentation.

        Comment

        • Fatt
          Junior Member
          • Jan 2015
          • 1

          #5
          So you are saying the units printed on the x/y axes have no meaning?
          They are the first (x-axis) and second (y-axis) principal components, respectively. It's a projection of your data onto a subspace that captures the maximum variance/minimum error. We can likely say they have no units, but they certainly do have meaning.

          Comment

          • feralBiologist
            Member
            • Jun 2011
            • 61

            #6
            Originally posted by Fatt View Post
            They are the first (x-axis) and second (y-axis) principal components, respectively. It's a projection of your data onto a subspace that captures the maximum variance/minimum error. We can likely say they have no units, but they certainly do have meaning.
            Yes, I know these are the principal components. It is just that DESeq2 prints units on these axes (you can check the link to the plot in my first post) and I could not make any sense of these. I also saw a lot of other PCA plots (presumably produced by other programs) displaying units on the axes so wondered what these are - just do image search on Google for "PCA plot" and you will see a plenty of graphs displaying units on the axes. It is the units that I find confusing, not the axes themselves.

            Comment

            • feralBiologist
              Member
              • Jun 2011
              • 61

              #7
              I found the answer to this on CrossValidated: it seems the units denote the raw component scores. As the components are themselves linear combinations of multiple genes it is hard to interpret these raw scores biologically. They are just coordinates in the two-dimensional PC space and are helpful to simply place the individual samples in that space.

              Comment

              • Skiaphrene
                Member
                • Aug 2013
                • 18

                #8
                I might be saying this a bit late, but the % of intertia (ie total variance) captured by each PC is a very useful "quality control" measure for PCA and should be included if possible in the figure. The axis units, as said above, are not very interpretable.

                BTW, that's a whopping big 1st PC! Nice!

                Comment

                • student-t
                  Member
                  • Mar 2015
                  • 16

                  #9
                  So, how much % of the total variance should be expected?

                  Comment

                  • dpryan
                    Devon Ryan
                    • Jul 2011
                    • 3478

                    #10
                    @student-t: I replied to this on biostars.

                    Comment

                    Latest Articles

                    Collapse

                    • SEQadmin2
                      Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                      by SEQadmin2


                      I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


                      Here are nine questions we think about, in roughly the order they matter, before...
                      06-18-2026, 07:11 AM
                    • SEQadmin2
                      From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                      by SEQadmin2


                      Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                      The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                      ...
                      06-02-2026, 10:05 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by SEQadmin2, 06-17-2026, 06:09 AM
                    0 responses
                    24 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-09-2026, 11:58 AM
                    0 responses
                    42 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-05-2026, 10:09 AM
                    0 responses
                    48 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-04-2026, 08:59 AM
                    0 responses
                    49 views
                    0 reactions
                    Last Post SEQadmin2  
                    Working...