Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpreting a PCA plot produced with DESeq2

    I've used DESeq2 to analyse a few RNA-Seq samples. I followed pretty closely the manual and using the following code

    Code:
    vsd <- varianceStabilizingTransformation(dds)
    data <- plotPCA(vsd, intgroup=c("condition"), returnData=TRUE)
    percentVar <- round(100 * attr(data, "percentVar"))
    plotPCA <- ggplot(data, aes(PC1, PC2, color=condition)) +
      geom_point(size=3) +
      xlab(paste0("PC1: ",percentVar[1],"% variance")) +
      ylab(paste0("PC2: ",percentVar[2],"% variance")) +
      geom_text(aes(label=names),hjust=0.25, vjust=-0.5, show_guide = F)
    ggsave("PCA.pdf", plot = plotPCA)
    I created the following PCA plot:



    What I don't understand is - what are the units that are recorded on the x/y axes? What's their meaning?
    Last edited by feralBiologist; 02-14-2015, 05:30 PM.

  • #2
    The axes are dimensionless, they have no units.

    Comment


    • #3
      Originally posted by dpryan View Post
      The axes are dimensionless, they have no units.
      Thanks for your reply! So you are saying the units printed on the x/y axes have no meaning? Do you know how can I edit the code to remove them?

      Comment


      • #4
        plotPCA <- plotPCA + theme(axis.text.x = element_blank(), axis.text.y=element_blank())

        Or something like that. Check the ggplot2 documentation.

        Comment


        • #5
          So you are saying the units printed on the x/y axes have no meaning?
          They are the first (x-axis) and second (y-axis) principal components, respectively. It's a projection of your data onto a subspace that captures the maximum variance/minimum error. We can likely say they have no units, but they certainly do have meaning.

          Comment


          • #6
            Originally posted by Fatt View Post
            They are the first (x-axis) and second (y-axis) principal components, respectively. It's a projection of your data onto a subspace that captures the maximum variance/minimum error. We can likely say they have no units, but they certainly do have meaning.
            Yes, I know these are the principal components. It is just that DESeq2 prints units on these axes (you can check the link to the plot in my first post) and I could not make any sense of these. I also saw a lot of other PCA plots (presumably produced by other programs) displaying units on the axes so wondered what these are - just do image search on Google for "PCA plot" and you will see a plenty of graphs displaying units on the axes. It is the units that I find confusing, not the axes themselves.

            Comment


            • #7
              I found the answer to this on CrossValidated: it seems the units denote the raw component scores. As the components are themselves linear combinations of multiple genes it is hard to interpret these raw scores biologically. They are just coordinates in the two-dimensional PC space and are helpful to simply place the individual samples in that space.

              Comment


              • #8
                I might be saying this a bit late, but the % of intertia (ie total variance) captured by each PC is a very useful "quality control" measure for PCA and should be included if possible in the figure. The axis units, as said above, are not very interpretable.

                BTW, that's a whopping big 1st PC! Nice!

                Comment


                • #9
                  So, how much % of the total variance should be expected?

                  Comment


                  • #10
                    @student-t: I replied to this on biostars.

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Best Practices for Single-Cell Sequencing Analysis
                      by seqadmin



                      While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                      Today, 07:15 AM
                    • seqadmin
                      Latest Developments in Precision Medicine
                      by seqadmin



                      Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                      Somatic Genomics
                      “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                      05-24-2024, 01:16 PM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, Today, 08:18 AM
                    0 responses
                    11 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, Today, 08:04 AM
                    0 responses
                    12 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 06-03-2024, 06:55 AM
                    0 responses
                    13 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 05-30-2024, 03:16 PM
                    0 responses
                    27 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X