Unconfigured Ad

**gringer** · 05-26-2017, 08:12 PM

I've had the best results from PCAs based on DESeq2 results when I used the VST and did an additional correction for transcript length (i.e. divide the by the longest transcript per gene in kb). This was before the rlogTransformation was visible/usable, so it might be that rlog works better for that.

What was your experimental design? Were all these six samples separate biological replicates? It's concerning that your samples are clustering by ID first and by treatment second. In our case samples clustered primarily by cell population first, and by treatment second. If your ID18_X and GP18_X come from the same (or similar) samples, or were sequenced/extracted in batches (we've noticed sequencing batch effects as well), that might explain why they're clustering together.

As a sanity check for PCAs, it's a good idea to make sure that the data you're generating the PCA from fits a normal distribution. You can do this by running qqnorm(<data>); values should generally be a straight line along the diagonal, usually with a bit of deviation at the extremities. If the qqnorm plot isn't approximately a straight line, then the data will need additional normalisation applied before running a PCA.

**ronaldrcutler** · 05-29-2017, 09:51 AM

Hi Gringer,

The ID18_x were biological replicates from the same batch and GP18_x were biological replicates from the same batch. They did not come from the same samples or the same batch.

I was not able to generate a distribution with the regularized log transformation (unsure how to extract the values from the data.frame), but came up with a plot of variance over the read counts which shows that there does not seem to be a dependence of the variance on the mean.

**gringer** · 05-29-2017, 12:19 PM

By plotting the rank (assuming you have actually plotted the rank), you've removed any parametric factors from the plot. If you're doing a PCA on the rank then this would be fine, but I suspect your PCA is being done on something else. You need to make sure that the same values are plotted that are observed by the PCA calculation.

Topics	Statistics	Last Post
Genome-Wide CRISPR Screen Uncovers Unlikely Psoriasis Target by SEQadmin2 Started by SEQadmin2, Today, 10:08 AM	0 responses 6 views 0 reactions	Last Post by SEQadmin2 Today, 10:08 AM
Engineered Protein Motor Takes Its First Steps Along DNA Track by SEQadmin2 Started by SEQadmin2, Yesterday, 11:05 AM	0 responses 7 views 0 reactions	Last Post by SEQadmin2 Yesterday, 11:05 AM
High-Resolution Sequencing Exposes Hidden Toxoplasma Diversity by SEQadmin2 Started by SEQadmin2, 07-02-2026, 11:08 AM	0 responses 30 views 0 reactions	Last Post by SEQadmin2 07-02-2026, 11:08 AM
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, 06-30-2026, 05:37 AM	0 responses 28 views 0 reactions	Last Post by SEQadmin2 06-30-2026, 05:37 AM

Unconfigured Ad

DESeq2 PCA Plots

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News