Unconfigured Ad

**Simon Anders** · 05-21-2012, 12:29 AM

Size factors of your range are quite common, and DESeq's main functionality, i.e., testing for differential expression, copes well with it. Hence, just go ahead and run your tests.

The VST needs to resort to a certain approximation (details on request) and hence the heatmap might become misleading if the size factors are different. This does not affect the actual test functions because they do not use the VST.

**dav1dmartin** · 05-21-2012, 03:27 PM

Thanks for the info. Do you know a convenient way to assess global changes in gene expression across samples to group samples in this case? In the vignette for example, the blinded dispersion estimates followed by the vst and distance matrix allowed an unbiased grouping of similar samples(given similar sizeFactors). What if one were to measure the covariance of each sample versus every other, using normalized ratios of individual gene counts to the average gene counts across all samples? Would this allow some sort of grouping between samples with positive vs. negative covariance? Or would you run into the same problem of high variance genes skewing the comparison, if so, could one group the genes according to expression or variance and try this? Thanks again, I am currently trying to generate a list of differentially expressed genes which I am confident are related to the treatment and not high inter-animal variability. I have checked some with qpcr with mixed results so far...
-David

**Simon Anders** · 05-22-2012, 12:56 AM

I am not quite sure I understand your problem. You want to know which genes changed due to treatment and want to guard against within-group variability. This is the default use case for DESeq, and you will get a statistically sound result if you follow the standard work-flow (which does not use the VST).

Hence, why again do you want to use the VST? You will need to explain your setup in more detail.

BTW, checking by qPCR is only very rarely useful. It helps to avoid technical noise (if you think that qPCR is more precise than RNA-Seq) but as you main worry is sample-to-sample variation due to biological causes (i.e., actual expression differences rather than measurement errors), measuring the same samples with another technique will not tell you anything new.

**pbarros** · 12-06-2012, 09:23 AM

DESeq - High Count Variablity across Samples

Dear Simon,
I am using DESeq in the analysis of RNAseq data, but I'm still doing experiments with the package, to learn how to use it properly for my particular of data... In this analysis I have two 'control' (replicate) samples and only one 'test' sample (and I will not have replicates for this condition unfortunately). My goal now is just to see whether or not I can use the two control samples as replicates, since the 'controlled' conditions in which the plant material was collected were slightly different.

Regarding your previous post I'm not sure if I understood well.

Originally posted by Simon Anders View Post

The VST needs to resort to a certain approximation (details on request) and hence the heatmap might become misleading if the size factors are different. This does not affect the actual test functions because they do not use the VST.

So does this mean that if there is some (high) variation between size factors, we may not trust on the results retrieved after VST?
I am facing "similar" results to what was reported in the DESeq vignette, although in my case the number of replicates is reduced.
Specifically if I build heatmaps (for count data and sample-to-sample distances) using VST data, my two replicates for 'control' condition cluster together. But when I use untransformed counts one of the 'control' samples clusters with the 'test' sample.

What intrigues me now is the fact that the size factors are

test:1.8420157
control1:0.8258893 (control1 is the one that clusters differently)
control2:0.6850067

So my question is this: can I just "trust" on these results and accept my two controls as replicates, or this is a case when "heatmaps might become misleading"...?

thank you in advance

Pedro

Topics	Statistics	Last Post
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 12 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 48 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 106 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 125 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM

Unconfigured Ad

DESeq - High Count Variablity across Samples

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News