independet filtering and experimental design in DESeq

Hi everybody,

I know this problem to be discussed quite a lot. I read the posts here and here (and also the papers mentioned in them).

I have two questions concerning my experiment. One is about the experimental design, the second about how to set the filtering. I think they are both somehow connected, so I would like to place them in one post.

In my experiment we have three conditions (ctrl, KO1 and KO2) and three separate cell types ( I, P, and NP).
I would like to understand better how to analyse the data in one go.

The aim of the experiment is not only to compare the ctrl vs. KO1 and/or KO2, but also to analyse the efficiency of cellular processes by comparing NP vs. P in ctrl and/or KO1 and KO2.

I ran the analysis once with all genes (without any filtering at all, first!). I compared the ctrl vs. KO1 and KO2. It was interesting to see, that in all the comparisons of ctrl vs. KO1 I get a long list of significantly deregulated genes (FDR=0.1%), but in the comparison ctrl vs. KO2 I get only 2-5 genes.
So I thought a good explanation for that will be filtering the low-count genes. In search of a good cutoff I tried the genefilter package and got the following rank plot:

Q1: I was wondering if cutting the data set at 0.57 is a good decision.

Than I looked for a FDR value and did the rejection plot, to see how many genes I am left with, with each of the different FDR values.

It was interessting to see, that from 0%-50% they are all overlap each other.

Q2: Does that mean, that there is no difference between ϑ=0.5 and ϑ=0.1?

pair-wise vs. multifactor design:

I read the DESeq manual and ran the analysis as described here:

Code:

pd <- read.delim2("../phenoData.txt", sep="\t",quote="", row.names=1)

featureCountTable = read.table( "countTable.txt", header=TRUE, row.names=1, quote="")

conditions = factor(pd$comparison) # I have [COLOR="Red"]nine conditions[/COLOR] are ctrl_I, ctrl_NP, ctrl_P, KO1_I, KO1,NP, KO1_P KO2_I, KO2_NP and KO2_P

cds = newCountDataSet( featureCountTable, conditions )

cds = estimateSizeFactors( cds )
normResults <- counts( cds, normalized=TRUE ) 

#Variance estimation
cds = estimateDispersions(cds)

# I than ran for each comparison a binomial test
res_I_ctrl_KO1 = nbinomTest( cds, "ctrl_I", "KO1_I" )
res_P_ctrl_KO1 = nbinomTest( cds, "ctrl_P", "KO1_P" )
...

I was wondering if DESeq can work this way or if I need to run a multi-factor design such as

Code:

fit1 = fitNbinomGLMs( cdsFullDataSet, count ~ libType + condition )
fit0 = fitNbinomGLMs( cdsFullDataSet, count ~ libType )

whereas libType will be the ctrl, KO1 and KO2 and condition will be I, NP and P.

It will be great if I can get some help.

thanks a lot

Assa

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 15 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

independet filtering and experimental design in DESeq

Latest Articles

ad_right_rmr

News