Unconfigured Ad

**marcus1487** · 09-11-2014, 04:25 PM

So I was able to get a work-able solution to the second half of my problem, but I am still not sure if it is producing accurate results. Here is the function I have produced to testForDEU on each cellLine independently using globally fit dispersions. The dxd object is taken from the code in my previous post.

Code:

fitOneCellLine <- function(cellLine){
  sampleData1 <- sampleData[sampleData$cellLine == cellLine,]
  countData1 <- countData[,sampleData$cellLine == cellLine]
  dxd1 <- DEXSeqDataSet(
    countData1, sampleData1, formula( ~ sample + exon + condition:exon),
    featureID, groupID)
  dxd1 <- estimateSizeFactors(dxd1)

  ## add dispersions from rep samples
  dxdDims <- dim(mcols(rowData(dxd)))[2]
  mcols(rowData(dxd1))[5:dxdDims] <- mcols(rowData(dxd))[5:dxdDims]
  mcols(mcols(dxd1)) <- mcols(mcols(dxd))
  dxd1@dispersionFunction <- dxd@dispersionFunction
  
  dxd1 <- testForDEU(dxd1, reducedModel =  ~ sample + exon)
  return(DEXSeqResults(dxd1))
}

aRes <- fitOneCellLine('a')
bRes <- fitOneCellLine('b')
cRes <- fitOneCellLine('c')

I note that the three results objects have different p-values, but I am still not sure if I have transferred all of the appropriate information into the cellLine specific object before testing for DEU. I am mostly concerned that this workflow is not producing accurate results. I am not sure what to test against either as this is the only way I could get cellLine specific results.

Also I was not able to produce the fold change values here, but in my data set generated using the DEXSeqDataSetFromHTSeq function I am able to estimate fold changes. I am not sure what else is different between this randomly generated data set and my biological data set aside from the generator function, but in any case that is not a huge concern of mine.

**areyes** · 09-23-2014, 06:39 AM

Dear Marcus,

Of course the optimal is that you would have replicates for the third cell line. But what you could also try is to estimate the dispersions using only the cell lines for which you have replicates, and pass this dispersion estimates to an object with all the data. This would assume that the variability between each of the celllines is similar... although I don't know how true this is.

With regards to the implementation, testForDEU requires the columns "allZero", "dispGeneEst", "dispFit", "dispersion","dispMAP", "dispOutlier". You could pass these columns to the object used for the testing independently of how you estimate the dispersions before this step and this should work.

**marcus1487** · 09-23-2014, 02:45 PM

Hi areyes,

Thank you for your response.

In terms of the assumption that variability between each of the cell lines is similar, isn't this just the same assumption that is made when variability is estimated across all samples even when they are all replicated?

To spell it out a little more, if we are testing for the effect of a condition across cell lines What is the difference between these two situations:

1) Estimating variability across a set of replicated cell line - condition combinations and then testing for differential expression between conditions within a single replicated cell line.
2) Estimating variability across a set of replicated cell line - condition combinations and then testing for differential expression between conditions within a separate UN-replicated cell line.

I may be wrong, but I think the only difference is the additional power from an additional experiment in the replicated experiment (assuming common read coverage). I guess the question is: Is there reason to share dispersion estimates between different conditions when comparing differential expression across another condition.

In terms of the implementation details this is incredibly helpful! Thank you so much for confirming that this works correctly assuming the above assumptions are valid.

**areyes** · 09-24-2014, 12:15 AM

Hi Marcus,

You are right, so far one dispersion estimate is estimated for all the levels of the factorial designs.

I think that if you include the un-replicated cell-line when estimating the dispersions, the final dispersion estimates might be underestimated so it might be better to use only the replicated samples for this step.

Yes, including more samples should give you additional power. But the think that I would keep in mind is that it would be not possible to estimate the variability in the un-replicated cell-line, so its hard to know if the dispersion estimation based on the replicated cell-lines models well the variability of the un-replicated cell-line. I would just try and see how much additional power to get by including the un-replicated cell-line.

On other topics, I just read the formulas of your code again:

design <- formula( ~ sample + exon + cellLine + cellLine:exon + condition:exon )
subDesign <- formula( ~ sample + exon + cellLine + cellLine:exon )

In this formulas, you don't have to include the cellLine effect alone, you can just do:

design <- formula( ~ sample + exon + cellLine:exon + condition:exon )
subDesign <- formula( ~ sample + exon + cellLine:exon )

to consider the variability in exon usage due to the different cell-lines.

Topics	Statistics	Last Post
Genome-Wide CRISPR Screen Uncovers Unlikely Psoriasis Target by SEQadmin2 Started by SEQadmin2, Yesterday, 10:08 AM	0 responses 6 views 0 reactions	Last Post by SEQadmin2 Yesterday, 10:08 AM
Engineered Protein Motor Takes Its First Steps Along DNA Track by SEQadmin2 Started by SEQadmin2, 07-07-2026, 11:05 AM	0 responses 8 views 0 reactions	Last Post by SEQadmin2 07-07-2026, 11:05 AM
High-Resolution Sequencing Exposes Hidden Toxoplasma Diversity by SEQadmin2 Started by SEQadmin2, 07-02-2026, 11:08 AM	0 responses 31 views 0 reactions	Last Post by SEQadmin2 07-02-2026, 11:08 AM
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, 06-30-2026, 05:37 AM	0 responses 29 views 0 reactions	Last Post by SEQadmin2 06-30-2026, 05:37 AM

Unconfigured Ad

DEXSeq Partially Unreplicated Dataset: Weird EstimateDispersions behavior

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News