Seqanswers Leaderboard Ad

**dpryan** · 01-10-2014, 01:48 AM

Limma can handle paired setups. So can DESeq and edgeR ("person" is just used as a factor).

**raphael123** · 01-10-2014, 07:54 AM

Thnak you ! I found here that DESeq does not support the pair test :

[BioC] RNAseq expression analysis using DESeq: technical replicates, paired samples

https://stat.ethz.ch/pipermail/bioconductor/2010-April/032869.html

It s not clear for DESeq2 and hard to figure for Limma. Maybe someone could give me an example ?

**dpryan** · 01-10-2014, 08:05 AM

It sort of depends on what method you want to use to model the paired samples. The classic method would be a mixed-effect linear model, which DESeq doesn't do. You can also just put each patient/person in as a factor, which isn't doing the exact same thing but is at least in the right direction. Limma has some of its own methods, since it's not using raw count data. In the edgeR vignette, there's an example of this in section 3.4.1 (DESeq works the same). In the limma vignette, look at section 16.3 (or google around for other "tumor normal comaprison"s, which are the most frequent use example).

**raphael123** · 01-10-2014, 10:29 AM

As I understand, edgeR, as DESeq have the exact test and the GLM test. In both case the GLM test allow to test for multiple factor and we can use use the subjects ID as a factor.
Problem is that GLM test seems less accurate, I compare both and the result is quite different.
I also compare my results in edgeR with my result in DESeq and it s still very different.
I am a bit new in the field so I would be happy if you have any comment, here is my code :

Code:

library("edgeR")

miRNA=read.table("merged_filtered_data.csv", header=TRUE,row.names=1)
roundmiRNA=round(miRNA)
roundmiRNA = roundmiRNA[apply(roundmiRNA,1,sum)!=0,]
counts=roundmiRNA[miRNAdesign$week=="0"]

info=read.table("samples_infos.csv", header=TRUE,row.names=1)
miRNAdesign=data.frame(row.names = colnames(info),
  subject=t(info)[,"subject"],
  week=t(info)[,"week"],
)

y <- DGEList(counts=roundmiRNA,group=miRNAdesign$week)
y <- calcNormFactors(y)
y <- estimateCommonDisp(y)
y <- estimateTagwiseDisp(y)
et <- exactTest(y)
topTags(et)


design <- model.matrix(~miRNAdesign$week)
y <- DGEList(counts=roundmiRNA)
y <- estimateGLMCommonDisp(y,design)
y <- estimateGLMTrendedDisp(y,design)
y <- estimateGLMTagwiseDisp(y,design)
fit <- glmFit(y,design)
lrt <- glmLRT(fit,coef=2)
topTags(lrt)



design <- model.matrix(~miRNAdesign$subject+miRNAdesign$week)
y <- estimateGLMCommonDisp(y,design)

**raphael123** · 01-10-2014, 02:51 PM

Limma is doing a contrast test that I do not understand yet .. My question is still on !

**dpryan** · 01-10-2014, 03:22 PM

roundmiRNA=round(miRNA)

You need to thoroughly reconsider what you're doing. That line in your script tells me that your counts are wrong (unless that line has no effect).

That using the exact test and a GLM produce somewhat different results isn't exactly surprising. Normally the changes are on the margins.

**raphael123** · 01-10-2014, 11:17 PM

I made my counts round because each read is splitted between each possible map in the genome when I build the table. It make float numbers !

I have to do the math because I don t really get what happened. THank you for your help

**dpryan** · 01-10-2014, 11:37 PM

Originally posted by raphael123 View Post

I made my counts round because each read is splitted between each possible map in the genome when I build the table. It make float numbers !

That's not the appropriate way to go about things. All counts should be integer because they could be nothing but integer. Use featureCounts or htseq-count (or summarizeOverlaps in R, though that's probably really slow) to derive your count tables.

**raphael123** · 01-11-2014, 10:40 AM

Originally posted by dpryan View Post

That's not the appropriate way to go about things. All counts should be integer because they could be nothing but integer.

How do ou deal with a read that map at two different places ? You can make it desapear like the libraries that you suggest me. At the end of the day you loose some information

featureCounts : It does not count reads overlapping with more than one feature

htseq-count : If it contains more than one feature, the read is counted as ambiguous (and not counted for any features)

**dpryan** · 01-11-2014, 11:14 PM

You don't deal with ambiguous mappings since all of the count-based statistical tests (edgeR/DESeq/Limma/etc.) aren't meant to deal with them (a general rule of high-throughput data is that you need to whittle away the parts of the data that aren't useable).

Topics	Statistics	Last Post
A Closer Look at the Enigmatic Genomes of Oikopleura dioica by seqadmin Started by seqadmin, 05-10-2024, 06:35 AM	0 responses 20 views 0 likes	Last Post by seqadmin 05-10-2024, 06:35 AM
Advanced Epigenome Editing Platform Explores Gene Regulation Mechanisms by seqadmin Started by seqadmin, 05-09-2024, 02:46 PM	0 responses 25 views 0 likes	Last Post by seqadmin 05-09-2024, 02:46 PM
Telomere Maintenance by PARP1: A New Perspective in Cancer Research by seqadmin Started by seqadmin, 05-07-2024, 06:57 AM	0 responses 21 views 0 likes	Last Post by seqadmin 05-07-2024, 06:57 AM
Enhanced Neoantigen Detection: Introducing NeoHunter by seqadmin Started by seqadmin, 05-06-2024, 07:17 AM	0 responses 21 views 0 likes	Last Post by seqadmin 05-06-2024, 07:17 AM

Seqanswers Leaderboard Ad

Announcement

Question on a paired test for RNA expression

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News