I am using DEXSeq for testing differential exon usage between two conditions: control and treatment. For each condition, I have 8 biological replicates (C1-C8, and T1-T8). The design is listed below.

As you can see from the last column, we have 8 subjects involved in the experiment. Subject 1 has

design(pasillaExons)

gives:

I think in the pasilla example, the biological replicates are all different. Thus in my situation, in order to see if there is differential exon usage between the treatment and control, can I do:

(1)

(2) incorporate the subject as a corvariate (coded that column as a factor), and then analyze in the GLM framework? In this case, in my implementation, shall I write:

(3) I am not sure if including subject as a corvariate is the best approach in my situation. Are there any other options that I can consider?

(4) I write the formula for null and alternative models exactly according to the vignette, but I am not sure if they are what I should put in R implementation.

Thank you so much ;-)

*condition subject*

C1 control 1

C2 control 2

C3 control 3

C4 control 4

C5 control 5

C6 control 6

C7 control 7

C8 control 8

T1 treatment 1

T2 treatment 2

T3 treatment 3

T4 treatment 4

T5 treatment 5

T6 treatment 6

T7 treatment 7

T8 treatment 8C1 control 1

C2 control 2

C3 control 3

C4 control 4

C5 control 5

C6 control 6

C7 control 7

C8 control 8

T1 treatment 1

T2 treatment 2

T3 treatment 3

T4 treatment 4

T5 treatment 5

T6 treatment 6

T7 treatment 7

T8 treatment 8

As you can see from the last column, we have 8 subjects involved in the experiment. Subject 1 has

**both the control and the treatment**, and so on for all the other subjects. This is different from the situation discussed in the DEXSeq vignette here, for example:design(pasillaExons)

gives:

*condition type*

treated1fb treated single-read

treated2fb treated paired-end

treated3fb treated paired-end

untreated1fb untreated single-read

untreated2fb untreated single-read

untreated3fb untreated paired-end

untreated4fb untreated paired-endtreated1fb treated single-read

treated2fb treated paired-end

treated3fb treated paired-end

untreated1fb untreated single-read

untreated2fb untreated single-read

untreated3fb untreated paired-end

untreated4fb untreated paired-end

I think in the pasilla example, the biological replicates are all different. Thus in my situation, in order to see if there is differential exon usage between the treatment and control, can I do:

(1)

**ignore the fact that each subject had both control and treatment**? In this case, in my implementation, shall I write:

pExons = estimateDispersions(pExons, formula=f_dispersion)

pExons = fitDispersionFunction(pExons)

pExons = testForDEU(pExons, formula0 = f_0, formula1 = f_1)**f_dispersion = count ~ sample + condition * exon**pExons = estimateDispersions(pExons, formula=f_dispersion)

pExons = fitDispersionFunction(pExons)

**Null model: f_0 = count ~ sample + condition****Alternative model: f_1 = count ~ sample + condition * I(exon == exonID)**pExons = testForDEU(pExons, formula0 = f_0, formula1 = f_1)

(2) incorporate the subject as a corvariate (coded that column as a factor), and then analyze in the GLM framework? In this case, in my implementation, shall I write:

**f_dispersion = count ~ sample + (condition + subject) * exon****Null model: f_0 = count ~ sample + subject * exon + condition****Alternative model: f_1 = count ~ sample + subject * exon + condition * I(exon == exonID)**(3) I am not sure if including subject as a corvariate is the best approach in my situation. Are there any other options that I can consider?

(4) I write the formula for null and alternative models exactly according to the vignette, but I am not sure if they are what I should put in R implementation.

Thank you so much ;-)

## Comment