Seqanswers Leaderboard Ad

**Simon Anders** · 10-10-2011, 11:54 PM

It should be much easier to get dexseq_count.py to work than to try to use htseq_count.py. The reason that I have written two scripts is, after all, that the tasks are not exactly the same.

dexseq_count.py should not have any problems with a GFF file produced with dexseq_prepare.py.

I used htseq-count instead of dexseq-count because GFF contained attribute "exonic_part" not "exon" and dexseq_count was expecting "exon" whereas htseq-count alloowed me to specify the attribute name.

dexseq_prepare.py has split and renamed all your "exon" lines to "exonic_part", and this is what dexseq_count.py expects.

We have tested it with TopHat SAM files but not yet with GSNAP SAM files. With TopHat, it works fine.

Maybe you can send me by email an excerpt from your TopHat and GSNAP SAM files and then, we can investigate.

**FuzzyCoder** · 10-11-2011, 06:21 PM

Thank you Simon.

After reading your reply, I decided to rerun dexseq_count.py and it indeed worked

. It seems that I did not actually follow all the pre-processing steps necessary to make the SAM compatible when I last tried dexseq_count

.

I have been able to generate count files and create an ExonCountSet object per the 10/2/2011 pasilla vignette (using my data) through estimateDispersions.

However, I now get an error when attempting to run fitDispersionFunction:

Code:

Error in glmgam.fit(mm, disps[good], start = coefs) : 
  More columns than rows in X
In addition: Warning message:
In is.na(rows) : is.na() applied to non-(list or vector) of type 'NULL'
Error in fitDispersionFunction(ecs) : 
  Failed to fit the dispersion function

Any thoughts?

I will attempt to email the RData containing the ecs. However, it is 7MB, so please let me know if that does not make it to you. I can give you access to it via DropBox or FTP at your preference.

**areyes** · 10-12-2011, 12:27 AM

Also... you don't have the latest version of DEXSeq (0.99.0). You could also try updating it.

**FuzzyCoder** · 10-12-2011, 07:26 AM

Alejandro-

I updated to 0.1.29 with biocLite. Same error.

I emailed the data file to you and Simon (~7MB). I also sent you an invitation to my DropBox folder.

Thanks for your assistance.

**areyes** · 10-12-2011, 08:15 AM

I found the reason why the function is breaking. DEXSeq follows the motivation of DESeq package to use biological replicates to estimate the variance between samples to distinguish real effects from your treatments from just technical and biological variation, in this case you don't have biological replicates and the individual exon dispersion estimations give values that are basically 0. (Try doing the Figure 1 of the vignette) Then, at the time of estimating the dispersion function it just breaks, because the data behaves differently of what we are expecting.

**FuzzyCoder** · 10-12-2011, 08:26 AM

Thank you!

I will try it with additional replicates. I only used one per condition because I wanted to work my way through the GSNAP -> DEXSeq workflow successfully before I ran all the replicates through GSNAP (~12 hours per replicate). I will let you know how it goes tomorrow.

**oliviera** · 10-18-2012, 12:39 AM

Dear all,
I try to use DExseq but got the following error when I call the function

ecs<- fitDispersionFunction(ecs)

Warning message:
In glmgam.fit(mm, disps[good], coef.start = coefs) :
Too much damping - convergence tolerance not achievable

Here is the version in R I use
R version 2.15.1 (2012-06-22)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] C/UTF-8/C/C/C/C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] DEXSeq_1.4.0 Biobase_2.18.0 BiocGenerics_0.4.0
loaded via a namespace (and not attached):
[1] RCurl_1.95-1.1 XML_3.95-0.1 biomaRt_2.14.0 hwriter_1.3 plyr_1.7.1
[6] statmod_1.4.16 stringr_0.6.1

Any suggestion on what is going on?

Cheers
Olivier

**areyes** · 10-18-2012, 12:51 AM

Hi oliviera,

This warning sometimes happens when your data is sparsed. Have you done the "fit diagnostics" as indicated in the vignette? As it is just a warning, you can go on with your analysis, maybe you will loose some power if the fit is "above" most of the points...

Alejandro

**oliviera** · 10-18-2012, 02:00 AM

Hi Alejandro,

Here is the graph to check dispersion estimate. What do you think?

meanvalues<- rowMeans(counts(ecs))
plot(meanvalues, fData(ecs)$dispBeforeSharing, log="xy", main="mean vs CR dispersion")
x<- 0.01:max(meanvalues)
y<- ecs@dispFitCoefs[1] + ecs@dispFitCoefs[2] / x
lines(x, y, col="red")

Attached Files

Screen shot 2012-10-18 at 11.55.12 AM.jpg (79.7 KB, 134 views)

**areyes** · 10-18-2012, 02:02 AM

I think it should be OK, how many hits do you get?

**oliviera** · 10-20-2012, 12:53 AM

Only 92 with adjp < 0.01.
With DEseq I get ~2000 genes at this cut off. I think I made something wrong

**metheuse** · 04-17-2013, 12:45 PM

I got the same problem. It's all zeros in the second column of the output of dexseq_count.py
I saw you said you realized later you didn't follow the manual exactly. Could you tell me what's the problem? Thanks.

**Simon Anders** · 04-17-2013, 01:56 PM

Originally posted by oliviera View Post

Only 92 with adjp < 0.01.
With DEseq I get ~2000 genes at this cut off. I think I made something wrong

Why would you use such as stringent cut-off?

Do you really need to make sure that your list of differentially used exons do not contain more than 1% false positives? In most use cases, 10% are considered acceptable.

And: Of course, you get much more genes than exons. Detecting differential expression is needs to see much less information in the data than detecting differential exon usage.

**Simon Anders** · 04-17-2013, 01:58 PM

Originally posted by metheuse View Post

I got the same problem. It's all zeros in the second column of the output of dexseq_count.py
I saw you said you realized later you didn't follow the manual exactly. Could you tell me what's the problem? Thanks.

Hav you checked your alignments with a genome browser? Load the SAM file and the GFF file produced by dexseq_prepare in, e.g., IGV, and look at one of the loci with zero counts. If there really are no reads, you experiment has failed (or you are using a wrong annotation file).

Topics	Statistics	Last Post
ASHG 2024 Highlights – Part Two by seqadmin Started by seqadmin, Today, 11:09 AM	0 responses 24 views 0 likes	Last Post by seqadmin Today, 11:09 AM
ASHG 2024 Highlights – Part One by seqadmin Started by seqadmin, Today, 06:13 AM	0 responses 20 views 0 likes	Last Post by seqadmin Today, 06:13 AM
Seq-Scope Expands Possibilities for High-Resolution Gene Expression Analysis by seqadmin Started by seqadmin, 11-01-2024, 06:09 AM	0 responses 30 views 0 likes	Last Post by seqadmin 11-01-2024, 06:09 AM
New Model Aims to Explain Polygenic Diseases by Connecting Genomic Mutations and Regulatory Networks by seqadmin Started by seqadmin, 10-30-2024, 05:31 AM	0 responses 21 views 0 likes	Last Post by seqadmin 10-30-2024, 05:31 AM

Seqanswers Leaderboard Ad

Announcement

DEXSeq Using Counts File From htseq-count

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News