Seqanswers Leaderboard Ad

**areyes** · 10-10-2012, 03:58 AM

Hi Yerbol,

Can you include the output of your sessionInfo()?

I think the function estimateSizeFactors, that is called inside makeCompleteDEUAnalysis, is returning NAs. Do you have NA values in your counts? or any not integer value?

Best wishes,
Alejandro

**gokhulkrishnakilaru** · 10-10-2012, 04:32 AM

Originally posted by areyes View Post

Hi Yerbol,

Can you include the output of your sessionInfo()?

I think the function estimateSizeFactors, that is called inside makeCompleteDEUAnalysis, is returning NAs. Do you have NA values in your counts? or any not integer value?

Best wishes,
Alejandro

I see NAs in my counts.

**areyes** · 10-10-2012, 05:43 AM

That should not be the case, either you have 0 or more counts, but NA counts are strange.
What are you using to make the counts? Do you see NAs also in the count files?

**gokhulkrishnakilaru** · 10-10-2012, 05:55 AM

Originally posted by areyes View Post

That should not be the case, either you have 0 or more counts, but NA counts are strange.
What are you using to make the counts? Do you see NAs also in the count files?

Oops, its my bad now. I saw it after running the R script. Not in the counts file. Sorry my friend.

**yerbol** · 10-11-2012, 03:03 AM

Can you include the output of your sessionInfo()?

R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] SAJR_0.0 multicore_0.1-7 DEXSeq_1.2.1 Biobase_2.16.0 BiocGenerics_0.2.0

loaded via a namespace (and not attached):
[1] biomaRt_2.12.0 hwriter_1.3 MASS_7.3-7 plyr_1.7.1 RCurl_1.91-1 statmod_1.4.15
[7] stringr_0.6 tools_2.15.1 XML_3.9-4

I think the function estimateSizeFactors, that is called inside makeCompleteDEUAnalysis, is returning NAs. Do you have NA values in your counts? or any not integer value?

no, i checked by is.na(ecs). Counts are seem to be OK, they were generated by dexseq_count.py

Moreover, I guess the problem is somehow related to the number of conditions. When I run DEXSeq on two conditions (2 conds * 3 replicates) it works, but when vector with conditions in design contains more than two categories it fails.

**areyes** · 10-11-2012, 03:56 AM

could you check using:

any(is.na(counts(ecs)))

???

**yerbol** · 10-11-2012, 04:15 AM

Originally posted by areyes View Post

could you check using:

any(is.na(counts(ecs)))

???

yes, i misstyped, I do that. It returns FALSE.
And as I said, it works when I subset from same dataset only two conditions

**areyes** · 10-11-2012, 04:48 AM

The number of conditions or replicates should not be a problem. I am unable to reproduce this error... would you mind sending me your ExonCountSet object to give it a closer look?

**yerbol** · 10-11-2012, 04:59 AM

yes, of course. email?

**areyes** · 10-11-2012, 06:59 AM

Hi Yerbol,

Thanks for sending me your ExonCountSet object. I did

Code:

> toOut <- colSums(counts(ecs1))
> names(toOut) <- NULL
> toOut
 [1]  1779963   136044    26189  3148937  6446319  3636717  1587331   609052
 [9]  1453709  5970441    57469  2804561       18       17       19  3630791
[17] 12749375  5368012

You have samples with very low counts (26189, 57469, 18, 19, 17), which is definitely not normal), which sequencing technology are you using? Maybe you should check your read and alignment files. When removing this strange samples, the normalization factors are not NA anymore.

Alejandro Reyes

**yerbol** · 10-11-2012, 07:41 AM

Yes, I know, some samples failed and have no coverage.
And I havn`t done any prefiltering yet. I thought it shouldn`t influence results, because in case of low counts there will be no significance in DEU-test.

So, formally - what should be lowest (and may be highest??) count number for correct run of estimateSizeNumbers. And its really weird that it influence other samples too.

And thank you VERY much for help!

**areyes** · 10-11-2012, 08:05 AM

I suggest you standart quality controls and discard the samples that are not good (% of aligned reads, quality per cycle, PCR duplicates, contamination, etc). It is just impossible to compare a library with 12749375 read counts with one with 7 read counts.

No problem!

Alejandro

**capricy** · 11-08-2013, 02:27 PM

I found the similar NA problem and my library size looks ok following your sample scripts:

> out<-colSums(counts(ecs))
> names(out)<-NULL
> out
[1] 48205948 43440778 22486575 22125932 40800119 51553703 47167921 14781957
[9] 30978061 62536983 25509126 55754959 16406873 59322632 39926544 63520796
[17] 71058692 58530700

**capricy** · 11-08-2013, 02:39 PM

more details about my data:

any(is.na(counts(ecs)))
[1] FALSE

> design
countFile condition
MP ./ACAGTG///accepted_hits.exonCounts MP
FA ./ACTTGA///accepted_hits.exonCounts FA
FR ./AGGTTT///accepted_hits.exonCounts FR
FW ./AGTCAA///accepted_hits.exonCounts FW
MA ./ATCACG///accepted_hits.exonCounts MA
FW.1 ./CAGATC///accepted_hits.exonCounts FW
MM ./CGATGT///accepted_hits.exonCounts MM
FP ./CTTGTA///accepted_hits.exonCounts FP
MA.1 ./GATCAG///accepted_hits.exonCounts MA
MW ./GCCAAT///accepted_hits.exonCounts MW
FM ./GGCTAC///accepted_hits.exonCounts FM
FR.1 ./GTCCGC///accepted_hits.exonCounts FR
MM.1 ./TAGCTT///accepted_hits.exonCounts MM
MW.1 ./TGACCA///accepted_hits.exonCounts MW
MP.1 ./TTAGGC///accepted_hits.exonCounts MP
FP.1 ./index10///accepted_hits.exonCounts FP
FM.1 ./index11///accepted_hits.exonCounts FM
FA.1 ./index9///accepted_hits.exonCounts FA

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 17 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

[DEXSeq] problem with estimateSizeFactors

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News