Seqanswers Leaderboard Ad

**svl** · 11-17-2009, 05:50 AM

Hi Xi Wang,

--------------
Thanks again for your package, seems to work fine. Will you be including more info in the output.html? For instance;
* correlation measures
* nr of reads included for each
* amount of differentially expressed
Of course we can extract that ourselfs from the output_score.txt files, but it still would be nice to have some more info directly

--------------
And why are the header values for the two compared samples called "value1" and "value2". If you give the sample-names via the flag "groupLabel1/2" in the function DEGexp() it would be nice if they show up in output_score.txt files too.

--------------
And what does the option "rawCount" excactly do?

thanks,
-SvL

**adamreid** · 11-17-2009, 07:24 AM

Hi Xi,

I guess I'm asking whether there is/ought to be a correction for gene length because more reads are expected to map to longer genes.

Something else I was wondering about was the use of multiple examples for the two conditions. If I use multiple columns for expCol1 and expCol2 the number of reads appears to to summed. Is it therefore a bad idea to use say 3 columns for expCol1 and 2 for expCol2?

Adam

**Xi Wang** · 11-18-2009, 09:01 AM

Originally posted by svl View Post

Thanks again for your package, seems to work fine. Will you be including more info in the output.html? For instance;
* correlation measures
* nr of reads included for each
* amount of differentially expressed
Of course we can extract that ourselfs from the output_score.txt files, but it still would be nice to have some more info directly

--------------
And why are the header values for the two compared samples called "value1" and "value2". If you give the sample-names via the flag "groupLabel1/2" in the function DEGexp() it would be nice if they show up in output_score.txt files too.

Thanks a lot for your suggestions. We will add these info in the next version. I am not sure what "nr" is in the sentense "* nr of reads included for each". So could you please give me more details? Thanks.

Originally posted by svl View Post

And what does the option "rawCount" excactly do?

The option rawCount is only used when the method=MATR is chosen. If rawCount = TRUE, we will adjust the mean of M to the same value for the case-and-control samples and the technical replicates. The difference of the mean of M is caused by the different sequence depth in the two samples compared. If rawCount = FALSE, we assume that the gene expression levels have already been normalized (against the sequence depth), such as RPKM. Therefore, no need to adjust the mean of M.

May this information help you.

Wish best wishes,
Xi

**Xi Wang** · 11-18-2009, 09:16 AM

Originally posted by adamreid View Post

I guess I'm asking whether there is/ought to be a correction for gene length because more reads are expected to map to longer genes.

It is ture that more reads come from the longer genes if the copy number of transcripts is the same. However, with the aim to identify the differently expressed genes, we can use raw read counts. The reason is that we only consider every gene, and the gene length in samples is not changed (if ignoring the alternative splicing). For the methods based on the random sampling model (such as LRT, FET, MARS), we suggest using the raw counts, which better fits the random sampling model.

Originally posted by adamreid View Post

Something else I was wondering about was the use of multiple examples for the two conditions. If I use multiple columns for expCol1 and expCol2 the number of reads appears to to summed. Is it therefore a bad idea to use say 3 columns for expCol1 and 2 for expCol2?

It works.

Thanks for your questions.
Xi

**svl** · 11-19-2009, 03:51 AM

Originally posted by Xi Wang View Post

Thanks a lot for your suggestions. We will add these info in the next version. I am not sure what "nr" is in the sentense "* nr of reads included for each". So could you please give me more details? Thanks.
Xi

I wasn't too clear indeed :P, I meant the amount of reads the analysis is based on. I just quickly wrote some things that came to mind.

Originally posted by Xi Wang View Post

If rawCount = FALSE, we assume that the gene expression levels have already been normalized (against the sequence depth), such as RPKM.
Xi

Right. I have RPKM values (cufflinks output), so do you suggest I'd be better off using the method=MATR with rawCount=F instead of method=MARS...? It's not all technical replicates I put up against each other...

**Xi Wang** · 11-19-2009, 08:40 AM

Originally posted by svl View Post

I wasn't too clear indeed :P, I meant the amount of reads the analysis is based on. I just quickly wrote some things that came to mind.

Thanks. It's quite clear this time. We are also feeling those statistics are quite important in practice.

Originally posted by svl View Post

Right. I have RPKM values (cufflinks output), so do you suggest I'd be better off using the method=MATR with rawCount=F instead of method=MARS...? It's not all technical replicates I put up against each other...

Sorry that I didn't make myself quite clearly instead:-( The rawCount option is only for method=MATR. But for other methods, no need to check whether the gene expression levels are quantified by raw read counts or not.
Further, as we recommend to use raw read count as the gene expression level, you can multiply the RPKM by the gene length to get back the raw read count. If you don't want to do like this, DEGexp deals with RPKM well.

**ngcrawford** · 11-19-2009, 04:37 PM

I can't get DEGseq to run my data

DEGseq look really nice, but I'm having trouble getting my data file read. Do I just need to substitute:

>geneExpFile <- system.file("data", "GeneExpExample5000.txt", package = "DEGseq")

with

>geneExpFile <- system.file("data", "MyData.txt", package = "DEGseq")

and then run DEGexp(commands)?

I'm getting the following error:

Error in read.table(geneExpFile1, header = header, sep = sep) :
no lines available in input
In addition: Warning message:
In file(file, "rt") :
file("") only supports open = "w+" and open = "w+b": using the former

Thanks in advance,
Nick

**Xi Wang** · 11-19-2009, 09:48 PM

Originally posted by ngcrawford View Post

DEGseq look really nice, but I'm having trouble getting my data file read. Do I just need to substitute:

with

and then run DEGexp(commands)?

I'm getting the following error:

Thanks in advance,
Nick

Nick,

You can specify the gene expression file in this way:

Suppose the file path is "D:/data/MyData.txt" (windows platform), then

Code:

geneExpFile <- "D:/data/MyData.txt"

Xi

**ngcrawford** · 11-20-2009, 07:04 AM

Xi,

That worked like a charm. Thanks!

- Nick

**ngcrawford** · 11-20-2009, 01:21 PM

Fdr?

How do you set it? It's mentioned in the paper, but I can only find ways to adjust the p-value cut-off.

Thanks in advance.

- Nick

**AmyL** · 11-20-2009, 06:30 PM

Question:

I am attempting to analyze samples that do not have the same number of reads. For example, one has 800K and another has 1.3M. With the analysis from DEGseq, it is obvious that the fold changes between samples are due to the difference in total read numbers. With this particular example, would you recommend using a reads/million normalization?

Thanks in advance!

**Xi Wang** · 11-20-2009, 06:45 PM

Originally posted by ngcrawford View Post

How do you set it? It's mentioned in the paper, but I can only find ways to adjust the p-value cut-off.

Thanks in advance.

- Nick

Sorry, I cannot catch what your meaning. What "it" refers to? Thanks.

Xi

**Xi Wang** · 11-20-2009, 07:08 PM

Originally posted by AmyL View Post

Question:

I am attempting to analyze samples that do not have the same number of reads. For example, one has 800K and another has 1.3M. With the analysis from DEGseq, it is obvious that the fold changes between samples are due to the difference in total read numbers. With this particular example, would you recommend using a reads/million normalization?

Thanks in advance!

AmyL,

Thanks for your question.
If you only care the fold changes, you can use a normalization as you mentioned. Or, in DEGseq, you can use the option normalMethod="median".

Xi

**Likun Wang** · 11-22-2009, 05:01 AM

Originally posted by ngcrawford View Post

How do you set it? It's mentioned in the paper, but I can only find ways to adjust the p-value cut-off.

Thanks in advance.

- Nick

Hi ngcrawford and Xi,
I think ngcrawford want to find a way to set the fdr cut-off.
The following is an example to set it.
DEGexp(geneExpFile1=geneExpFile,geneExpFile2=geneExpFile2,thresholdKind=4,qValue=0.001)
Please type ?DEGexp for detail.

---------------
Likun

**Likun Wang** · 11-22-2009, 05:13 AM

Originally posted by Xi Wang View Post

AmyL,

Thanks for your question.
If you only care the fold changes, you can use a normalization as you mentioned. Or, in DEGseq, you can use the option normalMethod="median".

Xi

BTW: For the fold changes you can do normalization as you methioned or use the option normalMethod="median". But for the methods "LRT", "FET(fisher's exact test)" and "MARS", the row count and normalMethod="none" are recommended for your example.

.

Topics	Statistics	Last Post
Small Blood Stem Cell Subset Linked to Immune System Aging by seqadmin Started by seqadmin, Today, 06:58 AM	0 responses 8 views 0 likes	Last Post by seqadmin Today, 06:58 AM
New AI Model Designs Synthetic DNA Switches for Targeted Gene Expression in Specific Cell Types by seqadmin Started by seqadmin, Yesterday, 08:43 AM	0 responses 18 views 0 likes	Last Post by seqadmin Yesterday, 08:43 AM
Microbes in Urban Spaces Adapt to Disinfectants and Scarce Resources by seqadmin Started by seqadmin, 10-17-2024, 07:29 AM	0 responses 52 views 0 likes	Last Post by seqadmin 10-17-2024, 07:29 AM
Genetic Barcodes and Single-Cell Sequencing Illuminate Tumor Initiation and Chemoresistance in Breast Cancer by seqadmin Started by seqadmin, 10-15-2024, 06:35 AM	0 responses 40 views 0 likes	Last Post by seqadmin 10-15-2024, 06:35 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News