Hi Xi,
Thank you for your so kind reply.
My mapped reads are the "eland" format like this:
26 CCTTTCCACATCTTTCTCCCTCGCT U1 0 1 1 chr12 81865484 R
So, my data should convert to the "eland" format that DEGseq supports like this:
26 CCTTTCCACATCTTTCTCCCTCGCT 81865484 U1 R
I'm just wondering whether my conversion was right.
BTW, after I used the getGeneExp() function, if all of the RPKM values in the expression value files are "0", does it mean that the DEGexp() will fail to read the expCol1 or expCol2 value?
And, is there any difference between the "valCol" in readGeneExp() and the "expCol" in DEGexp()?
Thanks.
lix
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Sorry for the late reply. I was just back home.
You can try the following script. If the problem remains, just let me know.
Code:library(DEGseq) sample1 <- "D:/data/sample1.txt" sample2 <- "D:/data/sample2.txt" refFlat <- "D:/data/refFlat.txt" mapResultBatch1 <- c(sample1) mapResultBatch2 <- c(sample2) outputDir <- "D:/data/DEGseqExample" DEGseq(mapResultBatch1, mapResultBatch2, fileFormat = "eland",refFlat = refFlat, outputDir = outputDir, method = "LRT")
Leave a comment:
-
DEGseq err
Hi Xi,
I'm trying to use DEGseq and my RNA-seq reads have already mapped. The results are the "eland" format. Does DEGseq support this kind of format?
I just follow these commands as the manual metioned:
library(DEGseq)
sample1 <- system.file("data","D:/data/sample1.txt",package = "DEGseq")
sample2 <- system.file("data","D:/data/sample2.txt",package = "DEGseq")
refFlat <- system.file("data","D:/data/refFlat.txt",package = "DEGseq")
mapResultBatch1 <- c(sample1)
mapResultBatch2 <- c(sample2)
outputDir <- file.path(tempdir(),"D:/data/DEGseqExample")
DEGseq(mapResultBatch1, mapResultBatch2, fileFormat = "eland",refFlat = refFlat, outputDir = outputDir, method = "LRT")
However, it reported:
Loading required package: qvalue
Loading required package: tcltk
Loading Tcl/Tk interface ... done
Loading required package: samr
Loading required package: impute
Error in file(file, "r") : cannot open the connection
Calls: DEGseq -> DEGexp -> read.table -> file
In addition: Warning message:
In file(file, "r") :
cannot open file 'D:/data/DEGseqExample/group1.exp': No such file or directory
Execution halted
Is my CMD all right? Any kind of suggetion will be appreciated.
Thanks.
lixLast edited by lix; 01-28-2010, 04:55 AM.
Leave a comment:
-
Hi Abhi,
Thanks for your questions.
Originally posted by apratap View PostWhile I understand if we ignore the isoforms of the same gene, normalizing for the gene len wont make a difference. I am not sure why we should not normalize for the difference in reads counts. Also I am not able to generate the RPKM result using the read count /gene length. Could you please explain.
Originally posted by apratap View Post1. why we are not accounting for diff read numbers in the output_score.txt
Originally posted by apratap View Post2. How is the RPKM being calculated here. Somehow I am not able to find the same number shown in the example taken above.
Hope this helps. If further questions, just let me know. I am glad to solve all the problems.
Best,
Xi
Leave a comment:
-
Hi Xi
Thanks for putting your work in the open source. I am quite happy to see many different methods at my hand to play around with numbers.
I think I have some repeated questions that have been asked on this thread in some way or the other. I may be getting down to very basics and hope that will help some others also.
As a pilot DEgseq usage run, I used the LRT method to find expression values for two samples(control/mutant). I am taking one specific gene count example to ask my questions. The input was bed format, alignment was done using TopHat.
HTML Code:Control Gene Count RPKM total no. reads gene length ABAT 811 1.57049 87082676 5930 Mutant ABAT 4264 10.3046 69779755 5930
1. why we are not accounting for diff read numbers in the output_score.txt
2. How is the RPKM being calculated here. Somehow I am not able to find the same number shown in the example taken above.
Thanks!
-Abhi
Leave a comment:
-
Thanks Likun,
I was confused by the q-value. It all makes sense now.
- Nick
Leave a comment:
-
Originally posted by Xi Wang View PostAmyL,
Thanks for your question.
If you only care the fold changes, you can use a normalization as you mentioned. Or, in DEGseq, you can use the option normalMethod="median".
Xi
Leave a comment:
-
Originally posted by ngcrawford View PostHow do you set it? It's mentioned in the paper, but I can only find ways to adjust the p-value cut-off.
Thanks in advance.
- Nick
I think ngcrawford want to find a way to set the fdr cut-off.
The following is an example to set it.
DEGexp(geneExpFile1=geneExpFile,geneExpFile2=geneExpFile2,thresholdKind=4,qValue=0.001)
Please type ?DEGexp for detail.
---------------
Likun
Leave a comment:
-
Originally posted by AmyL View PostQuestion:
I am attempting to analyze samples that do not have the same number of reads. For example, one has 800K and another has 1.3M. With the analysis from DEGseq, it is obvious that the fold changes between samples are due to the difference in total read numbers. With this particular example, would you recommend using a reads/million normalization?
Thanks in advance!
Thanks for your question.
If you only care the fold changes, you can use a normalization as you mentioned. Or, in DEGseq, you can use the option normalMethod="median".
Xi
Leave a comment:
-
Question:
I am attempting to analyze samples that do not have the same number of reads. For example, one has 800K and another has 1.3M. With the analysis from DEGseq, it is obvious that the fold changes between samples are due to the difference in total read numbers. With this particular example, would you recommend using a reads/million normalization?
Thanks in advance!
Leave a comment:
-
Fdr?
How do you set it? It's mentioned in the paper, but I can only find ways to adjust the p-value cut-off.
Thanks in advance.
- Nick
Leave a comment:
-
Originally posted by ngcrawford View PostDEGseq look really nice, but I'm having trouble getting my data file read. Do I just need to substitute:
with
and then run DEGexp(commands)?
I'm getting the following error:
Thanks in advance,
Nick
You can specify the gene expression file in this way:
Suppose the file path is "D:/data/MyData.txt" (windows platform), then
Code:geneExpFile <- "D:/data/MyData.txt"
Leave a comment:
-
I can't get DEGseq to run my data
DEGseq look really nice, but I'm having trouble getting my data file read. Do I just need to substitute:
>geneExpFile <- system.file("data", "GeneExpExample5000.txt", package = "DEGseq")
>geneExpFile <- system.file("data", "MyData.txt", package = "DEGseq")
I'm getting the following error:
Error in read.table(geneExpFile1, header = header, sep = sep) :
no lines available in input
In addition: Warning message:
In file(file, "rt") :
file("") only supports open = "w+" and open = "w+b": using the former
Nick
Leave a comment:
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-25-2024, 11:49 AM
|
0 responses
20 views
0 likes
|
Last Post
by seqadmin
04-25-2024, 11:49 AM
|
||
Started by seqadmin, 04-24-2024, 08:47 AM
|
0 responses
20 views
0 likes
|
Last Post
by seqadmin
04-24-2024, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
62 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
61 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
Leave a comment: