DEGseq - SEQanswers

Xi Wang replied

11-23-2010, 08:54 PM
Originally posted by wdt View Post

HI,
Using the sam to bed Perl script, I got the file like

chr1 435837 435913 U0 0 +
chr1 435837 435913 U0 0 -
chr1 435837 435913 U1 0 -
chr1 435838 435914 U1 0 +
chr1 435838 435914 U1 0 -
chr1 435838 435914 U1 0 -
chr1 435840 435916 U2 0 -
chr1 435840 435916 U2 0 -
chr1 435840 435916 U3 0 -
chr1 435840 435916 U2 0 -
chr1 435842 435918 U4 0 -
chr1 435842 435918 U4 0 -
chr1 435844 435920 U2 0 -
chr1 435844 435920 U2 0 -
chr1 437189 437265 U2 0 +

Could someone explain how U0, U1, U2 are assigned and
what they are?

Thanks,

U (unique) means the uniquely mapped reads. Maybe the script regards all the reads as unique reads.

And the integer means the number of mismatches.
Leave a comment:
wdt replied

11-23-2010, 08:31 PM
HI,
Using the sam to bed Perl script, I got the file like

chr1 435837 435913 U0 0 +
chr1 435837 435913 U0 0 -
chr1 435837 435913 U1 0 -
chr1 435838 435914 U1 0 +
chr1 435838 435914 U1 0 -
chr1 435838 435914 U1 0 -
chr1 435840 435916 U2 0 -
chr1 435840 435916 U2 0 -
chr1 435840 435916 U3 0 -
chr1 435840 435916 U2 0 -
chr1 435842 435918 U4 0 -
chr1 435842 435918 U4 0 -
chr1 435844 435920 U2 0 -
chr1 435844 435920 U2 0 -
chr1 437189 437265 U2 0 +

Could someone explain how U0, U1, U2 are assigned and
what they are?

Thanks,
Leave a comment:
Xi Wang replied

11-22-2010, 06:32 PM
Originally posted by Sol View Post

How do you do to calculated the cutoff in the value the DEGseq, in pvalue. cutoff = 2 for example
thanks

The cufoffs are specified by users. If you ask how to calculate the p-values, please refer to our paper: http://bioinformatics.oxfordjournals.../full/26/1/136

BTW, p-value should be any real number between 0 and 1.
Leave a comment:
Sol replied

11-22-2010, 05:43 PM
How do you do to calculated the cutoff in the value the DEGseq, in pvalue. cutoff = 2 for example
thanks
Leave a comment:
Xi Wang replied

11-20-2010, 08:03 PM
Originally posted by Sol View Post

I would like know what the letters NA means as a result of DEGSeq.
Another question: log2 is two fold change or four fold change
thanks

Thanks for your question.

NA: when the read counts for a gene in both samples are zero, or zero and a small number (say, <5), the program will not calculate the values (such as fold-change, p-value) for this gene. "NA"s appear in those places.

log2 means base-2 logarithm. So
if fold-change = 1, log2(fold-change) = 0;
if fold-change = 2, log2(fold-change) = 1;
if fold-change = 4, log2(fold-change) = 2;
if fold-change = 0.5, log2(fold-change) = -1.
Leave a comment:
Sol replied

11-20-2010, 05:26 PM
I would like know what the letters NA means as a result of DEGSeq.
Another question: log2 is two fold change or four fold change
thanks
Leave a comment:
anamaretti replied

11-09-2010, 02:00 PM
Hello

I'm using samWrapper to do some statistical analysis in my samples.
I have 2 groups, each one with 5 biological replicates.
However, I'm having some weird results.
Even if all the samples don't show any read to some genes, some times these genes are included in the list of genes with difference in gene expression (Signature =TRUE).

The parameters that I used are:
Value are in RPKM
nperm= 1000
min.fold-change=2
max.qValue=1e-04
paired=FALSE

Should I include some restriction term to avoid that??

By the way, the seed standard value is 100???
Is there some benefit if I modify it?

Thanks for the help and for this program, it is great!
Leave a comment:
Marisa_Miller replied

11-09-2010, 08:36 AM
Thank you so much! That solved my problems!
Leave a comment:
Xi Wang replied

11-08-2010, 11:21 PM
Hi Marisa,

Thanks for your questions.

1) please refer to this site: http://www.bioconductor.org/packages...t/doc/DEGseq.R
"exp[30:35,]" is just used for display the values of the matrix "exp" in lines 30-35

2) yes, you need to specify your files, and the column for gene names (geneCol=?), the columns for gene expression values (valCol=??), etc..

3) Please pay attention to the parts in bold

Code:

DEGexp2([B]geneExpFile1="your_gene_exp_file_1"[/B], geneCol1=1, expCol1=c(7,9,12,15,18), groupLabel1="kidney", [B]geneExpFile2="your_gene_exp_file_2"[/B], geneCol2=1, expCol2=c(8,10,11,13,16), groupLabel2="liver", method="MARS", outputDir=outputDir)

4) do it like this:

Code:

DEGexp2(geneExpFile1="your_gene_exp_file_1", geneCol1=1, expCol1=c(7,9,12,15,18),geneExpFile2="your_gene_exp_file_2", geneCol2=1, expCol2=c(8,10,11,13,16), [B]thresholdKind=3, qValue=1e-3[/B], method="MARS", outputDir=outputDir)

5) ignore the bold words in the first two lines, they are just for this example.
valCol stands for which column contains the (expression) value you want to analyze. For you case, you may set "valCol=6".
Last edited by Xi Wang; 11-08-2010, 11:23 PM.
Leave a comment:
Marisa_Miller replied

11-08-2010, 08:00 AM
Hi and thanks for the reply! I have decided to use the MARS method after reading through your paper. The one thing I an confused about is the actual usage of R. When reading through your manual I am still confused about the actual commands used to run DEGexp. Below is the usage you have listed in the manual for DEGexp2 (which I think I need to use since I have two different input files).

geneExpFile <- system.file("extdata", "GeneExpExample5000.txt", package="DEGseq")
outputDir <- file.path(tempdir(), "DEGexpExample")
exp <- readGeneExp(file=geneExpFile, geneCol=1, valCol=c(7,9,12,15,18))
exp[30:35,]
exp <- readGeneExp(file=geneExpFile, geneCol=1, valCol=c(8,10,11,13,16))
exp[30:35,]
DEGexp2(geneExpFile1=geneExpFile, geneCol1=1, expCol1=c(7,9,12,15,18), groupLabel1="kidney", geneExpFile2=geneExpFile, geneCol2=1, expCol2=c(8,10,11,13,16), groupLabel2="liver", method="MARS", outputDir=outputDir)
cat("outputDir:", outputDir, "\n")

Questions:
1) Is each command entered on a separate line? It is unclear where the line breaks are.
2)I am unsure which parts of the example usage listed above I need to change for my specific case. Obviously I need to specify the correct file paths and names etc...
3)Since I am using two separate files where do I specify this in the commands above? I can't tell from the example commands where to do this.
4) Where can I enter a q-value threshold?
5)I tried to highlight in bold parts of the example I do not understand the meaning of or could not find an explanation of in the manual.

The example of my input file is in the previous post.
Basically, my problem is with the usage of R. If you could help me by indicating how I can apply the example usage to my files that would be great.

Thank you,
Marisa
Leave a comment:
Xi Wang replied

11-03-2010, 08:59 PM
Hello Marisa,

Thanks for using DEGseq. You may use the function DEGexp to detect differentially expressed genes, and the read counts (the 6th column of your file) are recommanded to feed to DEGexp. Details can be found in our Bioinformatics paper. There are slight difference between the method LRT, FET and MARS, of which the MARS method was proposed by us based on the M-A plot using a normal distribution approximation.

hope this helps.
Leave a comment:
Marisa_Miller replied

11-03-2010, 11:44 AM
Hello,
I am new to R, and would like to use your DEGseq package to identify differentially expressed genes between my libraries. In my case I have 2 libraries to compare. I have calculated the RPKM's using a program written by a member of my lab, an example of how the file looks is shown here:

Chr Gene Start End Gene_len Reads RPKM Log2(RPKM)
1 AT1G01010.1 3631 5899 1688 45 1.58899 0.668107
1 AT1G01020.1 5928 8737 1623 104.73 3.84621 1.94344
1 AT1G01020.2 6790 8737 1085 72.2697 3.97015 1.98919
1 AT1G01030.1 11649 13714 1905 78 2.44051 1.28718
1 AT1G01040.1 23146 31227 6251 1159 11.0513 3.46615
1 AT1G01046.1 28500 28706 207 4 1.15178 0.203866
1 AT1G01050.1 31170 33153 976 2186 133.5 7.06069

I am unsure which part of your package to use (I think DEGexp?) to analyze the data. Also, if you could help me with what method to use (i.e. LRT, MATR, etc...).

I have read through the examples on the usage of the package, but am still unsure.

Thank you in advance for your help
Leave a comment:
Xi Wang replied

10-27-2010, 09:29 PM
Originally posted by Sol View Post

the results of the DEGseq, already can be used directly for analysis or should i make some other standardization
thanks

The answer is yes. You can apply the results to function analysis, say GO enrichment analysis. Alternatively, you can refer to GOseq, which takes into account the gene length bias.
Leave a comment:
Xi Wang replied

10-27-2010, 09:21 PM
Originally posted by Sol View Post

What is the function of the MA plot the fold change??
The graph shows the relationship in the MA-plot, no??
thanks

I am not very clear what you want to know by asking these questions.
For detailed questions, I prefer you sent me emails: [email protected]. I will give you more rapid replies regarding DEGseq. Thanks.
Leave a comment:
Sol replied

10-27-2010, 12:20 PM
the results of the DEGseq, already can be used directly for analysis or should i make some other standardization
thanks
Leave a comment:

Previous 1 2 3 4 5 6 11 template Next

Recent Developments in Metagenomics

by seqadmin

Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable¹. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
- Channel: Articles
09-23-2024, 06:35 AM
Understanding Genetic Influence on Infectious Disease

by seqadmin

During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
- Channel: Articles
09-09-2024, 10:59 AM

Topics	Statistics	Last Post
Mechanical Forces in DNA Transcription Uncovered by Clemson Researchers by seqadmin Started by seqadmin, 10-02-2024, 04:51 AM	0 responses 13 views 0 likes	Last Post by seqadmin 10-02-2024, 04:51 AM
New Epigenetic Clock Links Cheek Cells to Mortality Risk by seqadmin Started by seqadmin, 10-01-2024, 07:10 AM	0 responses 21 views 0 likes	Last Post by seqadmin 10-01-2024, 07:10 AM
AI-Powered Blood Test Shows Promise for Early Ovarian Cancer Detection by seqadmin Started by seqadmin, 09-30-2024, 08:33 AM	0 responses 25 views 0 likes	Last Post by seqadmin 09-30-2024, 08:33 AM
Stem Cell Research Suggests Human Cells May Enter Developmental Pause by seqadmin Started by seqadmin, 09-26-2024, 12:57 PM	0 responses 18 views 0 likes	Last Post by seqadmin 09-26-2024, 12:57 PM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Latest Articles

ad_right_rmr

News