I would recommend clustering the time-course expression profiles of each gene using fuzzy c means clustering. I am pretty sure this can be done in R fairly easily. Then you can look for enrichment of specific pathways or GO terms in each cluster. And maybe you can see what genes are regulated early, middle, and late. Perhaps middle or late genes are regulated by a transcription factor that you see increased in the early group. Just an idea.
But i would definitely look into the fuzzy c means clustering. Look at figure 7 in this paper for the type of output you can expect from it.
Rigbolt KT, Prokhorova TA, Akimov V, Henningsen J, Johansen PT, Kratchmarova
I, Kassem M, Mann M, Olsen JV, Blagoev B. System-wide temporal characterization
of the proteome and phosphoproteome of human embryonic stem cell differentiation.
Sci Signal. 2011 Mar 15;4(164):rs3. PubMed PMID: 21406692.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
time courses and heat maps
From my previous experience with time course experiments (however, this was in the proteomics field), I recommend the following:
- Decide first which is your time point of reference. This has to be clear already when you design the experimental protocol.
- Use the data of this timepoint as "background"/ zero / reference (whatever you would like to call it) and then calculate the ratio of all the other time points with respect to this one.
- Once you have fold-chance or log ratio values by gene per time point, you can visualize the values in a heatmap (I did this once with RPKMs using Gitools @ http://www.gitools.org)
Leave a comment:
-
Good morning.I need to normalize the data leaving the software analysis of SOLiD, Bioscope?
I need to normalize?
thanks
Leave a comment:
-
graphs and figures
Any command in R that produces a figure can typically be wrapped to produce a pdf or tiff or jpeg output rather than output to an R graph. Look into the R help on each output type for more information.
Here is a really simple pdf wrapper function
makepdf<-function(x,filename,w,h){
pdf(file=filename,width=w,height=h)
x
dev.off()
}
An example of how to use it.
makepdf(plot(1:10),"plot.pdf",5,5)
Leave a comment:
-
I need to take in the graph generated in MA-plot DEGseq, the differentially expressed genes. has some software that does this? or script?
thanks
Leave a comment:
-
groupings
Originally posted by schmima View Postgroup the genes in a senseful way prior to plotting (eg GO terms / gene families / PFAM domains etc).
Genes that show no expression in any time point can be removed from the analysis and reduce your gene list sometimes substantially.
I have also seen analysis that group expression into groups in a K-means manner to try to identify the major themes in the expression.
Like with most data I strongly recommend just playing with the data and seeing what jumps out at you then follow up on it. Look closely at the subgroups I mentioned above and also transcription factors and tissue related gene families in the time series.
You can also look at change in expression rather than expression values. how does the expression change between point 1 and 2 or point 2 and 3 or 1 and 3 etc.
Leave a comment:
-
some none professional answers
The resulting dendrograms for the large sets of gene lists that come out of the next generation sequencing data can be difficult to visualize.
So - in my opinion - I would first think on what I would like to show... So if I have a timecourse where I'm interested in what makes the difference I would first search for genes / gene sets (grouped together in a senseful way - eg function) that show the major difference between the samples and only plot these. This should reduce the amount of data plotted, in case of groups it links naked gene names to a term that one understands (e.g. 'ABC transporters' tells me personally more than 'ATXGXXXXX' or a '.' in a picture).
However - this requires some timecourse analysis... What is not the most unproblematic thing (eg due to between timepoints correlation). And it is also the question what is tested/what would you like to know... I guess there may be some helpful literature related to timecourses and ANOVA (not that you need to use ANOVA - but I think it is a good option to get some general principles and problems of timecourse studies).
Leave a comment:
-
Boxplot-dendrogram
We ran into similar problems when looking at this kind of data. The resulting dendrograms for the large sets of gene lists that come out of the next generation sequencing data can be difficult to visualize. We used both a heatmap approach and a combination of a dendrogram with boxplots over a time series in the paper we just published (RNA-Seq atlas of Glycine max -- http://seqanswers.com/forums/showthread.php?t=6321).
Leave a comment:
-
I have a similar task and would be interested in a professional answer. Naively I'll try with HTSeq and DESeq on simple read count data and compare my samples pairwise.
Leave a comment:
-
RNA-seq, RPKM and heatmap???
I calculated the RPKM based on my RNA-seq data. I am trying to cluster the data and explain the gene expression through a time series (along which my samples are taken).
Could anybody recommend some good method to do so?
I am thinking to log-transform the RPKM data, and then make a heatmap graphs like what we usually do for microarray data. What do you guys think about this?
ThanksTags: None
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 11:49 AM
|
0 responses
11 views
0 likes
|
Last Post
by seqadmin
Today, 11:49 AM
|
||
Started by seqadmin, Yesterday, 08:47 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
Yesterday, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
61 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
60 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
Leave a comment: