Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • gwilymh
    replied
    I think I have found a solution. CuffDiff actually does output a count of each feature (genes.count_tracking, cds.count_tracking, etc). I ran cuffDiff with each sample treated as a separate treatment, c.f. different samples from the same treatment being identified as replicates using the -L/--labels feature. This gave me count data for each feature in each sample.

    Leave a comment:


  • lucer105
    replied
    Originally posted by ashuchawla View Post
    day 0
    sample 1 - untreated

    day 3
    sample 2 - treated with "A"
    sample 3 - treated with "B"
    sample 4 - treated with "C"

    day 6

    sample 5 - treated with "A"
    sample 6 - treated with "B"
    sample 7 - treated with "C"

    day 9

    sample 8 - treated with "A"
    sample 9 - treated with "B"
    sample 10 - treated with "C"

    They want to see DE across time as well as across treatments. I am just a bioinformatician. Some body else did the experiments and I have the rnaSeq data for it. This is all I have been told.

    Ashu

    I think you need a "untreated" control for every time point, or the time will be a extra variable.

    Leave a comment:


  • ashuchawla
    replied
    day 0
    sample 1 - untreated

    day 3
    sample 2 - treated with "A"
    sample 3 - treated with "B"
    sample 4 - treated with "C"

    day 6

    sample 5 - treated with "A"
    sample 6 - treated with "B"
    sample 7 - treated with "C"

    day 9

    sample 8 - treated with "A"
    sample 9 - treated with "B"
    sample 10 - treated with "C"

    They want to see DE across time as well as across treatments. I am just a bioinformatician. Some body else did the experiments and I have the rnaSeq data for it. This is all I have been told.

    Ashu

    Originally posted by Simon Anders View Post
    45 comparisons still would only take an hour or so. However, my feeling is that you are doing something fundamentally wrong if you are comparing pairs of samples, rather than pairs of conditions or time points.

    Maybe explain a bit more about the biology of your experiment.

    Leave a comment:


  • Simon Anders
    replied
    45 comparisons still would only take an hour or so. However, my feeling is that you are doing something fundamentally wrong if you are comparing pairs of samples, rather than pairs of conditions or time points.

    Maybe explain a bit more about the biology of your experiment.

    Leave a comment:


  • ashuchawla
    replied
    I have ten samples total, 2C10 = 45 combinations

    Originally posted by Simon Anders View Post
    BTW, what did you mean by "pairwise comparisons would take me a long time"? Can't be more than a few minutes calculation time. The issue is rather: what would the result tell you?

    Leave a comment:


  • ashuchawla
    replied
    I have been told that there will be genes that would be down regulated for all 9 days, some would be up-regulated for all 9 days and some would change from down to up or vice versa. I understand that I could get a list of DE genes upon pairwise comparisons of my samples across the times d0, d3, d6 and d9. I have categorized the ones with negative log2foldchange(in pairwise comparison in my previous project) as down regulated and the ones that are positive as up regulated. I wanted a way to do that for this project as well but if I have to do it pairwise , it will take me a lot of time. I have BAM files for all samples and I also have the HT-Seq counts for all of them. What would be the best move for me next?

    Please let me know if you have any further questions...

    Thanks a million,
    Ashu


    Originally posted by Simon Anders View Post
    Yes, there have been a lot of updates to DESeq, especially the release of DESeq2.

    And there have always been plenty of methods to analyse time series data. My post above was not to claim that it cannot be done. Rather, it can only be done once you know what it is, i.e., what you actually want.

    You say you "need to know gene regulation pattern across time". What do you mean exactly by "pattern"?

    DESeq is a tool to test for statistical significance of differential expression. You ask a specific question and you get a p value, i.e., a yes/no answer (or rather: a yes / can't say answer). Once you can tell me the yes/no question, I can tell you how to use DESeq for it.

    The issue here is that people keep asking me about once a week "how do you analyse time course data?" but when I ask back "what is your precise question?" I never get an answer.

    Not that I'm surprised: In my experience, analysing time course data is rarely about answering yes/no question but rather about answering "which?" questions and hence, they are a job not for methods of statistical hypothesis testing but of machine learning.

    Simon

    Leave a comment:


  • Simon Anders
    replied
    BTW, what did you mean by "pairwise comparisons would take me a long time"? Can't be more than a few minutes calculation time. The issue is rather: what would the result tell you?

    Leave a comment:


  • Simon Anders
    replied
    Yes, there have been a lot of updates to DESeq, especially the release of DESeq2.

    And there have always been plenty of methods to analyse time series data. My post above was not to claim that it cannot be done. Rather, it can only be done once you know what it is, i.e., what you actually want.

    You say you "need to know gene regulation pattern across time". What do you mean exactly by "pattern"?

    DESeq is a tool to test for statistical significance of differential expression. You ask a specific question and you get a p value, i.e., a yes/no answer (or rather: a yes / can't say answer). Once you can tell me the yes/no question, I can tell you how to use DESeq for it.

    The issue here is that people keep asking me about once a week "how do you analyse time course data?" but when I ask back "what is your precise question?" I never get an answer.

    Not that I'm surprised: In my experience, analysing time course data is rarely about answering yes/no question but rather about answering "which?" questions and hence, they are a job not for methods of statistical hypothesis testing but of machine learning.

    Simon

    Leave a comment:


  • ashuchawla
    replied
    Need Help with Time Series analysis of RNA-Seq Data

    Dear Simon or anybody with RNA-Seq data analysis expertise,

    I wanted to ask you if there have been any updates on DESeq or another tool since this post which could enable the analysis of RNA-Seq Data for samples across time( day 0, day 3, day6, day 9) without having to do pair wise comparisons. I have total 10 samples and pairwise comparisons would take me a long time. I need to know gene regulation pattern across time for these samples and if this could be done using all samples at one time. Any help would be highly appreciated. I started working on RNASeq analysis only a month ago and do not have a lot of experience.

    Thanks
    Ashu



    Originally posted by Simon Anders View Post
    DESeq allows you to perform pairwise comparisons, and, to my knowledge, the same is true for all other tools out there. So, you can pick pairs of time points and compare these. Using GLMs, you can also compare differences between pairs of time points for one drug with differences for another drug or for the untreated controls.

    But which comparisons (contrasts) are useful to analyze data? Figuring this out is, in my opinion, the main challenge of time course data.

    Of course, all these pairwise comparisons are a bit pedestrian, if you have more than a three or four time points. You might be more interested in curve fits, and this is a very different statistical task, with which I have little experience. I haven't seen yet any such analysis published for RNA-Seq data, but there is lots of paper on microarray time courses. Maybe the article by Hafemeister et al in the current issue of Bioinformatics is a good starting point. Translating such methods to the RNA-Seq settings is certainly something that needs to be done now.

    S

    Leave a comment:


  • Cole Trapnell
    replied
    For those of you reading this thread that wished to use DESeq with Cufflinks: please check out http://cufflinks.cbcb.umd.edu for release 1.0.0, which includes a major overhaul of replicate support in Cuffdiff. Cuffdiff now models overdispersion of fragment counts at the transcript model, building on ideas introduced by DESeq and edgeR to greatly improve accuracy in calling differentially expressed genes and transcripts.

    Thanks to all the commenters on this thread and elsewhere for helpful feedback!

    Leave a comment:


  • Simon Anders
    replied
    The SCV plot is perfectly fine. The black curve just shows that you have genes with expression strength ranging from 1 count up to 300 or 1000 counts. For genes with more than around 100 counts, the biological noise is very low. For lowly expressed genes, you not only have strong shot noise but also strong biological noise.

    It is a priori surprising that the biological noise should depend so strongly on the gene's expression strength (after all, a coefficient of variation is already normalized for expression strength). However, I've seen this before; it is quite common. We are currently investigating the hypothesis that this may happen whenever the library preparation PCR was started with very low initial cDNA concentration.

    In your case, however, you have good replicability for you stronger genes, i.e., you should get good results.

    Leave a comment:


  • ecofriendly
    replied
    Not sure if the attachment went through...trying again
    Attached Files

    Leave a comment:


  • ecofriendly
    replied
    Hi Simon, Please see the attachment.

    Leave a comment:


  • Simon Anders
    replied
    I really need to change this plot; it seems to confuse people. The black curve (labelled "base mean") , which I suppose you are talking about, does not show SCV values (i.e., the y axis does not apply to it). Rather it is only there to indicate which expression stengths actually occur in your sample in order to tell you which parts of the colored curves (which do show SCV values) are of interest.

    Maybe post your plot here, then I can try to clarify it.

    Leave a comment:


  • ecofriendly
    replied
    Dear Simon,

    First of all, thanks for your response and clarification. I see your point regarding Fig. 3 in the paper, which is consistent with my own MvA plots - that at low expression levels, the fold change has to be bigger for DESeq to call that change significant.

    To be clear, I just meant that when making the SCV plot as described in Fig. 1 of the vignette, I get a graph that looks different from the example shown. Basically, my mean curve doesn't follow a bell shape on the left side; instead of tailing off, the SCV values remain high on the left, and it looks as though there is a second, smaller peak. Is this something that other users have seen? How does one explain this?

    Thanks again for your help! It is much appreciated.

    Elena

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Exploring the Dynamics of the Tumor Microenvironment
    by seqadmin




    The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
    07-08-2024, 03:19 PM
  • seqadmin
    Exploring Human Diversity Through Large-Scale Omics
    by seqadmin


    In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
    06-25-2024, 06:43 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 07-19-2024, 07:20 AM
0 responses
28 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-16-2024, 05:49 AM
0 responses
41 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-15-2024, 06:53 AM
0 responses
46 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-10-2024, 07:30 AM
0 responses
43 views
0 likes
Last Post seqadmin  
Working...
X