Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Pathway analysis of Differentially expressed genes

    Dear all,
    I am new to the field of RNA-seq analysis. I used Tophat and the Cufflinks/Cuffdiff to get the differentially expressed genes.I had two conditions a wild type and tumor. I wanted to now do the Pathway analysis and determine which genes are up and down regulated for the PI3K/AKT/mTor pathways in the tumor . What should be the approach from here?. Also , I have the cuff diff output but I was not sure which one of the files would I be using for the Pathway analysis.
    I am naive to this field and so I apologize for the basic nature of the questions?.
    Any help will be much appreciated.
    Thanks,
    Himanshu Sharma.

  • #2
    Hi Himanshu,
    We've done similar work ourselves.
    What we did was map each differentially-expressed gene onto their respective Enzyme Commission accession (EC). Doing so was quite helpful because as you know, KEGG uses KO and/or EC accessions in their pathways. KEGG also has mTOR signaling; you noted you're interested in that pathway.
    You could then use tools such as KeggAtlas, DAVID, Paice, SubPathwayMiner... just to name a few. These tools interface your dataset with KEGG pathways.
    If the tools above do not do what you're looking for, you may want to build your own custom solution via the KEGG web-service API... it's well documented and available for many popular programming languages.
    Hope this was helpful.
    Last edited by phoss; 04-16-2012, 05:07 PM.

    Comment


    • #3
      Dear phoss,
      Thanks a lot for the reply . It is very helpful indeed. Although I wanted to know how did you map each DE to their Enzyme commission accession?. I know I need EC as it is compatible with KEGG and GO gives me Uniprot Id's. So what exactly did you use ?. It will be really greatful If you could give me some leads.
      Thanks,
      Himanshu Sharma.

      Comment


      • #4
        Hi Himanshu,
        Glad it was helpful.
        We obtained accession-to-EC mappings via biomart. If however such mappings do not exist for your model, you could use uniprot by running uniprot BLAST against your DEGs. This is handy because a good-number of their accessions have ECs. EBI and GOA are good resources too.

        In-case you're wondering, we had good results with DAVD and the KEGG api (both java and python). We've used / had good results with KEGGAnnotator, KEGGanim, paice and DAVID. PathRender in R is pretty-neat too. I actually developed paice ~2yrs ago but it's always good to try other tools and use the one best-suited for the job. We developed paice to help with gene-family visualization since some ECs occur in multiple copies but yet have different expression values.

        KEGG tools have been exhaustively studied so you'll have access to many tools / resources. Have you checked out other databases such as Metacyc or Reactome?

        Comment


        • #5
          Dear Phoss,
          Thanks again for your help. It is really useful. Now I had a few questions if you dont mind. I have approximately 1200 genes which are significant and my model is mouse (mus musculus). So is there a way to get the EC of all 1200 genes together because it will be very long and tedious to do it one by one. Also, if you could guide me a bit to obtain EC from the gene name via Biomart.
          Thanks again for your help. I really appreciate it .
          Thanks,
          Himanshu Sharma.

          Comment


          • #6
            Hi Himanshu,
            I personally have not worked with M. musculus but I recall a well-known mouse resource to be 'Mouse Genome Informatics' (MGI).
            You could use MGI Biomart (under Analysis Tools) to mine-out GO mappings for your DEGs and then map such GO accessions against GO->EC mappings from http://www.geneontology.org/external2go/ec2go
            Please correct me if I'm wrong, but I did not see any EC retrieval-option @ MGI-BioMart. If you have GO accessions, you could easily cross-link it with EC accessions nonetheless.
            All the above info is pretty-much enclosed in the EBI-GOA page: http://www.ebi.ac.uk/GOA/downloads.html

            Comment


            • #7
              I am working with chicken RNAseq dataset. I completed my analysis with cuffdiff and have a list of DE genes. How can I do about finding if these genes are involved in a pathway? Can I use the diff_gene file from cuffdiff for pathway analysis?

              Comment


              • #8
                hi Phoss
                Thank you for your guidance towards pathway analysis of my deferentially expressed RNA-seq genes. However, as a beginner to pathway analysis, I would like to know details with protocol. Can you please give me some link of study material so I can study and follow the same.

                I need to learn every steps of the same.
                Thanks again.

                Comment


                • #9
                  anybody got answer ? I am also waiting ?

                  Comment


                  • #10
                    If you know R, GAGE/Pathview workflow in Bioconductor can do the pathway analysis with your data. It works for both RNA-seq and microarray data. No mapping to EC is needed, and no pre-selection or filtering of genes either.

                    Here is an example workflow: https://stat.ethz.ch/pipermail/bioco...ly/054021.html
                    The packages are available at:
                    GAGE is a published method for gene set (enrichment or GSEA) or pathway analysis. GAGE is generally applicable independent of microarray or RNA-Seq data attributes including sample sizes, experimental designs, assay platforms, and other types of heterogeneity, and consistently achieves superior performance over other frequently used methods. In gage package, we provide functions for basic GAGE analysis, result processing and presentation. We have also built pipeline routines for of multiple GAGE analyses in a batch, comparison between parallel analyses, and combined analysis of heterogeneous data from different sources/studies. In addition, we provide demo microarray data and commonly used gene set data based on KEGG pathways and GO terms. These funtions and data are also useful for gene set analysis using other methods.

                    Pathview is a tool set for pathway based data integration and visualization. It maps and renders a wide variety of biological data on relevant pathway graphs. All users need is to supply their data and specify the target pathway. Pathview automatically downloads the pathway graph data, parses the data file, maps user data to the pathway, and render pathway graph with the mapped data. In addition, Pathview also seamlessly integrates with pathway and gene set (enrichment) analysis tools for large-scale and fully automated analysis.

                    Some example graphic outputs here:
                    Last edited by bigmw; 08-30-2013, 06:18 AM.

                    Comment


                    • #11
                      Hi charitra & jp,

                      Once you have your list of genes either up or down regulated obtained from either RNA seq or microarray,
                      you can go to DAVID
                      and put your list of genes in Start Analysis options,
                      select the option usually official gene symbol & proceed for the analysis.

                      Good luck

                      Comment


                      • #12
                        Dear bigmw and vishnuamaram
                        Thank you for your kind guidance. However, I am not able to use both of the programs (pathview and DAVID). For the pathview, I followed the tutorial successfuly. But not with my sample.
                        My main problem is that I have RNA-seq cuff_data_diff_gene.txt or original files. I really can not upload into either programs.
                        Any help ?

                        Comment


                        • #13
                          You haven’t really followed the GAGE/pathview workflow for RNA-seq data, where you don’t really need Cufflinks/Cuffdiff, just map the raw reads using tophat:


                          I would suggest you go through that first. You can’t input arbitrary data and ask the programs to figure it out for you. You may also want to read the GAGE/pathview documentation first.
                          I am not familiar with Cufflinks/Cuffdiff and its output. I assume it is a list of significant genes with certain p-value cutoff. To be able to use GAGE/pathview, you will need:
                          1. a list of all genes (usually thousands or tens of thousands entries) with its differential expression score, like fold change, t-statstics etc.
                          2 read that list into R as a vector or a 1-column matrix named with Gene IDs (Entrez Gene, Gene symbol etc). You may want to check functions “read.delim” or “read.table” for how to do this.
                          3 make sure you know enough R

                          You may want to to collaborate with a bioinformatician if you are not sure how to do these.

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Essential Discoveries and Tools in Epitranscriptomics
                            by seqadmin




                            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                            Yesterday, 07:01 AM
                          • seqadmin
                            Current Approaches to Protein Sequencing
                            by seqadmin


                            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                            04-04-2024, 04:25 PM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, 04-11-2024, 12:08 PM
                          0 responses
                          57 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-10-2024, 10:19 PM
                          0 responses
                          53 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-10-2024, 09:21 AM
                          0 responses
                          45 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-04-2024, 09:00 AM
                          0 responses
                          55 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X