Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • rvaerle
    Junior Member
    • Oct 2011
    • 6

    Some cummeRbund questions

    I worked my way through the cummeRbund manual and so far I'm really impressed with it. However, I have a few questions as I'm new to R and I couldn't find the solution to these in the manual:

    1). After finding similarly expressed genes using:

    mySimilar<-findSimilar(cuff,"PINK1",n=20)

    how do I write the expression data (with statistics) for these genes to a file?

    2). How can I produce a heatmap of all differentially expressed genes?

    3). I have a list of gene IDs in a text file - how can I produce a heatmap of these?
  • rvaerle
    Junior Member
    • Oct 2011
    • 6

    #2
    I managed to create a heatmap for all differentially expressed genes (2 above) using a post by Loyal elsewhere in this forum:

    cuff <- readCufflinks()

    #Retrive significant gene IDs (XLOC) with a pre-specified alpha
    diffGeneIDs <- getSig(cuff,level="genes",alpha=0.05)

    #Use returned identifiers to create a CuffGeneSet object with all relevant info for given genes
    diffGenes<-getGenes(cuff,diffGeneIDs)

    h<-csHeatmap(diffGenes,cluster='both')

    However, I'm still struggling with writing similarly expressed genes to a text file and to import a list of gene IDs from a file. I would appreciate if someone could put me in the right direction. Thanks!

    Comment

    • lgoff
      Member
      • Feb 2008
      • 82

      #3
      Originally posted by rvaerle View Post
      I worked my way through the cummeRbund manual and so far I'm really impressed with it. However, I have a few questions as I'm new to R and I couldn't find the solution to these in the manual:

      1). After finding similarly expressed genes using:

      mySimilar<-findSimilar(cuff,"PINK1",n=20)

      how do I write the expression data (with statistics) for these genes to a file?

      2). How can I produce a heatmap of all differentially expressed genes?

      3). I have a list of gene IDs in a text file - how can I produce a heatmap of these?
      Hi rvaerle,
      Heres how you can do this with cummeRbund:

      1)
      Code:
      >write.table(diffData(mySimilar),"mySimilar.diff")
      >write.table(fpkm(mySimilar),"mySimilar.fpkm")
      >write.table(features(mySimilar),"mySimilar.features")
      2) It seems you got this one working already

      3)
      >
      Code:
      myIDs<-read.table("Ids.txt")
      myGenes<-getGenes(cuff,myIDs)
      heat<-csHeatmap(myGenes)
      Please let me know if you need any additional help!

      Cheers,
      Loyal

      Comment

      • rvaerle
        Junior Member
        • Oct 2011
        • 6

        #4
        Many thanks for this, Loyal! It looks so easy after seeing the code... Also, thanks for this great package!

        BW
        Ronny

        Comment

        • waspboyz
          Junior Member
          • May 2012
          • 3

          #5
          Hello, this thread has been extremely useful to me already (as another R/bioinformatics in general novice). I am also using the mySimilar function, and have found some interesting results. But I have the problem that it gives me mostly results that were deemed not significant by cuffdiff (in general this set of genes are very lowly expressed, and any patterns detected are likely to be noise, i guess) . If I increase the number of loci to return, in addition to more of these noisy genes, I also get more genes that might be "real" hits.
          So my question is if there is any way to exclude the non-significant genes in the first place? Or can anyone suggest an efficient way to sort through the noise?
          Thanks,
          Jeremy

          Comment

          • lgoff
            Member
            • Feb 2008
            • 82

            #6
            Originally posted by waspboyz View Post
            Hello, this thread has been extremely useful to me already (as another R/bioinformatics in general novice). I am also using the mySimilar function, and have found some interesting results. But I have the problem that it gives me mostly results that were deemed not significant by cuffdiff (in general this set of genes are very lowly expressed, and any patterns detected are likely to be noise, i guess) . If I increase the number of loci to return, in addition to more of these noisy genes, I also get more genes that might be "real" hits.
            So my question is if there is any way to exclude the non-significant genes in the first place? Or can anyone suggest an efficient way to sort through the noise?
            Thanks,
            Jeremy
            Hi waspboyz,
            FindSimilar and getSig, are essentially asking two very different questions of the data. Since different restrictions and filters can be applied to both questions, the best way to find the set of genes that you are interested in my be to generate a list of significant genes, generate a second list of similar genes, and then use intersect() to find the list of genes common to both lists.

            Cheers,
            Loyal

            Comment

            • wangli
              Member
              • Apr 2012
              • 48

              #7
              error in cummeRbund

              Hi, I found the post is very helpful. I am struggling with cummeRbund now. I tried some codes listed here, and am confronted with some errors.

              > cuff_data <- readCufflinks('diff_out')
              > csDensity(genes(cuff_data))
              Error in dat$fpkm + pseudocount : non-numeric argument to binary operator

              > diffGeneIDs <- getSig(cuff_data, level="genes", alpha=0.05)
              > diffGenes <- getGenes(cuff_data, diffGeneIDs)
              Error in sqliteExecStatement(conn, statement, ...) :
              RS-DBI driver: (RS_SQLite_exec: could not execute1: cannot start a transaction within a transaction)

              > sessionInfo()
              R version 2.15.0 (2012-03-30)
              Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

              locale:
              [1] C

              attached base packages:
              [1] stats graphics grDevices utils datasets methods base

              other attached packages:
              [1] cummeRbund_1.2.0 reshape2_1.2.1 ggplot2_0.9.1 RSQLite_0.11.1
              [5] DBI_0.2-5 BiocInstaller_1.4.6

              loaded via a namespace (and not attached):
              [1] MASS_7.3-18 RColorBrewer_1.0-5 colorspace_1.1-1 dichromat_1.2-4
              [5] digest_0.5.2 grid_2.15.0 labeling_0.1 memoise_0.1
              [9] munsell_0.3 plyr_1.7.1 proto_0.3-9.2 scales_0.2.1
              [13] stringr_0.6 tools_2.15.0

              Do someone has any idea how to get rid of the error?
              Thanks
              Li

              Comment

              • RUlearner
                Junior Member
                • Jun 2012
                • 1

                #8
                Hi all,

                I am getting the same error as Li when trying to run csDensity with cummeRbund. Any suggestions?

                > epi <- readCufflinks( )
                > epi
                CuffSet instance with:
                4 samples
                25078 genes
                42661 isoforms
                30261 TSS
                24612 CDS
                150468 promoters
                181566 splicing
                121338 relCDS
                dens <- csDensity(genes(epi))
                Error in dat$fpkm + pseudocount : non-numeric argument to binary operator

                Thanks so much

                Comment

                • lgoff
                  Member
                  • Feb 2008
                  • 82

                  #9
                  Version info...

                  Hi Wangli and RULearner,
                  Can you provide me with version information for both cuffdiff and cummeRbund and I can try to help. Also, if either of you wants to email me your cuffData.db files so that I can reproduce the error, I can try to see what's going on.

                  Cheers,
                  Loyal

                  Comment

                  • Starr_Hazard
                    Member
                    • Nov 2010
                    • 19

                    #10
                    Reading in a table of Gene names for cummeRbund analysis

                    I want examine the CuffDiff/cummeRbund analyses for a list of genes that other RNAseq software has flagged as significant for my data.

                    I made both plain text and csv tables of the gene names
                    say,

                    Fcrls
                    Ndst4
                    Prokr2
                    Grm2
                    Snap25

                    Then I use read.table to enter the names to R/cummeRbund as suggested by Loyal

                    >
                    Code:
                    myIDs<-read.table("Ids.txt")
                    myGenes<-getGenes(cuff,myIDs)
                    heat<-csHeatmap(myGenes)
                    The heat map is never produced and there are no errors.

                    When I then attempt to examine the CuffGeneSet "myGenes"
                    it contains only a single gene (the first one in the list)

                    myGeneFeatureNames<-featureNames(myGenes)

                    the object gives only the XLOC and gene_short_name for the first entry of the list

                    if I

                    print(myIDs)

                    I see every entry. So read.table works ( so does read.csv)

                    I can however use this syntax

                    myGeneID<-myIDs[[i,1]]
                    myGene<-getGene(myGeneID)

                    to get every gene one at a time and I can make isoform plots for example.

                    What more can I do to get a list of Gene names entered as a CuffGeneSet
                    ?

                    Comment

                    • ftorri
                      Member
                      • Oct 2010
                      • 11

                      #11
                      Hi all,

                      I am having the same problem as described in the last pot: I can see all my genes doing:

                      print(myIDs)

                      Then if I produce the heatmap I have the heatmap only of the first gene.

                      Did anyone of you find a solution already? Thanks!

                      Federica

                      Comment

                      • lgoff
                        Member
                        • Feb 2008
                        • 82

                        #12
                        Hi Starr and fed,

                        MyIDs must be a vector of gene ids. If you are reading these in with read.table, than myIDs is most likely a data.frame. The following can confirm this:

                        is.vector(myIDs)

                        If this is FALSE, then getGenes will not read the list of identifiers correctly, and will most likely only use the first gene. You can probably do something like the following:

                        myIDs<-as.vector(myIDs$v1)

                        To coerce the first column of the data.frame to a vector.

                        Cheers,
                        Loyal

                        Comment

                        • ftorri
                          Member
                          • Oct 2010
                          • 11

                          #13
                          Hi Loyal,

                          here what I got:

                          > is.vector(myIDs)
                          [1] FALSE
                          > myIDs<-as.vector(myIDs$v1)

                          But it isn't still a vector. Am I doing something wrong?

                          > is.vector(myIDs)
                          [1] FALSE

                          Fed

                          Comment

                          • ftorri
                            Member
                            • Oct 2010
                            • 11

                            #14
                            Actually when I do it, myIDs becomes 'empty' (NULL), and the lists disappears:

                            > myIDs<-as.vector(myIDs$v1)
                            > is.vector(myIDs)
                            [1] FALSE
                            > myIDs
                            NULL

                            Fed

                            Comment

                            • ftorri
                              Member
                              • Oct 2010
                              • 11

                              #15
                              Hi,

                              trying to get the heatmap of the significant genes:

                              > diffGeneIDs <- getSig(cuff_data,level="genes",alpha=0.0001)
                              > diffGenes<-getGenes(cuff_data,diffGeneIDs)
                              > h<-csHeatmap(diffGenes,cluster='both')
                              > h

                              I have back a heatmap where the gene name is always preceded by NA|, but my gene.diff doesn't have any NA field.

                              Fed

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                Yesterday, 10:05 AM
                              • SEQadmin2
                                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                by SEQadmin2


                                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                Introduction

                                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                05-22-2026, 06:42 AM
                              • SEQadmin2
                                Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                by SEQadmin2

                                Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                05-06-2026, 09:04 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, Yesterday, 12:03 PM
                              0 responses
                              19 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, Yesterday, 11:40 AM
                              0 responses
                              14 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 05-28-2026, 11:40 AM
                              0 responses
                              29 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 05-26-2026, 10:12 AM
                              0 responses
                              31 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...