Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • kelseyca
    Member
    • May 2013
    • 12

    Help with cuffdiff and cummeRbund?

    Hi all!

    Sorry to bother with a simple question- I have read through all the cummeRbund posts and tutorials but I seem to be stuck right at the start!

    I have ran RNA-seq analyses on galaxy online- tophat, cufflinks, cuffmerge, and cuffdiff. I would now like to visualize results in cummeRbund. I downloaded the cuffdiff files (11 each for two groups) off galaxy and they are in .tabular format. I installed cummeRbund, and ran the following. It does not work. Could the issue be that files should be in .db format? I don't know where the cuffData.db file came from- it appeared before I had even downloaded the cuffdiff files.

    > source("http://bioconductor.org/biocLite.R")
    > biocLite("cummeRbund")
    > getwd()
    > setwd("C:/Users/caetano1/Downloads/SEDENTARYDFF")
    > list.files()
    > library(cummeRbund)
    > cuff= readCufflinks (dbFile = "cuffData.db",
    + geneFPKM = "CuffdiffSEDENTARY__gene_FPKM_tracking.tabular",
    + geneDiff = "CuffdiffSEDENTARY__CDS_FPKM_differential_expression_testing.tabular",
    + isoformFPKM = "CuffdiffSEDENTARY__transcript_FPKM_tracking.tabular",
    + isoformDiff = "CuffdiffSEDENTARY__transcript_differential_expression_testing.tabular",
    + TSSFPKM = "CuffdiffSEDENTARY__TSS_groups_FPKM_tracking.tabular",
    + TSSDiff = "CuffdiffSEDENTARY__TSS_groups_differential_expression_testing.tabular",
    + CDSFPKM = "CuffdiffSEDENTARY__CDS_FPKM_tracking.tabular",
    + CDSExpDiff = "CuffdiffSEDENTARY__CDS_FPKM_differential_expression_testing.tabular"",
    + CDSDiff = "CuffdiffSEDENTARY__CDS_overloading_diffential_expression_testing.tabular",
    + promoterFile = "CuffdiffSEDENTARY__promoters_differential_expression_testing.tabular",
    + splicingFile = "CuffdiffSEDENTARY__splicing_differential_expression_testing.tabular",
    + rebuild = T)

    I think I'm missing something really obvious here!

    Thank you so much!

    Kelesy
  • muthu545
    Member
    • Jul 2011
    • 32

    #2
    Hi kelseyca,

    cuffData.db is the database file created by cummeRbund to store all the results from cuffdiff in a easy to access format for commands in cummeRbund in R.

    So if you run readCufflinks (dbFile = "cuffData.db",....) command even without loading all the files from cuffdiff into the directory, a default cuffData.db fill will be created.

    Hope this helps

    Thanks
    --
    Muthu

    Comment

    • kelseyca
      Member
      • May 2013
      • 12

      #3
      Originally posted by muthu545 View Post
      Hi kelseyca,

      cuffData.db is the database file created by cummeRbund to store all the results from cuffdiff in a easy to access format for commands in cummeRbund in R.

      So if you run readCufflinks (dbFile = "cuffData.db",....) command even without loading all the files from cuffdiff into the directory, a default cuffData.db fill will be created.

      Hope this helps

      Thanks
      --
      Muthu
      Muthu,

      Thanks for your reply! So, how can I get R to read my cuffdiff files? Are they in the wrong format?

      Kelsey

      Comment

      • sazz
        Member
        • Oct 2012
        • 28

        #4
        Output files should look like this:

        genes.read_group_tracking
        genes.fpkm_tracking
        genes.count_tracking
        gene_exp.diff

        I guess, you should delete ".tabular" part and organize them in the right format.

        Comment

        • muthu545
          Member
          • Jul 2011
          • 32

          #5
          Kelsey,

          As Sazz mentioned, the output files from cuffdiff will not have .tabular file formats.
          Please verify your output files from cuffdiff, if it doesnot match names provided in the readCufflinks command, then the files will not be recognized in R.

          Simple is to copy all the output files from cuffdiff into a directory and run the following command.

          cuff= readCufflinks (dbFile = "cuffData.db",dir="C:/Users/caetano1/Downloads/SEDENTARYDFF",
          gtfFile='DIRPATH/gtffile', genome='genomename',rebuild = T)

          This command recognizes all the files required to make the directory. You need not specify them individually.

          GTF file is needed for some visualization commands in cummeRbund.

          Hope this is helpful

          Thanks
          --
          Muthu

          Comment

          • kelseyca
            Member
            • May 2013
            • 12

            #6
            Originally posted by muthu545 View Post
            Kelsey,

            As Sazz mentioned, the output files from cuffdiff will not have .tabular file formats.
            Please verify your output files from cuffdiff, if it doesnot match names provided in the readCufflinks command, then the files will not be recognized in R.

            Simple is to copy all the output files from cuffdiff into a directory and run the following command.

            cuff= readCufflinks (dbFile = "cuffData.db",dir="C:/Users/caetano1/Downloads/SEDENTARYDFF",
            gtfFile='DIRPATH/gtffile', genome='genomename',rebuild = T)

            This command recognizes all the files required to make the directory. You need not specify them individually.

            GTF file is needed for some visualization commands in cummeRbund.

            Hope this is helpful

            Thanks
            --
            Muthu
            Hi Muthu,

            One last question. Sorry If I am missing something very obvious here and wasting your time. thank you so much for being so patient and all of your help.

            I can not figure out how to export cuffdiff files from galaxy online in any other format than .tabular. I am just clicking "download" under the cuffdiff run. All manuals and FAQ's I have been reading are from running the tuxedo suite offline.

            Also, R cannot find the function "readCufflinks".

            Kelsey

            Comment

            • kelseyca
              Member
              • May 2013
              • 12

              #7
              Originally posted by muthu545 View Post
              Kelsey,

              As Sazz mentioned, the output files from cuffdiff will not have .tabular file formats.
              Please verify your output files from cuffdiff, if it doesnot match names provided in the readCufflinks command, then the files will not be recognized in R.

              Simple is to copy all the output files from cuffdiff into a directory and run the following command.

              cuff= readCufflinks (dbFile = "cuffData.db",dir="C:/Users/caetano1/Downloads/SEDENTARYDFF",
              gtfFile='DIRPATH/gtffile', genome='genomename',rebuild = T)

              This command recognizes all the files required to make the directory. You need not specify them individually.

              GTF file is needed for some visualization commands in cummeRbund.

              Hope this is helpful

              Thanks
              --
              Muthu
              > cuff= readCufflinks (dbFile = "cuffData.db",dir="C:/Users/caetano1/Downloads/SEDENTARYDFF",
              + gtfFile='DIRPATH/gtffile', genome='genomename',rebuild = T)
              Creating database C:/Users/caetano1/Downloads/SEDENTARYDFF/cuffData.db
              Reading GTF file
              Error in import(FileForFormat(con), ...) :
              error in evaluating the argument 'con' in selecting a method for function 'import': Error in FileForFormat(con) : Format 'DIRPATH/gtffile' unsupported
              >

              Comment

              • muthu545
                Member
                • Jul 2011
                • 32

                #8
                Originally posted by kelseyca View Post
                Hi Muthu,

                One last question. Sorry If I am missing something very obvious here and wasting your time. thank you so much for being so patient and all of your help.

                I can not figure out how to export cuffdiff files from galaxy online in any other format than .tabular. I am just clicking "download" under the cuffdiff run. All manuals and FAQ's I have been reading are from running the tuxedo suite offline.

                Also, R cannot find the function "readCufflinks".

                Kelsey
                Hi Kelsey,

                Not a problem.

                If that's the case (Galaxy's output is .tabular), then you could rename the files in order to change the .tabular file format, after you download them.

                If R cannot find the functions 'readCufflinks', it means you did not load the corresponding library 'cummeRbund' in the current workspace.

                Thanks
                --
                Muthu

                Comment

                • muthu545
                  Member
                  • Jul 2011
                  • 32

                  #9
                  Originally posted by kelseyca View Post
                  > cuff= readCufflinks (dbFile = "cuffData.db",dir="C:/Users/caetano1/Downloads/SEDENTARYDFF",
                  + gtfFile='DIRPATH/gtffile', genome='genomename',rebuild = T)
                  Creating database C:/Users/caetano1/Downloads/SEDENTARYDFF/cuffData.db
                  Reading GTF file
                  Error in import(FileForFormat(con), ...) :
                  error in evaluating the argument 'con' in selecting a method for function 'import': Error in FileForFormat(con) : Format 'DIRPATH/gtffile' unsupported
                  >
                  Kelsey,

                  Rightnow, its throwing out error because its not able to detect the directory 'DIRPATH' and the gtf file.

                  I mentioned 'DIRPATH' in order to imply the directory in which you have the .gtf file you used to run cufflinks.
                  you could copy the XXX.gtf file to the same working directory 'C:/Users/caetano1/Downloads/SEDENTARYDFF' and then replace the DIRPATH/gtffile in the command to 'C:/Users/caetano1/Downloads/SEDENTARYDFF/XXX.gtf' and the 'genomename' to the name of the genome you are working with eg. 'hg19', 'hg18','pt03','mm9','mm10' etc...

                  Your readcufflinks command should work after this without any error.

                  thanks
                  --
                  Muthu

                  Comment

                  • kelseyca
                    Member
                    • May 2013
                    • 12

                    #10
                    > source("http://bioconductor.org/biocLite.R")
                    Bioconductor version 2.12 (BiocInstaller 1.10.2), ?biocLite for help
                    > biocLite("cummeRbund")
                    BioC_mirror: http://bioconductor.org
                    Using Bioconductor version 2.12 (BiocInstaller 1.10.2), R version 3.0.1.
                    Installing package(s) 'cummeRbund'
                    trying URL 'http://bioconductor.org/packages/2.12/bioc/bin/windows/contrib/3.0/cummeRbund_2.2.0.zip'
                    Content type 'application/zip' length 2600163 bytes (2.5 Mb)
                    opened URL
                    downloaded 2.5 Mb

                    package ‘cummeRbund’ successfully unpacked and MD5 sums checked

                    The downloaded binary packages are in
                    C:\Users\caetano1\AppData\Local\Temp\RtmpQTqdVW\downloaded_packages
                    Warning message:
                    installed directory not writable, cannot update packages 'class', 'foreign',
                    'MASS', 'mgcv', 'nnet', 'spatial'
                    > getwd()
                    [1] "\\\\ansci-alpha/Homes/Grads/caetano1/Documents"
                    > setwd("C:/Users/caetano1/Downloads/SEDENTARYDFF")
                    > list.files()
                    [1] "cuffData.db"
                    [2] "CuffdiffSEDENTARY__CDS_FPKM_differential_expression_testing.tabular"
                    [3] "CuffdiffSEDENTARY__CDS_FPKM_tracking.tabular"
                    [4] "CuffdiffSEDENTARY__CDS_overloading_diffential_expression_testing.tabular"
                    [5] "CuffdiffSEDENTARY__gene_differential_expression_testing.tabular"
                    [6] "CuffdiffSEDENTARY__gene_FPKM_tracking.tabular"
                    [7] "CuffdiffSEDENTARY__promoters_differential_expression_testing.tabular"
                    [8] "CuffdiffSEDENTARY__splicing_differential_expression_testing.tabular"
                    [9] "CuffdiffSEDENTARY__transcript_differential_expression_testing.tabular"
                    [10] "CuffdiffSEDENTARY__transcript_FPKM_tracking.tabular"
                    [11] "CuffdiffSEDENTARY__TSS_groups_differential_expression_testing.tabular"
                    [12] "CuffdiffSEDENTARY__TSS_groups_FPKM_tracking.tabular"
                    [13] "mm10.gtf"
                    > library(cummeRbund)
                    Loading required package: BiocGenerics
                    Loading required package: parallel

                    Attaching package: ‘BiocGenerics’

                    The following objects are masked from ‘packagearallel’:

                    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
                    clusterExport, clusterMap, parApply, parCapply, parLapply,
                    parLapplyLB, parRapply, parSapply, parSapplyLB

                    The following object is masked from ‘package:stats’:

                    xtabs

                    The following objects are masked from ‘package:base’:

                    anyDuplicated, as.data.frame, cbind, colnames, duplicated, eval,
                    Filter, Find, get, intersect, lapply, Map, mapply, match, mget,
                    order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
                    rbind, Reduce, rep.int, rownames, sapply, setdiff, sort, table,
                    tapply, union, unique, unlist

                    Loading required package: RSQLite
                    Loading required package: DBI
                    Loading required package: ggplot2
                    Loading required package: reshape2
                    Loading required package: fastcluster

                    Attaching package: ‘fastcluster’

                    The following object is masked from ‘package:stats’:

                    hclust

                    Loading required package: rtracklayer
                    Loading required package: GenomicRanges
                    Loading required package: IRanges
                    Loading required package: Gviz
                    Loading required package: grid

                    Attaching package: ‘cummeRbund’

                    The following object is masked from ‘package:GenomicRanges’:

                    promoters

                    The following object is masked from ‘package:IRanges’:

                    promoters

                    > cuff= readCufflinks (dbFile = "cuffData.db",dir="C:/Users/caetano1/Downloads/SEDENTARYDFF",
                    + gtfFile="C:/Users/caetano1/Downloads/SEDENTARYDFF/mm10.gtf", genome='mm10',rebuild = T)
                    Creating database C:/Users/caetano1/Downloads/SEDENTARYDFF/cuffData.db
                    Reading GTF file
                    Error in .parse_attrCol(attrCol, file, colnames) :
                    Some attributes do not conform to 'tag value' format
                    >

                    Comment

                    • jp.
                      Senior Member
                      • Jul 2013
                      • 142

                      #11
                      Please try simple this one.
                      Note: keep you "diff_out" folder within cuff_data folder
                      change directory to: cuff_data
                      > cuff_data<- readCufflinks('diff_out',rebuild=T)

                      Comment

                      Latest Articles

                      Collapse

                      • SEQadmin2
                        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                        by SEQadmin2


                        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                        ...
                        06-02-2026, 10:05 AM
                      • SEQadmin2
                        Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                        by SEQadmin2


                        With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                        Introduction

                        Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                        05-22-2026, 06:42 AM
                      • SEQadmin2
                        Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                        by SEQadmin2

                        Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                        Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                        05-06-2026, 09:04 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, 06-02-2026, 12:03 PM
                      0 responses
                      21 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 11:40 AM
                      0 responses
                      14 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 05-28-2026, 11:40 AM
                      0 responses
                      29 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 05-26-2026, 10:12 AM
                      0 responses
                      31 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...