Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • gkuffel
    Genomics Lab Mgr.
    • Oct 2012
    • 14

    Issues with Tophat

    Hi All-

    I am analyzing RNASeq data for the first time using Tophat and when I run the following command through the terminal the Tophat information/help page pops up and nothing else seems to happen. There is no error/exception displayed and I have no idea what is wrong with my command. Any help would be greatly appreciated. I am following the tutorial by Illumina for single-read data: http://www.illumina.com/documents/pr...ysisTopHat.pdf

    My command:

    --GTF Mus_musculus/UCSC/mm10/Annotation/Genes/genes.gtf --library-type fr-firststrand --num-threads 1 --output-dir Mouse_output S1.fastq
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    Are you using the same version of TopHat mentioned in that guide (v.1.4.0)? Current version is v.2.0.13.

    Example from the document you linked.

    Code:
    $ tophat --GTF <iGenomesFolder>/Annotation/Genes/genes.gtf --library-type <LibraryType> --num-threads 1 --output-dir <SampleOutputFolder> <iGenomesFolder>/Sequence/BowtieIndex/genome <SampleID>.fastq
    For names like <something here> you need to provide real names/file or directory paths that you want/are present on your system. This is standard unix convention of specifying variable parts of a command. You are not providing path to location of where you have the bowtie indexes for the genome (which tophat needs).

    So your command would become something like this:

    Code:
    $ tophat --GTF Mus_musculus/UCSC/mm10/Annotation/Genes/genes.gtf --library-type fr-firststrand --num-threads 1 --output-dir Mouse_output [COLOR="Red"]/path_to/Sequence/BowtieIndex/genome[/COLOR] S1.fastq

    Comment

    • gkuffel
      Genomics Lab Mgr.
      • Oct 2012
      • 14

      #3
      Thank you for taking the time to help. I am familiar with Unix conventions. I am calling tophat from the directory that contains the fastq file so the file name should be the correct path. I don't think that is the issue, however you raise a good point I am using the newest version of tophat not the version listed in the tutorial so maybe the syntax is outdated?

      Comment

      • GenoMax
        Senior Member
        • Feb 2008
        • 7142

        #4
        You had omitted providing path to bowtie genome index files (unless you did not copy the entire command in your original post).

        Comment

        • gkuffel
          Genomics Lab Mgr.
          • Oct 2012
          • 14

          #5
          Hmm, I guess I am not understanding then. I am calling tophat from a terminal window. We have a cluster that I ssh into. I have an account that lives on the server. I am calling tophat from a folder in my account. This folder is named WorkflowFolder. so the full path I suppose would be /home/gkuffel/WorkflowFolder/S1.fastq

          Do I also need to specify this for the reference genome like this: /home/gkuffel/WorkflowFolder/ Mus_musculus/UCSC/mm10/Annotation/Genes/genes.rtf

          Comment

          • GenoMax
            Senior Member
            • Feb 2008
            • 7142

            #6
            TopHat (and other aligners) require that the genome sequence be indexed in a binary form (burrows-wheeler transform or FM index). This is what you are going to use to compare your data against.

            Since you are using mouse genome you can get pre-made index files/annotation from the iGenomes site: http://support.illumina.com/sequenci...e/igenome.html. Get the build you like (several available for mouse).

            <iGenomesFolder>/Sequence/BowtieIndex/genome - This part of the example command is referring to the genome index files.

            The gtf file only has information about features/annotation for your genome. It has to be used in combination with the actual sequence/index files.

            Comment

            • gkuffel
              Genomics Lab Mgr.
              • Oct 2012
              • 14

              #7
              That is exactly where I got the genome that I am using. I used the FTP site to transfer the file to my computer and then I used FUGU to get the file to my account on our server that I use to run TopHat.

              Comment

              • GenoMax
                Senior Member
                • Feb 2008
                • 7142

                #8
                Perhaps you did not paste the entire command in the original post?

                It is unfortunate but the pre-built genome index file is named "genome" (that is the "basename" for the index, there should be multiple files with that basename in BowtieIndex directory). That part (highlighted in blue below) is missing from your original command.

                $ tophat --GTF Mus_musculus/UCSC/mm10/Annotation/Genes/genes.gtf --library-type fr-firststrand --num-threads 1 --output-dir Mouse_output /path_to/Sequence/BowtieIndex/genome S1.fastq

                Comment

                • GenoMax
                  Senior Member
                  • Feb 2008
                  • 7142

                  #9
                  You need to transfer everything in the "BowtieIndex" directory over to the server.

                  Comment

                  • gkuffel
                    Genomics Lab Mgr.
                    • Oct 2012
                    • 14

                    #10
                    The file from igenomes is:

                    Mus_musculus/UCSC/mm10/Annotation/Genes/genes.gtf

                    I have transferred the entire folder for Mus_musculus to our server.

                    S1.fastq is my data.

                    I don't think that is the issue my command now is:

                    tophat --GTF home/gkuffel/WorkflowFolder/Mus_musculus/UCSC/mm10/Annotation/Genes/genes.gtf --library-type fr-firststrand --num-threads 1 --output-dir Mouse_output home/gkuffel/WorkflowFolder/S1.fastq

                    Tophat started running this time!!! But then it gave me this error: Expected bowtie2 to be in the same directory with bowtie2-align: /usr/local/share/bowtie2-2.1.0/

                    Exiting now...

                    Comment

                    • GenoMax
                      Senior Member
                      • Feb 2008
                      • 7142

                      #11
                      Irrespective of what the error message says now, TopHat is not going to work right until you provide the location of the genome index files.

                      That is likely to be this: /home/gkuffel/WorkflowFolder/Mus_musculus/UCSC/mm10/Sequence/Bowtie2Index/genome

                      Comment

                      • gkuffel
                        Genomics Lab Mgr.
                        • Oct 2012
                        • 14

                        #12
                        I finally understand what you mean. Thanks for your patience. Here is my new command:

                        tophat --GTF home/gkuffel/Workflowfolder/Mus_musculus/UCSC/mm10/Annotation/Genes/genes.gtf --library-type fr-firststrand --num-threads 1 --output-dir Mouse_output home/gkuffel/WorkflowFolder/Mus_musculus/UCSC/mm10/Sequence/Bowtie2Index/genome S1.fastq

                        Sorry about the confusion, at least I have that figured out. I am still getting the same error as before though, any thoughts?

                        Comment

                        • GenoMax
                          Senior Member
                          • Feb 2008
                          • 7142

                          #13
                          Did you install tuxedo suite on this machine or someone else did?

                          My feeling is that you are also missing a leading "/" before "home" in both places in your command line. Only way that command would work if you were in the directory that has S1.fastq and the top level directory called "home" is in that directory.

                          Do you see these two programs with these commands?

                          Code:
                          $ which tophat
                          $ which bowtie2

                          Comment

                          Latest Articles

                          Collapse

                          • SEQadmin2
                            Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                            by SEQadmin2


                            I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                            Here are nine questions we think about, in roughly the order they matter, before...
                            06-18-2026, 07:11 AM
                          • SEQadmin2
                            From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                            by SEQadmin2


                            Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                            The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                            ...
                            06-02-2026, 10:05 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by SEQadmin2, 06-26-2026, 11:10 AM
                          0 responses
                          12 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-17-2026, 06:09 AM
                          0 responses
                          46 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-09-2026, 11:58 AM
                          0 responses
                          106 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-05-2026, 10:09 AM
                          0 responses
                          125 views
                          0 reactions
                          Last Post SEQadmin2  
                          Working...