Unconfigured Ad

**GenoMax** · 02-09-2015, 09:54 AM

Are you using the same version of TopHat mentioned in that guide (v.1.4.0)? Current version is v.2.0.13.

Example from the document you linked.

Code:

$ tophat --GTF <iGenomesFolder>/Annotation/Genes/genes.gtf --library-type <LibraryType> --num-threads 1 --output-dir <SampleOutputFolder> <iGenomesFolder>/Sequence/BowtieIndex/genome <SampleID>.fastq

For names like <something here> you need to provide real names/file or directory paths that you want/are present on your system. This is standard unix convention of specifying variable parts of a command. You are not providing path to location of where you have the bowtie indexes for the genome (which tophat needs).

So your command would become something like this:

Code:

$ tophat --GTF Mus_musculus/UCSC/mm10/Annotation/Genes/genes.gtf --library-type fr-firststrand --num-threads 1 --output-dir Mouse_output [COLOR="Red"]/path_to/Sequence/BowtieIndex/genome[/COLOR] S1.fastq

**gkuffel** · 02-09-2015, 10:45 AM

Thank you for taking the time to help. I am familiar with Unix conventions. I am calling tophat from the directory that contains the fastq file so the file name should be the correct path. I don't think that is the issue, however you raise a good point I am using the newest version of tophat not the version listed in the tutorial so maybe the syntax is outdated?

**GenoMax** · 02-09-2015, 10:57 AM

You had omitted providing path to bowtie genome index files (unless you did not copy the entire command in your original post).

**gkuffel** · 02-09-2015, 11:52 AM

Hmm, I guess I am not understanding then. I am calling tophat from a terminal window. We have a cluster that I ssh into. I have an account that lives on the server. I am calling tophat from a folder in my account. This folder is named WorkflowFolder. so the full path I suppose would be /home/gkuffel/WorkflowFolder/S1.fastq

Do I also need to specify this for the reference genome like this: /home/gkuffel/WorkflowFolder/ Mus_musculus/UCSC/mm10/Annotation/Genes/genes.rtf

**GenoMax** · 02-09-2015, 12:01 PM

TopHat (and other aligners) require that the genome sequence be indexed in a binary form (burrows-wheeler transform or FM index). This is what you are going to use to compare your data against.

Since you are using mouse genome you can get pre-made index files/annotation from the iGenomes site: http://support.illumina.com/sequenci...e/igenome.html. Get the build you like (several available for mouse).

<iGenomesFolder>/Sequence/BowtieIndex/genome - This part of the example command is referring to the genome index files.

The gtf file only has information about features/annotation for your genome. It has to be used in combination with the actual sequence/index files.

**gkuffel** · 02-09-2015, 12:05 PM

That is exactly where I got the genome that I am using. I used the FTP site to transfer the file to my computer and then I used FUGU to get the file to my account on our server that I use to run TopHat.

**GenoMax** · 02-09-2015, 12:19 PM

Perhaps you did not paste the entire command in the original post?

It is unfortunate but the pre-built genome index file is named "genome" (that is the "basename" for the index, there should be multiple files with that basename in BowtieIndex directory). That part (highlighted in blue below) is missing from your original command.

$ tophat --GTF Mus_musculus/UCSC/mm10/Annotation/Genes/genes.gtf --library-type fr-firststrand --num-threads 1 --output-dir Mouse_output /path_to/Sequence/BowtieIndex/genome S1.fastq

**GenoMax** · 02-09-2015, 12:22 PM

You need to transfer everything in the "BowtieIndex" directory over to the server.

**gkuffel** · 02-09-2015, 12:34 PM

The file from igenomes is:

Mus_musculus/UCSC/mm10/Annotation/Genes/genes.gtf

I have transferred the entire folder for Mus_musculus to our server.

S1.fastq is my data.

I don't think that is the issue my command now is:

tophat --GTF home/gkuffel/WorkflowFolder/Mus_musculus/UCSC/mm10/Annotation/Genes/genes.gtf --library-type fr-firststrand --num-threads 1 --output-dir Mouse_output home/gkuffel/WorkflowFolder/S1.fastq

Tophat started running this time!!! But then it gave me this error: Expected bowtie2 to be in the same directory with bowtie2-align: /usr/local/share/bowtie2-2.1.0/

Exiting now...

**GenoMax** · 02-09-2015, 12:42 PM

Irrespective of what the error message says now, TopHat is not going to work right until you provide the location of the genome index files.

That is likely to be this: /home/gkuffel/WorkflowFolder/Mus_musculus/UCSC/mm10/Sequence/Bowtie2Index/genome

**gkuffel** · 02-09-2015, 01:22 PM

I finally understand what you mean. Thanks for your patience. Here is my new command:

tophat --GTF home/gkuffel/Workflowfolder/Mus_musculus/UCSC/mm10/Annotation/Genes/genes.gtf --library-type fr-firststrand --num-threads 1 --output-dir Mouse_output home/gkuffel/WorkflowFolder/Mus_musculus/UCSC/mm10/Sequence/Bowtie2Index/genome S1.fastq

Sorry about the confusion, at least I have that figured out. I am still getting the same error as before though, any thoughts?

**GenoMax** · 02-09-2015, 02:52 PM

Did you install tuxedo suite on this machine or someone else did?

My feeling is that you are also missing a leading "/" before "home" in both places in your command line. Only way that command would work if you were in the directory that has S1.fastq and the top level directory called "home" is in that directory.

Do you see these two programs with these commands?

Code:

$ which tophat
$ which bowtie2

Topics	Statistics	Last Post
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, Today, 05:37 AM	0 responses 5 views 0 reactions	Last Post by SEQadmin2 Today, 05:37 AM
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 16 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 49 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 109 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM

Unconfigured Ad

Issues with Tophat

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News