Tophat can't find Bowtie index files

Ajayi Oyeyemi replied

02-07-2013, 05:11 PM
Originally posted by Dario1984 View Post

It's exactly what the error messages tells you. You don't have Bowtie 2 indexes. If the files end in ebwt, then they are Bowtie 1 indexes.

Thanks. So what do you suggest I do? Should I go ahead and use the pre-built index files in bowtie website or I should build one? In anycase what are the indicators that suggests that it's bowtie2? Please forgive me for my Ignorance.
Leave a comment:
Dario1984 replied

02-07-2013, 03:00 PM
It's exactly what the error messages tells you. You don't have Bowtie 2 indexes. If the files end in ebwt, then they are Bowtie 1 indexes.
Leave a comment:
Ajayi Oyeyemi replied

02-07-2013, 01:53 PM
Errors in Bowtie index files.

Hi everyone, I'm just new into RNA-seq analysis and this thread has been very helpful. However, I've been having difficulty in my alignment using tophat.
[ooa4@cbsum1c1b009 rnaseq]$ tophat -p12 /home/workdir/ooa4/rnaseq/ bosTau7 SRR594497.fastq SRR594499.fastq

[2013-02-07 14:09:57] Beginning TopHat run (v2.0.7)
-----------------------------------------------
[2013-02-07 14:09:57] Checking for Bowtie
Bowtie version: 2.0.6.0
[2013-02-07 14:09:57] Checking for Samtools
Samtools version: 0.1.18.0
[2013-02-07 14:09:57] Checking for Bowtie index files
Error: Could not find Bowtie 2 index files (/home/workdir/ooa4/rnaseq/.*.bt2)

My ls
[ooa4@cbsum1c1b009 rnaseq]$ ls
bosTau7.1.ebwt bosTau7.3.ebwt bosTau7.fa bosTau7.rev.1.ebwt bovine gtf files.gz SRR594497.sra SRR594499.fastq tophat_out
bosTau7.2.ebwt bosTau7.4.ebwt bosTau7.fa.gz bosTau7.rev.2.ebwt SRR594497.fastq SRR594497.sra.1 SRR594499.sra

Can someone please help me?

Yemi.
Leave a comment:
wanfahmi replied

02-01-2013, 03:31 AM
Hey Everybody,

The reason why the tophat couldn't find the bowtie indexes because you didn't specify the path in your working directory.

The easiest way you just make link to your working directory from the reference genome which already having the indexes sequence.

ln -s /path/to/your/genome/hg19/Annotation/Genes/genes.gtf .
ln -s /path/to/your/genome/hg19/Sequence/BowtieIndex/genome.*.

Then, when you run from your working directory it could find those indexes.

Just my thought, hope this help!
Leave a comment:
neoyoung replied

01-04-2013, 07:58 AM
When I get the below error message during tophat running,
"Error: Could not find Bowtie 2 index files (...)"

I fixed it up with tophat command line like below,
# tophat -p 12 /home/NGS_work/bowtie2/indexes/hg18/hg18 seq_1.fa

My folder for bowtie indexes is like below
"indexes" folder > "hg18" folder > hg18.1.bt2, hg18.2.bt2, hg18.3.bt2, hg18.4.bt2, hg18.rev.1.bt2, hg18.rev.2.bt2 files

You have to specify the index title name such as "hg18" for the index path in the command line of Tophat.

If you have hg19 indexes files in the "Test" folder, you write command like below...
# tophat -p 12 /absolute path to "Test" folder/hg19 yourSeqfile.fa
Leave a comment:
Bulak replied

06-16-2012, 04:08 AM
For bowtie index problems: make sure to include a slash '/' at the end of the path when you export it in .bashrc
export BOWTIE_INDEXES=/some_full_path/to_bowtie_indexes/
Otherwise, when you run tophat you get missing index messages even though all the index files are properly created.

To billstevens: I had the same problem. One thing I figured is that tophat only checks for the presence of the folder set by --transcriptome-index. If the folder is there, even though content not complete, it does seem to ignore the --GTF flag and then fail when it discovers that the contents are not complete. So make sure to remove the transcriptome-index folder for the first time AND after each error message when you want to try something new. I could get it (--transcriptome-index) work with --bowtie1 option but so far I had many other issues with bowtie2 (unfortunately) towards end of the analysis when tophat-report is called in. I hope this helps.

Last edited by Bulak; 06-16-2012, 11:45 PM.
Leave a comment:
wmyashar replied

06-15-2012, 12:40 PM
I am having the same problem but I cannot even get the test_data to run, isn't the bowtie indexes in the same folder that comes from the zip? I have tried reinstalling tophat to make sure I did not mess anything up but it keeps giving me the same error (including the code that i inputted):

Wills-MacBook-Pro:tophat-2.0.3 wmyashar$ tophat2 /Users/wmyashar/Desktop/test_data/ reads_2.fq

[2012-06-15 13:37:18] Beginning TopHat run (v2.0.3)
-----------------------------------------------
[2012-06-15 13:37:18] Checking for Bowtie
Bowtie version: 2.0.0.6
[2012-06-15 13:37:18] Checking for Samtools
Samtools version: 0.1.18.0
[2012-06-15 13:37:18] Checking for Bowtie index files
Error: Could not find Bowtie 2 index files (/Users/wmyashar/Desktop/test_data/.*.bt2)
Leave a comment:
SilviaBCE replied

06-13-2012, 08:17 AM
Hi everybody! I'm new to this forum and to the RNA-seq world so I hope that my questions won't seem too naive to you!
I'm trying to make my first tophat run, I have Illumina paired end reads and I want to align them to the human genome.
I downloaded the pre-built index froma bowtie so I suppose that I don't need to use bowtie-build in this case, do I?
Anyway
this is the code I typed:
$ cd /bowtie/
$ tophat -r 54 --mate-std-dev 35 --solexa1.3-quals hg19 ~/RNA.seq.data/My.data/exp007.s_1_1.fq ~/RNA.seq.data/My.data/exp007.s_1_2.fq

And this is the output with the error message:

[Wed Jun 13 18:08:00 2012] Beginning TopHat run (v1.4.1)
-----------------------------------------------
[Wed Jun 13 18:08:00 2012] Preparing output location ./tophat_out/
[Wed Jun 13 18:08:00 2012] Checking for Bowtie index files
Error: Could not find Bowtie index files hg19.*

What can I do to fix this problem?
Thanks a lot.

Last edited by SilviaBCE; 06-15-2012, 01:35 AM.
Leave a comment:
billstevens replied

06-12-2012, 08:07 AM
Yup, it works fine. Additionally, I just tried it with -G but without the

\ --transcriptome-index=transcriptome_data/known \

part and it works. I'm guessing there has to be something wrong with the syntax. The folder is created too, but i keep getting the same error. Any other ideas?
Leave a comment:
Dario1984 replied

06-11-2012, 06:00 PM
What happens if you give the absolute path to the GTF file ? Can you load the GTF file into a genome browser without any complaints it is wrongly formatted ?
Leave a comment:
billstevens replied

06-11-2012, 09:16 AM
Hi guys,

Any idea on why the -G function won't work for me? I am at a complete loss.
Leave a comment:
findingdan replied

06-10-2012, 07:25 AM
Originally posted by colinmolter View Post

I have the same error. Though the bowtie2 index are there:

Code:

tophat2 /path/Bowtie2Index/hg19 mysample.fastq

-->
[2012-05-22 22:49:13] Beginning TopHat run (v2.0.0)
-----------------------------------------------
[2012-05-22 22:49:13] Checking for Bowtie
Bowtie version: 2.0.0.6
[2012-05-22 22:49:13] Checking for Samtools
Samtools version: 0.1.18.0
[2012-05-22 22:49:13] Checking for Bowtie index files
Error: Could not find Bowtie 2 index files (/path/Bowtie2Index/hg19.*.bt2)

but:
ls -lh /path/Bowtie2Index/hg19.*

do i miss something?

thanks
colin

Hey, Colin. Don't click on the "part 1" link, click on the main hg18 (or whatever reference you want) link above. I think the part1..3 are just listed to you can download parts of the whole indeces. -Dan king
Leave a comment:
billstevens replied

06-08-2012, 09:57 AM
I just ran it without the -G function and it works.....

But I really want the -G function since I need to run 6 more samples. Any other thoughts?
Leave a comment:
billstevens replied

06-08-2012, 09:48 AM
Hey, actually that didn't work. I tried every iteration of that

tophat -p 4 -G hg19_knowngene2.gtf \ --transcriptome-index=transcriptome_data/known \ hg19 Luxs-11-23_ACTTGA_L005_R1_001.fastq.gz....
tophat -p 4 -G hg19_knowngene2.gtf \ --transcriptome-index=transcriptome_data/known\ hg19 Luxs-11-23_ACTTGA_L005_R1_001.fastq.gz...
tophat -p 4 -G hg19_knowngene2.gtf \--transcriptome-index=transcriptome_data/known\ hg19 Luxs-11-23_ACTTGA_L005_R1_001.fastq.gz...
tophat -p 4 -G hg19_knowngene2.gtf \--transcriptome-index=transcriptome_data/known \ hg19 Luxs-11-23_ACTTGA_L005_R1_001.fastq.gz...

I keep getting the same error.

Could not find Bowtie index files --transcriptome-index=transcriptome_data/known hg19.*
Leave a comment:
billstevens replied

06-08-2012, 06:23 AM
Thanks Dario!

Colin, yes, that's exactly what I'm doing. Its in the manual of Tophat.

TopHat should be first run with the -G option and with the --transcriptome-index option pointing to a directory and a name prefix which will indicate where the transcriptome data files will be stored. Then subsequent TopHat runs using the same --transcriptome-index option value will directly use the transcriptome data created in the first run (no -G option needed for subsequent runs).
For example the first TopHat run could look like this:
tophat -o out_sample1 -G known_genes.gtf \
--transcriptome-index=transcriptome_data/known \
hg19 sample1_1.fq.z
In this example the first run will create the transcriptome_data directory if it doesn't exist, and files known.fa, known.gff and known.*ebwt (Bowtie index files) will be generated in that directory. Then for subsequent runs with the same genome and known transcripts but different reads (e.g. sample2_2.fq.z etc.), TopHat will no longer spend time building the transcriptome index because it can directly use the previously built transcriptome index, so the -G option can be even discarded for subsequent runs:
tophat -o out_sample2 \
--transcriptome-index=transcriptome_data/known \
hg19 sample2_1.fq.z
Leave a comment:

Previous 1 2 3 template Next

Essential Discoveries and Tools in Epitranscriptomics

by seqadmin

The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
- Channel: Articles
04-22-2024, 07:01 AM
Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 20 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 20 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Latest Articles

ad_right_rmr

News