Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tophat can't find Bowtie index files

    New to TopHat/Bowtie. I have already set PATH for bowtie2, tophat and samtools. However, when I was trying tophat using the test example from tophat website and the Nature Protocols example/procedure, the following error message was shown. I tried ln (link file), adding path-to-directory before file, and exporting path for the directory containing index files (actually in the same directory), but none of them worked. The error message asks for .bt2 index files, but all provided index files are in .ebwt format. Should I change file extension? Please advise. Thanks in advance.

    [2012-05-17 14:54:46] Beginning TopHat run (v2.0.0)
    -----------------------------------------------
    [2012-05-17 14:54:46] Checking for Bowtie
    Bowtie version: 2.0.0.6
    [2012-05-17 14:54:46] Checking for Samtools
    Samtools version: 0.1.18.0
    [2012-05-17 14:54:46] Checking for Bowtie index files
    Error: Could not find Bowtie 2 index files (test_ref.*.bt2)

  • #2
    You should get new indexes from the Bowtie2 website.

    Comment


    • #3
      Or build one from your reference genome

      Comment


      • #4
        Thanks for help

        Very helpful. Thanks a lot.

        Comment


        • #5
          I have the same error. Though the bowtie2 index are there:
          Code:
          tophat2 /path/Bowtie2Index/hg19 mysample.fastq
          -->
          [2012-05-22 22:49:13] Beginning TopHat run (v2.0.0)
          -----------------------------------------------
          [2012-05-22 22:49:13] Checking for Bowtie
          Bowtie version: 2.0.0.6
          [2012-05-22 22:49:13] Checking for Samtools
          Samtools version: 0.1.18.0
          [2012-05-22 22:49:13] Checking for Bowtie index files
          Error: Could not find Bowtie 2 index files (/path/Bowtie2Index/hg19.*.bt2)

          but:
          ls -lh /path/Bowtie2Index/hg19.*

          -rw-r--r-- 1 916M /path/Bowtie2Index/hg19.1.bt2
          -rw-r--r-- 1 684M /path/Bowtie2Index/hg19.2.bt2
          do i miss something?

          thanks
          colin

          Comment


          • #6
            Hi Colin,
            I have

            <basename>.1.bt2
            <basename>.2.bt2
            <basename>.3.bt2
            <basename>.4.bt2
            <basename>.rev.1.bt2
            <basename>.rev.2.bt2
            Maybe your indexing was not complete?

            Comment


            • #7
              Hey guys, I'm having trouble getting my Tophat run to start. Here is my code:

              tophat -p 4 -G hg19_knowngene2.gtf \ --transcriptome-index=transcriptome_data/known \ hg19 Luxs-11-23_ACTTGA_L005_R1_001.fastq.gz,Luxs-11-23_ACTTGA_L005_R1_002.fastq.gz,Luxs-11-23_ACTTGA_L005_R1_003.fastq.gz,Luxs-11-23_ACTTGA_L005_R1_004.fastq.gz,Luxs-11-23_ACTTGA_L005_R1_005.fastq.gz,Luxs-11-23_ACTTGA_L005_R1_006.fastq.gz,Luxs-11-23_ACTTGA_L005_R1_007.fastq.gz Luxs-11-23_ACTTGA_L005_R2_001.fastq.gz,Luxs-11-23_ACTTGA_L005_R2_002.fastq.gz,Luxs-11-23_ACTTGA_L005_R2_003.fastq.gz,Luxs-11-23_ACTTGA_L005_R2_004.fastq.gz,Luxs-11-23_ACTTGA_L005_R2_005.fastq.gz,Luxs-11-23_ACTTGA_L005_R2_006.fastq.gz,Luxs-11-23_ACTTGA_L005_R2_007.fastq.gz > output.txt 2> errors.txt 0> input.txt

              I have the six hg19.ebwt index files in the directory.

              I keep getting this error:
              Could not find Bowtie index files --transcriptome-index=transcriptome_data/known.

              What do you guys think?

              Comment


              • #8
                You have a space before the escape character. It's only supposed to be after it.

                Comment


                • #9
                  transcriptome_index

                  [QUOTE=billstevens;75627]Hey guys, I'm having trouble getting my Tophat run to start. Here is my code:

                  tophat -p 4 -G hg19_knowngene2.gtf \ --transcriptome-index=transcriptome_data/known \ hg19 Luxs-11-23_ACTTGA_L005_R1_001.fastq.gz,Luxs-11-... ,Luxs-11-23_ACTTGA_L005_R2_007.fastq.gz > output.txt 2> errors.txt 0> input.txt]

                  Hi, I never used the transcriptome_index, but what I understood about it is that it should replace the '-G hg19' option at the second pass.
                  In fact, i was wondering if I am all the time working with hg19, if I could create these index once and then use them by not mentioning the -G option.
                  Does anyone has good clues about how to use it?

                  thanks
                  colin

                  Comment


                  • #10
                    Thanks Dario!

                    Colin, yes, that's exactly what I'm doing. Its in the manual of Tophat.

                    TopHat should be first run with the -G option and with the --transcriptome-index option pointing to a directory and a name prefix which will indicate where the transcriptome data files will be stored. Then subsequent TopHat runs using the same --transcriptome-index option value will directly use the transcriptome data created in the first run (no -G option needed for subsequent runs).
                    For example the first TopHat run could look like this:
                    tophat -o out_sample1 -G known_genes.gtf \
                    --transcriptome-index=transcriptome_data/known \
                    hg19 sample1_1.fq.z
                    In this example the first run will create the transcriptome_data directory if it doesn't exist, and files known.fa, known.gff and known.*ebwt (Bowtie index files) will be generated in that directory. Then for subsequent runs with the same genome and known transcripts but different reads (e.g. sample2_2.fq.z etc.), TopHat will no longer spend time building the transcriptome index because it can directly use the previously built transcriptome index, so the -G option can be even discarded for subsequent runs:
                    tophat -o out_sample2 \
                    --transcriptome-index=transcriptome_data/known \
                    hg19 sample2_1.fq.z

                    Comment


                    • #11
                      Hey, actually that didn't work. I tried every iteration of that

                      tophat -p 4 -G hg19_knowngene2.gtf \ --transcriptome-index=transcriptome_data/known \ hg19 Luxs-11-23_ACTTGA_L005_R1_001.fastq.gz....
                      tophat -p 4 -G hg19_knowngene2.gtf \ --transcriptome-index=transcriptome_data/known\ hg19 Luxs-11-23_ACTTGA_L005_R1_001.fastq.gz...
                      tophat -p 4 -G hg19_knowngene2.gtf \--transcriptome-index=transcriptome_data/known\ hg19 Luxs-11-23_ACTTGA_L005_R1_001.fastq.gz...
                      tophat -p 4 -G hg19_knowngene2.gtf \--transcriptome-index=transcriptome_data/known \ hg19 Luxs-11-23_ACTTGA_L005_R1_001.fastq.gz...

                      I keep getting the same error.

                      Could not find Bowtie index files --transcriptome-index=transcriptome_data/known hg19.*

                      Comment


                      • #12
                        I just ran it without the -G function and it works.....

                        But I really want the -G function since I need to run 6 more samples. Any other thoughts?

                        Comment


                        • #13
                          Originally posted by colinmolter View Post
                          I have the same error. Though the bowtie2 index are there:
                          Code:
                          tophat2 /path/Bowtie2Index/hg19 mysample.fastq
                          -->
                          [2012-05-22 22:49:13] Beginning TopHat run (v2.0.0)
                          -----------------------------------------------
                          [2012-05-22 22:49:13] Checking for Bowtie
                          Bowtie version: 2.0.0.6
                          [2012-05-22 22:49:13] Checking for Samtools
                          Samtools version: 0.1.18.0
                          [2012-05-22 22:49:13] Checking for Bowtie index files
                          Error: Could not find Bowtie 2 index files (/path/Bowtie2Index/hg19.*.bt2)

                          but:
                          ls -lh /path/Bowtie2Index/hg19.*



                          do i miss something?

                          thanks
                          colin


                          Hey, Colin. Don't click on the "part 1" link, click on the main hg18 (or whatever reference you want) link above. I think the part1..3 are just listed to you can download parts of the whole indeces. -Dan king

                          Comment


                          • #14
                            Hi guys,

                            Any idea on why the -G function won't work for me? I am at a complete loss.

                            Comment


                            • #15
                              What happens if you give the absolute path to the GTF file ? Can you load the GTF file into a genome browser without any complaints it is wrongly formatted ?

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM
                              • seqadmin
                                Techniques and Challenges in Conservation Genomics
                                by seqadmin



                                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                Avian Conservation
                                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                03-08-2024, 10:41 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 06:37 PM
                              0 responses
                              10 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 06:07 PM
                              0 responses
                              9 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-22-2024, 10:03 AM
                              0 responses
                              49 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 03-21-2024, 07:32 AM
                              0 responses
                              67 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X