Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • madsen
    Member
    • Sep 2009
    • 10

    Tophat - accepted_hits.sam file is empty?

    Hi all.

    I've recently installed TopHat and the test files ran without problems, thus I assume the installation went OK.
    Now applying my own data things seems not to go so smoothly. I ran a subset (1000000 sequences) of my paired-end Illumina GA2 reads to test my data. I don't get any junction (which I also wouldn't expect with only 1000000 reads on a mammalian genome) but it surprised me that the accepted_hits.sam file is empty. If I understand correctly this file should contain the position and sequence of the aligned reads to the genome? Since I thought that the problem could be caused by a wrong fastq format I also aligned my subset with bowtie against my reference genome. This seems to go OK. The reason for my suspicion is that Tophat indicate a seed length of 52bp but my sequences are 51bp.
    Thus, does anyone have any idea what is going wrong and is it somehow possible to control the seed length in tophat (as in bowtie with the -l option).

    Regards, Ole

    Some information:

    example of my fastq format:
    @HWI-EA332:5:13596#0/2
    GCTGATCCGGGACTGCCGGCCTGTGAGGCTGCCCACCTGCGCGGCGGGGGC
    +HWI-EA332:5:13596#0/2
    `aa__]ZHZ_]\]V[]NXX_[FJFSJTY]R\\]VWHZFQ][JOWMZ\[_BB

    The tophat screen:
    [Wed Sep 30 09:29:55 2009] Preparing output location ./tophat_out/
    [Wed Sep 30 09:29:55 2009] Checking for Bowtie index files
    [Wed Sep 30 09:29:55 2009] Checking for reference FASTA file
    [Wed Sep 30 09:29:55 2009] Checking for Bowtie
    Bowtie version: 0.10.1.0
    [Wed Sep 30 09:29:55 2009] Checking reads
    seed length: 52bp
    format: fastq
    quality scale: --solexa1.3-quals
    [Wed Sep 30 09:30:20 2009] Mapping reads against RefGenome with Bowtie
    [Wed Sep 30 09:34:15 2009] Joining segment hits
    Splitting reads into 2 segments
    [Wed Sep 30 09:34:23 2009] Mapping reads against RefGenome with Bowtie
    [Wed Sep 30 09:39:36 2009] Mapping reads against RefGenome with Bowtie
    [Wed Sep 30 09:44:53 2009] Mapping reads against RefGenome with Bowtie
    [Wed Sep 30 09:48:42 2009] Joining segment hits
    Splitting reads into 2 segments
    [Wed Sep 30 09:48:49 2009] Mapping reads against RefGenome with Bowtie
    [Wed Sep 30 09:54:02 2009] Mapping reads against RefGenome with Bowtie
    [Wed Sep 30 09:59:22 2009] Searching for junctions via segment mapping
    Warning: junction database is empty!
    [Wed Sep 30 10:01:08 2009] Joining segment hits
    [Wed Sep 30 10:01:08 2009] Joining segment hits
    [Wed Sep 30 10:01:08 2009] Reporting output tracks
    -----------------------------------------------
    Run complete [00:31:12 elapsed]

    My command:
    ./tophat --solexa1.3-quals RefGenome part10_1.ma.fq part10_2.ma.fq
  • madsen
    Member
    • Sep 2009
    • 10

    #2
    My fastq format seems to have changed during upload (the GGGG C at the end),
    thus here it is again:

    @HWI-EA332:5:13596#0/2
    GCTGATCCGGGACTGCCGGCCTGTGAGGCTGCCCACCTGCGCGGCGGGGGC
    +HWI-EA332:5:13596#0/2
    `aa__]ZHZ_]\]V[]NXX_[FJFSJTY]R\\]VWHZFQ][JOWMZ\[_BB

    Comment

    • madsen
      Member
      • Sep 2009
      • 10

      #3
      Hmmm, didn't help. when opening my data with any texteditor etc. I don't see the GGGG C thus I presume this is not the problem?

      Comment

      • Cole Trapnell
        Senior Member
        • Nov 2008
        • 213

        #4
        Hi, Can you verify that your Bowtie index's record names contain no spaces, by typing bowtie-inspect --names <your_index>

        There is a known interoperability bug between TopHat and Bowtie (which is fixed in the upcoming Bowtie 0.10.2) which results in behavior like this when the index has spaces in the names.

        If your index has simple names, and you are still seeing this, can you email me your logs from the run?

        Comment

        • madsen
          Member
          • Sep 2009
          • 10

          #5
          Dear Cole

          Seem like the indexes have spaces. I'll send the log files.

          ole

          1 dna:chromosome chromosome:Sscrofa9:1:1:295534705:1
          2 dna:chromosome chromosome:Sscrofa9:2:1:140138492:1
          3 dna:chromosome chromosome:Sscrofa9:3:1:123604780:1
          4 dna:chromosome chromosome:Sscrofa9:4:1:136259946:1
          5 dna:chromosome chromosome:Sscrofa9:5:1:100521970:1
          6 dna:chromosome chromosome:Sscrofa9:6:1:123310171:1
          7 dna:chromosome chromosome:Sscrofa9:7:1:136414062:1
          8 dna:chromosome chromosome:Sscrofa9:8:1:119990671:1
          9 dna:chromosome chromosome:Sscrofa9:9:1:132473591:1
          10 dna:chromosome chromosome:Sscrofa9:10:1:66741929:1
          11 dna:chromosome chromosome:Sscrofa9:11:1:79819395:1
          12 dna:chromosome chromosome:Sscrofa9:12:1:57436344:1
          13 dna:chromosome chromosome:Sscrofa9:13:1:145240301:1
          14 dna:chromosome chromosome:Sscrofa9:14:1:148515138:1
          15 dna:chromosome chromosome:Sscrofa9:15:1:134546103:1
          16 dna:chromosome chromosome:Sscrofa9:16:1:77440658:1
          17 dna:chromosome chromosome:Sscrofa9:17:1:64400339:1
          18 dna:chromosome chromosome:Sscrofa9:18:1:54314914:1
          X dna:chromosome chromosome:Sscrofa9:X:1:125876292:1

          Comment

          • greggrant
            Member
            • Dec 2008
            • 28

            #6
            I have run tophat on a set of 454 runs of mouse transcripts. Oddly it produced no junctions. Somebody else here installed tophat a couple weeks ago and got the exact same result but from a completely different data set (solexa data from a different lab) but also against mouse. I can't find any support on this problem, can anybody please help, we really need this to work! Thank you, Greg (Univ of Pennsylvania)

            [Mon Oct 12 15:15:08 2009] Preparing output location ./tophat_out/
            [Mon Oct 12 15:15:08 2009] Checking for Bowtie index files
            [Mon Oct 12 15:15:08 2009] Checking for reference FASTA file
            Warning: Could not find FASTA file /Applications/bowtie-0.10.0/indexes/m_musculus.fa
            [Mon Oct 12 15:15:08 2009] Reconstituting reference FASTA file from Bowtie index
            [Mon Oct 12 15:32:45 2009] Checking for Bowtie
            Bowtie version: 0.10.0.0
            [Mon Oct 12 15:32:45 2009] Checking reads
            Warning: found a read < 20bp in 4.TCA.454Reads.fna
            Warning: found a read < 20bp in 4.TCA.454Reads.fna
            seed length: 20bp
            format: fasta
            [Mon Oct 12 15:32:46 2009] Mapping reads against m_musculus with Bowtie
            [Mon Oct 12 15:33:25 2009] Joining segment hits
            [Mon Oct 12 15:33:25 2009] Searching for junctions via segment mapping
            Warning: junction database is empty!
            [Mon Oct 12 15:36:12 2009] Joining segment hits

            Comment

            • Cole Trapnell
              Senior Member
              • Nov 2008
              • 213

              #7
              Two things that may be causing problems:

              1) Did you check that the Bowtie index records have no spaces in the names? If your index has spaces in the names, you should upgrade to Bowtie version 0.11.x, as we recently resolved an interoperability bug that can trigger this.

              2) Are the sequences for your 454 reads all on single line, or does a read span more than one line? The current version of TopHat has a bug in handling FASTA or FASTQ files where the sequence record for a given read spans more than one line.

              If neither of these is the case for you, please email me the logs from the run. I'll need more information to see what's wrong.

              Comment

              • greggrant
                Member
                • Dec 2008
                • 28

                #8
                Originally posted by Cole Trapnell View Post
                Two things that may be causing problems:

                1) Did you check that the Bowtie index records have no spaces in the names? If your index has spaces in the names, you should upgrade to Bowtie version 0.11.x, as we recently resolved an interoperability bug that can trigger this.

                2) Are the sequences for your 454 reads all on single line, or does a read span more than one line? The current version of TopHat has a bug in handling FASTA or FASTQ files where the sequence record for a given read spans more than one line.

                If neither of these is the case for you, please email me the logs from the run. I'll need more information to see what's wrong.
                Thank you very much for your help! I downloaded the index from the tophat site so I assume it is correct, and I installed tophat and bowtie just today so I assume I'm up to date on versions. There are no spaces. But indeed my fasta file has multiple line records. I'm going to fix that and try again and I'll let you know. Thanks again!!!

                Comment

                • madsen
                  Member
                  • Sep 2009
                  • 10

                  #9
                  Hi All/Cole
                  Just to update on my question. It was indeed the space in the ref genome names which caused the problems. Now everything is running without any problem. Thanks to Cole for his help.

                  Ole

                  Comment

                  • greggrant
                    Member
                    • Dec 2008
                    • 28

                    #10
                    I fixed the multiple line thing and unfortunately it did the same thing again. Here are my files. This has my input file, the command I used (in note.txt) and the entire directory tophat_out. I installed it today with the latest 64 bit versions on a power mac g6 desktop. Thank you for any help you can provide!

                    Comment

                    • Cole Trapnell
                      Senior Member
                      • Nov 2008
                      • 213

                      #11
                      The index linked from the TopHat site unfortunately IS affected by the interoperability bug I mentioned above - I never had a chance to rebuild them with simpler names. I checked the logs in these files, and you appear to have Bowtie 0.10.0 installed, which will trigger the bug. Please upgrade to Bowtie 0.11.2 and give this another shot. Sorry for the inconvenience.

                      Comment

                      • greggrant
                        Member
                        • Dec 2008
                        • 28

                        #12
                        Originally posted by Cole Trapnell View Post
                        The index linked from the TopHat site unfortunately IS affected by the interoperability bug I mentioned above - I never had a chance to rebuild them with simpler names. I checked the logs in these files, and you appear to have Bowtie 0.10.0 installed, which will trigger the bug. Please upgrade to Bowtie 0.11.2 and give this another shot. Sorry for the inconvenience.
                        Thanks again for your help! I downloaded this file:

                        bowtie-0.11.2-bin-macos-10.5-x86_64.zip

                        But now when I run this version of bowtie it throws this error:

                        > bowtie
                        dyld: unknown required load command 0x80000022
                        Trace/BPT trap

                        Sorry I'm having so much trouble but I hope I've almost got it, thanks again for your help!

                        Comment

                        • greggrant
                          Member
                          • Dec 2008
                          • 28

                          #13
                          Originally posted by Cole Trapnell View Post
                          The index linked from the TopHat site unfortunately IS affected by the interoperability bug I mentioned above - I never had a chance to rebuild them with simpler names. I checked the logs in these files, and you appear to have Bowtie 0.10.0 installed, which will trigger the bug. Please upgrade to Bowtie 0.11.2 and give this another shot. Sorry for the inconvenience.
                          I tried 11.3 and got the same error, only 10.0 seems to run.... what am I doing wrong?

                          > dyld: unknown required load command 0x80000022
                          >Trace/BPT trap

                          Comment

                          • Cole Trapnell
                            Senior Member
                            • Nov 2008
                            • 213

                            #14
                            Hmm - that's a new one. What version of OS X are you running this on?

                            Comment

                            • NJD
                              Junior Member
                              • Aug 2009
                              • 3

                              #15
                              I was getting the same message with bowtie-0.11.2-bin-macos-10.5-x86_64.zip. Working from source and setting BITS=64 seems to be fine. Mac OS X 10.5.8.

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                06-02-2026, 10:05 AM
                              • SEQadmin2
                                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                by SEQadmin2


                                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                Introduction

                                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                05-22-2026, 06:42 AM
                              • SEQadmin2
                                Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                by SEQadmin2

                                Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                05-06-2026, 09:04 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, Today, 08:59 AM
                              0 responses
                              4 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 12:03 PM
                              0 responses
                              21 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 11:40 AM
                              0 responses
                              14 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 05-28-2026, 11:40 AM
                              0 responses
                              29 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...