Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • altodor
    Member
    • Nov 2009
    • 12

    TopHat - empty junction database

    Hi all,

    I ran TopHat, it aligned reads but couldn't find any junctions! So I received files with left-hand reads and right-hand reads (around 30 mln reads) , but the junction database is empy. What can be the problem?

    Command line:
    /home/programs/tophat/bin/tophat -r 200 -p 8 -o tophatout-11-3 --segment-length 24 --keep-tmp /home/genomes/bowtie/hg19 SRR027870_1.fastq SRR027870_2.fastq

    Reads length is 45 bp, I tried to use --segment-reads 24 as mentioned in previous threads, but no effect.

    Here is some information:
    File segment_junc:
    segment_juncs v1.1.2 (exported)
    ---------------------------
    Loading reference sequences...
    Loading ...done
    Loading left segment hits... Processed 500000 root segment groups
    Microaligned 0 segments
    done.
    Loading right segment hits... Processed 500000 root segment groups
    Processed 1000000 root segment groups
    Processed 1500000 root segment groups
    Microaligned 0 segments
    done.
    Found 0 potential split-segment junctions
    Indexing extensions in /tophatout-11-3/tmp/left_kept_reads_seg1_missing.fq
    Indexing extensions in /tophatout-11-3/tmp/left_kept_reads_seg2_missing.fq
    Indexing extensions in /tophatout-11-3/tmp/right_kept_reads_seg1_missing.fq
    Indexing extensions in /tophatout-11-3/tmp/right_kept_reads_seg2_missing.fq
    Total extensions: 72218724
    Total extensions: 72218724
    Total extensions: 72218724
    Total extensions: 72218724
    Looking for junctions by island end pairings
    Adding hits from segment file 0 to coverage map
    Adding hits from segment file 1 to coverage map
    Adding hits from segment file 2 to coverage map
    Adding hits from segment file 3 to coverage map
    Map covers 31687325 bases
    Map covers 30905855 bases in sufficiently long segments
    Map contains 777008 good islands
    38604205 are left looking bases
    38603996 are right looking bases
    Collecting potential splice sites in islands
    reporting synthetic splice junctions...
    Found 0 potential island-end pairing junctions
    done
    -- seg --
    -- done --
    -- cov --
    -- done --
    -- buf --
    -- done --
    Reporting potential splice junctions...done
    Reported 0 total possible splices


    Report from TopHat:

    [Wed Jan 12 12:59:35 2011] Beginning TopHat run (v1.1.2)
    -----------------------------------------------
    [Wed Jan 12 12:59:35 2011] Preparing output location /tophatout-11-3/
    [Wed Jan 12 12:59:35 2011] Checking for Bowtie index files
    [Wed Jan 12 12:59:35 2011] Checking for reference FASTA file
    [Wed Jan 12 12:59:35 2011] Checking for Bowtie
    Bowtie version: 0.12.7.0
    [Wed Jan 12 12:59:36 2011] Checking for Samtools
    Samtools version: 0.1.10.0
    [Wed Jan 12 13:00:41 2011] Checking reads
    min read length: 45bp, max read length: 45bp
    format: fastq
    quality scale: phred33 (default)
    [Wed Jan 12 13:06:58 2011] Mapping reads against hg19 with Bowtie
    [Wed Jan 12 13:18:52 2011] Joining segment hits
    [Wed Jan 12 13:21:33 2011] Mapping reads against hg19 with Bowtie(1/2)
    [Wed Jan 12 13:25:00 2011] Mapping reads against hg19 with Bowtie(2/2)
    [Wed Jan 12 13:29:56 2011] Mapping reads against hg19 with Bowtie
    [Wed Jan 12 13:41:14 2011] Joining segment hits
    [Wed Jan 12 13:43:53 2011] Mapping reads against hg19 with Bowtie(1/2)
    [Wed Jan 12 13:48:26 2011] Mapping reads against hg19 with Bowtie(2/2)
    [Wed Jan 12 13:55:16 2011] Searching for junctions via segment mapping
    Warning: junction database is empty!
    [Wed Jan 12 13:58:43 2011] Joining segment hits
    [Wed Jan 12 14:01:41 2011] Joining segment hits
    [Wed Jan 12 14:04:35 2011] Reporting output tracks
    -----------------------------------------------
    Run complete [01:19:07 elapsed]


    Any help is very appreciated!
  • Camg
    Member
    • Jan 2011
    • 21

    #2
    Empty junction file

    Hi,

    I'm having the same problem. I've tried using different parameters in a few different combinations and I'm still getting nothing. As you can see in the command below I've relaxed some of the parameters for finding junctions, I changed the --segment-length as people have recommended, and I used the --butterfly-search which is supposed to run a more sensitive search for junctions.

    I've also tried giving tophat a splice junction file as a .junc or .gtf file for a few of the introns that I'm interested in, and I got an error saying the junction file is empty. This is a separate issue though.

    FYI: Dataset is Illumina single reads (~30M) of 100bp trimmed to 85bp.

    Command line: tophat -p 4 -g 1 -a 4 -m 1 -F 0 -i 10 -I 200 --segment-length 24 --butterfly-search -o /media/Data_1/Cam/genome/trimmedreads/goe_1.9/ /media/Data_1/Cam/genome/trimmedreads/ecun /media/Data_1/Cam/genome/GOE1_trimmed85.fastq

    Tophat report:

    [Tue Jan 18 15:22:33 2011] Beginning TopHat run (v1.1.4)
    -----------------------------------------------
    [Tue Jan 18 15:22:33 2011] Preparing output location /media/Data_1/Cam/genome/trimmedreads/goe_1.9///
    [Tue Jan 18 15:22:33 2011] Checking for Bowtie index files
    [Tue Jan 18 15:22:33 2011] Checking for reference FASTA file
    Warning: Could not find FASTA file /media/Data_1/Cam/genome/trimmedreads/ecun.fa
    [Tue Jan 18 15:22:33 2011] Reconstituting reference FASTA file from Bowtie index
    [Tue Jan 18 15:22:33 2011] Checking for Bowtie
    Bowtie version: 0.12.7.0
    [Tue Jan 18 15:22:33 2011] Checking for Samtools
    Samtools version: 0.1.8.0
    [Tue Jan 18 15:22:33 2011] Checking reads
    min read length: 85bp, max read length: 85bp
    format: fastq
    quality scale: phred33 (default)
    [Tue Jan 18 15:34:57 2011] Mapping reads against ecun with Bowtie
    [Tue Jan 18 15:40:30 2011] Joining segment hits
    [Tue Jan 18 15:49:20 2011] Mapping reads against ecun with Bowtie(1/3)
    [Tue Jan 18 15:53:02 2011] Mapping reads against ecun with Bowtie(2/3)
    [Tue Jan 18 15:56:53 2011] Mapping reads against ecun with Bowtie(3/3)
    [Tue Jan 18 16:00:31 2011] Searching for junctions via segment mapping
    Warning: junction database is empty!
    [Tue Jan 18 16:10:39 2011] Joining segment hits
    [Tue Jan 18 16:12:14 2011] Reporting output tracks
    -----------------------------------------------
    Run complete [00:51:53 elapsed]

    It seems like this is a fairly common problem. Does anyone have any thoughts?

    Thanks

    Comment

    • altodor
      Member
      • Nov 2009
      • 12

      #3
      Check reference.fa file. Mine was empty, may be that was the problem.
      I built it manually using bowtie-inspect (there is such option, read bowtie-inspect help).
      Now everything works.
      Did it help you?

      Comment

      • Camg
        Member
        • Jan 2011
        • 21

        #4
        Re: empty junction file

        Thanks for the reply altodor.

        When I changed the name of my ref.fa file to match the index name it actually resulted in Tophat finding a few junctions. So thanks for that suggestion. However, it only found 2 junctions when there are over 30, and the boundaries of the one's it found are not perfect. So I still have to figure out why it won't recognize most of the splice boundaries.

        Comment

        Latest Articles

        Collapse

        • SEQadmin2
          Nine Things a Sample Prep Scientist Thinks About Before Sequencing
          by SEQadmin2


          I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

          Here are nine questions we think about, in roughly the order they matter, before...
          06-18-2026, 07:11 AM
        • SEQadmin2
          From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
          by SEQadmin2


          Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


          The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
          ...
          06-02-2026, 10:05 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by SEQadmin2, Today, 05:37 AM
        0 responses
        5 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-26-2026, 11:10 AM
        0 responses
        16 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-17-2026, 06:09 AM
        0 responses
        50 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-09-2026, 11:58 AM
        0 responses
        109 views
        0 reactions
        Last Post SEQadmin2  
        Working...