Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • TopHat v2.0.8b Error (tophat_reports & long_spanning_reads)

    Hi All,

    I am having a recurrent error with the new versions of TopHat and Bowtie (2.0.8b and 2.1.0, respectively). I have been using previous releases successfully without a problem. When our system administrator updated our LINUX cluster (system info below) with the new updates of TopHat and Bowtie, I consistently get the two error messages below (from different runs). The system administrator has updated the Bowtie indexes and reinstalled the binaries for TopHat and Bowtie, without success (same errors). Below I have copied the two error messages I get from my runs, the kernel errors that the system administrators see when the job terminates, and the system info.

    I've exhaustively searched SeqAnswers and these same errors have been reported by multiple users on earlier versions of TopHat. I have tried all of the suggestions, such as changing parameters, allocating more memory, using earlier versions, but nothing has worked. In one of the posts, the says the developers released a new version that fixed the bug. http://seqanswers.com/forums/showthr...tophat_reports and more discussion in http://seqanswers.com/forums/showthread.php?t=20538 but this was a few versions before the current release that is crashing for me. Any idea why the bug would resurface in a new version?

    It makes me wonder if it is a user error (me) that is causing the problem. I've reported the issue to the developers, but in the meantime, I was hoping someone here would give me (and others with the same issue), a little insight into the cause of the problem (is it something as a user I can fix or is it really a bug?). Thanks so much in advance!

    Cheers,

    Katie


    DATA
    100bp single-end Illumina reads


    COMMAND RUN
    tophat -p 8 -G /gpfs/group/databases/Homo_sapiens/UCSC/hg19/Annotation/Genes/genes.gtf -a 5 --coverage-search --microexon-search --library-type fr-unstranded -o /gpfs/group/su/kfisch/OA/results/mRNA/alignments/tophat_out /gpfs/group/databases/Homo_sapiens/UCSC/hg19/Sequence/Bowtie2Index/genome /gpfs/group/su/kfisch/OA/data/rawdata/R85L2/Normal1.fastq


    ERROR 1

    [2013-05-06 12:59:54] Beginning TopHat run (v2.0.8b)
    -----------------------------------------------
    [2013-05-06 12:59:54] Checking for Bowtie
    Bowtie version: 2.1.0.0
    [2013-05-06 12:59:54] Checking for Samtools
    Samtools version: 0.1.18.0
    [2013-05-06 12:59:55] Checking for Bowtie index files
    [2013-05-06 12:59:55] Checking for reference FASTA file
    [2013-05-06 12:59:55] Generating SAM header for /gpfs/group/databases/Homo_sapiens/UCSC/hg19/Sequence/Bowtie2Index/genome
    format: fastq
    quality scale: phred33 (default)
    [2013-05-06 13:02:00] Reading known junctions from GTF file
    [2013-05-06 13:02:06] Preparing reads
    left reads: min. length=100, max. length=100, 5492656 kept reads (26291 discarded)
    [2013-05-06 13:04:06] Creating transcriptome data files..
    [2013-05-06 13:06:00] Building Bowtie index from genes.fa
    [2013-05-06 13:17:23] Mapping left_kept_reads to transcriptome genes with Bowtie2
    [2013-05-06 13:32:43] Resuming TopHat pipeline with unmapped reads
    [2013-05-06 13:32:43] Mapping left_kept_reads.m2g_um to genome genome with Bowtie2
    [2013-05-06 14:00:59] Mapping left_kept_reads.m2g_um_seg1 to genome genome with Bowtie2 (1/4)
    [2013-05-06 14:04:40] Mapping left_kept_reads.m2g_um_seg2 to genome genome with Bowtie2 (2/4)
    [2013-05-06 14:08:19] Mapping left_kept_reads.m2g_um_seg3 to genome genome with Bowtie2 (3/4)
    [2013-05-06 14:11:57] Mapping left_kept_reads.m2g_um_seg4 to genome genome with Bowtie2 (4/4)
    [2013-05-06 14:15:46] Searching for junctions via segment mapping
    Coverage-search algorithm is turned on, making this step very slow
    Please try running TopHat again with the option (--no-coverage-search) if this step takes too much time or memory.
    [2013-05-06 22:12:32] Retrieving sequences for splices
    [2013-05-06 22:17:35] Indexing splices
    [2013-05-06 22:43:53] Mapping left_kept_reads.m2g_um_seg1 to genome segment_juncs with Bowtie2 (1/4)
    [2013-05-06 22:59:31] Mapping left_kept_reads.m2g_um_seg2 to genome segment_juncs with Bowtie2 (2/4)
    [2013-05-06 23:14:37] Mapping left_kept_reads.m2g_um_seg3 to genome segment_juncs with Bowtie2 (3/4)
    [2013-05-06 23:30:31] Mapping left_kept_reads.m2g_um_seg4 to genome segment_juncs with Bowtie2 (4/4)
    [2013-05-06 23:45:46] Joining segment hits
    [2013-05-07 00:02:15] Reporting output tracks
    [FAILED]
    Error running /gpfs/home/applications/tophat/2.0.8b/bin/tophat_reports --min-anchor 5 --splice-mismatches 0 --min-report-intron 50 --max-report-intron 500000 --min-isoform-fraction 0.15 --output-dir /gpfs/group/su/kfisch/OA/results/mRNA/alignments/Normal_Cart_1_1/ --max-multihits 20 --max-seg-multihits 40 --segment-length 25 --segment-mismatches 2 --min-closure-exon 100 --min-closure-intron 50 --max-closure-intron 5000 --min-coverage-intron 50 --max-coverage-intron 20000 --min-segment-intron 50 --max-segment-intron 500000 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --max-insertion-length 3 --max-deletion-length 3 -z gzip -p8 --gtf-annotations /gpfs/group/databases/Homo_sapiens/UCSC/hg19/Annotation/Genes/genes.gtf --gtf-juncs /gpfs/group/su/kfisch/OA/results/mRNA/alignments/Normal_Cart_1_1/tmp/genes.juncs --no-closure-search --library-type fr-unstranded --sam-header /gpfs/group/su/kfisch/OA/results/mRNA/alignments/Normal_Cart_1_1/tmp/genome_genome.bwt.samheader.sam --report-discordant-pair-alignments --report-mixed-alignments --samtools=/opt/applications/samtools/0.1.18/gnu/bin/samtools --bowtie2-max-penalty 6 --bowtie2-min-penalty 2 --bowtie2-penalty-for-N 1 --bowtie2-read-gap-open 5 --bowtie2-read-gap-cont 3 --bowtie2-ref-gap-open 5 --bowtie2-ref-gap-cont 3 /gpfs/group/databases/Homo_sapiens/UCSC/hg19/Sequence/Bowtie2Index/genome.fa /gpfs/group/su/kfisch/OA/results/mRNA/alignments/Normal_Cart_1_1/junctions.bed /gpfs/group/su/kfisch/OA/results/mRNA/alignments/Normal_Cart_1_1/insertions.bed /gpfs/group/su/kfisch/OA/results/mRNA/alignments/Normal_Cart_1_1/deletions.bed /gpfs/group/su/kfisch/OA/results/mRNA/alignments/Normal_Cart_1_1/fusions.out /gpfs/group/su/kfisch/OA/results/mRNA/alignments/Normal_Cart_1_1/tmp/accepted_hits /gpfs/group/su/kfisch/OA/results/mRNA/alignments/Normal_Cart_1_1/tmp/left_kept_reads.m2g.bam,/gpfs/group/su/kfisch/OA/results/mRNA/alignments/Normal_Cart_1_1/tmp/left_kept_reads.m2g_um.mapped.bam,/gpfs/group/su/kfisch/OA/results/mRNA/alignments/Normal_Cart_1_1/tmp/left_kept_reads.m2g_um.candidates /gpfs/group/su/kfisch/OA/results/mRNA/alignments/Normal_Cart_1_1/tmp/left_kept_reads.bam
    Loaded 216457 junctions


    ERROR 2

    [2013-05-06 13:51:01] Beginning TopHat run (v2.0.8b)
    -----------------------------------------------
    [2013-05-06 13:51:01] Checking for Bowtie
    Bowtie version: 2.1.0.0
    [2013-05-06 13:51:01] Checking for Samtools
    Samtools version: 0.1.18.0
    [2013-05-06 13:51:01] Checking for Bowtie index files
    [2013-05-06 13:51:01] Checking for reference FASTA file
    [2013-05-06 13:51:01] Generating SAM header for /gpfs/group/databases/Homo_sapiens/UCSC/hg19/Sequence/Bowtie2Index/genome
    format: fastq
    quality scale: phred33 (default)
    [2013-05-06 13:53:28] Reading known junctions from GTF file
    [2013-05-06 13:53:33] Preparing reads
    left reads: min. length=100, max. length=100, 12378532 kept reads (346481 discarded)
    [2013-05-06 13:58:21] Creating transcriptome data files..
    [2013-05-06 14:00:34] Building Bowtie index from genes.fa
    [2013-05-06 14:12:37] Mapping left_kept_reads to transcriptome genes with Bowtie2
    [2013-05-06 15:14:18] Resuming TopHat pipeline with unmapped reads
    [2013-05-06 15:14:18] Mapping left_kept_reads.m2g_um to genome genome with Bowtie2
    [2013-05-06 16:25:21] Mapping left_kept_reads.m2g_um_seg1 to genome genome with Bowtie2 (1/4)
    [2013-05-06 16:33:08] Mapping left_kept_reads.m2g_um_seg2 to genome genome with Bowtie2 (2/4)
    [2013-05-06 16:39:51] Mapping left_kept_reads.m2g_um_seg3 to genome genome with Bowtie2 (3/4)
    [2013-05-06 16:45:36] Mapping left_kept_reads.m2g_um_seg4 to genome genome with Bowtie2 (4/4)
    [2013-05-06 16:50:57] Searching for junctions via segment mapping
    Coverage-search algorithm is turned on, making this step very slow
    Please try running TopHat again with the option (--no-coverage-search) if this step takes too much time or memory.
    [2013-05-07 05:05:23] Retrieving sequences for splices
    [2013-05-07 05:14:27] Indexing splices
    [2013-05-07 05:45:12] Mapping left_kept_reads.m2g_um_seg1 to genome segment_juncs with Bowtie2 (1/4)
    [2013-05-07 06:06:02] Mapping left_kept_reads.m2g_um_seg2 to genome segment_juncs with Bowtie2 (2/4)
    [2013-05-07 06:25:11] Mapping left_kept_reads.m2g_um_seg3 to genome segment_juncs with Bowtie2 (3/4)
    [2013-05-07 06:43:05] Mapping left_kept_reads.m2g_um_seg4 to genome segment_juncs with Bowtie2 (4/4)
    [2013-05-07 07:00:28] Joining segment hits
    [FAILED]
    Error running 'long_spanning_reads':Warning: 6517936 malformed closure


    ERROR LOGS FROM CLUSTER ADMINISTRATOR

    Job 1125636.garibaldi01-adm.cluster.net;user=kfisch group=su jobname=TH_Normal_Cart_

    7 00:07:22 nodea1329.cluster.net kernel: [1782648.116483] tophat_reports[19181]: segfault at 18 ip 000000000041032a sp 00007fff9de41c20 error 4 in tophat_reports[400000+117000]

    1125639.garibaldi01-adm.cluster.net;user=kfisch group=su jobname=TH_Normal_Cart_

    May 7 03:43:30 nodea1418.cluster.net kernel: [411648.947929] long_spanning_r[31143]: segfault at 8 ip 000000000042c96f sp 00007f635b46f520 error 4 in long_spanning_reads[400000+10b000]

    1125640.garibaldi01-adm.cluster.net;user=kfisch group=su jobname=TH_Normal_Cart_

    May 7 07:09:44 nodea1428.cluster.net kernel: [4960730.719744] long_spanning_r[14136]: segfault at 8 ip 000000000042c96f sp 00007f9b35bf8520 error 4 in long_spanning_reads[400000+10b000]


    SYSTEM INFO

    # uname -a
    Linux nodea1329 2.6.34-12-desktop #1 SMP PREEMPT 2010-06-29 02:39:08
    +0200 x86_64 x86_64 x86_64 GNU/Linux
    # cat /etc/SuSE-release
    openSUSE 11.3 (x86_64)
    VERSION = 11.3
    # gcc --version
    gcc (SUSE Linux) 4.5.0 20100604 [gcc-4_5-branch revision 160292] Copyright (C) 2010 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

    #cat /proc/cpuinfo (2xeight cores machines) [...]
    processor : 15
    vendor_id : GenuineIntel
    cpu family : 6
    model : 45
    model name : Intel(R) Xeon(R) CPU E5-2450 0 @ 2.10GHz
    stepping : 7
    cpu MHz : 2099.859
    cache size : 20480 KB
    physical id : 1
    siblings : 8
    core id : 7
    cpu cores : 8
    apicid : 46
    initial apicid : 46
    fpu : yes
    fpu_exception : yes
    cpuid level : 13
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
    cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 x2apic popcnt aes xsave avx lahf_lm ida arat tpr_shadow vnmi flexpriority ept vpid
    bogomips : 4199.44
    clflush size : 64
    cache_alignment : 64
    address sizes : 46 bits physical, 48 bits virtual
    power management:

    # free
    total used free shared buffers cached
    Mem: 49482780 605996 48876784 0 128544 175908
    -/+ buffers/cache: 301544 49181236
    Swap: 4200992 19492 4181500

  • #2
    **UPDATE: Problem Fixed!**

    Hi All,

    I fixed the TopHat 2.0.8b crash that I was experiencing, so I wanted to update my post in case this helps anyone else. I had our system administrator reinstall the UCSC hg19 iGenome, and running TopHat with the new genome fixed the problem. We didn’t make any changes to the iGenome prior to the crash of TopHat (and it doesn’t look like it has been updated since last year on the iGenome website), so I’m not sure why this caused TopHat to crash. Either way, TopHat is not throwing the error messages now and is completing normally.

    Cheers,

    Katie

    Comment


    • #3
      I am having this same error, but it sounds like I will have to solve it differently (I don't think we're using "iGenome," so I guess I can't just re-install that to fix things, unfortunately).

      I am using Tophat 2.0.9, downloaded directly from the web site (not compiled locally!).

      Code:
      [2013-08-07 09:29:12] Joining segment hits
      [2013-08-07 09:33:53] Reporting output tracks
              [FAILED]
      Error running /work/Apps/Bio/tophat/tophat-2.0.9/tophat_reports --min-anchor 8 --splice-mismatches 2 --min-report-intron 50 --max-report-intron 500000 --min-isoform-fraction 0.15 --output-dir Tophat_New_1_FASTA_FILES/Sample_CR3_D00/CR3_D0_ATCACG_L1_PAIR
      ED/ --max-multihits 20 --max-seg-multihits 40 --segment-length 25 --segment-mismatches 2 --min-closure-exon 100 --min-closure-intron 50 --max-closure-intron 5000 --min-coverage-intron 50 --max-coverage-intron 20000 --min-segment-intron 50 --max-segment-
      intron 500000 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --max-insertion-length 3 --max-deletion-length 3 -z gzip -p6 --inner-dist-mean 150 --inner-dist-std-dev 20 --gtf-annotations tophat_index_cr3/index.gff -
      -gtf-juncs Tophat_New_1_FASTA_FILES/Sample_CR3_D00/CR3_D0_ATCACG_L1_PAIRED/tmp/index.juncs --no-closure-search --no-coverage-search --sam-header Tophat_New_1_FASTA_FILES/Sample_CR3_D00/CR3_D0_ATCACG_L1_PAIRED/tmp/hg19_genome.bwt.samheader.sam --samtools
      =/work/Apps/bin/samtools --bowtie2-max-penalty 6 --bowtie2-min-penalty 2 --bowtie2-penalty-for-N 1 --bowtie2-read-gap-open 5 --bowtie2-read-gap-cont 3 --bowtie2-ref-gap-open 5 --bowtie2-ref-gap-cont 3 /work/Common/Data/Bowtie_Indexes/hg19.fa Tophat_New_
      1_FASTA_FILES/Sample_CR3_D00/CR3_D0_ATCACG_L1_PAIRED/junctions.bed Tophat_New_1_FASTA_FILES/Sample_CR3_D00/CR3_D0_ATCACG_L1_PAIRED/insertions.bed Tophat_New_1_FASTA_FILES/Sample_CR3_D00/CR3_D0_ATCACG_L1_PAIRED/deletions.bed Tophat_New_1_FASTA_FILES/Samp
      le_CR3_D00/CR3_D0_ATCACG_L1_PAIRED/fusions.out Tophat_New_1_FASTA_FILES/Sample_CR3_D00/CR3_D0_ATCACG_L1_PAIRED/tmp/accepted_hits Tophat_New_1_FASTA_FILES/Sample_CR3_D00/CR3_D0_ATCACG_L1_PAIRED/tmp/left_kept_reads.m2g.bam,Tophat_New_1_FASTA_FILES/Sample_
      CR3_D00/CR3_D0_ATCACG_L1_PAIRED/tmp/left_kept_reads.m2g_um.mapped.bam,Tophat_New_1_FASTA_FILES/Sample_CR3_D00/CR3_D0_ATCACG_L1_PAIRED/tmp/left_kept_reads.m2g_um.candidates Tophat_New_1_FASTA_FILES/Sample_CR3_D00/CR3_D0_ATCACG_L1_PAIRED/tmp/left_kept_rea
      ds.bam Tophat_New_1_FASTA_FILES/Sample_CR3_D00/CR3_D0_ATCACG_L1_PAIRED/tmp/right_kept_reads.m2g.bam,Tophat_New_1_FASTA_FILES/Sample_CR3_D00/CR3_D0_ATCACG_L1_PAIRED/tmp/right_kept_reads.m2g_um.mapped.bam,Tophat_New_1_FASTA_FILES/Sample_CR3_D00/CR3_D0_ATC
      ACG_L1_PAIRED/tmp/right_kept_reads.m2g_um.candidates Tophat_New_1_FASTA_FILES/Sample_CR3_D00/CR3_D0_ATCACG_L1_PAIRED/tmp/right_kept_reads.bam
      Loaded 314277 junctions

      Comment


      • #4
        Apparently I was using iGenomes after all, without realizing. It seems to be from this page: http://cufflinks.cbcb.umd.edu/igenomes.html

        I am going to download the files from here and see if that fixes things.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Genetic Variation in Immunogenetics and Antibody Diversity
          by seqadmin



          The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
          11-06-2024, 07:24 PM
        • seqadmin
          Choosing Between NGS and qPCR
          by seqadmin



          Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
          10-18-2024, 07:11 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 11-08-2024, 11:09 AM
        0 responses
        36 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 11-08-2024, 06:13 AM
        0 responses
        28 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 11-01-2024, 06:09 AM
        0 responses
        32 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 10-30-2024, 05:31 AM
        0 responses
        23 views
        0 likes
        Last Post seqadmin  
        Working...
        X