Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tophat failure: Reporting output tracks and bam_merge

    Something wrong with bam_merge?

    TopHat run (v2.0.6)
    Bowtie version: 0.12.7.0
    Samtools version: 0.1.18.0


    tophat --bowtie1 -o ./SRR486241 --solexa-quals -p 10 -g 1 --no-coverage-search --no-novel-juncs --library-type fr-firststrand -G genes.gtf GRCh37 ESC_1.fastq ESC_2.fastq





    Reporting output tracks
    [FAILED]
    Error running /usr/local/bin/tophat_reports --min-anchor 8 --splice-mismatches 0 --min-report-intron 50 --max-report-intron 500000 --min-isoform-fraction 0.15 --output-dir ./SRR486241/ --max-multihits 1 --max-seg-multihits 10 --segment-length 25 --segment-mismatches 2 --min-closure-exon 100 --min-closure-intron 50 --max-closure-intron 5000 --min-coverage-intron 50 --max-coverage-intron 20000 --min-segment-intron 50 --max-segment-intron 500000 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --max-insertion-length 3 --max-deletion-length 3 --bowtie1 -z gzip -p10 --inner-dist-mean 50 --inner-dist-std-dev 20 --gtf-annotations genes.gtf --gtf-juncs ./SRR486241/tmp/genes.juncs --no-closure-search --no-coverage-search --no-microexon-search --solexa-quals --library-type fr-firststrand --sam-header ./SRR486241/tmp/GRCh37_genome.bwt.samheader.sam --report-discordant-pair-alignments --report-mixed-alignments --samtools=/Users/yingtao/Jerry/Tools/samtools/samtools --bowtie2-max-penalty 6 --bowtie2-min-penalty 2 --bowtie2-penalty-for-N 1 --bowtie2-read-gap-open 5 --bowtie2-read-gap-cont 3 --bowtie2-ref-gap-open 5 --bowtie2-ref-gap-cont 3 GRCh37.fa ./SRR486241/junctions.bed ./SRR486241/insertions.bed ./SRR486241/deletions.bed ./SRR486241/fusions.out ./SRR486241/tmp/accepted_hits ./SRR486241/tmp/left_kept_reads.m2g.bam,./SRR486241/tmp/left_kept_reads.m2g_um.mapped.bam,./SRR486241/tmp/left_kept_reads.m2g_um.candidates ./SRR486241/tmp/left_kept_reads.bam ./SRR486241/tmp/right_kept_reads.m2g.bam,./SRR486241/tmp/right_kept_reads.m2g_um.mapped.bam,./SRR486241/tmp/right_kept_reads.m2g_um.candidates ./SRR486241/tmp/right_kept_reads.bam

    Error: bam_merge failed to open BAM file ./SRR486241/tmp/right_kept_reads.m2g_um.candidates3.bam


    The input files are very large. There are about 200 million pairs of paired-end reads.
    When I tried the same parameter using the first 10,000 pairs, everything was fine and there was no failure.
    Therefore, is it because the input FASTQ file is too large?
    Last edited by Jerry_Zhao; 12-13-2012, 01:39 PM.

  • #2
    >Therefore, is it because the input FASTQ file is too large?

    Well, that would be the very root cause, yes. Smaller input equals less load on the system overall. Now as if bam_merge itself is having problems ... that is a different issue. It could be simply that you are running out of disk space, disk quota, an network file error, memory constraints, etc instead of the bam_merge program per se. Are there any other error messages preceeding the one that you posted?

    Comment


    • #3
      We have 2T hard drive and the input files are 67G + 67G , therefore the disk space is unlikely the problem.
      We have 32G memory and the tophat only use half of them.

      I have used the same parameter for several datasets with 20 million single-end reads last week, and everything was fine.


      I have changed the parameter -p 10 to -p 7, and run tophat again.
      I will update the result today or tomorrow.

      Comment


      • #4
        any news?

        Jerry,

        Having similar problem with TopHat 2.0.6. Have 2T hard drive and 96G memory. Input files are 20G + 20G. I too, have used the same parameter (-p 25) for smaller datasets with no problem. Any results when you decreased the number of threads from 10 to 7?

        Appreciate any information...

        Comment


        • #5
          Hi Labrat73,

          Sorry that I forgot to update.

          I only changed the parameter -p 10 to -p 7, and no error message occurred.
          Nevertheless, I do not have high confidence about the results.

          Therefore, I am going to try STAR, another RNA-seq aligner.
          I will update after I get the STAR results and compare with the two.

          Best,
          Jerry

          Comment


          • #6
            Originally posted by Jerry_Zhao View Post
            Hi Labrat73,

            Sorry that I forgot to update.

            I only changed the parameter -p 10 to -p 7, and no error message occurred.
            Nevertheless, I do not have high confidence about the results.

            Therefore, I am going to try STAR, another RNA-seq aligner.
            I will update after I get the STAR results and compare with the two.

            Best,
            Jerry

            Hi,

            I am getting into that error for one of my samples. So the trick is to lower the number of threads to solve it. Did I read it correctly?

            Comment


            • #7
              Yes.
              Reduce the number of threads to no more than 10, there will be no error message during running.

              Nevertheless, no error message does not mean that the mapping is correct. This is my opinion.

              Comment


              • #8
                Jup, the job went through without crash.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Essential Discoveries and Tools in Epitranscriptomics
                  by seqadmin




                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                  04-22-2024, 07:01 AM
                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-25-2024, 11:49 AM
                0 responses
                19 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-24-2024, 08:47 AM
                0 responses
                19 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                62 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                60 views
                0 likes
                Last Post seqadmin  
                Working...
                X