Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error: tagBam call failed when running pe_utils.py --compute-insert-len

    Dear All:

    I am using MISO to test for differential exon usage between a control and a treatment group. I got an error when computing the insert length distribution using pe_utils.py --compute-insert-len. I list the steps I used below:

    1. sort the BAM file from TopHat (by coordinate):
    samtools sort control.bam control_sorted

    2. index the BAM file:
    samtools index control_sorted.bam control_sorted.bai

    3. run pe_utils.py:
    python pe_utils.py --compute-insert-len controlam /directories/exons/Homo_sapiens.GRCh37.65.min_1000.const_exons.gff --output-dir /directories/insert-dist/

    After the command above, I got the error message:

    Preparing to call bedtools 'tagBam'
    tagBam -i control.bam -files /directories/exons/Homo_sapiens.GRCh37.65.min_1000.const_exons.gff -labels gff -intervals -f 1 | samtools view - -h | egrep '^@|:gff:' | samtools view - -Shb -o /directories/insert-dist/bam2gff_Homo_sapiens.GRCh37.65.min_1000.const_exons.gff/control.bam
    [samopen] SAM header is present: 25 sequences.
    [sam_read1] reference 'ID:TopHat CL:/informatics/tools/Linux-AS5/bin/tophat -o Lane3 -g 1 --coverage-search --microexon -r 100 --phred64-quals --library-type fr-unstranded -p 4 -G gene_models/Homo_sapiens.GRCh37.72_norm.gtf --transcriptome-index=gene_models/transcripts /directories/Genomes/NCBI_Jul-09-2012/Human/bowtie/human_ref_genome Lane3_1.fq.gz Lane3_2.fq.gz VN:1.4.1
    ' is recognized as '*'.
    [main_samview] truncated file.
    Traceback (most recent call last):
    File "/pe_utils.py", line 520, in <module>
    main()
    File "pe_utils.py", line 517, in main
    sd_max=sd_max)
    File "pe_utils.py", line 271, in compute_insert_len
    output_dir)
    File "exon_utils.py", line 185, in map_bam2gff
    raise Exception, "Error: tagBam call failed."
    Exception: Error: tagBam call failed.

    I used Homo_sapiens.GRCh37.72_norm.gtf from Ensembl as the annotation file when preparing my data, but downloaded

    Human genome (hg19) alternative events v2.0

    from the MISO website and unzipped. I saw it is based on Homo_sapiens.GRCh37.65. Is this the version problem? If so, could anyone provide the latest GFF3 file for use? Thank you for your suggestions!

  • #2
    I'm actually having the same problem with MISO. I also haven't been able to find anything as to why this happens, so if anyone could give us some insight, that would be great.

    For now I'm trying to do the same thing using Bowtie and Picard-tools (outlined here: http://vinaykmittal.blogspot.ca/2012...or-paired.html ) but I'm sure it would be much easier using MISO's function...

    Comment


    • #3
      Originally posted by space_monkey View Post
      I'm actually having the same problem with MISO. I also haven't been able to find anything as to why this happens, so if anyone could give us some insight, that would be great.

      For now I'm trying to do the same thing using Bowtie and Picard-tools (outlined here: http://vinaykmittal.blogspot.ca/2012...or-paired.html ) but I'm sure it would be much easier using MISO's function...
      Hi @space_monkey:

      Here are some comments -- I contact the authors and gave my partial outputs, and the reply is below:

      It does look like a headers mismatch then. Your BAM file contains "chr" style chromosomes (e.g. "chr10" and not "10"). I believe your GFF, /directories/exons/Homo_sapiens.GRCh37.65.min_1000.const_exons.gff, is from Ensembl which would not contain chr-prefixes. Just look in that gff file and see what the chromosome entries are like. If they don't have chr, the operation will fail. All you need to do is generate a constitutive exons file from a UCSC gff which contains chromosome headers that match your .bam file.

      See:



      If you're using hg19, you can use this GFF:



      Use our exon_utils program to generate constitutive exons from this file and then rerun pe_utils with that, instead of the GRCh37 gff file.

      I use ensGene.gff3 and it works. However, I still cannot get the results (not sure why...)

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM
      • seqadmin
        Techniques and Challenges in Conservation Genomics
        by seqadmin



        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

        Avian Conservation
        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
        03-08-2024, 10:41 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 03-27-2024, 06:37 PM
      0 responses
      13 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-27-2024, 06:07 PM
      0 responses
      11 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-22-2024, 10:03 AM
      0 responses
      53 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-21-2024, 07:32 AM
      0 responses
      69 views
      0 likes
      Last Post seqadmin  
      Working...
      X