Dear All:
I am using MISO to test for differential exon usage between a control and a treatment group. I got an error when computing the insert length distribution using pe_utils.py --compute-insert-len. I list the steps I used below:
1. sort the BAM file from TopHat (by coordinate):
samtools sort control.bam control_sorted
2. index the BAM file:
samtools index control_sorted.bam control_sorted.bai
3. run pe_utils.py:
python pe_utils.py --compute-insert-len controlam /directories/exons/Homo_sapiens.GRCh37.65.min_1000.const_exons.gff --output-dir /directories/insert-dist/
After the command above, I got the error message:
Preparing to call bedtools 'tagBam'
tagBam -i control.bam -files /directories/exons/Homo_sapiens.GRCh37.65.min_1000.const_exons.gff -labels gff -intervals -f 1 | samtools view - -h | egrep '^@|:gff:' | samtools view - -Shb -o /directories/insert-dist/bam2gff_Homo_sapiens.GRCh37.65.min_1000.const_exons.gff/control.bam
[samopen] SAM header is present: 25 sequences.
[sam_read1] reference 'ID:TopHat CL:/informatics/tools/Linux-AS5/bin/tophat -o Lane3 -g 1 --coverage-search --microexon -r 100 --phred64-quals --library-type fr-unstranded -p 4 -G gene_models/Homo_sapiens.GRCh37.72_norm.gtf --transcriptome-index=gene_models/transcripts /directories/Genomes/NCBI_Jul-09-2012/Human/bowtie/human_ref_genome Lane3_1.fq.gz Lane3_2.fq.gz VN:1.4.1
' is recognized as '*'.
[main_samview] truncated file.
Traceback (most recent call last):
File "/pe_utils.py", line 520, in <module>
main()
File "pe_utils.py", line 517, in main
sd_max=sd_max)
File "pe_utils.py", line 271, in compute_insert_len
output_dir)
File "exon_utils.py", line 185, in map_bam2gff
raise Exception, "Error: tagBam call failed."
Exception: Error: tagBam call failed.
I used Homo_sapiens.GRCh37.72_norm.gtf from Ensembl as the annotation file when preparing my data, but downloaded
Human genome (hg19) alternative events v2.0
from the MISO website and unzipped. I saw it is based on Homo_sapiens.GRCh37.65. Is this the version problem? If so, could anyone provide the latest GFF3 file for use? Thank you for your suggestions!
I am using MISO to test for differential exon usage between a control and a treatment group. I got an error when computing the insert length distribution using pe_utils.py --compute-insert-len. I list the steps I used below:
1. sort the BAM file from TopHat (by coordinate):
samtools sort control.bam control_sorted
2. index the BAM file:
samtools index control_sorted.bam control_sorted.bai
3. run pe_utils.py:
python pe_utils.py --compute-insert-len controlam /directories/exons/Homo_sapiens.GRCh37.65.min_1000.const_exons.gff --output-dir /directories/insert-dist/
After the command above, I got the error message:
Preparing to call bedtools 'tagBam'
tagBam -i control.bam -files /directories/exons/Homo_sapiens.GRCh37.65.min_1000.const_exons.gff -labels gff -intervals -f 1 | samtools view - -h | egrep '^@|:gff:' | samtools view - -Shb -o /directories/insert-dist/bam2gff_Homo_sapiens.GRCh37.65.min_1000.const_exons.gff/control.bam
[samopen] SAM header is present: 25 sequences.
[sam_read1] reference 'ID:TopHat CL:/informatics/tools/Linux-AS5/bin/tophat -o Lane3 -g 1 --coverage-search --microexon -r 100 --phred64-quals --library-type fr-unstranded -p 4 -G gene_models/Homo_sapiens.GRCh37.72_norm.gtf --transcriptome-index=gene_models/transcripts /directories/Genomes/NCBI_Jul-09-2012/Human/bowtie/human_ref_genome Lane3_1.fq.gz Lane3_2.fq.gz VN:1.4.1
' is recognized as '*'.
[main_samview] truncated file.
Traceback (most recent call last):
File "/pe_utils.py", line 520, in <module>
main()
File "pe_utils.py", line 517, in main
sd_max=sd_max)
File "pe_utils.py", line 271, in compute_insert_len
output_dir)
File "exon_utils.py", line 185, in map_bam2gff
raise Exception, "Error: tagBam call failed."
Exception: Error: tagBam call failed.
I used Homo_sapiens.GRCh37.72_norm.gtf from Ensembl as the annotation file when preparing my data, but downloaded
Human genome (hg19) alternative events v2.0
from the MISO website and unzipped. I saw it is based on Homo_sapiens.GRCh37.65. Is this the version problem? If so, could anyone provide the latest GFF3 file for use? Thank you for your suggestions!
Comment