I'm trying to use a Cufflinks GTF that includes novel isoforms as an input for DEXSeq. I'm having trouble with the first python script, which I successfully ran with the ensembl gtf...
Here is a portion of my cufflinks GTF:
1 protein_coding exon 5473 5485 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000001"; exon_number "1"; gene_name "Vom2r-ps1"; oId "ENSRNOT00000044270"; nearest_ref "ENSRNOT00000044270"; class_code "="; tss_id "TSS1"; p_id "P10";
1 protein_coding exon 5524 5725 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000001"; exon_number "2"; gene_name "Vom2r-ps1"; oId "ENSRNOT00000044270"; nearest_ref "ENSRNOT00000044270"; class_code "="; tss_id "TSS1"; p_id "P10";
When I run the python script, I get the following error:
Traceback (most recent call last):
File "/home/ega2d/bin/dexseq_prepare_annotation.py", line 33, in <module>
exons[f.iv] += ( f.attr['gene_id'], f.attr['transcript_id'] )
File "_HTSeq.pyx", line 509, in HTSeq._HTSeq.GenomicArray.__getitem__ (src/_HTSeq.c:8702)
KeyError: 'Non-stranded index used for stranded GenomicArray.'
My gtf clearly has strand information, so I'm at a loss for how to make my gtf play nicely with the python script. Any help from HTSeq/DEXSeq experts would be greatly appreciated!
Here is a portion of my cufflinks GTF:
1 protein_coding exon 5473 5485 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000001"; exon_number "1"; gene_name "Vom2r-ps1"; oId "ENSRNOT00000044270"; nearest_ref "ENSRNOT00000044270"; class_code "="; tss_id "TSS1"; p_id "P10";
1 protein_coding exon 5524 5725 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000001"; exon_number "2"; gene_name "Vom2r-ps1"; oId "ENSRNOT00000044270"; nearest_ref "ENSRNOT00000044270"; class_code "="; tss_id "TSS1"; p_id "P10";
When I run the python script, I get the following error:
Traceback (most recent call last):
File "/home/ega2d/bin/dexseq_prepare_annotation.py", line 33, in <module>
exons[f.iv] += ( f.attr['gene_id'], f.attr['transcript_id'] )
File "_HTSeq.pyx", line 509, in HTSeq._HTSeq.GenomicArray.__getitem__ (src/_HTSeq.c:8702)
KeyError: 'Non-stranded index used for stranded GenomicArray.'
My gtf clearly has strand information, so I'm at a loss for how to make my gtf play nicely with the python script. Any help from HTSeq/DEXSeq experts would be greatly appreciated!
Comment