Am I crazy, or does Tophat's gtf_to_fasta utility seem to ignore strand information? Using the hg19 iGenomes release, for example. Transcript NR_024540 is on the minus strand.
Code:
585 NR_024540 chr1 - 14361 29370 29370 29370 11 14361,14969,15795,16606,16857,17232,17605,17914,18267,24737,29320, 14829,15038,15947,16765,17055,17368,17742,18061,18366,24891,29370, 0 WASH7P unk unk -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,
Code:
>NR_024540.1 tccggcagagcggaagcggcggcgggagcttccgggagggcggctcgcag gcaccatgactcctgtgaggatgcagcactccctggcaggtcagacctat
Code:
>1 NR_024540 chr1- 14362-14829,14970-15038,15796-15947,16607-16765,16858-17055,17233-17368,17606-17742,17915-18061,18268-18366,24738-24891,29321-29370 TCCTGCACAGCTAGAGATCCTTTATTAAAAGCACACTGTTGGTTTCTGCTCAGTTCTTTA TTGATTGGTGTGCCGTTTTCTCTGGAAGCCTCTTAAGAACACAGTGGCGCAGGCTGGGTG # nope not the same!
Code:
TCCGGCAGAGCGGAAGCGGCGGCGGGAGCTTCCGGGAGGGCGGCTCGCAG GCACCATGACTCCTGTGAGGATGCAGCACTCCCTGGCAGGTCAGACCTAT # yay this seems to match!
Now, I sort of remember seeing this a while back, but forget what became of it. Anyone have insight? Is this a bug?