I'm running Tophat v1.2.0 with Bowtie 0.12.7.0 and Samtools 0.1.12a, on a particular RNAseq sample. Everything runs correctly until Tophat tries to use Samtools, because it creates this nonsensical alignment that chokes samtools:
HWUSI-EAS552R_0001:1:4:10279:10944#0 163 chr17 7357209 255 78M97N536870908M92N22M = 7357535 0 CAACTATAGTCCCACATCACCCAGCTATTCGCCAACTTCACCCAGCTACTCACCCACTTCTCCCAGCTACTCACCTACCTCTCCAAGCTACTCACC CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC;CC?CC?CBACCCCCCCC2C7*;;'8?2??=@?=CC@@=BA<AA@+:>50::>A+:>@:A###### NM:i:5 XS:A:- NH:i:1
If you look at the CIGAR string you can see the problem. There's a match region that's 536870908 bp in length, in the middle of this 96 bp read. Obviously a bug.
Has anyone else encountered this? Know if it's specific to this version of tophat? Since tophat only chokes at the final conversion of accepted_hits.sam to accepted_hits.bam, I'm planning on filtering the accepted_hits.sam to remove this, then use samtools to do the .bam conversion outside of tophat.
Is there a better place to report tophat bugs?
HWUSI-EAS552R_0001:1:4:10279:10944#0 163 chr17 7357209 255 78M97N536870908M92N22M = 7357535 0 CAACTATAGTCCCACATCACCCAGCTATTCGCCAACTTCACCCAGCTACTCACCCACTTCTCCCAGCTACTCACCTACCTCTCCAAGCTACTCACC CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC;CC?CC?CBACCCCCCCC2C7*;;'8?2??=@?=CC@@=BA<AA@+:>50::>A+:>@:A###### NM:i:5 XS:A:- NH:i:1
If you look at the CIGAR string you can see the problem. There's a match region that's 536870908 bp in length, in the middle of this 96 bp read. Obviously a bug.
Has anyone else encountered this? Know if it's specific to this version of tophat? Since tophat only chokes at the final conversion of accepted_hits.sam to accepted_hits.bam, I'm planning on filtering the accepted_hits.sam to remove this, then use samtools to do the .bam conversion outside of tophat.
Is there a better place to report tophat bugs?
Comment