Hello,
I have been trying convert extracted fusion evidence reads from a Tophat-aligned RNA-Seq BAM file (with "XF" tag) back to FASTQ format.
If I am not wrong, this is tricky because Tophat will have multiple lines of output for each fusion junction spanning read (where one segment of a single read maps to one location and the other segment to a different location on the reference) and Tophat will report one line per segment (in case of unique mapping).
If I grep out the fusion evidence reads using a simple "grep XF" and try to convert this to Fastq using Picard SamToFastq, I get an "Illegal mate" error, which is expected since the segments are so scattered across multiple lines.
Does anybody know a better way to do this ?
I have been trying convert extracted fusion evidence reads from a Tophat-aligned RNA-Seq BAM file (with "XF" tag) back to FASTQ format.
If I am not wrong, this is tricky because Tophat will have multiple lines of output for each fusion junction spanning read (where one segment of a single read maps to one location and the other segment to a different location on the reference) and Tophat will report one line per segment (in case of unique mapping).
If I grep out the fusion evidence reads using a simple "grep XF" and try to convert this to Fastq using Picard SamToFastq, I get an "Illegal mate" error, which is expected since the segments are so scattered across multiple lines.
Does anybody know a better way to do this ?
Comment