Updated:
You must not rename any files in the tophat output directories. My scripts do this automatically, renaming the output files with the sample names - this will cause tophat-fusion-post to fail.
Hi all,
First off thanks to the developers for what is an already excellent package. Tophat2 works great for me and I look forward to using it more.
On to my question - has anyone successfully used tophat-fusion-post with Tophat2 --bowtie1 using the fusion search options?
I am running Linux, using the pre-built Linux binaries.
Tophat2 with --bowtie1 completes fine, I have a large fusions.out file. For my current datasets I have between 350k and 1M candidates - filtering is absolutely necessary, and I've not seen that simply screening for high coverage numbers will help (repeats, narrow coverage windows, etc - all the reasons they list in the paper for having the further filtering steps).
I've made the structure as indicated on the Tophat/Fusion webpage, with tophat_* directories with all the typical output files in the directory, the ensGene, refGene and mcl files.
I have two copies of tophat-fusion-post, one from the original Tophat-Fusion, and one from Tophat2. They are not identical. Both behave similarly.
My Tophat2 directory does not have the annotations (ensGene.txt etc.) but my Tophat-Fusion directory does - I have used those, though tophat-fusion-post from Tophat-Fusion wants refGene_sorted.txt, whereas the Tophat2 version does not.
For testing purposes, I have used skip-blast. tophatfusion_out/ is created but fusion_seq.fa is empty. bowtie does try to align, but nothing happens. Everything completes with no errors, just 0 fusions found.
As a test, I also ran this in a completely empty directory, and it does not complain. This is worrisome, since the program should complain it has no files to work with! Based on this, I think it might not be properly reading the fusions.out file to begin with, even though it's in the directory in the structure on their webpage.
Tophat-fusion is in my path, and bowtie, bowtie2 (not that i'm using it), tophat and tophat2 all call the proper programs. All the bam files have "chr" as a prefix, as do the ensGene and refGen files.
Hopefully someone has used Tophat2 with --bowtie1 and then used tophat-fusion-post on the resultant fusions.out and can provide some advice.
Thank you all for your time,
-sf
You must not rename any files in the tophat output directories. My scripts do this automatically, renaming the output files with the sample names - this will cause tophat-fusion-post to fail.
Hi all,
First off thanks to the developers for what is an already excellent package. Tophat2 works great for me and I look forward to using it more.
On to my question - has anyone successfully used tophat-fusion-post with Tophat2 --bowtie1 using the fusion search options?
I am running Linux, using the pre-built Linux binaries.
Tophat2 with --bowtie1 completes fine, I have a large fusions.out file. For my current datasets I have between 350k and 1M candidates - filtering is absolutely necessary, and I've not seen that simply screening for high coverage numbers will help (repeats, narrow coverage windows, etc - all the reasons they list in the paper for having the further filtering steps).
I've made the structure as indicated on the Tophat/Fusion webpage, with tophat_* directories with all the typical output files in the directory, the ensGene, refGene and mcl files.
I have two copies of tophat-fusion-post, one from the original Tophat-Fusion, and one from Tophat2. They are not identical. Both behave similarly.
My Tophat2 directory does not have the annotations (ensGene.txt etc.) but my Tophat-Fusion directory does - I have used those, though tophat-fusion-post from Tophat-Fusion wants refGene_sorted.txt, whereas the Tophat2 version does not.
For testing purposes, I have used skip-blast. tophatfusion_out/ is created but fusion_seq.fa is empty. bowtie does try to align, but nothing happens. Everything completes with no errors, just 0 fusions found.
As a test, I also ran this in a completely empty directory, and it does not complain. This is worrisome, since the program should complain it has no files to work with! Based on this, I think it might not be properly reading the fusions.out file to begin with, even though it's in the directory in the structure on their webpage.
Tophat-fusion is in my path, and bowtie, bowtie2 (not that i'm using it), tophat and tophat2 all call the proper programs. All the bam files have "chr" as a prefix, as do the ensGene and refGen files.
Hopefully someone has used Tophat2 with --bowtie1 and then used tophat-fusion-post on the resultant fusions.out and can provide some advice.
Thank you all for your time,
-sf
Comment