I downloaded the Mus musculus mm10 pre-built tophat indices from the Tophat website (http://ccb.jhu.edu/software/tophat/igenomes.shtml). This package comes with a .gtf file that specifies junction boundaries and start/stop codon positions. It also contains a .fa file for the mouse genome.
I would like to produce a .fa file that contains all of the mouse RNA transcript sequences from the above two files. Is there any software that does this? I have read of de novo transcriptome assembly software like Trinity, but I do not think it would be ideal for what I would like to do here. I merely need some kind of script/tool to generate the annotated mRNA (and ideally, rRNA, mtRNA, and ncRNA) sequences that will match the annotation in the aforementioned two files (.gtf and .fa from the tophat pre-built index).
Thank you kindly for your time and help. Please let me know if there is any further information you would like to help evaluate my query. If I do find an appropriate tool for this problem, I plan on writing a script to extract and splice necessary sequences from the genomic .fa file, and will upload this script for others to use (but would rather avoid re-inventing the wheel if possible!).
I would like to produce a .fa file that contains all of the mouse RNA transcript sequences from the above two files. Is there any software that does this? I have read of de novo transcriptome assembly software like Trinity, but I do not think it would be ideal for what I would like to do here. I merely need some kind of script/tool to generate the annotated mRNA (and ideally, rRNA, mtRNA, and ncRNA) sequences that will match the annotation in the aforementioned two files (.gtf and .fa from the tophat pre-built index).
Thank you kindly for your time and help. Please let me know if there is any further information you would like to help evaluate my query. If I do find an appropriate tool for this problem, I plan on writing a script to extract and splice necessary sequences from the genomic .fa file, and will upload this script for others to use (but would rather avoid re-inventing the wheel if possible!).
Comment