Hi,
I have RNAseq data sequenced with paired-end 100bp illumina from a fungus with no reference genome. However, we have sequenced the genome of a closely related species and is running gene predictions for this genome at the moment.
I was wondering what the best strategy for assembling a transcriptome for the species without a reference genome. I thought of using STAMPY (or maybe TOPHAT with appropriate mismatch settings) to map the sequence reads to the genome of the related species, as an alternative to a de-novo assembly with foreaxmple TRINITY. Does anyone have experience with this? Is STAMPY a good choice?
It of course boils down to how closely related the organisms are if it will work. Judging from coding regions of single genes between the two organisms there is only ca. 5% basepair differences per gene, but more basepair differences and indels in the non-coding regions.
Many thanks in advance for any thoughts and recommendations.
I have RNAseq data sequenced with paired-end 100bp illumina from a fungus with no reference genome. However, we have sequenced the genome of a closely related species and is running gene predictions for this genome at the moment.
I was wondering what the best strategy for assembling a transcriptome for the species without a reference genome. I thought of using STAMPY (or maybe TOPHAT with appropriate mismatch settings) to map the sequence reads to the genome of the related species, as an alternative to a de-novo assembly with foreaxmple TRINITY. Does anyone have experience with this? Is STAMPY a good choice?
It of course boils down to how closely related the organisms are if it will work. Judging from coding regions of single genes between the two organisms there is only ca. 5% basepair differences per gene, but more basepair differences and indels in the non-coding regions.
Many thanks in advance for any thoughts and recommendations.
Comment