Hi guys,
I'm having a bit of a problem with respect to viral genome assembly from my illumina data. Some of the genomes possess direct terminal repeats (DTRs) and this is causing all the assembly programs to assemble this into a pseudo-circular genome and then break this at some point. This rest of the genome appears to be fine and I've checked this with PCR, but I'm left with two halves of a gene at either ends of the molecule.
I've tried playing about with really large kmer sizes, but this produces no discernible effect as I suspect the DTRs are fairly large (~1000bp). I also mapped my reads back to the assemblies to try and find the actual end point and then resolve this by genome walking, but I could not see anything indicative of this. Unfortunately, as these are novel viral sequences, a reference guided assembler is out of the question.
Have you any experience with this? Strategies that deal with circular genomes should prove to be effective here.
Thanks in advance.
I'm having a bit of a problem with respect to viral genome assembly from my illumina data. Some of the genomes possess direct terminal repeats (DTRs) and this is causing all the assembly programs to assemble this into a pseudo-circular genome and then break this at some point. This rest of the genome appears to be fine and I've checked this with PCR, but I'm left with two halves of a gene at either ends of the molecule.
I've tried playing about with really large kmer sizes, but this produces no discernible effect as I suspect the DTRs are fairly large (~1000bp). I also mapped my reads back to the assemblies to try and find the actual end point and then resolve this by genome walking, but I could not see anything indicative of this. Unfortunately, as these are novel viral sequences, a reference guided assembler is out of the question.
Have you any experience with this? Strategies that deal with circular genomes should prove to be effective here.
Thanks in advance.
Comment