Hi all,
I want to obtain the complete genome sequences of three closely related bacterial strains in order to identify genes involved in toxin metabolism.
We have a reference genome of the type strain, which consists of a single (unannotated) contig of about 2.8 Mb.
So far the most cost effective solution seems to be a partial throughput run on a HiSeq using 2x100 bp reads. The service provider I've talked to tells me that they can do this for $700 per sample. This should correspond to about 1-2 Gb per sample, which means several hundred x coverage.
The price is right but I'm unsure if short Illumina reads are going to present problems when we try to assemble the genomes, even with such high coverage. I have no experience with genome assembly, so I am hoping people here might be able to share their experience and tell me if they think we should be looking at alternative platforms to get longer reads for assembly.
Edit: I realised too late that my title says 'de novo'. This is not really the case as we have a reference genome. The accuracy of this reference is an unknown quantity however.
I want to obtain the complete genome sequences of three closely related bacterial strains in order to identify genes involved in toxin metabolism.
We have a reference genome of the type strain, which consists of a single (unannotated) contig of about 2.8 Mb.
So far the most cost effective solution seems to be a partial throughput run on a HiSeq using 2x100 bp reads. The service provider I've talked to tells me that they can do this for $700 per sample. This should correspond to about 1-2 Gb per sample, which means several hundred x coverage.
The price is right but I'm unsure if short Illumina reads are going to present problems when we try to assemble the genomes, even with such high coverage. I have no experience with genome assembly, so I am hoping people here might be able to share their experience and tell me if they think we should be looking at alternative platforms to get longer reads for assembly.
Edit: I realised too late that my title says 'de novo'. This is not really the case as we have a reference genome. The accuracy of this reference is an unknown quantity however.
Comment