So far we collected 32GB of 2x100 reads on a no-PCR library with average inserts size of about 500bp. Also part of a lane from a 5kb Mate End library, but the library was very bottomed-out. Probably only about 3GB of sequence from read pairs with unique endpoints.
These were libraries constructed from a DNA prep from a single plant. The species is genetically tetraploid with a 1C genome size around 0.7-1GBp.
Our best results with our normal assembler, ABySS-PE gave an N50 of 2.3kb scaffold length for a 875MB summed non-N scaffold length.
A cursory evaluation of the reads mapped back to scaffolds annotated by CEGMA suggests the allelic diversity is high -- maybe at the 1-3% level inside exons.
What assembler would you use?
We are planning to go deeper on the no-PCR library as that seems not to be even close to bottoming out. Also we have a 600 cycle 12GB MiSeq run on a pool of 6 or so plants of the same species that we can add to the assembly.
These were libraries constructed from a DNA prep from a single plant. The species is genetically tetraploid with a 1C genome size around 0.7-1GBp.
Our best results with our normal assembler, ABySS-PE gave an N50 of 2.3kb scaffold length for a 875MB summed non-N scaffold length.
A cursory evaluation of the reads mapped back to scaffolds annotated by CEGMA suggests the allelic diversity is high -- maybe at the 1-3% level inside exons.
What assembler would you use?
We are planning to go deeper on the no-PCR library as that seems not to be even close to bottoming out. Also we have a 600 cycle 12GB MiSeq run on a pool of 6 or so plants of the same species that we can add to the assembly.
Comment