Next-generation sequencing (NGS) technologies have rapidly transformed microbiological research. To date, sequencing-based applications have relied on fully assembled reference genomes for bioinformatics analyses. However, despite the availability of consensus-driven genome sequences in public databases, the quality, completeness, authenticity, accuracy, and traceability of genomic data are inadequate. The lack of standards for genomic data leads to potential errors as researchers attempt to interpret their genomic information and make impactful correlations.
ATCC addressed these challenges by implementing a robust NGS and genome assembly workflow to enrich the characterization of the biological materials in our collection. The result is authenticated ATCC biological materials paired with reference-quality microbial genomes with corresponding metadata that are now publicly available to the scientific community on the ATCC Genome Portal.
Recent innovations in second- and third-generation sequencing have now made it possible to produce complete reference-grade microbial genomes and improve the assembly contiguity of large and highly heterozygous fungal genomes by combining highly accurate Illumina short reads with the revolutionary scaffolding ability of Oxford Nanopore Technologies (ONT) ultra-long reads via so-called hybrid assembly techniques.
The ATCC microbial whole-genome sequencing workflow is an optimized methodology designed to achieve complete, circularized (when biologically appropriate), and contiguous genomic elements by using short-read (virology collection) and hybrid (bacteriology and mycology collections) assembly techniques. This methodology comprises five primary steps:
- Extraction of nucleic acids from authenticated ATCC strains
- Sequencing of the nucleic acids
- Assembly of sequencing data into a genome
- Annotation of the resultant genome
- Estimation of relatedness between a genome and all other genomes in our collection