Two papers were recently published in Nature Methods detailing an astonishing new technique which combines the flexibility of on-demand array technology with the power of next gen sequencing instrumentation.
These papers demonstrate the specific capture and sequencing of specific genomic regions, using a programmable microarray platform (from Nimblegen)
Direct selection of human genomic loci by microarray hybridization (Albert, et al)
and
Microarray-based genomic selection for high-throughput resequencing (Okou, et al)
Both studies capture defined regions of the genome (either ~6,700 exons or up to a 5Mb contiguous region) using Nimblegen arrays, and present 454 sequencing data of the enriched regions. ~65-75% on target reads were reported, with median single base coverage of 5-7 fold. The Albert study validated their technique against four samples from the HapMap collection, and were able to identify 79-94% of known variants in the target regions.
A third paper in the same issue of Nature Methods from the Church lab ("Multiplex amplification of large sets of human exons", Porreca et al) demonstrates another technique for capture. Rather than capturing the targets using hybridization directly on the array, this study uses a programmable array to synthesize "molecular inversion" capture probes that are cleaved from the array and used to fish out small regions of interest (~60-191 bp exon fragments). Enriched fractions were then sequenced with the Solexa Genome Analyzer.
The results reported in this study were less than impressive, with only 28% on target hits. There was also a significant problem with calling heterozygous polymorphisms, however the authors hope this can be optimized at the reaction level. This technique, which relies on a dual hybridization event surrounding the region of interest followed by gap-filling/ligation, is much more complicated and seems that it will require intense optimization to approach the success had with direct capture.
In any event, this enrichment technology will make a significant impact on any study examining a defined subset of the genome, such as candidate region sequencing workflows. What once was a laborious process of PCR primer design, optimization, amplification, and normalization, has become a simple one-pot hybridization event.
Comment