Illumina's $600 million acquisition of Solexa in November 2006 gave the company a head start in the next generation sequencing market.
Here I present a brief overview of Solexa's sequencing-by-synthesis chemistry. The sample prep methods used differ slightly from that used in ABI's SOLiD system, but the basic goals are the same: generate large numbers of unique "polonies" (polymerase generated colonies) that can be simultaneously sequenced. These parallel reactions occur on the surface of a "flow cell" (basically a water-tight microscope slide) which provides a large surface area for many thousands of parallel chemical reactions.
Step 1: Sample Preparation
The DNA sample of interest is sheared to appropriate size (average ~800bp) using a compressed air device known as a nebulizer. The ends of the DNA are polished, and two unique adapters are ligated to the fragments. Ligated fragments of the size range of 150-200bp are isolated via gel extraction and amplified using limited cycles of PCR.
Complete detailed protocols for DNA and small RNA library preparation can be found in the documents provided in the attachments to this post. ("dna_libe_prep.pdf" and "rna_libe_small_prep.pdf", respectively). This process is a fairly straightforward multi-step molecular biology process, however there are many pitfalls that can result in skewed results downstream.
Steps 2-6: Cluster Generation by Bridge Amplification
In contrast to the 454 and ABI methods which use a bead-based emulsion PCR to generate "polonies", Illumina utilizes a unique "bridged" amplification reaction that occurs on the surface of the flow cell.
The flow cell surface is coated with single stranded oligonucleotides that correspond to the sequences of the adapters ligated during the sample preparation stage. Single-stranded, adapter-ligated fragments are bound to the surface of the flow cell exposed to reagents for polyermase-based extension. Priming occurs as the free/distal end of a ligated fragment "bridges" to a complementary oligo on the surface.
Repeated denaturation and extension results in localized amplification of single molecules in millions of unique locations across the flow cell surface. This process occurs in what is referred to as Illumina's "cluster station", an automated flow cell processor.
Steps 7-12: Sequencing by Synthesis
A flow cell containing millions of unique clusters is now loaded into the 1G sequencer for automated cycles of extension and imaging.
The first cycle of sequencing consists first of the incorporation of a single fluorescent nucleotide, followed by high resolution imaging of the entire flow cell. These images represent the data collected for the first base. Any signal above background identifies the physical location of a cluster (or polony), and the fluorescent emission identifies which of the four bases was incorporated at that position.
This cycle is repeated, one base at a time, generating a series of images each representing a single base extension at a specific cluster. Base calls are derived with an algorithm that identifies the emission color over time. At this time reports of useful Illumina reads range from 26-50 bases.
The use of physical location to identify unique reads is a critical concept for all next generation sequencing systems. The density of the reads and the ability to image them without interfering noise is vital to the throughput of a given instrument. Each platform has its own unique issues that determine this number, 454 is limited by the number of wells in their PicoTiterPlate, Illumina is limited by fragment length that can effectively "bridge", and all providers are limited by flow cell real estate.
Hopefully that serves as a brief introduction to the technology! If I have made any errors or omissions, please feel free to correct me by posting here!
Comment