Header Leaderboard Ad

Collapse

Tech Summary: ABI's SOLiD (Seq. by Oligo Ligation/Detection), UPDATED for v2.0

Collapse

Announcement

Collapse
No announcement yet.
This is a sticky topic.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tech Summary: ABI's SOLiD (Seq. by Oligo Ligation/Detection), UPDATED for v2.0


    Applied Biosystems has just launched their instrument, which supports their version of high-throughput sequencing chemistry, termed “SOLiD™” (little “i”, please). Acquired from Agencourt Personal Genomics in late 2006, SOLiD is a unique parallel chemistry which enables simultaneous sequencing of thousands of individual DNA molecules.

    Here I will present a brief overview of the technology, aimed at those who haven’t had time to become intimately familiar with the chemistry. Figures and information taken directly from this presentation from ABI’s website.

    Sequencing on the SOLiD machine starts with library preparation. In the simplest fragment library, two different adapters are ligated to sheared genomic DNA (left panel of Fig. 1). If more rigorous structural analysis is desired, a “mate-pair” library can be generated in a similar fashion, be incorporating a circularization/cleavage step prior to adapter ligation (right panel of Fig.1).


    Figure 1. Library generation schematic.

    Once the adapters are ligated to the library, emulsion PCR is conducted using the common primers to generate “bead clones” which each contain a single nucleic acid species.


    Figure 2. Clonal bead library generation via emulsion PCR.

    Each bead is then attached to the surface of a flow cell via 3’ modifications to the DNA strands.


    Figure 3
    . Depositing beads into flow cell via end modifications.

    At this point, we have a flow cell (basically a microscope slide that can be serially exposed to any liquids desired) whose surface is coated with thousands of beads each containing a single genomic DNA species, with unique adapters on either end. Each microbead can be considered a separate sequencing reaction which is monitored simultaneously via sequential digital imaging. Up to this point all next-gen sequencing technologies are very similar, this is where ABI/SOLiD diverges dramatically (see figure 4).


    Figure 4. Schematic of ABI SOLiD v2.0 sequencing chemistry. SOLiD 2.0 chemistry utilizes 1/2 encoding (meaning bases 1/2 of the probe are the specific bases linked to the colorspace calls). The original version of the chemistry used 4,5 encoded probes.

    The actual base detection is no longer done by the polymerase-driven incorporation of labeled dideoxy terminators. Instead, SOLiD uses a mixture of labeled oligonucleotides and queries the input strand with ligase. Understanding the labeled oligo mixture is key to understanding SOLiD technology.

    Each oligo has degenerate positions at bases 3-5 (N’s), and one of 16 specific dinucleotides at positions 1-2 (numbered from the 3' end). Positions 6 through the 5’ are also degenerate (likely inosine, not confirmed), and hold one of four fluorescent dyes. The sequencing involves:
    1. Anneal a primer, then hybridize and ligate a mixture of fluorescent oligos (8-mers) whose 1st & 2nd 3' bases match that of the template
    2. Capping unextended fragments with the same mixture of nonfluorescent probes
    3. Phosphatase treatment to prevent any remaining unextended strands from contributing to out of phase ligation events
    4. Detection of the specific fluor
    5. Removal of fluor via two step chemical cleavage of the three 5' bases. This leaves behind a 5 base ligated probe, with a 5' phosphate
    6. Repeat, this time querying the 6th & 7th bases
    7. After 5-7 cycles of this, perform a “reset”, in which the initial primer and all ligated portions are melted from the template and discarded.
    8. Next a new initial primer is used that is N-1 in length. Repeating the initial cycling (steps 1-5) now generates an overlapping data set (bases 1/2, 6/7, etc, see Fig 4, Step 8 above).

    Thus, 5-7 ligation reactions followed by 5 primer reset cycles are repeated generating sequence data for ~35 contiguous bases, in which each base has been queried by two different oligonucleotides.

    If you’re doing the math you’ve realized there are 16 possible dinucleotides (4^2) and only 4 dyes. So data from a single color call does not tell you what base is at a given position. This is where the brilliance (and potential confusion) comes about with regard to SOLiD. There are 4 oligos for every dye, meaning there are four dinucleotides that are encoded by each dye.


    Figure 5. Schematic of dibase encoding, and how it relates to calling the actual template sequence

    For example (see Fig.5), the dinucleotides CA, AC, TG, and GT are all encoded by the green dye. Because each base is queried twice it is possible, using the two colors, to determine which bases were at which positions. This two color query system (known as “color space” in ABI-speak) has some interesting consequences with regard to the identification of errors. A detailed explanation of color space and it’s unique issues can be found in the PDF files attached to this post (“2Base_Pair_SOLiD_Data_V1.pdf” and "SOLiD_Dibase_Sequencing_and_Color_Space_Analysis.pdf").

    One of the side effects of this dual encoding is that when aligning to a reference and attempting to determine variants...true variants will follow specific color change "rules" as defined below in Figure 6.


    Figure 6. Colorspace valid variant rules

    Detection of a true SNP is reflected by changes in two adjacent colorspace calls, which must follow the rules above. Figure 7 below gives some examples of this principle in examining alignments.


    Figure 7. Colorspace examples

    Hopefully that gives you a brief introduction to ABI’s SOLiD technology.

    ____________________________________________
    EDIT May 2008: SOLiD 2.0 has been released.
    EDIT Sept 2008: This post has been updated entirely for v2.0 chemistry
    Attached Files

  • #2
    For sure, there's a lot of power behind the 2 color base-code. Very, very clever.
    AB had often very good idea (even this one coming from Agencourt, doesn't it?) but so weird ways to commercialize them. Let's see how they are gonna face the economics with 454 short term, and Illumina middle term.
    Thanxx for the explanation details.

    Comment


    • #3
      Solid chemistry

      Thanks for the terrific explanation of the technology and the color space document. Very helpful. What are the "z" bases and how are they cleaved?

      Comment


      • #4
        I have difficulties finding the SOLID instrument specs.
        How long will be a run? How much data will it generate? Do we have these published somewhere?

        Comment


        • #5
          Originally posted by jwolf View Post
          Thanks for the terrific explanation of the technology and the color space document. Very helpful. What are the "z" bases and how are they cleaved?
          I couldn't tell you. I haven't come across that in my reading. I'm sure someone here (or maybe one of our ABI members? ) will chime in.

          Originally posted by DNAcowboy View Post
          I have difficulties finding the SOLID instrument specs.
          How long will be a run? How much data will it generate? Do we have these published somewhere?
          I believe a run is on the order of ~7-8 days, depending on the application. (From www.in-sequence.com).

          From ABI's website:
          Read Length
          --Fragment libraries - up to 35 bases
          --Mate-paired libraries - 2 x 25 bases

          Typical Mappable* Output/Slide
          --Fragment libraries: 1-1.5GB
          --Mate-paired libraries: 1.5-2 GB

          Comment


          • #6
            Thanxx Eco. I should be visited soon by AB's representatives and should post some answers asap.

            Comment


            • #7
              SOLiD 2.0 has been released.

              Basically the same as above with a few changes to the probe details...instead of 4,5 encoded probes it will use 1,2 encoded probes and a series of "bridge probes" that prevent backing up too far into the P1 primer region.

              Still only 35bp reads max (7 ligations + 5 primers).

              Will post more details as they become available. I've put the SOLiD data definitions in the first post.



              Comment


              • #8
                Is there a fixed (or recommended) order for adding the oligos? I don’t understand why the first four oligos to be added would be CA CT GC GG (as suggested by the diagram) why would you not start with AA AC AG AT, hence identifying all the beads that start with an A, followed by a round of oligos starting with C etc

                My suspicious mind tells me that there must be a reason for the presented order so I'm concerned there's something I'm not fully understanding?

                Comment


                • #9
                  This is my understanding, the 16 possible 2 base-combinations for the oligos are present in the mix, not only those 4 on their diagrams. Am I wrong?

                  Comment


                  • #10
                    I think I get it now, because the n-1 reaction types the 3' base of the primer which is known, the first base of the clonal product is identified definitively, thanks DNAcowboy

                    Comment


                    • #11
                      This is my first post..
                      Regarding SOLiD sequencing, I have a question : how are the three 3' last nucleotides, including the fluorophore, cleaved ? What is the cleavage agent ?

                      Comment


                      • #12
                        [QUOTE=ECO;436]I couldn't tell you. I haven't come across that in my reading. I'm sure someone here (or maybe one of our ABI members? ) will chime in.



                        My guess would be that 'z' bases would be inosine. But that is almost pure speculation on my part.

                        --
                        Phillip

                        Comment


                        • #13
                          You're right ! 5' last nucleotides !
                          I submit this question to SOLID staff. The answer : chemical cleavage/proprietary reagent.

                          Comment


                          • #14
                            I've finally found the time to update this post for v2.0 chemistry. Enjoy.

                            Comment


                            • #15
                              Thanks for taking the time to explain this marvelous technology!
                              I have never used the SOLiD system, but I think I understand the gist of it. Just a couple of questions: How can the color space be translated into sequence, if the first position can be one out of 4 bases? Don't they have to know with what base they started with, in order to figure all downstream bases? And second question, in figure 4 (schematic), it appears as if primer n-2, -3, -4 coloring is off? It looks as if the read (for n-2) started at base +4?
                              Thanks so much!
                              And, oh, this is my first post... :-)

                              Comment

                              Working...
                              X