Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Andersen
    Member
    • Oct 2015
    • 15

    Illumina Unique Molecular Identifier Adaptor

    Hi All,

    I want to generate RRBS libraries where we can track each unique molecule with a UMI. Thus I have generated new TruSeq adaptors that we normally use.

    The regular Truseq primers look like this:
    A1_P5: AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATC*T
    A1_P7 (AR005): ℗-GATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGCCGTCTTCTGCTTG

    What we did was to add 8 random nucleotides to the P5 so that it looks like this:
    A1_P5 UMI8: AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNNNNN*T

    From what i heard I should anneal the two adaptors. The annealing effeciency can be seen here: https://imgur.com/a/70LZU

    What I did now was to try our regular protocol with old adaptors vs new adaptors.

    The libraries look like this: https://imgur.com/a/mHZeF

    Stupidly I did not generate another P5 without the 8 UMI's.

    It seems like the adaptors form adaptor dimers, but for the library it seems like it does not bind.

    Suggestions of other designs or ways to get it to work would be highly appreciated.

    I have looked into the dual indexes and then exchanging one index with UMI's instead, but could not find the sequences, do any of you have them?

    Best regards
    Emil
  • pmiguel
    Senior Member
    • Aug 2008
    • 2328

    #2
    Is this a Y-adapter design? If so, the P7 and P5 need to anneal over the last 12 bases of the P5/first 12 bases of the P7, with a 3' "T" overhang to work with many Illumina work flows. When you add 8 N's and a T to the P5, you create a 9 base 3' overhang.
    No chance you can ligate that to anything using a double stranded DNA ligase.

    I suppose the opposite design, where you put the 8 N's at the 5' end of P7 could work. But you would have to anneal the P5 and P7 by their 12 bases of complementarity and then extend the 3' end of the P5 strand using klenow, for example. But that would give you a blunt adapter. So your inserts would need to be blunt as well. Which would allow chimeric inserts and the formation of massive adapter dimers.

    BTW, Illumina asks for the following:
    Oligonucleotide sequences © 2017 Illumina, Inc. All rights reserved.
    and
    Oligonucleotide sequences © 2017 Illumina, Inc. All rights reserved. Derivative works created by
    Illumina customers are authorized for use with Illumina instruments and products only. All other
    uses are strictly prohibited.


    to be added to any sequence information derived from their adapter sequences if they are distributed or published outside ones institution.

    The correct position to place a UMI in the P5 index site would be after the TCTACAC. That is:
    AATGATACGGCGACCACCGAGATCTACACNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATCT

    But, because 8 N's can compose, at most, 4^8 ~= 64K sequence combinations, they would not be "unique" in the context of a sample producing millions of sequences. And, I don't know how far you can push the length of the i5 index. Probably 12, at least. But going beyond 8 bases you would need to subtract sequence from the sequence reads to make up for the reagents you would be using to generate a longer UMI.

    --
    Phillip
    Last edited by pmiguel; 10-11-2017, 09:05 AM.

    Comment

    • Andersen
      Member
      • Oct 2015
      • 15

      #3
      Thank you so much for your answer Phillip.

      1. I am actually not really sure if it is a Y-adapter design. But I am pretty sure that it is. I tried to find information about it but couldn't, but it is the same adapter as used for TruSeq LT.

      2. Thank you for suggesting not to go with blunt inserts.

      3. Thank you for that suggestion. Do you know if any of the current kits from Illumina uses Dual indexing where you also have the Y-adaptor setup as I presumably have? I would believe that this setup would actually be the best setup to run since I would be sure to have intact annealing at the complementary 12 bases.

      Thank you for your answers
      Kind regards
      Emil


      Originally posted by pmiguel View Post
      Is this a Y-adapter design? If so, the P7 and P5 need to anneal over the last 12 bases of the P5/first 12 bases of the P7, with a 3' "T" overhang to work with many Illumina work flows. When you add 8 N's and a T to the P5, you create a 9 base 3' overhang.
      No chance you can ligate that to anything using a double stranded DNA ligase.

      I suppose the opposite design, where you put the 8 N's at the 5' end of P7 could work. But you would have to anneal the P5 and P7 by their 12 bases of complementarity and then extend the 3' end of the P5 strand using klenow, for example. But that would give you a blunt adapter. So your inserts would need to be blunt as well. Which would allow chimeric inserts and the formation of massive adapter dimers.

      BTW, Illumina asks for the following:
      Oligonucleotide sequences © 2017 Illumina, Inc. All rights reserved.
      and
      Oligonucleotide sequences © 2017 Illumina, Inc. All rights reserved. Derivative works created by
      Illumina customers are authorized for use with Illumina instruments and products only. All other
      uses are strictly prohibited.


      to be added to any sequence information derived from their adapter sequences if they are distributed or published outside ones institution.

      The correct position to place a UMI in the P5 index site would be after the TCTACAC. That is:
      AATGATACGGCGACCACCGAGATCTACACNNNNNNNNACACTCTTTCCCTACACGACGCTCTTCCGATCT

      But, because 8 N's can compose, at most, 4^8 ~= 64K sequence combinations, they would not be "unique" in the context of a sample producing millions of sequences. And, I don't know how far you can push the length of the i5 index. Probably 12, at least. But going beyond 8 bases you would need to subtract sequence from the sequence reads to make up for the reagents you would be using to generate a longer UMI.

      --
      Phillip

      Comment

      • pmiguel
        Senior Member
        • Aug 2008
        • 2328

        #4
        Hi Emil,
        Most TruSeq Illumina kits use the Y-adapters. The common "TruSeq" DNA and RNAseq ones will offer a normal single index kit option (usually going up to 24 indexes) or a "high thoughput" one with dual indexes allowing you to multiplex 96 samples in a lane. Well, as long as you don't need them to be "unique dual".

        The parts of the dual index Y-adapters containing the indexes do not anneal! Only the 12 bases of the Y-adapter most proximate to the insert are annealled. The rest of the adapter has non-complementary sequence that won't anneal and hangs off like a forked tail. Get it? "Y" where the double-stranded part is the stem of the "Y" and the single stranded tails are the top part. It is only during subsequent PCR that this tail region becomes double stranded.

        This is one of those brilliant solutions to a bunch amplicon construction issues when using ligation. You need to have a forward and a reverse adapter. But if you just make them both double-stranded, so that T4 DNA ligase will use them as a substrate, you create 2 problems. First some of your constructs will get forward adapters on both ends or reverse adapters on both end and won't be suitable templates for clustering. Second with a double-stranded molecule you have two ends that can be ligated -- possibly to each other.

        So some fiendish genius came up with the idea of annealing single-stranded versions of both the left and right adapters to each other such that only one end was actually annealed. This way each double stranded insert was guaranteed to have an R adapter on one end and an F adapter on the other end. Actually, each strand of the the double stranded insert would have a single stranded R on one end and F on the other--the top strand in one orientation (say, F-insert-R) the bottom strand in the other orientation (R-insert-F).

        --
        Phillip

        Comment

        • nucacidhunter
          Jafar Jabbari
          • Jan 2013
          • 1250

          #5
          Sequence and structure of TruSeq HT adapters is attached.

          You would need to substitute i5 sequences with N to use as UMI.

          Other option you might consider is: http://www.nugen.com/products/ovatio...hyl-seq-system

          It has 6 UMI base which follows the index read so the index 1 read has to be 12 cycles to utilize UMI or 6 cycles just for the index. Other advantage is that they have included diversity nucleotides and libraries can be sequenced with 1% PhiX spike in. In the conventional protocol higher PhiX (>30%) is required.
          Attached Files

          Comment

          • torben
            Member
            • Oct 2012
            • 21

            #6
            For a way to create TruSeq adapters with UMI at the end see Kennedy SR, et al. Detecting ultralow-frequency mutations by Duplex Sequencing. Nat Protoc. 2014 Nov;9(11):2586-606. doi: 10.1038/nprot.2014.170. https://www.nature.com/nprot/journal....2014.170.html

            Comment

            • Andersen
              Member
              • Oct 2015
              • 15

              #7
              Thanks again Phillip!

              Indeed they must be some fiendish geniouses.

              I have allready generated the TruSeq DNA LT adapter piece with a 6 nt index. Do you think it would work to anneal the i5 adapter to this adapter or should i generate new i7 adapters aswell?

              Also to nucacidhunter and torben, thanks for the suggestions!

              Cheers
              Emil

              Originally posted by pmiguel View Post
              Hi Emil,
              Most TruSeq Illumina kits use the Y-adapters. The common "TruSeq" DNA and RNAseq ones will offer a normal single index kit option (usually going up to 24 indexes) or a "high thoughput" one with dual indexes allowing you to multiplex 96 samples in a lane. Well, as long as you don't need them to be "unique dual".

              The parts of the dual index Y-adapters containing the indexes do not anneal! Only the 12 bases of the Y-adapter most proximate to the insert are annealled. The rest of the adapter has non-complementary sequence that won't anneal and hangs off like a forked tail. Get it? "Y" where the double-stranded part is the stem of the "Y" and the single stranded tails are the top part. It is only during subsequent PCR that this tail region becomes double stranded.

              This is one of those brilliant solutions to a bunch amplicon construction issues when using ligation. You need to have a forward and a reverse adapter. But if you just make them both double-stranded, so that T4 DNA ligase will use them as a substrate, you create 2 problems. First some of your constructs will get forward adapters on both ends or reverse adapters on both end and won't be suitable templates for clustering. Second with a double-stranded molecule you have two ends that can be ligated -- possibly to each other.

              So some fiendish genius came up with the idea of annealing single-stranded versions of both the left and right adapters to each other such that only one end was actually annealed. This way each double stranded insert was guaranteed to have an R adapter on one end and an F adapter on the other end. Actually, each strand of the the double stranded insert would have a single stranded R on one end and F on the other--the top strand in one orientation (say, F-insert-R) the bottom strand in the other orientation (R-insert-F).

              --
              Phillip

              Comment

              • pmiguel
                Senior Member
                • Aug 2008
                • 2328

                #8
                Hi Emil,
                I would strongly recommend that you verify this yourself by aligning your p5 and the reverse (in the 3' - 5' direction) of your p7 sequence. You will see the terminal 12 bases on one side are complements of each other with just a 3' "T" overhang provided by the p5 oligo.
                Once you have done that, you will understand how a Y-adapter is structured to function as it does.
                --
                Phillip

                Comment

                • Andersen
                  Member
                  • Oct 2015
                  • 15

                  #9
                  Thank you for all your help!

                  I have now ordered the adapters and hope they will work!

                  Comment

                  • nucacidhunter
                    Jafar Jabbari
                    • Jan 2013
                    • 1250

                    #10
                    I hope that you have asked all C residues to be synthesized with mC to prevent C conversion to U during bisulfite treatment (which is very expensive) unless you are using techniques that does not require mC in adapters.

                    Comment

                    • Andersen
                      Member
                      • Oct 2015
                      • 15

                      #11
                      Originally posted by nucacidhunter View Post
                      I hope that you have asked all C residues to be synthesized with mC (which is very expensive) unless you are using techniques that does not require mC in adapters.

                      Indeed expensive, but yes it is synthesized with mC. Thanks for the heads up.

                      Comment

                      Latest Articles

                      Collapse

                      • SEQadmin2
                        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                        by SEQadmin2


                        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                        ...
                        06-02-2026, 10:05 AM
                      • SEQadmin2
                        Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                        by SEQadmin2


                        With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                        Introduction

                        Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                        05-22-2026, 06:42 AM
                      • SEQadmin2
                        Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                        by SEQadmin2

                        Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                        Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                        05-06-2026, 09:04 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, Today, 08:59 AM
                      0 responses
                      7 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 12:03 PM
                      0 responses
                      21 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 11:40 AM
                      0 responses
                      14 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 05-28-2026, 11:40 AM
                      0 responses
                      29 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...