Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Temporary deterioration of Q-scores at start of run

    We are performing some 16S sequencing using 2x300 bp reads (V3 chemistry) on the MiSeq. Libraries were prepared using Illumina's protocol for 16S V3V4 amplicon libraries (2-step PCR, NexteraXT indexes). We're multiplexing between 192 and 230 samples, and the pool is diluted to 4 pM for loading on the sequencer.

    Our cluster density and clusters PF is great (about 830 and >90%), however we are seeing a transient drop in Q-score early in read 1, generally between cycles 5 and 25, after which the Q-scores recover. Illumina's public data set also shows a similar transient drop in Q-scores, but their tech support is at a loss to explain it.

    Does anyone have any idea what the cause of this drop is, and how we can fix it? I've attached the Q30 chart and heatmap from a run currently in progress as an example of what we're seeing.
    Attached Files

  • #2
    That seems to be a low diversity issue sequencing through primers. I wonder if you spike in any PhiX. If you are concerned, for good quality you need to spike in 20% PhiX which is well above Illumina’s recommendation, but it gives good quality and more consistent results.

    Comment


    • #3
      If you want to achieve optimal results, use custom primers with slightly different lengths of padding such that sequencing does always start at the same position; that will allow automatic color balancing for the beginning of the read.

      If you can't do that - then as nucacidhunter stated, more PhiX should help, as should lower cluster density. As would a broader range of primer sequences (with substitutions).

      Comment


      • #4
        I had heard from tech support that the base caller can only handle 400-500K clusters/mm^2 for each of the 4 channels (A, T, G, and C). If there are more clusters per channel than that, it occasionally 'chokes' and returns horrible Q-scores. A cluster density of 890 is fine for a high diversity library (890K clusters/4 channels = 222.5K clusters/mm^2/channel) but in a low diversity library, 890K clusters all lighting up in the same channel is way more than the 400-500K clusters/mm^2 that the base caller can handle.

        My lab sequences a home-brewed 16S V1V2 amplicon on MiSeq and we always aim for 450-500K clusters/mm^2 and a 15% PhiX spike-in and we see excellent Q-scores even in low diversity regions.

        Comment


        • #5
          We run a lot of amplicons (mostly 16S) on the MiSeq at > 900K clusters/mm^2 by using the shifted padding method Brian mentioned. No need to waste money on re-sequencing PhiX.

          Comment


          • #6
            We currently spike in PhiX at 10%. We could go higher, but I don't really want to as it seems a waste to be continually re-sequencing that much PhiX. Out cluster density is based on what Illumina recommended for this particular procedure, but I suppose we could lower it further. I'm kind of loathe to reduce cluster density AND increase PhiX at the same time as then we're really reducing yield and we'd have to lower our multiplex.

            Can you give me an example of the shifted padding you mean? I assume you mean custom sequencing primers and not custom amplicon primers, right? Do you simply use a mixture of primers that have one, two or three extra bases on the 3' end, for example? You order each of these separately and then mix them in equimolar amounts?
            Last edited by cheezemeister; 02-05-2015, 08:24 PM.

            Comment


            • #7
              I don't really know the biochemistry, just the theory. But basically:

              Let's say 16S start with a set sequence Z. You should amplify with primer Z (and ideally a mix of other primers that are like Z but with common substitutions).

              Now you have a lot of molecules that start with sequence Z. If you sequence them, for the first lengthOf(Z) bases, everything will be identical and thus have terrible quality. But if the sequencing primers have variable-length padding with respect to the binding locus, such that some reads start at base 1 of the sequencing primer sequence; some of them start at base 2 of the sequencing primer sequence; and some start at base 3 of the sequencing primer sequence - then you will have high diversity and color-balancing. Unless 16S starts with a homopolymononmer which hopefully it doesn't.

              Comment


              • #8
                Just to clarify - we use 5'-padded amplicon primers (consisting of variable numbers of Ns, inline barcodes and 16S sequence). We use up to 48 different inline barcodes and pool the amplicons in normal sequencing libraries.
                PM me if you need more details - that's as much as I'm allowed to say publically

                Comment


                • #9
                  Originally posted by Brian Bushnell View Post
                  I don't really know the biochemistry, just the theory. But basically:

                  Let's say 16S start with a set sequence Z. You should amplify with primer Z (and ideally a mix of other primers that are like Z but with common substitutions).

                  Now you have a lot of molecules that start with sequence Z. If you sequence them, for the first lengthOf(Z) bases, everything will be identical and thus have terrible quality. But if the sequencing primers have variable-length padding with respect to the binding locus, such that some reads start at base 1 of the sequencing primer sequence; some of them start at base 2 of the sequencing primer sequence; and some start at base 3 of the sequencing primer sequence - then you will have high diversity and color-balancing. Unless 16S starts with a homopolymononmer which hopefully it doesn't.
                  Just to be clear, the padding/offset has to be in the amplicon primers, not the sequencing primers. Each cluster consists of ~1000 copies of the initial amplicon template, and each copy must be read by the same primer. Using a mixture of offset sequencing primers would result in phase shifting within the cluster and yield no useable data. In contrast, padding the amplicon primer produces phase shifting between clusters, and all can be sequenced by the identical primer yet produce per-cycle sequence diversity.

                  Comment


                  • #10
                    For someone looking to get practical information: http://www.nature.com/nmeth/journal/...meth.2634.html

                    Examples of frame-shifting primers are in supplementary materials, Figure 2.

                    Comment


                    • #11
                      Originally posted by HESmith View Post
                      Just to be clear, the padding/offset has to be in the amplicon primers, not the sequencing primers. Each cluster consists of ~1000 copies of the initial amplicon template, and each copy must be read by the same primer. Using a mixture of offset sequencing primers would result in phase shifting within the cluster and yield no useable data. In contrast, padding the amplicon primer produces phase shifting between clusters, and all can be sequenced by the identical primer yet produce per-cycle sequence diversity.
                      Thanks for the correction; I had assumed that each cluster would get only a single sequencing primer.

                      Comment


                      • #12
                        Originally posted by Brian Bushnell View Post
                        I don't really know the biochemistry, just the theory. But basically:

                        Let's say 16S start with a set sequence Z. You should amplify with primer Z (and ideally a mix of other primers that are like Z but with common substitutions).

                        Now you have a lot of molecules that start with sequence Z. If you sequence them, for the first lengthOf(Z) bases, everything will be identical and thus have terrible quality….
                        Brian,

                        We do a lot of 16S (V4 and V3-V4) sequencing using the method you just described, i.e. all sequence reads starting at the "identical" point and the quality is definitely not terrible. It's quite good in fact. We typically will load a v2 flow cell @ ~800K/mm^2, add 7.5-10% PhiX. We add custom, degenerate sequencing/index primers which overlap the degenerate PCR primers (as described by Rob Knight's and Patrick Schloss' groups) and it works out well.

                        Here's a FastQC plot of one we just ran.
                        Attached Files

                        Comment


                        • #13
                          That's interesting. Our lab specifically moved to variable-length padding for V4 because we got bad results otherwise. And we're now doing some new procedure that also has identical sequences in one part of the read, without variable-length padding, and those do come out with terrible quality in that part of the read. I suppose the key is the degree of degeneracy in the primers.

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Recent Advances in Sequencing Analysis Tools
                            by seqadmin


                            The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                            05-06-2024, 07:48 AM
                          • seqadmin
                            Essential Discoveries and Tools in Epitranscriptomics
                            by seqadmin




                            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                            04-22-2024, 07:01 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, Today, 06:35 AM
                          0 responses
                          9 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, Yesterday, 02:46 PM
                          0 responses
                          15 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 05-07-2024, 06:57 AM
                          0 responses
                          13 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 05-06-2024, 07:17 AM
                          0 responses
                          17 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X