Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Basequality Dropoff

    Hello people,

    first time poster and also kind of first time sequencer.
    I just started my PhD which involves a lot of NGS on a MiSeq and my supervisor and I ran 11 samples last week.

    The initial metrics looked very nice (cluster density of 1100 and 92% of clusters passing QC) but the data after the run showed a quality dropoff in both directions after about 100 cycles and only 40% of >Q30 reads.
    We both don't really know what the issue could be and think that it was probably not an issue with the adaptors or clustering – that looked nice in the beginning – so I decided to take the troubleshooting on!

    You find the screenshot attached, any input is valued Thanks in advance!
    Attached Files

  • #2
    1. Are these amplicons or sequences where there is low nucleotide diversity expected?
    2. You possibly have short inserts. Once you go through your inserts you are then sequencing into the adapter on the 3'-end. That leads to low nucleotide diversity and drop-off's in Q-scores.
    3. How much phiX is spiked into this run?

    Comment


    • #3
      Thanks for your ideas.

      1. In a population-genetics sense? And what is the underlying mechanism between the nucleotide diversity and basecall quality? But I guess not; every sample is the complete Human Cytomegalovirus genome isolated from patients.

      2. We ran the samples through an Agilent Bioanalyzer prior to sequencing. The insert-length is not perfectly normally distributed and some samples have some little spikes around 300bp, but the bulk of the library was between 470 and 550bp.

      3. 1% phiX

      Comment


      • #4
        Low nucleotide diversity in this case means majority of clusters will have e.g. an "A". When that happens the ability of the image analysis software to distinguish among clusters is hampered which then can lead to poor Q scores.

        I assume this is a V3 reagent run (since you have 1100 k/mm^2 cluster density). While it is possible to push the limit of cluster density (with good/diverse libraries) the fall over the cliff (in terms of Q score drop) is precipitous.

        Have you analyzed the data to see if your assumption in #2 above checks out. In general, smaller fragments will cluster efficiently and will out compete larger ones every time. Since you have a reference genome available you can use the method described by Brian in this post to actually find the real insert sizes in your data. It would be interesting to see what the results look like compared to your expectation.

        Can you show us what the "Summary" looks like for phiX alignments in that third tab?
        Last edited by GenoMax; 09-24-2018, 05:13 AM.

        Comment


        • #5
          Ah, low diversity between the clusters in each sequencing cycle, I understand how that would be problematic.
          Yes, it is a V3 run.
          And I see how a overclustering of smaller inserts would lead to sequencing into the opposite adapter with low nucleotide diversity. As soon as I get a PC and access to the institutes server I will look into the linked method of determining the true insert length.
          As I see the dropoff after around 100 bases, I should also see that as the true insert length, right? And that should be pretty salient in the data?

          As my supervisor will only return next week and my PC is still not set up, I won't be able to look up the phiX controls but I remember him saying that they looked okay.
          Thank you for your help so far, I'll will post any developments.

          Comment


          • #6
            As I see the dropoff after around 100 bases, I should also see that as the true insert length, right?
            While likely let us hope that is not the case because otherwise you would be losing a lot of data to adapters and would have short reads.

            When you have had a chance to investigate let us know what the FastQC profiles and the BBMap insert sizes look like.

            Is this a MiSeq you have control over/physical access to? It may be worth having Illumina tech support take a look at this run remotely. They can diagnose if there was a hardware issue that led to the Q-score drop. If you have a maintenance contract they will likely replace the reagents at no charge in that case.

            Comment


            • #7
              Do you have access to the basespace account that this was run on? I'd be curious to see what the metrics tab looks like.

              Comment


              • #8
                It is our own MiSeq and we will have tech support take a look at it. But from first sighting of the reads it actually does look like the short inserts have outcompeted the larger ones during clustering.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Understanding Genetic Influence on Infectious Disease
                  by seqadmin




                  During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

                  Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
                  09-09-2024, 10:59 AM
                • seqadmin
                  Addressing Off-Target Effects in CRISPR Technologies
                  by seqadmin






                  The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
                  08-27-2024, 04:44 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Today, 06:25 AM
                0 responses
                13 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 01:02 PM
                0 responses
                12 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 09-18-2024, 06:39 AM
                0 responses
                14 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 09-11-2024, 02:44 PM
                0 responses
                14 views
                0 likes
                Last Post seqadmin  
                Working...
                X