Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Novaseq 600 very low insert size

    Hi, I've been seeing very short insert sizes on the Novaseq 6000, using 2x150bp whole genome sequencing.
    Insert sizes inferred from mapping reads are much lower than the sizes of DNA fragments of the library, as measured by electrophoresis (average 450 bp).
    Click image for larger version  Name:	electrophoresis.jpg Views:	15 Size:	64.9 KB ID:	325632

    See the histogram below. As most insert sizes are <300 bp, mates overlap, and many do on their full length (insert size 150 bp). In that case both mates don't represent 2*150bp, but rather 1*150 bp. We might as well perform single-end sequencing.

    Click image for larger version  Name:	inserts.jpg Views:	14 Size:	27.9 KB ID:	325633

    We never had this issue with HiSeq, but we've had it with Novaseq with two different sequencing centers. The same problem affected a colleague working with a different sequencing center using Novaseq, on a different organism. I've yet to see decent insert sizes obtained with this technology, but people usually don't report on this metric and perhaps rarely measure it.

    Is there a bias favoring the sequencing of shorter fragments on the Novaseq platform ?​

    Thanks.

    Jean




    Last edited by jeanlain; 04-13-2024, 08:00 AM.

  • #2
    Which version of HiSeq did you use previously? The HiSeq 4000, NovaSeq, and NextSeq 2000 all utilize a newer clustering chemistry known as Exclusion Amplification (in most Illumina docs as ExAmp) that goes through rapid seeding on the flowcell and clusters immediately to occupy the microwells before other templates seed. MiSeqs, NextSeq 500s, and HiSeq 2000/2500s use random seeding that don't favor specific fragments sizes. This rapid seeding during ExAmp favors short fragments seeding first - if you map their positions on the flow cell, you should see that the insert length at the start of the lane is shorter than at the end of the lane.

    Comment


    • #3
      Thanks for the reply. I think we were using HiSeq 2500.

      Comment


      • #4
        This is an old article, but the author thoroughly explains some of the drawbacks of ExAmp, including the short fragment bias.

        The HiSeq 4000 was Illumina's way of making the patterned flowcell technology available to non X Ten customers, and opening up patterned ...

        Comment


        • #5
          Thanks.
          As you can see from my first post, the bias towards shorter fragments is very strong. Is it always that strong? I don't see many people complaining about it, but it's a big problem if half of your sequences are duplicated because mate overlap.

          Comment


          • #6
            I don't know that anyone has measured how extreme the bias is for library fragments. If I remember correctly Illumina published some rough numbers early on stating that adapter dimers (much shorter than a library) could take up 5-10x more of your reads on a patterned flow cell compared to what they did on a nonpatterned, but I don't know how to find where I read that initially.

            The closest I've found is point 3 on this post about HiSeq 4000 services, which says that 1% dimer can translate to 6% of reads, and 10% dimer up to 84% of reads.
            Illumina HiSeq 3000 HiSeq 4000 instrument: considerations, limitations and service prices.


            My recommendation would be to fragment gDNA less to create larger inserts, or if you're performing a double-sided cleanup at the end of library prep to generate the profile in the electropherogram you posted above, adjust your ratios to eliminate more short fragments and shift the distribution to the right. If short fragments aren't present, the bias won't allow them to be over-represented.

            Comment


            • #7
              Thanks for the recommendation. We don't prepare the libraries ourselves, we just send genomic DNA to sequencing platforms. We may ask to maximize insert size.
              An analysis of the selection bias would be helpful to publish, as the problem may be important. It can greatly impact the amount of useful sequence data you obtain. Not only the number of different bases that are sequenced can be much less than 2x150 per read pair, you would end up with nothing useful if the read pair has a mapping quality of zero because the effective sequence is too short and can map at different locations with equal score. In the end, you may lose a lot of data.

              Comment


              • #8
                Dear all, which sequencing platform do you recommend for isolated pathogenic bacteria Illumina NovaSeq 6000,Illumina NextSeq 550 platform? We intend to explore virulence genes and resistance genes and all SNP and variants? ?

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Latest Developments in Precision Medicine
                  by seqadmin



                  Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                  Somatic Genomics
                  “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                  Yesterday, 01:16 PM
                • seqadmin
                  Recent Advances in Sequencing Analysis Tools
                  by seqadmin


                  The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                  05-06-2024, 07:48 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 07:15 AM
                0 responses
                13 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-23-2024, 10:28 AM
                0 responses
                17 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-23-2024, 07:35 AM
                0 responses
                20 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 05-22-2024, 02:06 PM
                0 responses
                10 views
                0 likes
                Last Post seqadmin  
                Working...
                X