Hi, I've been seeing very short insert sizes on the Novaseq 6000, using 2x150bp whole genome sequencing.
Insert sizes inferred from mapping reads are much lower than the sizes of DNA fragments of the library, as measured by electrophoresis (average 450 bp).
See the histogram below. As most insert sizes are <300 bp, mates overlap, and many do on their full length (insert size 150 bp). In that case both mates don't represent 2*150bp, but rather 1*150 bp. We might as well perform single-end sequencing.
We never had this issue with HiSeq, but we've had it with Novaseq with two different sequencing centers. The same problem affected a colleague working with a different sequencing center using Novaseq, on a different organism. I've yet to see decent insert sizes obtained with this technology, but people usually don't report on this metric and perhaps rarely measure it.
Is there a bias favoring the sequencing of shorter fragments on the Novaseq platform ?
Thanks.
Jean
Insert sizes inferred from mapping reads are much lower than the sizes of DNA fragments of the library, as measured by electrophoresis (average 450 bp).
See the histogram below. As most insert sizes are <300 bp, mates overlap, and many do on their full length (insert size 150 bp). In that case both mates don't represent 2*150bp, but rather 1*150 bp. We might as well perform single-end sequencing.
We never had this issue with HiSeq, but we've had it with Novaseq with two different sequencing centers. The same problem affected a colleague working with a different sequencing center using Novaseq, on a different organism. I've yet to see decent insert sizes obtained with this technology, but people usually don't report on this metric and perhaps rarely measure it.
Is there a bias favoring the sequencing of shorter fragments on the Novaseq platform ?
Thanks.
Jean
Comment