Hello all,
I have had several troubleshooting attempts at sequencing a low-diversity, small amplicon library on a NextSeq2000 using P1 50 cycle reagent kits. The run settings are generally single-end, 50 bps read length with 2 index reads. The library itself is rather custom in generation and structure. Following some other groups doing DNA barcoding of LNPs, I have PCR amplified a mixture of 10 - 20 DNA barcodes from either pure input controls or experimental samples. The final library is 155 bps long using Nextera-derived dual indices. Following the Read 1 sequence, the insert consists of a 7 bp UMI, followed immediately by the experimental 8 bp barcode, then the Read 2 sequence.
After several runs, I have consistently seen low pass filter rates (~ 20 - 45%) and a large proportion of reads with ambiguous base calls. The poor base calling shows a specific behavior in which the first 35 bases are all mean Q30% < 5, then immediately at base 36 (part way into the Read 2 sequence) jumps up to > 30. I have tried titrating down the final loading concentration (1000, 650, 100, 10 pM) and increasing PhiX spike-in (5, 20, 50, 90%), although I haven't tested every combination. The most "successful" run (as measured by usable reads for analysis) used a 100 pM load with 30% Phix spike-in, resulting in ~30% PF rate and the same problems with base calling. 50 and 90% PhiX runs showed proper PhiX alignment % but no real improvements to base calling.
I am also confused by the basespace run showing > 90% avg Q30 score, only to inspect the run report and see each demultiplexed sample has Read 1 Qscore base % around 1 - 5%. Does the former metric account for PhiX scores? At this point, I am open to any thoughts on what may be happening. I am admittedly not well-versed in amplicon-seq, so it's very possible I am missing an obvious reason for this.
Thanks for your time!
I have had several troubleshooting attempts at sequencing a low-diversity, small amplicon library on a NextSeq2000 using P1 50 cycle reagent kits. The run settings are generally single-end, 50 bps read length with 2 index reads. The library itself is rather custom in generation and structure. Following some other groups doing DNA barcoding of LNPs, I have PCR amplified a mixture of 10 - 20 DNA barcodes from either pure input controls or experimental samples. The final library is 155 bps long using Nextera-derived dual indices. Following the Read 1 sequence, the insert consists of a 7 bp UMI, followed immediately by the experimental 8 bp barcode, then the Read 2 sequence.
After several runs, I have consistently seen low pass filter rates (~ 20 - 45%) and a large proportion of reads with ambiguous base calls. The poor base calling shows a specific behavior in which the first 35 bases are all mean Q30% < 5, then immediately at base 36 (part way into the Read 2 sequence) jumps up to > 30. I have tried titrating down the final loading concentration (1000, 650, 100, 10 pM) and increasing PhiX spike-in (5, 20, 50, 90%), although I haven't tested every combination. The most "successful" run (as measured by usable reads for analysis) used a 100 pM load with 30% Phix spike-in, resulting in ~30% PF rate and the same problems with base calling. 50 and 90% PhiX runs showed proper PhiX alignment % but no real improvements to base calling.
I am also confused by the basespace run showing > 90% avg Q30 score, only to inspect the run report and see each demultiplexed sample has Read 1 Qscore base % around 1 - 5%. Does the former metric account for PhiX scores? At this point, I am open to any thoughts on what may be happening. I am admittedly not well-versed in amplicon-seq, so it's very possible I am missing an obvious reason for this.
Thanks for your time!