In my latest test, NovaSeq only had a 4-5% duplication rate.
Announcement
Collapse
No announcement yet.
X
-
-
I calculated 8000 PPM of index swapping (cross-contamination) for our NovaSeq run with single indexes, and 120 PPM for dual indexes, when allowing zero barcode mismatches.
Leave a comment:
-
Originally posted by GenoMax View Post@cement_head: See if this blog post helps.
In my latest test, NovaSeq only had a 4-5% duplication rate. That's using our own NovaSeq data rather than external data. Overall not a huge problem though it's certainly worth removing. I'm not sure why the number is lower than my previous tests on external data, indicating >12%; possibly the chemistry got better. (Edit - I should note that this run used lots of libraries from different organisms multiplexed together, which reduces the apparent duplication rate, but makes it more accurate. That should not be relevant to such a huge discrepency, though.)
This run was extremely high quality (average 99.6% identity to the reference, or ~Q24) so duplicates were easy to detect. I'm really quite impressed with NovaSeq quality. It's unfortunate that there are only 4 quality scores, but CalcTrueQuality seems to do good job of recalibrating them to the full range of 0-41, yielding a 0.04 average deviation from the correct quality, down from 1.1 on the raw data. 1.1 is still really good (better than the HiSeq 2500 I compared it to), but having only 4 quality scores makes many operations like trimming and merging less accurate. It's actually very impressive that NovaSeq managed, with 4 quality scores, to get better quality score accuracy than HiSeq 2500. I've drawn a couple of conclusions from this: 1) The HiSeq quality score algorithm is terrible. And 2) NovaSeq is calibrated for successful runs only and cannot produce correct quality scores if there are any anomalies (e.g., if there is a lighting failure producing no signal, it will still output really high quality scores even though all the data is wrong). With our previous unsuccessful run (there was a lighting failure), the average deviation from the correct quality was ~20 (2 orders of magnitude).Last edited by Brian Bushnell; 07-14-2017, 05:25 AM.
Leave a comment:
-
Originally posted by GenoMax View Post@cement_head: See if this blog post helps.
Leave a comment:
-
Forgive this really basic question, but what is the cause of the duplicates on patterned flow cells as opposed to the older HiSeq2500 approach? Is this due to the density of the clusters and the likelihood of a library molecule detaching and then re-attaching a short distance away? Also, how is this different than a PCR duplicate? Is there anyway to tell other than spatial relatedness? (prediction based on XY locale)?
Leave a comment:
-
Originally posted by pmiguel View PostOkay I take your point, but an S2 should produce 3 billion clusters per flowcell, whereas a HiSeq 2500 produces about 1.6 billion with v3 chemistry. So the NovaSeq is about 4x less efficient than the HiSeq 2500 in this regard.
A NextSeq produces about 0.4 billion clusters per flowcell. So, the relative efficiencies would be:
(I'm using PF clusters per flowcell / ~number of input amplicon molecules)
HiSeq2500v3 = 1.6/7 = 23%
NextSeq = 0.4/1.4 = 29%
NovaSeqS2 = 3/90 = 3.3%
So, it absolutely looks like a much lower efficiency of clustering on the NovaSeq. (Anyone know if this is also the case for the HiSeq3000/4000?)
From what I could glean, based on the published specs (which are really vague, perhaps on purpose), the amount of library loaded ranges between 3-9 billion.
The yield is 0.75 billion to ??? billion (I think those that use these should chime in, it is not clear that the total yields stated are per flow cell or for both flow cells).
Mind you the % efficiencies (as you've defined) are way better than the MiSeq (0.3-0.4%) and the MiniSeq (1-5%)
That said, how much difference will this make for most runs? If you use the standard HiSeq2500 method, you start with 10ul of a 2nM library pool for denaturation. Since it gets diluted down to 20 pM (at least) you end up with 1 ml for each denaturation you do. One denaturation could be used to cluster all 8 lanes of the flowcell. But how often does that happen?
For us, I can't think of a single case where we have clustered more than 2-3 of lanes per denatured sample pool. Usually it is 8 sample pools for 8 lanes.
There are cases where the amount of library produced is limiting. And the NovaSeq would not be a good choice where this is your critical parameter.
So in most cases I would say it is being forced from 8 lanes to 1 lane along with losing the flexibility to run a much smaller flowcell (with rapid chemistry 2 lane flow cells) that are the major limitation of the NovaSeq.
Illumina expects you to just buy a NextSeq to deal with the 2nd issue above. That would okay (for some definitions of "okay") if they hadn't just decided all the NextSeqs should now have the ability to scan their microarrays. But the option is there.
Then there are the data issues considered in this thread. But I'm pretty sure that is something Illumina can fix (as they had for a period of time with the NextSeq, just after they introduced the v2 version of its chemistry/software) if they focus their attention on it.
Mind you 30% is not bad...it is an interesting threshold when you think about occupancy in space.
Cheers, A.
Leave a comment:
-
Originally posted by misterc View PostIs 150ul of a 1nM library what Illumina recommends for a single S2 flow cell?!?
Leave a comment:
-
I don't know if you can truly compare efficiencies of the ExAmp chemistry with the other instruments.
On the HiSeq and NextSeq instruments you are randomly clustering across the flowcell with a good correlation between how much DNA you load and how many clusters are produced.
On the ExAmp instruments there are only a fixed number of wells in which clusters can be formed. Additionally, you have to deal with the duplicates coming out of those wells and those duplicates that are formed in solution prior to the library going onto the flowcell.
I think what Illumina is trying to do in ExAmp is saturate the array as practically as possible.
No argument, though, about the loss of flexibility with the NovaSeq. In its' current iteration it's not something useful for an all-comers core lab.
Leave a comment:
-
Originally posted by austinso View PostOn another note:
150 uL of a 1 nM library (~90 billion molecules) minimum for loading is a lot of library when you consider you can get by with 1.4 billion for the NextSeq and 7 billion for the HiSeq.
FWIW...
A NextSeq produces about 0.4 billion clusters per flowcell. So, the relative efficiencies would be:
(I'm using PF clusters per flowcell / ~number of input amplicon molecules)
HiSeq2500v3 = 1.6/7 = 23%
NextSeq = 0.4/1.4 = 29%
NovaSeqS2 = 3/90 = 3.3%
So, it absolutely looks like a much lower efficiency of clustering on the NovaSeq. (Anyone know if this is also the case for the HiSeq3000/4000?)
That said, how much difference will this make for most runs? If you use the standard HiSeq2500 method, you start with 10ul of a 2nM library pool for denaturation. Since it gets diluted down to 20 pM (at least) you end up with 1 ml for each denaturation you do. One denaturation could be used to cluster all 8 lanes of the flowcell. But how often does that happen?
For us, I can't think of a single case where we have clustered more than 2-3 of lanes per denatured sample pool. Usually it is 8 sample pools for 8 lanes.
There are cases where the amount of library produced is limiting. And the NovaSeq would not be a good choice where this is your critical parameter.
So in most cases I would say it is being forced from 8 lanes to 1 lane along with losing the flexibility to run a much smaller flowcell (with rapid chemistry 2 lane flow cells) that are the major limitation of the NovaSeq.
Illumina expects you to just buy a NextSeq to deal with the 2nd issue above. That would okay (for some definitions of "okay") if they hadn't just decided all the NextSeqs should now have the ability to scan their microarrays. But the option is there.
Then there are the data issues considered in this thread. But I'm pretty sure that is something Illumina can fix (as they had for a period of time with the NextSeq, just after they introduced the v2 version of its chemistry/software) if they focus their attention on it.
--
Phillip
Leave a comment:
-
Is 150ul of a 1nM library what Illumina recommends for a single S2 flow cell?!?
Leave a comment:
-
On another note:
150 uL of a 1 nM library (~90 billion molecules) minimum for loading is a lot of library when you consider you can get by with 1.4 billion for the NextSeq and 7 billion for the HiSeq.
FWIW...
Leave a comment:
-
Originally posted by misterc View PostDoes anyone have even a lane's worth of these new .cbcl files from a NovaSeq? I'd like to test our bioinformatics pipeline with the new bcl2fastq converter v.2.19 that supports NovaSeq.
Leave a comment:
-
Does anyone have even a lane's worth of these new .cbcl files from a NovaSeq? I'd like to test our bioinformatics pipeline with the new bcl2fastq converter v.2.19 that supports NovaSeq.
Leave a comment:
Latest Articles
Collapse
-
by seqadmin
The recent pandemic caused worldwide health, economic, and social disruptions with its reverberations still felt today. A key takeaway from this event is the need for accurate and accessible tools for detecting and tracking infectious diseases. Timely identification is essential for early intervention, managing outbreaks, and preventing their spread. This article reviews several valuable tools employed in the detection and surveillance of infectious diseases.
...-
Channel: Articles
11-27-2023, 01:15 PM -
-
by seqadmin
Microbiome research has led to the discovery of important connections to human and environmental health. Sequencing has become a core investigational tool in microbiome research, a subject that we covered during a recent webinar. Our expert speakers shared a number of advancements including improved experimental workflows, research involving transmission dynamics, and invaluable analysis resources. This article recaps their informative presentations, offering insights...-
Channel: Articles
11-09-2023, 07:02 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 02:24 PM
|
0 responses
12 views
0 likes
|
Last Post
by seqadmin
Yesterday, 02:24 PM
|
||
Started by seqadmin, Yesterday, 07:37 AM
|
0 responses
23 views
0 likes
|
Last Post
by seqadmin
Yesterday, 07:37 AM
|
||
Started by seqadmin, 12-04-2023, 08:23 AM
|
0 responses
9 views
0 likes
|
Last Post
by seqadmin
12-04-2023, 08:23 AM
|
||
Started by seqadmin, 12-01-2023, 09:55 AM
|
0 responses
24 views
0 likes
|
Last Post
by seqadmin
12-01-2023, 09:55 AM
|
Leave a comment: