There was zero PhiX in the Novaseq data. I was wondering a bit about mitochondrial content, but still, the source DNA is the same for both platforms. Anyway, coincidental duplicates won't follow the pattern in the graph, of a curve with a negative derivative. They would cause a positive derivative because the number of potential matches increases with the square of the radius, so random matches would yield a curve that looks like Y=X^2, whereas the curve I plotted looks like... nothing with which I am familiar.
Edit:
Or, maybe, I should say it looks a bit like a step function plus a linear, or square-root, or X^Y function where Y is between 0.5 and 1. The step function has a steep increase until a point (say, 2500 for NovaSeq), which models "traditional" optical- or well-duplicates. The other function models "drifters" that break off and land in remote wells.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Was there a higher phiX concentration in the NovaSeq run? Wouldn't phiX produce pseudo-duplicates given the small genome, especially if library prep had a biased fragmentation?
I agree with your "fragment break-off" possibility. We were just chatting about that idea recently over here regarding the HiSeq4000.
Leave a comment:
-
I did a comparison of duplicate rates on HiSeq2500 and NovaSeq, using Illumina's public data on BaseSpace:
NovaSeq seems to have a problem, but it's not clear why. These are not normal optical/well duplicates; they are extremely remote. It looks like during colony formation, some reads break off and reattach to an empty well somewhere else. The farthest-right point (at 25000) is not for distance 25000 but for distance infinity, including inter-tile duplicates.
These libraries are PCR-free WGS and thus should not really have more than a tiny fraction of duplicates, as seen on the HiSeq. Does anyone have any idea what's causing this? Does my hypothesis sound reasonable? Previous Illumina platforms had a very obvious distance cutoff where the number of duplicates increases rapidly up to a point, then plateaus (which is true for this HiSeq data, at around dist=45, but you can't see it in this graph). That is not the case for NovaSeq - it just keeps ascending, and there is no clear cutoff. It gradually bends, so there is no clear inflection point like there is on other platforms.
For reference, the libraries are both human NA12878 runs. NovaSeq is 2x150 and HiSeq 2500 is 2x100. Pairs are considered duplicates when the distance between colony centers is at most the stated distance, and both R1 and R2 match with some number of substitutions allowed, to account for sequencing error (8 for 150bp reads and 5 for 100bp reads). The insert sizes are quite large on average (>500bp) which reduces the rate of coincidental duplicates. HS2500 is ~10x and NovaSeq is ~30x coverage so the coincidental duplicate rate should be extremely low in both cases.
P.S. This is an underestimate of the duplicate rate for both platforms, as it was generated in a way that is not robust to sequencing error. I will regenerate the data, but it won't change the discrepancy, just the magnitude.Attached FilesLast edited by Brian Bushnell; 03-01-2017, 07:43 PM.
Leave a comment:
-
Couple things that have changed on this lately.
1 - S4 flow cells now slated to ship in Q3 this year.
2 - S4 reagent kits only being reduced to be 20% cheaper than HiSeq X if you buy 5 NovaSeq instruments. Bleh. Still about half the cost per Gb versus HiSeq 4000.
Leave a comment:
-
Yeah, if you already have a HiSeq X then the only major advantage is that there are no library type limitations on the NovaSeq.
What NovaSeq does is offer the average core a shot at a price per base previously only available to those with the throughput to need 5+ HiSeq X.
That said, you would need to run S4 reagents to get that price per base and:
(1) S4 won't be ready until late 2017
(2) It will generate 3 Tb of data in a single run == a single lane (logically, if not physically).
--
Phillip
Leave a comment:
-
Originally posted by AllSeq View PostI'm pretty sure they meant 80% of the running cost (per Gb), not 80% of the specific kit cost. However, we've still only seen hints at specific pricing, so we can't say for sure.
Then from the cost perspective, it is not that impressive.
Big jump is throughput is always welcomed by the big genome centers. However, if base accuracy is down due to the new chemistry, then that won't even be a plus.
Anyway, I think we need to wait a little bit more to assess this new toy.
Leave a comment:
-
I'm pretty sure they meant 80% of the running cost (per Gb), not 80% of the specific kit cost. However, we've still only seen hints at specific pricing, so we can't say for sure.
Leave a comment:
-
Reagent cost is $6375 per flowcell for Hi Seq X. If the price of the new reagent is 80% of Hi Seq X, then it is $5100 per flowcell for NovaSeq 6000.
This means that the new reagent cost is $1.7/Gbp which is a huge drop from the previous $7/Gbp. Correct?
Leave a comment:
-
Originally posted by massspecgeek View PostSorry, should have said that support will continue. Only sales of new instruments affected.
Perhaps we will see a new sequencer (or two) slot in between there, in future.Last edited by GenoMax; 01-12-2017, 09:40 AM.
Leave a comment:
-
Sequencing reagents for GAIIx still appear to be available so those who want to keep using their 2500's should be fine.
Leave a comment:
-
Originally posted by Brian Bushnell View PostOuch... hope that just means selling new ones, rather than maintaining and supplying existing ones.
Leave a comment:
-
Originally posted by massspecgeek View PostI spoke to our Illumina sales rep...Was also told that 2500 and 3000 are being discontinued effective end of either Q1 or Q2, forgotten which. Our rep hasn't had full briefing yet so I'd be cautious in relying on that, but it's what I've heard so far. Should be getting something in writing this week.
Leave a comment:
-
I spoke to our Illumina sales rep on the phone yesterday. I was told $30/Gb for S2 100 cycle kits and $15/Gb for 300 cycle. Was also told that 2500 and 3000 are being discontinued effective end of either Q1 or Q2, forgotten which. Our rep hasn't had full briefing yet so I'd be cautious in relying on that, but it's what I've heard so far. Should be getting something in writing this week.
Leave a comment:
-
Okay, here is the actual quote from BioIT:
deSouza ran quickly through the comparisons. For a HiSeq 2500 customer, NovaSeq delivers 50% price reduction per Gb; 100% more output per run on the S2 flow cell. For HiSeq 4000 customers, NovaSeq delivers 45% price reduction and 2.5x the output based on the S3 flow cell. For X customers, “NovaSeq will be 20% more economical while delivering three times the throughput.”
So, if your core can generate enough libraries (dual indexed, I would presume) to make an S3 flowcell run worthwhile, you would generate sequence at 1/4th to 1/3rd the reagent costs of a HiSeq 2500. Even considering the logistical complexities that would entail, it seems like it would be difficult to brush off that kind of a price difference.
I just wish the S1 price per gigabase was going to come in close to that of the S2. But I'm doubting it will.
--
Phillip
Leave a comment:
Latest Articles
Collapse
-
by seqadmin
The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...-
Channel: Articles
11-06-2024, 07:24 PM -
-
by seqadmin
Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...-
Channel: Articles
10-18-2024, 07:11 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 11-08-2024, 11:09 AM
|
0 responses
227 views
0 likes
|
Last Post
by seqadmin
11-08-2024, 11:09 AM
|
||
Started by seqadmin, 11-08-2024, 06:13 AM
|
0 responses
166 views
0 likes
|
Last Post
by seqadmin
11-08-2024, 06:13 AM
|
||
Started by seqadmin, 11-01-2024, 06:09 AM
|
0 responses
80 views
0 likes
|
Last Post
by seqadmin
11-01-2024, 06:09 AM
|
||
New Model Aims to Explain Polygenic Diseases by Connecting Genomic Mutations and Regulatory Networks
by seqadmin
Started by seqadmin, 10-30-2024, 05:31 AM
|
0 responses
27 views
0 likes
|
Last Post
by seqadmin
10-30-2024, 05:31 AM
|
Leave a comment: