NovaSeq from Illumina

Brian Bushnell replied

03-01-2017, 11:05 PM
There was zero PhiX in the Novaseq data. I was wondering a bit about mitochondrial content, but still, the source DNA is the same for both platforms. Anyway, coincidental duplicates won't follow the pattern in the graph, of a curve with a negative derivative. They would cause a positive derivative because the number of potential matches increases with the square of the radius, so random matches would yield a curve that looks like Y=X^2, whereas the curve I plotted looks like... nothing with which I am familiar.

Edit:

Or, maybe, I should say it looks a bit like a step function plus a linear, or square-root, or X^Y function where Y is between 0.5 and 1. The step function has a steep increase until a point (say, 2500 for NovaSeq), which models "traditional" optical- or well-duplicates. The other function models "drifters" that break off and land in remote wells.

Last edited by Brian Bushnell; 03-01-2017, 11:18 PM.
Leave a comment:
SNPsaurus replied

03-01-2017, 10:44 PM
Was there a higher phiX concentration in the NovaSeq run? Wouldn't phiX produce pseudo-duplicates given the small genome, especially if library prep had a biased fragmentation?

I agree with your "fragment break-off" possibility. We were just chatting about that idea recently over here regarding the HiSeq4000.
Leave a comment:
Brian Bushnell replied

03-01-2017, 06:50 PM
I did a comparison of duplicate rates on HiSeq2500 and NovaSeq, using Illumina's public data on BaseSpace:

NovaSeq seems to have a problem, but it's not clear why. These are not normal optical/well duplicates; they are extremely remote. It looks like during colony formation, some reads break off and reattach to an empty well somewhere else. The farthest-right point (at 25000) is not for distance 25000 but for distance infinity, including inter-tile duplicates.

These libraries are PCR-free WGS and thus should not really have more than a tiny fraction of duplicates, as seen on the HiSeq. Does anyone have any idea what's causing this? Does my hypothesis sound reasonable? Previous Illumina platforms had a very obvious distance cutoff where the number of duplicates increases rapidly up to a point, then plateaus (which is true for this HiSeq data, at around dist=45, but you can't see it in this graph). That is not the case for NovaSeq - it just keeps ascending, and there is no clear cutoff. It gradually bends, so there is no clear inflection point like there is on other platforms.

For reference, the libraries are both human NA12878 runs. NovaSeq is 2x150 and HiSeq 2500 is 2x100. Pairs are considered duplicates when the distance between colony centers is at most the stated distance, and both R1 and R2 match with some number of substitutions allowed, to account for sequencing error (8 for 150bp reads and 5 for 100bp reads). The insert sizes are quite large on average (>500bp) which reduces the rate of coincidental duplicates. HS2500 is ~10x and NovaSeq is ~30x coverage so the coincidental duplicate rate should be extremely low in both cases.

P.S. This is an underestimate of the duplicate rate for both platforms, as it was generated in a way that is not robust to sequencing error. I will regenerate the data, but it won't change the discrepancy, just the magnitude.
Attached Files

NovaSeq_Duplicates.png (34.0 KB, 763 views)
Last edited by Brian Bushnell; 03-01-2017, 07:43 PM.
Leave a comment:
misterc replied

03-01-2017, 03:21 PM
Couple things that have changed on this lately.

1 - S4 flow cells now slated to ship in Q3 this year.
2 - S4 reagent kits only being reduced to be 20% cheaper than HiSeq X if you buy 5 NovaSeq instruments. Bleh. Still about half the cost per Gb versus HiSeq 4000.
Leave a comment:
GenoMax replied

01-24-2017, 10:24 AM
Added some information from webinar to the original post.
Leave a comment:
pmiguel replied

01-17-2017, 12:38 PM
Yeah, if you already have a HiSeq X then the only major advantage is that there are no library type limitations on the NovaSeq.
What NovaSeq does is offer the average core a shot at a price per base previously only available to those with the throughput to need 5+ HiSeq X.
That said, you would need to run S4 reagents to get that price per base and:
(1) S4 won't be ready until late 2017
(2) It will generate 3 Tb of data in a single run == a single lane (logically, if not physically).

--
Phillip
Leave a comment:
ymc replied

01-15-2017, 07:42 PM
Originally posted by AllSeq View Post

I'm pretty sure they meant 80% of the running cost (per Gb), not 80% of the specific kit cost. However, we've still only seen hints at specific pricing, so we can't say for sure.

Thanks for your reply.

Then from the cost perspective, it is not that impressive.

Big jump is throughput is always welcomed by the big genome centers. However, if base accuracy is down due to the new chemistry, then that won't even be a plus.

Anyway, I think we need to wait a little bit more to assess this new toy.
Leave a comment:
AllSeq replied

01-14-2017, 09:44 AM
I'm pretty sure they meant 80% of the running cost (per Gb), not 80% of the specific kit cost. However, we've still only seen hints at specific pricing, so we can't say for sure.
Leave a comment:
ymc replied

01-14-2017, 06:46 AM
Reagent cost is $6375 per flowcell for Hi Seq X. If the price of the new reagent is 80% of Hi Seq X, then it is $5100 per flowcell for NovaSeq 6000.

This means that the new reagent cost is $1.7/Gbp which is a huge drop from the previous $7/Gbp. Correct?
Leave a comment:
GenoMax replied

01-12-2017, 09:33 AM
Originally posted by massspecgeek View Post

Sorry, should have said that support will continue. Only sales of new instruments affected.

Taking out HiSeq 2500 would leave a gap in the continuum for "Illumina"verse between NextSeq 550 and HiSeq 4K/NovaSeq 5000.

Perhaps we will see a new sequencer (or two) slot in between there, in future.

Last edited by GenoMax; 01-12-2017, 09:40 AM.
Leave a comment:
GenoMax replied

01-12-2017, 09:29 AM
Sequencing reagents for GAIIx still appear to be available so those who want to keep using their 2500's should be fine.
Leave a comment:
massspecgeek replied

01-12-2017, 09:28 AM
Originally posted by Brian Bushnell View Post

Ouch... hope that just means selling new ones, rather than maintaining and supplying existing ones.

Sorry, should have said that support will continue. Only sales of new instruments affected.
Leave a comment:
Brian Bushnell replied

01-12-2017, 09:25 AM
Originally posted by massspecgeek View Post

I spoke to our Illumina sales rep...Was also told that 2500 and 3000 are being discontinued effective end of either Q1 or Q2, forgotten which. Our rep hasn't had full briefing yet so I'd be cautious in relying on that, but it's what I've heard so far. Should be getting something in writing this week.

Ouch... hope that just means selling new ones, rather than maintaining and supplying existing ones.
Leave a comment:
massspecgeek replied

01-12-2017, 08:16 AM
I spoke to our Illumina sales rep on the phone yesterday. I was told $30/Gb for S2 100 cycle kits and $15/Gb for 300 cycle. Was also told that 2500 and 3000 are being discontinued effective end of either Q1 or Q2, forgotten which. Our rep hasn't had full briefing yet so I'd be cautious in relying on that, but it's what I've heard so far. Should be getting something in writing this week.
Leave a comment:
pmiguel replied

01-11-2017, 09:13 AM
Okay, here is the actual quote from BioIT:

deSouza ran quickly through the comparisons. For a HiSeq 2500 customer, NovaSeq delivers 50% price reduction per Gb; 100% more output per run on the S2 flow cell. For HiSeq 4000 customers, NovaSeq delivers 45% price reduction and 2.5x the output based on the S3 flow cell. For X customers, “NovaSeq will be 20% more economical while delivering three times the throughput.”

So the cost comparison to the HiSeq 2500 is to a NovaSeq S2 flowcell. Whereas the HiSeq 4000 comparison is to a NovaSeq S3 flowcell.

So, if your core can generate enough libraries (dual indexed, I would presume) to make an S3 flowcell run worthwhile, you would generate sequence at 1/4th to 1/3rd the reagent costs of a HiSeq 2500. Even considering the logistical complexities that would entail, it seems like it would be difficult to brush off that kind of a price difference.

I just wish the S1 price per gigabase was going to come in close to that of the S2. But I'm doubting it will.

--
Phillip
Leave a comment:

Previous 1 3 4 5 6 7 8 template Next

Essential Discoveries and Tools in Epitranscriptomics

by seqadmin

The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
- Channel: Articles
Yesterday, 07:01 AM
Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 39 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 41 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 35 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Latest Articles

ad_right_rmr

News