The latest HiSeq3000 run (we did receive a few flowcells) did average 378 million clusters passing filter, per lane. All libraries were size selected.
One obvious part of the exclusion amplification as implemented is the very viscous enzyme mix. Probably the diffusion of the library fragments towards the flowcell is very much slowed down (requiring also higher library concentrations?) giving the molecule that arrives first the chance to become amplified and fill entire nanowells before a second one arrives (http://www.google.com/patents/WO2013188582A1?cl=en). The viscosity enhanced "drag" also could explain the stronger bias towards smaller inset size reads?
The high viscosity buffer together with high library concentrations and "RPA" amplification for the clustering process ("Recombinase Polymerase Amplification" ( http://www.twistdx.co.uk/our_technology/ )) might be sufficient for the Kinetic Exclusion Amplification on the nanowell flowcells? It seems to me that the other methods described in the patent might not be compatible with the old cBots (these can be used for the Hiseq3000/4000 clustering after a software upgrade)?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Originally posted by Brian Bushnell View PostImpressive; I was under the impression that inserts much over 800bp simply would not bridge-amplify. Maybe we should try that approach! Anyway, rather than shorter molecules vastly out-competing longer molecules at all lengths, that could be a more of a case where the rates are fairly similar up to a point (1kbp?) after which longer molecules start failing to form clusters at all (even if there were no short molecules present). I'm just guessing, though.
--
Phillip
Leave a comment:
-
Brian, when we were developing local assembly of paired-end RAD, we were surprised to see contigs of 1200 bp being assembled (see http://journals.plos.org/plosone/art...l.pone.0018561 figure 4), meaning that there must have been fragments of 1200 bp undergoing bridge amplification. We had to use a "triangle cut" in the gel size selection to over-represent the larger fragments, but they did bridge.
I think the size preference in the patterned flow cells could be because a small fragment could enter a well after a larger fragment but then outcompete the larger fragment to fill the well. Or in the diffusion kinetics?
Leave a comment:
-
Originally posted by pmiguel View PostThe 4th post in the thread, I actually converted the mass-based/log-linear plot results from the Agilent bioanalyzer chip to a linear, molecule-based plot. This way it can be directly compared to the insert sizes found by mapping the reads-pairs back to the genome from which they came.
The result showed that the shorter amplicons must have clustered preferentially. Really preferentially.
To me this has always suggested there must be some sort of competition for clustering that favors shorter amplicons.
Leave a comment:
-
Originally posted by Brian Bushnell View Post
The insert size distribution is fairly interesting for a couple reasons. It looks like the platform can probably handle inserts over 450bp fairly well; there were some short inserts, but they did not overwhelmingly out-compete the long ones. But the flat distribution of the short-insert tail is odd.
Bridged amplification & clustering followed by sequencing by synthesis. (Genome Analyzer / HiSeq / MiSeq)
The 4th post in the thread, I actually converted the mass-based/log-linear plot results from the Agilent bioanalyzer chip to a linear, molecule-based plot. This way it can be directly compared to the insert sizes found by mapping the reads-pairs back to the genome from which they came.
The result showed that the shorter amplicons must have clustered preferentially. Really preferentially.
To me this has always suggested there must be some sort of competition for clustering that favors shorter amplicons.
At the much higher clustering concentrations using for the 3000/4000 this process may be exacerbated.
--
Phillip
Leave a comment:
-
Originally posted by DNATECH View PostHi Pmiguel,
the basic procedure looks like:
- 5 ul of library (2 nM to 3 nM including PhiX)
- add 5 ul 0,1 N NaOH
- add 5 ul Tris (200mM)
- add 35 ul Enzyme Master Mix
- load all 50 ul onto cBot
So you cluster at 200-300 pM. About 10-15x what we use on our HiSeq2500.
--
Phillip
Leave a comment:
-
Thanks for the clarification, and thanks for sharing your data!
I did some mapping of the first 16m reads, and generated the following graphs:
The "Other" category refers to soft-clipped bases, which is very high in this case because PhiX is small so many of the reads went off the end (*Considering these reads have been adapter-trimmed, I have no idea what is being sequenced past the ends of the PhiX genome; it might be interesting to investigate). Overall the average error rate is below 1% but above 0.1% across the read. Read 2 has a higher-than-expected insertion rate in the first half of the read. Oddly, R2 has some Ns only in the first half, and R1 has some Ns only in the second half. Unlike other platforms, the error rate for R2 seems fairly flat across the read.
This is a different way of looking at the same data.
The quality accuracy graph indicates that again the Q-scores are binned, and like NextSeq V1, they are highly inflated. Over 70% of the bases were assigned Q41, but the average observed quality for Q41 bases was actually Q31.
The insert size distribution is fairly interesting for a couple reasons. It looks like the platform can probably handle inserts over 450bp fairly well; there were some short inserts, but they did not overwhelmingly out-compete the long ones. But the flat distribution of the short-insert tail is odd.
Lastly, it's worth noting that around 83% of the reads mapped to the reference with no mismatches or indels.
For comparison, I've attached the mhist of a 2x150bp HS2500 run (not on PhiX), below. To me the HS2500 looks better, but not drastically better, in terms of error rates.
Last edited by Brian Bushnell; 05-08-2015, 07:02 PM.
Leave a comment:
-
Hi Brian,
Thanks for looking at the data. The files that I uploaded have 482,680,800 reads. The sequencer generates "reads" for each single nanowell - no matter if it is loaded or not. Thus, the figure of 30% or higher "failing" reads is expected. The SAV viewer indicates a total of 482.68 million nanowells. According to Illumina 60% to 70% of clusters passing filter are considered to be very good; because the figure is calculated with respect to the total number of nanowells. I did intentionally upload files including all non-passing reads (the majority of the "not passing filter" data are likely simply empty nano-wells though).
Lutz
Originally posted by Brian Bushnell View PostI finally finished downloading these, and I'll take a look at the quality from mapping. But before I do that, I always trim adapters... but I was never sure what kind of adapters PhiX reads had. They don't exactly match any adapters in my list, so I'll call them "PhiX adapters". Here they are, for reference:
>Read1_adapter
AGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTGAAA
>Read2_adapter
AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAA
Also, at least for the first 4 million reads, 29.36% failed the chastity filter.Last edited by DNATECH; 05-08-2015, 03:38 PM.
Leave a comment:
-
I finally finished downloading these, and I'll take a look at the quality from mapping. But before I do that, I always trim adapters... but I was never sure what kind of adapters PhiX reads had. They don't exactly match any adapters in my list, so I'll call them "PhiX adapters". Here they are, for reference:
>Read1_adapter
AGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTGAAA
>Read2_adapter
AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAA
Also, at least for the first 4 million reads, 29.36% failed the chastity filter.
Leave a comment:
-
Hi Pmiguel,
the basic procedure looks like:
- 5 ul of library (2 nM to 3 nM including PhiX)
- add 5 ul 0,1 N NaOH
- add 5 ul Tris (200mM)
- add 35 ul Enzyme Master Mix
- load all 50 ul onto cBot
Originally posted by pmiguel View PostWow, 2000 pM? I think the highest we ever went on the HiSeq2500 was 23 pM.
--
Phillip
Leave a comment:
-
Originally posted by DNATECH View PostHi Miguel,
the input was 5ul of PhiX at 2 nM. So far we have used 2 nM concentrations for all our libraries/lanes. Illumina recommends up to 3 nM.
From what our FAS told us, I got the impression under-loading could be more detrimental than over-loading.
--
Phillip
Leave a comment:
-
Hi GenoMax,
perhaps we are just being careful at the moment - since Illumina seems to be very careful and there is very little information so far. The customer samples (n=11) have been looking great so far except one; this sample had some larger low complexity component to it (which we were not aware off). For this sample the Q30 rates dropped after the first 60 to 70 bases of low complexity bases from 95% to 70%.
Originally posted by GenoMax View Post@DNATECH: Based on this (and your other post) it sounds like you need "near perfect libraries" to get good data from patterned flowcells. This could be a problem for core facilities, where "variable" quality libraries come in from customers.
It would be interesting to hear about your experiences as real world customer libraries start flowing through.
Leave a comment:
-
Hi Miguel,
the input was 5ul of PhiX at 2 nM. So far we have used 2 nM concentrations for all our libraries/lanes. Illumina recommends up to 3 nM.
From what our FAS told us, I got the impression under-loading could be more detrimental than over-loading.
Originally posted by pmiguel View PostWhat final concentration was the phiX library that you clustered? I mean after neutralization?
I mean is there no danger of overclustering anymore? That was what I was hoping for when I heard about the patterned flowcells...
--
Phillip
Leave a comment:
Latest Articles
Collapse
-
by seqadmin
This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.
The Headliner
The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...-
Channel: Articles
03-03-2025, 01:39 PM -
-
by seqadmin
The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...-
Channel: Articles
02-24-2025, 06:31 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 03-03-2025, 01:15 PM
|
0 responses
178 views
0 likes
|
Last Post
by seqadmin
03-03-2025, 01:15 PM
|
||
Started by seqadmin, 02-28-2025, 12:58 PM
|
0 responses
269 views
0 likes
|
Last Post
by seqadmin
02-28-2025, 12:58 PM
|
||
Started by seqadmin, 02-24-2025, 02:48 PM
|
0 responses
654 views
0 likes
|
Last Post
by seqadmin
02-24-2025, 02:48 PM
|
||
Started by seqadmin, 02-21-2025, 02:46 PM
|
0 responses
267 views
0 likes
|
Last Post
by seqadmin
02-21-2025, 02:46 PM
|
Leave a comment: