I am a new Illumina user. What is the best way to do targeted seqeuncing (<100 amplicons) on the MiSeq?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
I've often wondered if you couldn't include some Ns in the primer for low complexity libraries which you could then remove post sequencing along with your target primer. So your fwd primer would look like this
[P5][Seq Primer]NNNNN[Target Primer]
In theory the first 5 bases would be random which would give decent cluster identification.
There's probably some obvious reason why this wouldn't work that I haven't thought of yet.
Or maybe designing PCR's to both strands would increase the complexity enough?
Comment
-
Originally posted by TonyBrooks View PostI've often wondered if you couldn't include some Ns in the primer for low complexity libraries which you could then remove post sequencing along with your target primer. So your fwd primer would look like this
[P5][Seq Primer]NNNNN[Target Primer]
In theory the first 5 bases would be random which would give decent cluster identification.
There's probably some obvious reason why this wouldn't work that I haven't thought of yet.
Or maybe designing PCR's to both strands would increase the complexity enough?
Comment
-
Cluster caller?
Originally posted by krobison View PostA related trick I saw published was to use a variable length sequence between the Illumina primers & targeting sequence; this way the targeting sequences aren't all in phase & the complexity is significantly increased in the eye of the cluster caller.
My samples are multiplexed targeted re-sequencing of exomes from 16 genotypes. Could this be due to a low complexity issue discussed on this thread?
I also heard rumors that if the first 4bp of a read is identical (which is very likely in targeted re-sequencing) it will be assigned to the same cluster. Is this true?
Comment
-
Originally posted by ShiveringFire View PostCan you educate me more on "cluster caller"? I recently got results from MiSeq Paired-End run (150bp) and I suspect there's a double read problem. Qscores are rather wavy at the beginning.
My samples are multiplexed targeted re-sequencing of exomes from 16 genotypes. Could this be due to a low complexity issue discussed on this thread?
I also heard rumors that if the first 4bp of a read is identical (which is very likely in targeted re-sequencing) it will be assigned to the same cluster. Is this true?
Our MiSeq is apparently 6 weeks from being delivered, so I can only extrapolate from the performance of our HiScanSQ (which is similar to a HiSeq). There are two issues masquerading as a single "problem".
(1) The first is that the actual instrument focusing software goes nuts and misfocuses if you have an empty tile (or set of tiles?) early in the run. Probably mainly during the first cycle. For our HiScanSQ this means if you don't have a fair number of clusters in the G&T channel and the A&C channel, you have a good chance that the instrument will pick the wrong focal plan to image.
Doesn't anyone have connections inside Illumina they could point out this blunder to? If focus is good for one channel and bad for the other, why not just use the good focal point?
Anyway, I think this is only an issue during the first few cycles (maybe only the first) of a read. Then the focal points seem to be carried along with little (or no) adjustments.
The effect? Once you are out of focus, you are hosed for that tile for the run, I think. (Or "psuedo-tile" -- but I'll just presume that everyone reading this realizes that no masonry is involved in this process and leave it at "tile".) This applies no matter how perfect your cluster spacing is.
(2) If the instrument did not hose up its focal plane during the first cycle, you can still have diminished results if you have a low complexity of base calls. This is the one everyone thinks about -- two adjacent clusters give the same base calls for 4 bases. If they are very close together, then neither of them may "register" (if I use the terminology correctly), well at all. This is a little more understandable. Except that Illumina foreclosed the use of methodology that could circumvent this issue on the GA-IIx for the HiSeq by making the early layers of data processing possible only on the instrument console.
So if issue 1 doesn't kill you, issue 2 probably will. Work-arounds are required.
--
Phillip
Comment
-
no more deferred cluster calling
Thanks Philip,
I came across this paper formally defining the problem and a possible solution:
Massively parallel DNA sequencing is capable of sequencing tens of millions of DNA fragments at the same time. However, sequence bias in the initial cycles, which are used to determine the coordinates of individual clusters, causes a loss of fidelity in cluster identification on Illumina Genome Analysers. This can result in a significant reduction in the numbers of clusters that can be analysed. Such low sample diversity is an intrinsic problem of sequencing libraries that are generated by restriction enzyme digestion, such as e4C-seq or reduced-representation libraries. Similarly, this problem can also arise through the combined sequencing of barcoded, multiplexed libraries. We describe a procedure to defer the mapping of cluster coordinates until low-diversity sequences have been passed. This simple procedure can recover substantial amounts of next generation sequencing data that would otherwise be lost.
Sadly I was told that MiSeq doesn't save raw image files so a "deferred cluster calling" is not possible anymore. This sounds like getting rid of the evidence for a dirty job...
So spiking a lot of pHiX seems like the only solution for low complexity. Can one spike phiX even in custom made libraries?
Comment
-
I am looking for an alternative to spiking in 50% or more PhiX while sequencing low diversity amplicon libraries.
I am wondering if using longer barcodes with balanced representation of bases (up to 24 bases) would be sufficient.
Additionally, does anyone know if I can specify barcodes of different lengths on a MiSeq sample sheet?
Comment
-
I don't think balanced barcodes will work. The index read occurs after read 1 so the instrument will still have issues calling clusters.
I am doing some tag sequencing that requires a custom library prep protocol, and I have the same issue with homogeneous nucleotides for the first 10 bases of read 1. Illumina tech support recommended adding a stretch of 12 Ns (not 6) right after the read 1 primer binding site. It seems like that would be fairly straightforward to add to primers for amplicon sequencing.
I am pretty sure (but don't quote me on this) that you can specify indeces of different lengths on the sample sheet. I think that the 2 index reads use a 6 base and a 7 base index.
Comment
-
Originally posted by bbeitzel View PostI don't think balanced barcodes will work. The index read occurs after read 1 so the instrument will still have issues calling clusters.
I am doing some tag sequencing that requires a custom library prep protocol, and I have the same issue with homogeneous nucleotides for the first 10 bases of read 1. Illumina tech support recommended adding a stretch of 12 Ns (not 6) right after the read 1 primer binding site. It seems like that would be fairly straightforward to add to primers for amplicon sequencing.
Comment
Latest Articles
Collapse
-
by seqadmin
Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.
Somatic Genomics
“We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...-
Channel: Articles
05-24-2024, 01:16 PM -
-
by seqadmin
The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...-
Channel: Articles
05-06-2024, 07:48 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 06:55 AM
|
0 responses
12 views
0 likes
|
Last Post
by seqadmin
Yesterday, 06:55 AM
|
||
Started by seqadmin, 05-30-2024, 03:16 PM
|
0 responses
24 views
0 likes
|
Last Post
by seqadmin
05-30-2024, 03:16 PM
|
||
Comprehensive Sequencing of Great Ape Sex Chromosomes Yields Insights into Evolution and Genetic Variability
by seqadmin
Started by seqadmin, 05-29-2024, 01:32 PM
|
0 responses
29 views
0 likes
|
Last Post
by seqadmin
05-29-2024, 01:32 PM
|
||
Started by seqadmin, 05-24-2024, 07:15 AM
|
0 responses
215 views
0 likes
|
Last Post
by seqadmin
05-24-2024, 07:15 AM
|
Comment