In October 2009, I prepared a paired-end multiplexed genomic DNA library using Illumina's kits and enriched for my regions of interest using Agilent's SureSelect Target Enrichment System. I cloned and Sanger sequenced a fraction of my library before proceeding to cluster generation and sequencing.
The good news was that the Target Enrichment worked well (almost 90% in my regions of interest which was great considering that Agilent had not developed blockers specific to the multiplex adapters and primers yet, only the paired-end ones).
The bad news was that of the 99 clones I sequenced (10 of each of 10 samples except one didn't sequence properly), only 54 had the correct adapter/primer sequences on both ends. 25 had a truncation on one end, 5 had a truncation on one end in addition to a mismatch, 1 had a truncation on both ends, 11 had one mismatch, and 3 had two mismatches. (The truncations were not all small either - only 19 of the truncations were 3 bp and under, 5 of them were 9 bp, and 1 was 10 bp.)
In contrast, in my sequences of interest, I detected a total of 21 known SNPs (in dbSNP) and only 3 unknown SNPs. Thus, I would guess that the mismatches in the Illumina primers/adapters were not due to polymerase error, at least not the majority of them.
I talked to Illumina customer support about this, and since nobody else had reported this problem they thought that maybe it was normal and that most people just don't clone a fraction of their library to check. They told me to go ahead and sequence the library.
So I did sequence it using a paired-end flow cell (actually it is still running and will be done tomorrow). I ran my samples in 3 lanes and the Illumina multiplex control in a 4th lane (the other 4 lanes were for other users who had regular paired-end libraries that went through the SureSelect protocol without multiplexing.) Unfortunately, I only got 3500 clusters while the control and other users got over 100,000. I am pretty sure I didn't make any dilution errors or anything.
I was just wondering if anyone else has (1) cloned their library and Sanger sequenced so that they can report whether they had any mismatches/truncations (2) done Illumina multiplex sequencing recently (perhaps with the same batch of adapters/primers I used) so that I can know whether I am the only one with problems with this (3) any other advice that could be useful
Thanks!
The good news was that the Target Enrichment worked well (almost 90% in my regions of interest which was great considering that Agilent had not developed blockers specific to the multiplex adapters and primers yet, only the paired-end ones).
The bad news was that of the 99 clones I sequenced (10 of each of 10 samples except one didn't sequence properly), only 54 had the correct adapter/primer sequences on both ends. 25 had a truncation on one end, 5 had a truncation on one end in addition to a mismatch, 1 had a truncation on both ends, 11 had one mismatch, and 3 had two mismatches. (The truncations were not all small either - only 19 of the truncations were 3 bp and under, 5 of them were 9 bp, and 1 was 10 bp.)
In contrast, in my sequences of interest, I detected a total of 21 known SNPs (in dbSNP) and only 3 unknown SNPs. Thus, I would guess that the mismatches in the Illumina primers/adapters were not due to polymerase error, at least not the majority of them.
I talked to Illumina customer support about this, and since nobody else had reported this problem they thought that maybe it was normal and that most people just don't clone a fraction of their library to check. They told me to go ahead and sequence the library.
So I did sequence it using a paired-end flow cell (actually it is still running and will be done tomorrow). I ran my samples in 3 lanes and the Illumina multiplex control in a 4th lane (the other 4 lanes were for other users who had regular paired-end libraries that went through the SureSelect protocol without multiplexing.) Unfortunately, I only got 3500 clusters while the control and other users got over 100,000. I am pretty sure I didn't make any dilution errors or anything.
I was just wondering if anyone else has (1) cloned their library and Sanger sequenced so that they can report whether they had any mismatches/truncations (2) done Illumina multiplex sequencing recently (perhaps with the same batch of adapters/primers I used) so that I can know whether I am the only one with problems with this (3) any other advice that could be useful
Thanks!
Comment