Seqanswers Leaderboard Ad

**HMorrison** · 02-21-2012, 12:42 PM

Originally posted by rcorbett View Post

Hi.
We have an Ion Torrent PGM where we have sequenced a number of genomes. Looking at the TMAP alignments we definitely see sequencing errors of the indel variety around homopolymer regions in the reference. The homopolymer runs don't necessarily have to be more than two consective bases either.

Some quick analysis showed me that the positions of the indels in the reads aren't completely random, and do stack up in the alignments, which could cause false positives in the indel calling (as well as potentially interfering with true variants that are in the region).

Downloading example E. Coli data from the Life website I see the same sorts of errors.

To my dismay when I google this I find all sorts of technical and sponsored reports from Illumina et al, pointing out the errors in Ion Torrent data. Furthermore, I see the reports getting fired back from Life discounting all the analysis in the first paper, and so it continues.

My question:
Can anyone who has sequenced and analyzed data on the PGM objectively comment on the rate of indels in the reads? I would like to hear if other people have seen what I've seen, or even better if they know of a magic fix.

thanks!

I've posted on the Ion Community site recently about some very non-random errors; generally a C or T deletion in a CCTT type motif. Link is http://lifetech-it.hosted.jivesoftwa...sage/6220#6220 .
I had waited to see what information I got from Life Tech about it before coming to this site. They suggested it was a problematic sequence. Would be happy to provide more details or discuss it offline.

Hilary Morrison

**IonTorrent** · 03-01-2012, 06:57 PM

Here's a recording of Dr. Niall Lennon from the Broad on their experiences with semiconductor sequencing.

http://www.youtube.com/watch?v=N2nbbBo0zT0

**david2** · 03-02-2012, 05:34 AM

error model available for indels?

We are looking into amplicon sequencing for variant detection, some of the genes have several repeat regions and can generate a lot of false positive heterozygous indels.

Is there a model describing how the Ion Torrent generates read errors in this area? With such a model we could adapt our filtering strategy to reduce the false positive rate (although we want to be sure not to miss a true positive).

**HMorrison** · 03-02-2012, 05:48 AM

Originally posted by IonTorrent View Post

Here's a recording of Dr. Niall Lennon from the Broad on their experiences with semiconductor sequencing.

http://www.youtube.com/watch?v=N2nbbBo0zT0

I don't need the advertisement.

**david2** · 03-02-2012, 06:03 AM

Hilary,
I tried to follow your link to the IT community website, but I get an error message:
"It appears you're not allowed to view what you requested"
(I am registered, but not as a IT customer).
David

**david2** · 03-02-2012, 06:04 AM

Video from Broad: I just lost 10 min. with it, no info on indel read errors

**HMorrison** · 03-02-2012, 06:14 AM

Originally posted by david2 View Post

Hilary,
I tried to follow your link to the IT community website, but I get an error message:
"It appears you're not allowed to view what you requested"
(I am registered, but not as a IT customer).
David

This is what I posted there:

"I've just finished looking through a set of reads from control templates (16S tag sequencing using fusion primers) and see a very interesting (and sad)

error pattern. In this image, the top sequence is the most abundant *incorrect* read; the bottom (blue) is the correct read. Number of each is at the left. Ecoli tag results--less than 4% perfect reads. We have 43 controls including K12; the percent correct varied from 0% up to 82%. Seems to have happened on both runs, same day; one was 314 and the other 316. Some more investigation to do."

They told us it was a difficult sequence and to try the enzyme in the 200 nt sequencing kit.

IT_ErrorPattern.jpg

Attached Files

IT_ErrorPattern.jpg (91.7 KB, 246 views)

**david2** · 03-02-2012, 07:06 AM

Thanks Hilary for the details, interesting case indeed.

**low99** · 03-04-2012, 10:40 PM

Unfortunately this is an inherent problem of the 454, Ion Torrent and probably the Proton chemistry. It is well documented. This if from the NEJM article on the sequencing on the German EColi outbreak :
"We also performed sequencing on the Illumina HiSeq platform in accordance with the manufacturer's instructions. An initial single-end run was used to correct errors in the Ion Torrent sequence, principally in homopolymeric tracts. "

Just a moment...

http://www.nejm.org/doi/full/10.1056/NEJMoa1107643?query=featured_home&

**ngseq** · 03-17-2012, 04:30 PM

CCTT calling error

Wow, this would be very a big issue if proven to be a reproducible error for PGM. But given how many CCTTs there are in genomes (occuring once in every 256bp in totally random sequence) one would image this would have been identified much earlier in-house by LT. Looks like it may have something to do with specific context within in which a CCTT lies?

Thanks Hilary for the very intriguing observation. Have any other PGM users seen this?

**HMorrison** · 03-20-2012, 06:47 AM

Non-random error reduced with new enzyme

I would love to know what the two different enzymes are, but whatever enzyme is included in the 200 nt PGM sequencing kit has almost eliminated the problem I first reported. Errors are mainly in what I would consider true homopolymer runs (i.e. more than two of the same base). Much more likely to continue using the system for pyrotag-like (ph-tag?) sequencing.

**jonathanjacobs** · 03-21-2012, 05:43 AM

FWIW - there's a parrallel discussion about this at the IonTorrent community here.

http://lifetech-it.hosted.jivesoftwa.../2299?tstart=0

As Hmorrison mentioned - the 200nt kit largely eliminates this issue.

**HMorrison** · 03-21-2012, 05:50 AM

Parallel discussions

I know; I posted it there too. LifeTech doesn't seem to like me using SeqAnswers exclusively.

**ngseq** · 03-21-2012, 05:52 AM

thanks! looks like we should try to stick to 200nt kits.

Topics	Statistics	Last Post
The Adaptation of the Cell Cycle in Multiciliated Cells by seqadmin Started by seqadmin, 06-07-2024, 06:58 AM	0 responses 13 views 0 likes	Last Post by seqadmin 06-07-2024, 06:58 AM
New Method for DNA Sequence Amplification by seqadmin Started by seqadmin, 06-06-2024, 08:18 AM	0 responses 21 views 0 likes	Last Post by seqadmin 06-06-2024, 08:18 AM
New Tools Enhance Single-Molecule DNA Analysis with Minimal Samples by seqadmin Started by seqadmin, 06-06-2024, 08:04 AM	0 responses 20 views 0 likes	Last Post by seqadmin 06-06-2024, 08:04 AM
SIX2 Protein Identified as a Key Player in Prostate Cancer Treatment Resistance by seqadmin Started by seqadmin, 06-03-2024, 06:55 AM	0 responses 14 views 0 likes	Last Post by seqadmin 06-03-2024, 06:55 AM

Seqanswers Leaderboard Ad

Announcement

Ion Torrent data quality impressions?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News