Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ECO
    --Site Admin--
    • Oct 2007
    • 1360

    Ion Torrent claims of MiSeq showing post-homopolymer substitution errors

    I wanted to hopefully start some discussion here of perhaps the most interesting thing going on in the sequencing marketing world this week (while we wait for Roche to up its bid for ILMN or bail ).

    Ion Torrent posted an analysis of public MiSeq data on the Ion Community, and is presenting an analysis that describes a "clear systematic bias within MiSeq® data". A choice quote is below (PDF export of the post is attached...you know, for openness):

    "These substitution errors often fall to the last base of a homopolymer region - based on the direction of the read. For example, in a stretch of three G bases, the fourth base is often erroneously called a G. This strand-specific pattern is wide spread, and explains 49.9% and 51.8% of MiSeq® substitution errors overall in DH10B and K12, respectively. This dominant error profile that can be found so frequently next to homopolymer regions suggests a clear systematic bias within MiSeq® data.
    Keith Robison and Monkol Lek have taken a look at the claims on their respective blogs.
    Attached Files
  • koadman
    Member
    • May 2010
    • 65

    #2
    I love it. The good folks at Life Tech may have in fact helped to make analysis pipelines for MiSeq better by publicizing a bias that is probably much more fixable than the homopolymer issues on their own platform. Keep up the good work Ion.

    Comment

    • TonyBrooks
      Senior Member
      • Jun 2009
      • 303

      #3
      Surely as this is strand specific it's not too big a problem. You just need to be sure that any SNP is visible in both forward and reverse reads. If it's only seen in reads from one direction, then you should ignore it, treat it with caution or at least give it a really low mappability score) - something I think most aligners do (correct me if I'm wrong).

      The only problem is if you had a single base flanked by homopolymers in both directions. Then the base would be miscalled on both strands.

      Comment

      • ulz_peter
        Senior Member
        • Feb 2010
        • 219

        #4
        Is someone else also getting tired of companies trying to prove the weaknesses of the opponent rather than focussing on their own system?

        Comment

        • sinaian
          Junior Member
          • Jan 2011
          • 4

          #5
          So ONLY NOW someone finally realizes weakness of opponent is not a proper subject? How convenient is the timing ...

          BTW, trading sensitivity for specificity is always a great solution.

          Comment


          • #6
            I wonder if this is related to the fast chemistry times of Illumina's newest platforms? Seems odd such a prevalent error profile would go missed.

            Comment

            • lh3
              Senior Member
              • Feb 2008
              • 686

              #7
              Let me discard the previous post.

              IonTorrent is finding something real. However, I think this is not caused by homopolymer run, at least not mainly caused by that, but by the "GGC" and/or the invert repeat artifact [PMID:21576222]. This region is particularly enriched with GGC on both forward and backward strands. In addition, the screenshot is exaggerating the Illumina problem a little bit: they disabled shading in IGV; the majority of mismatches have quality below 10 and are barely visible under the IGV default setting. Some mismatches do get Q20 recurrently, which is worrying.
              Last edited by lh3; 02-01-2012, 07:54 PM.

              Comment

              • snetmcom
                Senior Member
                • Oct 2008
                • 158

                #8
                Originally posted by sinaian View Post
                So ONLY NOW someone finally realizes weakness of opponent is not a proper subject? How convenient is the timing ...

                BTW, trading sensitivity for specificity is always a great solution.
                just poking through their documentation, there are several publications that have found this before.

                Comment

                • pmiguel
                  Senior Member
                  • Aug 2008
                  • 2328

                  #9
                  Originally posted by snetmcom View Post
                  just poking through their documentation, there are several publications that have found this before.
                  Yes, I think MIRA creator, Bastien Chevreux, noticed it first -- and changed MIRA to compensate for the Illumina GGCxG issue. Bizarre Illumina has not fixed it themselves, but there are a handful of issues Illumina seems blind to.

                  --
                  Phillip

                  Comment

                  • alanwan
                    Junior Member
                    • Sep 2008
                    • 4

                    #10
                    The system bias indeed exists. But it is usually very small - no more than 1/1000 detected SNVs are caused by system errors. Therefore few people realize it.

                    However it is fatal to rare disease causal novel SNP detection, because system errors occur randomly to the whole genome, and since the known SNPs occupy only 1/100 (db135 ~30M/3G) of the genome base positions, most of the errors SNVs exist in novel sites. That leads to a high false positive rate in your novel SNPs.

                    This problem could be far more worse if you want to find common novel SNPs in size>=3 population samples. Actually we found a terrible FPR (>98%) in detected common novel SNVs of a whole exome sequencing project (family samples, size=3, sequence generated by one GAII) in 2010. However, it is important to note that not all our Illumina sequence data have such a high error rate.

                    In my observation, the proceeding homopolymer leads to most of the false positives,while GGC problem is light. I think it may depend on sample properties and other factors.

                    As you guys may already find, there have been many articles introducing methods to solve the system bias problems of the NGS instruments, such as GATK variants calibration, VarScan, CRISP, SERVIC4E, and etc. Unfortunately there is no common conclusion that which method provides the best solution. No offense, I personally had bad experience with GATK's old versions, which crashed again and again and was too picky to my BAM files exported by other aligner. I did not try other tools yet, and I am still using my own scripts to filter the false positives.

                    Comment

                    • james hadfield
                      Moderator
                      Cambridge, UK
                      Community Forum
                      • Feb 2008
                      • 224

                      #11
                      Is
                      Originally posted by sinaian View Post
                      weakness of opponent
                      a popular strategy in the US at the moment due to your 57th presidential election?

                      Bashing opponents always makes jucier headlines than demonstrating minor improvments to your own system. I would very much prefer to hear Ion discussing the very real improvments they have made. The technology has raced forward as fast as we hoped it would.

                      Comment

                      • sinaian
                        Junior Member
                        • Jan 2011
                        • 4

                        #12
                        Originally posted by james hadfield View Post
                        Bashing opponents always makes jucier headlines than demonstrating minor improvments to your own system. I would very much prefer to hear Ion discussing the very real improvments they have made. The technology has raced forward as fast as we hoped it would.
                        Fully agreed. But it is just intereting to compare the atmosphere when one party came out bashing the other, versus when the opponenent actually answers back.

                        Comment

                        • pmiguel
                          Senior Member
                          • Aug 2008
                          • 2328

                          #13
                          Can anyone verify that this is the old "GGCxG" issue?

                          If so, I have my doubts that Illumina will address the issue on the basis of LifeTech pointing it out. Seems either to be firmly in their corporate blind spot or an intractable issue.

                          --
                          Phillip

                          Comment

                          • alanwan
                            Junior Member
                            • Sep 2008
                            • 4

                            #14
                            Originally posted by pmiguel View Post
                            Can anyone verify that this is the old "GGCxG" issue?

                            If so, I have my doubts that Illumina will address the issue on the basis of LifeTech pointing it out. Seems either to be firmly in their corporate blind spot or an intractable issue.

                            --
                            Phillip
                            This system bias problem probably can never be completely solved. But I believe new algorithms will help distinguish the error calls.

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Pathogen Surveillance with Advanced Genomic Tools
                              by seqadmin




                              The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
                              03-24-2025, 11:48 AM
                            • seqadmin
                              New Genomics Tools and Methods Shared at AGBT 2025
                              by seqadmin


                              This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                              The Headliner
                              The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                              03-03-2025, 01:39 PM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, 03-20-2025, 05:03 AM
                            0 responses
                            41 views
                            0 reactions
                            Last Post seqadmin  
                            Started by seqadmin, 03-19-2025, 07:27 AM
                            0 responses
                            51 views
                            0 reactions
                            Last Post seqadmin  
                            Started by seqadmin, 03-18-2025, 12:50 PM
                            0 responses
                            38 views
                            0 reactions
                            Last Post seqadmin  
                            Started by seqadmin, 03-03-2025, 01:15 PM
                            0 responses
                            193 views
                            0 reactions
                            Last Post seqadmin  
                            Working...