Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Roche's gsMapper

    Hello

    Has anyone here ever changed the parameters used by gsMapper when mapping their read data to a reference genome? If so, can anyone elaborate on what "minimum overlap length" and "alignment identity score" means? (the definition in the manual is far too little)

    Cheers

    Layla

  • #2
    I have not modified the default setting in gsMapper running.

    gsMapper algorithm is similar to other assembly software (phrap), using the similar concept of "overlap" between reads to obtain contigs.

    The difference is that 454 gsMapper is all based on raw flow space. Therefore, the scores, the length I believe is on flow space.

    For example, minimum overlap length, default value is 40 based on Manual. I believe 40 means 40 flows, not 40 bases. 40 flows is roughly between 16bp to 20 bp.

    You can play with the value, but I doubt that you can get any real difference in result.

    Comment


    • #3
      I don't think this is true. I think It's 40 bases not 40 flows. IIRC (not that it's in the manual), flowspace is only used in calling the consensus *after* mapping the reads (in sequence space).

      I could be wrong. It's a shame its not easy to find these things out.

      Also, I think these settings should have a big effect on the result. 'Seed size' is a trade off between sensitivity and running time. The bigger the seed size, the quicker the running time, but the more 'nearly perfect' hits you will miss. The lower the seed size, the higher the sensitivity, but the specificity dramatically reduces at some point, so many false matches need to be inspected at later stages of the mapping.
      Last edited by dan; 09-13-2009, 11:44 PM. Reason: Responding to the second point too.
      Homepage: Dan Bolser
      MetaBase the database of biological databases.

      Comment


      • #4
        We used some different values for "minimum length" and "minimum identity": -ml 90% -mi 96% to get more reliable variation detection in areas with lower coverage.

        Comment


        • #5
          Maybe silly but I simply did a BLAT analysis of the reads (which is really fast) to a reference genome which allowed me to simply choose any cut-off I like (length as well as sensitivity %homology). But probably this also depends on the specific requirements.....
          My 2 cents.
          Alex

          Comment


          • #6
            Originally posted by AlexB View Post
            Maybe silly but I simply did a BLAT analysis of the reads (which is really fast) to a reference genome which allowed me to simply choose any cut-off I like (length as well as sensitivity %homology). But probably this also depends on the specific requirements.....
            My 2 cents.
            Alex
            Alex, with the homopolymer issue, do you have something standard to take care of all those small indels that blat might be returning? I believe gsMapper has some in-built filters to take care of some of those false positives..
            --
            bioinfosm

            Comment


            • #7
              I have to admit that in such detail we never looked so I can't comment. Since we were relatively new to the technology at the time we compared the results of gsmapper to the ones returned by BLAT and using certain homology/length cutoffs we more or less reproduced the results. This was using a 2Mb genome though... Can you be more precise with what you exactly mean I will keep my eye on it.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Latest Developments in Precision Medicine
                by seqadmin



                Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                Somatic Genomics
                “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                Today, 01:16 PM
              • seqadmin
                Recent Advances in Sequencing Analysis Tools
                by seqadmin


                The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                05-06-2024, 07:48 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Today, 07:15 AM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 10:28 AM
              0 responses
              15 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 07:35 AM
              0 responses
              16 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-22-2024, 02:06 PM
              0 responses
              8 views
              0 likes
              Last Post seqadmin  
              Working...
              X