Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SliderII: High Quality SNP Calling Using Illumina Data at Shallow Coverage

    SliderII is now available from:

    High quality SNP calling using Illumina data at minimal coverage


    Sorry for the delay,

    Nawar

  • #2
    Thanks for this. I am always fancinated by slider. I guess this is the first SNP caller that explicitly use four quality values. James Bonfield and Mark Daly both believe and show some preliminary result that using four values leads to better SNP calls. Some comments on the figures at your website:

    1. It is interesting to see you also come to the point of using known allele frequency as a prior, the same as BGI's SNP caller. When I did SNP calling for that NA18507, I also suggested this, but all the rest of people said it is cheating somehow and rejected my suggestion. They more like to think there are two problems: SNP discovery and genotyping. For SNP discovery, we only use a flat prior and for genotyping, we use the allele frequency.

    2. How Slider detect paralogous regions? To detect CNV first and then filter out the SNPs in CNVs? I agree that setting maximum depth as is used by maq is not a good way.

    3. I am not sure if I read your paper properly. As I understand, only one mutation (not sequencing errors) is allowed on one read. Is that right?

    Comment


    • #3
      step by step

      I checked http://www.bcgsc.ca/platform/bioinfo/software/SliderII
      and think it does alignment by steps.

      # Alignment.Java: Find read locations on the reference sequence with an exact match and one-off match (one base mismatch) to prb derived sequences.
      # Extend.java: Expand reads to include up to 3 mismatches

      Comment


      • #4
        Any insight on how slider results compare to MAQ SNP calling on single/paired data?

        Originally posted by lh3 View Post
        Thanks for this. I am always fancinated by slider. I guess this is the first SNP caller that explicitly use four quality values. James Bonfield and Mark Daly both believe and show some preliminary result that using four values leads to better SNP calls. Some comments on the figures at your website:

        1. It is interesting to see you also come to the point of using known allele frequency as a prior, the same as BGI's SNP caller. When I did SNP calling for that NA18507, I also suggested this, but all the rest of people said it is cheating somehow and rejected my suggestion. They more like to think there are two problems: SNP discovery and genotyping. For SNP discovery, we only use a flat prior and for genotyping, we use the allele frequency.

        2. How Slider detect paralogous regions? To detect CNV first and then filter out the SNPs in CNVs? I agree that setting maximum depth as is used by maq is not a good way.

        3. I am not sure if I read your paper properly. As I understand, only one mutation (not sequencing errors) is allowed on one read. Is that right?
        --
        bioinfosm

        Comment


        • #5
          Yes, just check the link:

          High quality SNP calling using Illumina data at minimal coverage


          N.

          Comment


          • #6
            - Regarding paralogous, Slider identify paralogous SNPs (and contig edge SNPs) as they are likely to be at the edges of the reads.
            - Yes, Slider (and SliderII) allows up to one mutation, plus, it consider all possible bases in prb data, and when using PET reads, SliderII force align reads if other side is aligned.

            Nawar

            Comment


            • #7
              2. Do you mean you exclude SNPs towards the ends of a read? These are the false SNPs caused by indels. A better strategy would be to filter out SNPs close to predicted indels.

              3. Sorry that I did not read through the whole page. I now realize that this is a seeding-extension algorithm. You allow maximum one mutation in the seed but may extend the seed to allow more. By the way, the page said "the smaller the seed size is, the faster the alignment will be". Is this a typo?

              Comment


              • #8
                > Hi,
                >
                > I am very interested in your SNP Caller SilderII. I am trying to use it. I have one question for you. What's the meaning of SNP_in in the config file? I didn't find the explanation for the item from sliderII website.
                >
                > Thank you very much.
                >
                > Rebecca

                SNP_in is the expected number of bases in the reference genome for each one SNP, for the human genome, this number should be 1000.

                Nawar

                Comment


                • #9
                  What coordinate system are you using for generating a table of known snps to feed into SliderII. Is this 1 based? or 0 based? Does anyone have a table for mouse 2007, mm9?

                  Comment


                  • #10
                    Hi Nix,

                    I used the Ensembl Variation database (version 50) SNPs.
                    You need to adjust the format.

                    Nawar

                    Comment


                    • #11
                      Is SliderII's paper published?
                      And the picture in this link can not be displayed
                      High quality SNP calling using Illumina data at minimal coverage
                      Last edited by pengchy; 10-09-2011, 11:52 PM. Reason: more questions

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Recent Developments in Metagenomics
                        by seqadmin





                        Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                        09-23-2024, 06:35 AM
                      • seqadmin
                        Understanding Genetic Influence on Infectious Disease
                        by seqadmin




                        During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

                        Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
                        09-09-2024, 10:59 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 10-02-2024, 04:51 AM
                      0 responses
                      13 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 10-01-2024, 07:10 AM
                      0 responses
                      21 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 09-30-2024, 08:33 AM
                      0 responses
                      25 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 09-26-2024, 12:57 PM
                      0 responses
                      18 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X