Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SliderII: High Quality SNP Calling Using Illumina Data at Shallow Coverage

    SliderII is now available from:

    High quality SNP calling using Illumina data at minimal coverage


    Sorry for the delay,

    Nawar

  • #2
    Thanks for this. I am always fancinated by slider. I guess this is the first SNP caller that explicitly use four quality values. James Bonfield and Mark Daly both believe and show some preliminary result that using four values leads to better SNP calls. Some comments on the figures at your website:

    1. It is interesting to see you also come to the point of using known allele frequency as a prior, the same as BGI's SNP caller. When I did SNP calling for that NA18507, I also suggested this, but all the rest of people said it is cheating somehow and rejected my suggestion. They more like to think there are two problems: SNP discovery and genotyping. For SNP discovery, we only use a flat prior and for genotyping, we use the allele frequency.

    2. How Slider detect paralogous regions? To detect CNV first and then filter out the SNPs in CNVs? I agree that setting maximum depth as is used by maq is not a good way.

    3. I am not sure if I read your paper properly. As I understand, only one mutation (not sequencing errors) is allowed on one read. Is that right?

    Comment


    • #3
      step by step

      I checked http://www.bcgsc.ca/platform/bioinfo/software/SliderII
      and think it does alignment by steps.

      # Alignment.Java: Find read locations on the reference sequence with an exact match and one-off match (one base mismatch) to prb derived sequences.
      # Extend.java: Expand reads to include up to 3 mismatches

      Comment


      • #4
        Any insight on how slider results compare to MAQ SNP calling on single/paired data?

        Originally posted by lh3 View Post
        Thanks for this. I am always fancinated by slider. I guess this is the first SNP caller that explicitly use four quality values. James Bonfield and Mark Daly both believe and show some preliminary result that using four values leads to better SNP calls. Some comments on the figures at your website:

        1. It is interesting to see you also come to the point of using known allele frequency as a prior, the same as BGI's SNP caller. When I did SNP calling for that NA18507, I also suggested this, but all the rest of people said it is cheating somehow and rejected my suggestion. They more like to think there are two problems: SNP discovery and genotyping. For SNP discovery, we only use a flat prior and for genotyping, we use the allele frequency.

        2. How Slider detect paralogous regions? To detect CNV first and then filter out the SNPs in CNVs? I agree that setting maximum depth as is used by maq is not a good way.

        3. I am not sure if I read your paper properly. As I understand, only one mutation (not sequencing errors) is allowed on one read. Is that right?
        --
        bioinfosm

        Comment


        • #5
          Yes, just check the link:

          High quality SNP calling using Illumina data at minimal coverage


          N.

          Comment


          • #6
            - Regarding paralogous, Slider identify paralogous SNPs (and contig edge SNPs) as they are likely to be at the edges of the reads.
            - Yes, Slider (and SliderII) allows up to one mutation, plus, it consider all possible bases in prb data, and when using PET reads, SliderII force align reads if other side is aligned.

            Nawar

            Comment


            • #7
              2. Do you mean you exclude SNPs towards the ends of a read? These are the false SNPs caused by indels. A better strategy would be to filter out SNPs close to predicted indels.

              3. Sorry that I did not read through the whole page. I now realize that this is a seeding-extension algorithm. You allow maximum one mutation in the seed but may extend the seed to allow more. By the way, the page said "the smaller the seed size is, the faster the alignment will be". Is this a typo?

              Comment


              • #8
                > Hi,
                >
                > I am very interested in your SNP Caller SilderII. I am trying to use it. I have one question for you. What's the meaning of SNP_in in the config file? I didn't find the explanation for the item from sliderII website.
                >
                > Thank you very much.
                >
                > Rebecca

                SNP_in is the expected number of bases in the reference genome for each one SNP, for the human genome, this number should be 1000.

                Nawar

                Comment


                • #9
                  What coordinate system are you using for generating a table of known snps to feed into SliderII. Is this 1 based? or 0 based? Does anyone have a table for mouse 2007, mm9?

                  Comment


                  • #10
                    Hi Nix,

                    I used the Ensembl Variation database (version 50) SNPs.
                    You need to adjust the format.

                    Nawar

                    Comment


                    • #11
                      Is SliderII's paper published?
                      And the picture in this link can not be displayed
                      High quality SNP calling using Illumina data at minimal coverage
                      Last edited by pengchy; 10-09-2011, 11:52 PM. Reason: more questions

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Addressing Off-Target Effects in CRISPR Technologies
                        by seqadmin






                        The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
                        08-27-2024, 04:44 AM
                      • seqadmin
                        Selecting and Optimizing mRNA Library Preparations
                        by seqadmin



                        Sequencing mRNA provides a snapshot of cellular activity, allowing researchers to study the dynamics of cellular processes, compare gene expression across different tissue types, and gain insights into the mechanisms of complex diseases. “mRNA’s central role in the dogma of molecular biology makes it a logical and relevant focus for transcriptomic studies,” stated Sebastian Aguilar Pierlé, Ph.D., Application Development Lead at Inorevia. “One of the major hurdles for...
                        08-07-2024, 12:11 PM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 08-27-2024, 04:40 AM
                      0 responses
                      16 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 08-22-2024, 05:00 AM
                      0 responses
                      293 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 08-21-2024, 10:49 AM
                      0 responses
                      135 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 08-19-2024, 05:12 AM
                      0 responses
                      124 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X