Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to tell SNPs synonymous v. nonsynonymous

    Hi,
    I have aligned my Illumina reads to hg19 and used SAMtools to give me a list of SNPs. I uploaded the list of SNPs to Galaxy and used tools there to filter out all dbSNPs and to select only the SNPs that are in coding exons. Now I would like to determine whether the each SNP is synonymous or nonsynonymous. Is there an automated way of doing this? It will take me a long time if I need to enter each SNP into the UCSC browser to see whether it changes the amino acid.

  • #2
    I have the same problem as well

    Comment


    • #3
      Finding the good ones

      Originally posted by Margot View Post
      Hi,
      I have aligned my Illumina reads to hg19 and used SAMtools to give me a list of SNPs. I uploaded the list of SNPs to Galaxy and used tools there to filter out all dbSNPs and to select only the SNPs that are in coding exons. Now I would like to determine whether the each SNP is synonymous or nonsynonymous. Is there an automated way of doing this? It will take me a long time if I need to enter each SNP into the UCSC browser to see whether it changes the amino acid.
      Hi Margot,

      I use SIFT (http://sift.jcvi.org/) to do this and I check it does support hg19/GRCh37. It works relatively fast and is fairly user friendly, though I wish they didn't use * for nonsense codons...really messes up my database but you can replace them in excel using find/replace with find=~* and replace=X

      Hope that helps,

      Jonathan

      Comment


      • #4
        another tool is http://gvs.gs.washington.edu/SeattleSeqAnnotation/

        Comment


        • #5
          Looks like the SeattleSeqAnnotation only supports NCBI36/hg18 1-indexed and doesn't support the new GRCh37/hg19 genome build, but that was only on a quick check so I might be wrong. Anyone with experience with both?

          Comment


          • #6
            As far as I know you are right see http://gvs.gs.washington.edu/Seattle...lpHowToUse.jsp

            Comment


            • #7
              Thanks for the help!

              Comment


              • #8
                I would suggest you to write a program by yourself.

                By definition, "synonymous" means that a change in a nucleotide position doesn't change the translated amino acid, whereas "non-synonymous" changes. So you need to decide the translational frame of the exon and the codon usage table of your organism and organelle, before you can determine whether each SNP is "synonymous" or "non-synonymous".
                Website: http://www.fengfengzhou.org/FengfengZhou/

                Comment


                • #9
                  Thanks for the information which I'm looking for. Sift needs SNPs' direction while gvs only needs ref_allele, first/second allele, no strand required.

                  samtools pileup does not provide strand information, does it matter that I just give 1 or -1 for sift to run? thanks.

                  Comment


                  • #10
                    Originally posted by bair View Post
                    Thanks for the information which I'm looking for. Sift needs SNPs' direction while gvs only needs ref_allele, first/second allele, no strand required.

                    samtools pileup does not provide strand information, does it matter that I just give 1 or -1 for sift to run? thanks.
                    For SIFT you can just mark all SNPs as forward strand if the strand is not known.

                    "Orientation:
                    Use 1 for positve strand and -1 for negative strand. If orientation is not known, use 1 as default."
                    The one trick I noticed is that you must always include the known reference allele as the first allele in the allele column otherwise it defaults to reporting a synonymous SNP. I caught this issue after two samples had the identical mutation but one was heterozygous with the reference allele (Call nonsynonymous) and the other was homozygous for the mutation (Called synonymous).

                    Comment


                    • #11
                      Thank you Jon, the trick is very helpful, I need to be careful to modify my input file.
                      Actually I have not setup sift yet, may worth to compare the outputs from sift and gvs.

                      Comment


                      • #12
                        Originally posted by Margot View Post
                        I uploaded the list of SNPs to Galaxy and used tools there to filter out all dbSNPs and to select only the SNPs that are in coding exons. Now I would like to determine whether the each SNP is synonymous or nonsynonymous.
                        Late answer I know, but this is a very common requirement so this may help someone. Unfortunately Galaxy lacks a simple SNV annotation tool, but with some work you *can* separate your synonymous and nonsynonymous Galaxy (hint: read the help text for the "Mutate Codons" tool).

                        Or you could use a variant annotation tool e.g.:
                        http://gvs.gs.washington.edu/SeattleSeqAnnotation/
                        http://www.svaproject.org/
                        http://www.openbioinformatics.org/annovar/
                        http://seqant.genetics.emory.edu/
                        http://www.ualberta.ca/~stothard/dow...SNP/index.html

                        Comment


                        • #13
                          Originally posted by BetterPrimate View Post
                          Late answer I know, but this is a very common requirement so this may help someone. Unfortunately Galaxy lacks a simple SNV annotation tool, but with some work you *can* separate your synonymous and nonsynonymous Galaxy (hint: read the help text for the "Mutate Codons" tool).

                          Or you could use a variant annotation tool e.g.:
                          http://gvs.gs.washington.edu/SeattleSeqAnnotation/
                          http://www.svaproject.org/
                          http://www.openbioinformatics.org/annovar/
                          http://seqant.genetics.emory.edu/
                          http://www.ualberta.ca/~stothard/dow...SNP/index.html
                          Thank you very much.

                          Comment


                          • #14
                            Use the latest ANNOVAR program... it's great!

                            Comment


                            • #15
                              The ensembl snp effect predictor is very good for this

                              http://bioinformatics.oxfordjournals...6/16/2069.full
                              http://www.ensembl.org/Homo_sapiens/...loadVariations

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Genetic Variation in Immunogenetics and Antibody Diversity
                                by seqadmin



                                The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
                                11-06-2024, 07:24 PM
                              • seqadmin
                                Choosing Between NGS and qPCR
                                by seqadmin



                                Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                                10-18-2024, 07:11 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 11-08-2024, 11:09 AM
                              0 responses
                              222 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 11-08-2024, 06:13 AM
                              0 responses
                              163 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 11-01-2024, 06:09 AM
                              0 responses
                              80 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 10-30-2024, 05:31 AM
                              0 responses
                              27 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X