Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to tell SNPs synonymous v. nonsynonymous

    Hi,
    I have aligned my Illumina reads to hg19 and used SAMtools to give me a list of SNPs. I uploaded the list of SNPs to Galaxy and used tools there to filter out all dbSNPs and to select only the SNPs that are in coding exons. Now I would like to determine whether the each SNP is synonymous or nonsynonymous. Is there an automated way of doing this? It will take me a long time if I need to enter each SNP into the UCSC browser to see whether it changes the amino acid.

  • #2
    I have the same problem as well

    Comment


    • #3
      Finding the good ones

      Originally posted by Margot View Post
      Hi,
      I have aligned my Illumina reads to hg19 and used SAMtools to give me a list of SNPs. I uploaded the list of SNPs to Galaxy and used tools there to filter out all dbSNPs and to select only the SNPs that are in coding exons. Now I would like to determine whether the each SNP is synonymous or nonsynonymous. Is there an automated way of doing this? It will take me a long time if I need to enter each SNP into the UCSC browser to see whether it changes the amino acid.
      Hi Margot,

      I use SIFT (http://sift.jcvi.org/) to do this and I check it does support hg19/GRCh37. It works relatively fast and is fairly user friendly, though I wish they didn't use * for nonsense codons...really messes up my database but you can replace them in excel using find/replace with find=~* and replace=X

      Hope that helps,

      Jonathan

      Comment


      • #4
        another tool is http://gvs.gs.washington.edu/SeattleSeqAnnotation/

        Comment


        • #5
          Looks like the SeattleSeqAnnotation only supports NCBI36/hg18 1-indexed and doesn't support the new GRCh37/hg19 genome build, but that was only on a quick check so I might be wrong. Anyone with experience with both?

          Comment


          • #6
            As far as I know you are right see http://gvs.gs.washington.edu/Seattle...lpHowToUse.jsp

            Comment


            • #7
              Thanks for the help!

              Comment


              • #8
                I would suggest you to write a program by yourself.

                By definition, "synonymous" means that a change in a nucleotide position doesn't change the translated amino acid, whereas "non-synonymous" changes. So you need to decide the translational frame of the exon and the codon usage table of your organism and organelle, before you can determine whether each SNP is "synonymous" or "non-synonymous".
                Website: http://www.fengfengzhou.org/FengfengZhou/

                Comment


                • #9
                  Thanks for the information which I'm looking for. Sift needs SNPs' direction while gvs only needs ref_allele, first/second allele, no strand required.

                  samtools pileup does not provide strand information, does it matter that I just give 1 or -1 for sift to run? thanks.

                  Comment


                  • #10
                    Originally posted by bair View Post
                    Thanks for the information which I'm looking for. Sift needs SNPs' direction while gvs only needs ref_allele, first/second allele, no strand required.

                    samtools pileup does not provide strand information, does it matter that I just give 1 or -1 for sift to run? thanks.
                    For SIFT you can just mark all SNPs as forward strand if the strand is not known.

                    "Orientation:
                    Use 1 for positve strand and -1 for negative strand. If orientation is not known, use 1 as default."
                    The one trick I noticed is that you must always include the known reference allele as the first allele in the allele column otherwise it defaults to reporting a synonymous SNP. I caught this issue after two samples had the identical mutation but one was heterozygous with the reference allele (Call nonsynonymous) and the other was homozygous for the mutation (Called synonymous).

                    Comment


                    • #11
                      Thank you Jon, the trick is very helpful, I need to be careful to modify my input file.
                      Actually I have not setup sift yet, may worth to compare the outputs from sift and gvs.

                      Comment


                      • #12
                        Originally posted by Margot View Post
                        I uploaded the list of SNPs to Galaxy and used tools there to filter out all dbSNPs and to select only the SNPs that are in coding exons. Now I would like to determine whether the each SNP is synonymous or nonsynonymous.
                        Late answer I know, but this is a very common requirement so this may help someone. Unfortunately Galaxy lacks a simple SNV annotation tool, but with some work you *can* separate your synonymous and nonsynonymous Galaxy (hint: read the help text for the "Mutate Codons" tool).

                        Or you could use a variant annotation tool e.g.:
                        http://gvs.gs.washington.edu/SeattleSeqAnnotation/
                        http://www.svaproject.org/
                        http://www.openbioinformatics.org/annovar/
                        http://seqant.genetics.emory.edu/
                        http://www.ualberta.ca/~stothard/dow...SNP/index.html

                        Comment


                        • #13
                          Originally posted by BetterPrimate View Post
                          Late answer I know, but this is a very common requirement so this may help someone. Unfortunately Galaxy lacks a simple SNV annotation tool, but with some work you *can* separate your synonymous and nonsynonymous Galaxy (hint: read the help text for the "Mutate Codons" tool).

                          Or you could use a variant annotation tool e.g.:
                          http://gvs.gs.washington.edu/SeattleSeqAnnotation/
                          http://www.svaproject.org/
                          http://www.openbioinformatics.org/annovar/
                          http://seqant.genetics.emory.edu/
                          http://www.ualberta.ca/~stothard/dow...SNP/index.html
                          Thank you very much.

                          Comment


                          • #14
                            Use the latest ANNOVAR program... it's great!

                            Comment


                            • #15
                              The ensembl snp effect predictor is very good for this

                              http://bioinformatics.oxfordjournals...6/16/2069.full
                              http://www.ensembl.org/Homo_sapiens/...loadVariations

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM
                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              13 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              18 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              15 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              43 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X