Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • lre1234
    Senior Member
    • Aug 2011
    • 110

    SNP Allele frequency data

    Hi All,
    I have a list of a few thousands SNP that I am trying to get the population allele frequency data for. Ideally, I would like the frequencies for all 26 of the populations within the 1000 genomes for these. I can't seem to find the information anywhere. Does anyone know where I may be able to get this?
    Thanks for your help
  • Emily_Ensembl
    Member
    • Dec 2013
    • 12

    #2
    You could try the Ensembl REST API variation POST endpoint. You'd need to chunk your list of variants into 200s but it would be relatively easy.

    Comment

    • lre1234
      Senior Member
      • Aug 2011
      • 110

      #3
      Thanks Emily_Ensembl.
      This API does work, but I am still looking to get the allele frequencies for all 26 subpopulations of the 1000 Genomes. This API seems to only give the main continental ancestries. I know that I can probably download all of the genotype information and calculate these myself, but I would have thought there would be a simple way too download the information from somewhere.

      Comment

      • Emily_Ensembl
        Member
        • Dec 2013
        • 12

        #4
        pops=1 should get you all 26 populations

        Comment

        • lre1234
          Senior Member
          • Aug 2011
          • 110

          #5
          Then I must be doing something wrong (quite likely ) . I'm trying it with the wget example:

          Code:
          wget -q --header='Content-type:application/json' --header='Accept:application/json' \
          --post-data='{ "ids" : ["rs56116432" ] }' \
          'http://rest.ensembl.org/variation/homo_sapiens' -O temp.out pops=1
          And only get this as the output:

          {"rs56116432":{"ambiguity":"Y","ancestral_allele":null,"minor_allele":"T","mappings":[{"allele_string":"C/T","start":133256042,"coord_system":"chromosome","assembly_name":"GRCh38","end":133256042,"strand":1,"seq_region_name":"9","location":"9:133256042-133256042"},{"strand":1,"seq_region_name":"CHR_HG2030_PATCH","end":133256189,"assembly_name":"GRCh38","location":"CHR_HG2030_PATCH:133256189-133256189","allele_string":"C/T","coord_system":"chromosome","start":133256189}],"MAF":0.00259585,"most_severe_consequence":"missense_variant","synonyms":["NM_020469.2:c.689G>A","NP_065202.2.Gly230Asp"],"evidence":["Frequency","1000Genomes","ESP","ExAC","TOPMed","gnomAD"],"source":"Variants (including SNPs and indels) imported from dbSNP","var_class":"SNP","name":"rs56116432"}}
          I've also tried it with other SNPs just to see if it was a SNP specific thing, but get similar outputs.

          Comment

          • lre1234
            Senior Member
            • Aug 2011
            • 110

            #6
            Then I must be doing something wrong (highly probable ). I'm trying it with the wget example:

            Code:
            wget -q --header='Content-type:application/json' --header='Accept:application/json' \
            --post-data='{ "ids" : ["rs56116432" ] }' \
            'http://rest.ensembl.org/variation/homo_sapiens' -O temp.out pops=1
            And I get:

            {"rs56116432":{"ambiguity":"Y","ancestral_allele":null,"minor_allele":"T","mappings":[{"allele_string":"C/T","start":133256042,"coord_system":"chromosome","assembly_name":"GRCh38","end":133256042,"strand":1,"seq_region_name":"9","location":"9:133256042-133256042"},{"strand":1,"seq_region_name":"CHR_HG2030_PATCH","end":133256189,"assembly_name":"GRCh38","location":"CHR_HG2030_PATCH:133256189-133256189","allele_string":"C/T","coord_system":"chromosome","start":133256189}],"MAF":0.00259585,"most_severe_consequence":"missense_variant","synonyms":["NM_020469.2:c.689G>A","NP_065202.2.Gly230Asp"],"evidence":["Frequency","1000Genomes","ESP","ExAC","TOPMed","gnomAD"],"source":"Variants (including SNPs and indels) imported from dbSNP","var_class":"SNP","name":"rs56116432"}}
            Unless I'm missing something in the output, all I see is the total MAF and not broken down by populations. I've also tried this with other SNPs and get similar results.

            Comment

            • Emily_Ensembl
              Member
              • Dec 2013
              • 12

              #7
              Try adding pops=1 to the URL, like:

              Code:
              wget -q --header='Content-type:application/json' --header='Accept:application/json' --post-data='{ "ids" : ["rs56116432" ] }' 'http://rest.ensembl.org/variation/homo_sapiens?pops=1' -O temp.out

              Comment

              • lre1234
                Senior Member
                • Aug 2011
                • 110

                #8
                That works. Thanks. I have a couple thousand to get the frequency for, so I can write some sort of wrap to let it go on and do them all.

                One last question, is there a way to limit the output to just the 1000 genomes populations only?

                Comment

                • Emily_Ensembl
                  Member
                  • Dec 2013
                  • 12

                  #9
                  No, you would need to parse your query response to limit the data in that way.

                  Comment

                  Latest Articles

                  Collapse

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by SEQadmin2, 06-05-2026, 10:09 AM
                  0 responses
                  14 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-04-2026, 08:59 AM
                  0 responses
                  24 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-02-2026, 12:03 PM
                  0 responses
                  29 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-02-2026, 11:40 AM
                  0 responses
                  23 views
                  0 reactions
                  Last Post SEQadmin2  
                  Working...