No announcement yet.

Where to download all exon locations?

  • Filter
  • Time
  • Show
Clear All
new posts

  • Where to download all exon locations?


    I am doing some exome sequencing. Since the capture region is not completely accurate. I would like to compare it with all exon regions to remove the SNP calls outside the exon region.

    Does anyone know where I can find a database or file or all exons with starting and ending location for hg18?


  • #2
    Hi foxyg,

    The UCSC Browser can give you that information:

    Go to the table browser,choose hg18 as assembly
    choose group: Genes and Gene Prediction tracks,
    choose track: UCSC Genes (more isoforms, less accurate) or RefSeqGenes (less isoforms, more accurate)
    choose: output format: BED (or any other format you'd like, but BED can be displayed within the genome browser)
    enter some file name in the text box below and click get output,
    you then get referred to another page, where you may choose to get all exons, additional intron coordinates, jsut coding exons, 5'UTR exons, ...)

    That's it.

    The BED format contains one line per exon (tab-delimited):
    <chromosome><start position><end position><identifier><score, always 0 for the exons as they do not need a score><+/- strand>

    Hope that helps,


    • #3

      did you get results from that! I was looking at something similar, and there are quite a few exons that are not covered by agilent target regions it seems!


      • #4
        The Ensembl API is also very handy for these kinds of tasks, or if you want to avoid scripting all together then BioMART is another option.


        • #5
          exon_coords has more bases than Agilent target region?

          I used a different method to generate the exon_coords.bed file from the UCSC refflat.txt file. I used the script from Illumina casava pipeline. This produced a total of 65M bases worth of exonic coordinates. The region targetted by Agilent sure select is 50M bases.

          How are you handling this?


          • #6
            hg18 versus hg19 for exome capture data

            Has anyone looked in detail how the genome build chosen affects the coverage / depth? hg18 vs hg19?

            Also, anyone used human exome capture kit for mouse?


            • #7
              A nice way to download exon coords. But when I download BED file, how to get normal names for genes/exons in its corresponding column (like HOX, SSBP1, etc..)?