Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Where to download all exon locations?

    Hi,

    I am doing some exome sequencing. Since the capture region is not completely accurate. I would like to compare it with all exon regions to remove the SNP calls outside the exon region.

    Does anyone know where I can find a database or file or all exons with starting and ending location for hg18?

    Thanks

  • #2
    Hi foxyg,

    The UCSC Browser can give you that information:

    Go to the table browser,choose hg18 as assembly
    choose group: Genes and Gene Prediction tracks,
    choose track: UCSC Genes (more isoforms, less accurate) or RefSeqGenes (less isoforms, more accurate)
    choose: output format: BED (or any other format you'd like, but BED can be displayed within the genome browser)
    enter some file name in the text box below and click get output,
    you then get referred to another page, where you may choose to get all exons, additional intron coordinates, jsut coding exons, 5'UTR exons, ...)

    That's it.

    The BED format contains one line per exon (tab-delimited):
    <chromosome><start position><end position><identifier><score, always 0 for the exons as they do not need a score><+/- strand>

    Hope that helps,

    Comment


    • #3
      foxyg,

      did you get results from that! I was looking at something similar, and there are quite a few exons that are not covered by agilent target regions it seems!
      --
      bioinfosm

      Comment


      • #4
        The Ensembl API is also very handy for these kinds of tasks, or if you want to avoid scripting all together then BioMART is another option.

        http://www.ensembl.org/info/docs/api..._tutorial.html

        http://www.ensembl.org/biomart/martview

        Comment


        • #5
          exon_coords has more bases than Agilent target region?

          I used a different method to generate the exon_coords.bed file from the UCSC refflat.txt file. I used the nonoverlapping_exon_coords.pl script from Illumina casava pipeline. This produced a total of 65M bases worth of exonic coordinates. The region targetted by Agilent sure select is 50M bases.

          How are you handling this?

          Comment


          • #6
            hg18 versus hg19 for exome capture data

            Has anyone looked in detail how the genome build chosen affects the coverage / depth? hg18 vs hg19?

            Also, anyone used human exome capture kit for mouse?

            Comment


            • #7
              Hello,
              A nice way to download exon coords. But when I download BED file, how to get normal names for genes/exons in its corresponding column (like HOX, SSBP1, etc..)?

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              25 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              27 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              24 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              52 views
              0 likes
              Last Post seqadmin  
              Working...
              X