Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • foxyg
    Member
    • May 2010
    • 54

    Translate coordinates between 2 references

    Hi,

    Are there any tools to translate the coordinates between 2 reference fasta files, such as HG18 and HG19. I need a tool which if I give 2 references and a indel file, and a list a of locations in 1 reference, then return the according locations in the other reference.

    I would hate to have to write that myself.

    Thanks
  • Chipper
    Senior Member
    • Mar 2008
    • 323

    #2
    ucsc liftOver

    Comment

    • dawe
      Senior Member
      • Apr 2009
      • 258

      #3
      As long as there are liftOver chain (i.e. the "dictionary"), you can use liftOver, either online or by downloading the binary and the "dictionary". It works with intervals (BED files), so if you have to translate a wiggle file, convert it into bedgraph first.

      d

      Comment

      • malachig
        Senior Member
        • Aug 2010
        • 117

        #4
        To elaborate. In order to use 'liftOver' you need to download the executable tool and the right dictionary file (i.e. the one that corresponds to your current and target genome versions). Links:

        Liftover executable & Liftover files.

        Find your genome of interest, then follow the appropriate 'LiftOver Files' link, then find the file that corresponds to the two genome builds of interest (e.g. hg18ToHg19.over.chain.gz)

        Comment

        • hengdai
          Junior Member
          • Oct 2010
          • 2

          #5
          I have a similar question. I ran some sequences using tophat/bowtie based on NCBI ref v37 instead of UCSChg19. So the files (sam, wig etc) have Ids like NC_0000001 instead of chr1. Unfortunately then I realized that these Ids doesn't work with IGV.
          I heard that NCBI v37.3 and UCSC 19 are identical, so can I just use perl to run a replace on these text files? (I guess it is similar to liftover, but I did not see a dictionary for ncbi->hg)

          Thanks
          Heng

          Comment

          • malachig
            Senior Member
            • Aug 2010
            • 117

            #6
            Liftover is primarily concerned with converting the coordinates of features on one genome build to the corresponding coordinates on a different genome build of the same species (or orthologous position on a different species build). The difference in naming of the chromosomes themselves is due to different conventions used by UCSC versus NCBI. You should be able to remap the names (but check to see if they have a one-to-one relationship). You can confirm that your NCBI build corresponds to a particular UCSC build here:
            UCSC Releases

            When dealing with genome builds from different sources, particularly human, it is important to think about how the source (NCBI, UCSC, Ensembl) deals with the haplotype chromosomes and unassembled contigs (those pieces of the genome that still have not been assigned to a chromosome). For these sequences, figuring out the mapping of names is not always obvious and worse, there isn't necessarily a one-to-one relationship. For example, NCBI may keep unassembled contigs from chr1 as separate entries whereas UCSC may place them in a 'chrUn' entry. Thankfully, the bulk of the human build (corresponding to chromosome 1-22, x, y, and the mitochondrial genome) should be consistent and one-to-one. UCSC provides detailed descriptions of the idiosyncrasies of each build on their website under 'assembly details' for each assembly.

            Comment

            • hengdai
              Junior Member
              • Oct 2010
              • 2

              #7
              malachig,

              I just compared h_sapiens_37_asm.fa and hg19.fa, they are indeed the same except names. Each with 25 fasta sequences so they are one-to-one. So I guess the unmapped ones are at least consistent for this version of human reference genome.

              Thanks for the update. I guess I will just ran a replace for my text output now since it took me several days to ran my program. (I only have very limited nodes)

              Comment

              • foxyg
                Member
                • May 2010
                • 54

                #8
                I download the liftover and chain file, the instruction says

                liftOver oldFile map.chain newFile unMapped

                If I want to translate 1 position in bed format chr1:344-344, how do I do this, what is oldFile and unMapped?

                Comment

                • d17
                  Member
                  • Sep 2008
                  • 27

                  #9
                  Originally posted by foxyg View Post
                  I download the liftover and chain file, the instruction says

                  liftOver oldFile map.chain newFile unMapped

                  If I want to translate 1 position in bed format chr1:344-344, how do I do this, what is oldFile and unMapped?
                  oldFile is the file of coordinates you want to convert (typically BED)
                  unMapped is a file that is created when you run liftOver: it contains those features in oldFile that did not lift over to the new coordinates, and gives the reason why (e.g. partially deleted)

                  Comment

                  Latest Articles

                  Collapse

                  • GATTACAT
                    Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                    by GATTACAT
                    Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                    07-01-2026, 11:43 AM
                  • SEQadmin2
                    Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                    by SEQadmin2


                    I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                    Here are nine questions we think about, in roughly the order they matter, before...
                    06-18-2026, 07:11 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by SEQadmin2, Yesterday, 11:08 AM
                  0 responses
                  6 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-30-2026, 05:37 AM
                  0 responses
                  11 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-26-2026, 11:10 AM
                  0 responses
                  19 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-17-2026, 06:09 AM
                  0 responses
                  53 views
                  0 reactions
                  Last Post SEQadmin2  
                  Working...