Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Liam_Gallagher
    Member
    • Oct 2011
    • 18

    Combine annotations

    Hi all...could you suggest me a way to combine annotations coming from different files and related to the same subject (e.g. the same genomic coordinate) ? I was thinking about Annovar (tool to annotate nucleotide variants, indels, etc.) which produces a lot of files...and usually I use Excel to combine all of this annotations in one final document containing all the informations.
    This method works, but it takes me a lot of time....
    Thank you!

    Another question: how can I annotate a BED file containing a list of genomic coordinates, with corrisponding Gene Symbols ? I tried using the UCSC genome browser (tables section), but I was able to annotate just with ucsc gene names, and not with normal names (like Hugo symbol names). BEDTools can do it ? thanks!
  • jxchong
    Junior Member
    • Feb 2013
    • 6

    #2
    Annovar has a summarize_annovar.pl script you can take a look at...

    Personally, I combine the annotations using my own scripts. Excel is definitely not that way to do it!! You're just going to end up making mistakes, plus Excel can't handle more than a handful of data sets.

    Comment

    • Liam_Gallagher
      Member
      • Oct 2011
      • 18

      #3
      Yes...I know about summarize_annovar, but I was not able to get it working...because of an error. Today I tried to redownload some database, and now it works fine!
      Fortunately I have just some variant to annotate (we look just at some gene...it's not a big sequencing project..) and to do it I use a macro that I've written...and do the work automatically. But to combine all the input file to process with my macro, it takes time....so i was wondering if you know other tools that do it..however thank you for the suggestions!

      What about my second question (to annotate a bed file with hgnc gene symbols) ?! :-)


      Originally posted by jxchong View Post
      Annovar has a summarize_annovar.pl script you can take a look at...

      Personally, I combine the annotations using my own scripts. Excel is definitely not that way to do it!! You're just going to end up making mistakes, plus Excel can't handle more than a handful of data sets.

      Comment

      • jxchong
        Junior Member
        • Feb 2013
        • 6

        #4
        The other tool would be writing a script yourself to do it (usually Perl or Python)


        Comment

        • AlexReynolds
          Member
          • Feb 2013
          • 45

          #5
          Originally posted by Liam_Gallagher View Post
          Another question: how can I annotate a BED file containing a list of genomic coordinates, with corrisponding Gene Symbols ? I tried using the UCSC genome browser (tables section), but I was able to annotate just with ucsc gene names, and not with normal names (like Hugo symbol names). BEDTools can do it ? thanks!
          BEDOPS is another suite of tools for manipulating BED data.

          You can use the bedmap tool to annotate genomic regions with IDs or other data from other sets (gene names, etc.).

          As an example, if you have regions in a sorted file called Regions.bed and your genes in a sorted file called Genes.bed (where gene IDs are in the fourth column, per UCSC specification), the file AnnotatedRegions.bed will contain your answer:

          Code:
          $ bedmap --echo --echo-map-id --delim '\t' Regions.bed Genes.bed > AnnotatedRegions.bed
          The only requirement is that the inputs are sorted. Use the sort-bed utility for this purpose, e.g.:

          Code:
          $ sort-bed UnsortedRegions.bed > SortedRegions.bed

          Comment

          • Liam_Gallagher
            Member
            • Oct 2011
            • 18

            #6
            Originally posted by AlexReynolds View Post
            BEDOPS is another suite of tools for manipulating BED data.

            You can use the bedmap tool to annotate genomic regions with IDs or other data from other sets (gene names, etc.).

            As an example, if you have regions in a sorted file called Regions.bed and your genes in a sorted file called Genes.bed (where gene IDs are in the fourth column, per UCSC specification), the file AnnotatedRegions.bed will contain your answer:

            Code:
            $ bedmap --echo --echo-map-id --delim '\t' Regions.bed Genes.bed > AnnotatedRegions.bed
            The only requirement is that the inputs are sorted. Use the sort-bed utility for this purpose, e.g.:

            Code:
            $ sort-bed UnsortedRegions.bed > SortedRegions.bed
            Thank you very much for your suggestions....they are very helpful!!

            Comment

            Latest Articles

            Collapse

            • SEQadmin2
              Nine Things a Sample Prep Scientist Thinks About Before Sequencing
              by SEQadmin2


              I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

              Here are nine questions we think about, in roughly the order they matter, before...
              06-18-2026, 07:11 AM
            • SEQadmin2
              From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
              by SEQadmin2


              Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


              The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
              ...
              06-02-2026, 10:05 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, 06-17-2026, 06:09 AM
            0 responses
            34 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-09-2026, 11:58 AM
            0 responses
            99 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-05-2026, 10:09 AM
            0 responses
            120 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-04-2026, 08:59 AM
            0 responses
            113 views
            0 reactions
            Last Post SEQadmin2  
            Working...