Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • [Announcement] Testers needed for new peak annotation tool

    Hi,

    I am developing a new c++ based open-source tool (peak-tool) to annotate peaks from a bed file using gencode annotation database.

    Once the gencode database is loaded and processed, each bed line lookup takes a constant time because of an indexing data structure.

    Here are example annotation:

    Bed:
    chr14 57735721 57735722 MACS_peak_13241 1159.23

    Annotated:
    chr14 57735721 MACS_peak_13241 1159.23 PROMOTER AP5M1 + AP5M1-001 protein_coding 57735627 94
    chr14 57735721 MACS_peak_13241 1159.23 PROMOTER EXOC5 - EXOC5-001 protein_coding 57735726 5


    Bed:
    chr1 12556659 12556660 MACS_peak_330 1733.05

    Annotated:
    chr1 12556659 MACS_peak_330 1733.05 PROMOTER VPS13D + VPS13D-012 retained_intron 12557280 -621


    Bed:
    chr1 1778750 1778751 MACS_peak_51 102.12

    Annotated:
    chr1 1778750 MACS_peak_51 102.12 INTRON GNB1 - GNB1-001 protein_coding 1822495 43745



    Can you please test this tool and provide feedback for further improvements?

    Here's the link to github
    https://github.com/goxed/peak-tool

    Right now the tool can annotate only from Bed files (human / hg19) using annotations from Gencode database (included in the git repo)

    The tool needs 16GB RAM on MAC OS X (10.9.x or greater) systems and >=20GB RAM on Linux systems (16GB if you use ZRAM memory compression or a very fast SSD swap)
    Last edited by amitra; 05-07-2015, 01:46 PM.

  • #2
    What's the benefit of this over things like bedtools or bedops? Also, it looks like you're actually annotating the midpoint of peaks, rather than peaks themselves (e.g., https://github.com/goxed/peak-tool/b....cpp#L496-L498). That would seem rather problematic.

    Comment


    • #3
      Originally posted by dpryan View Post
      What's the benefit of this over things like bedtools or bedops? Also, it looks like you're actually annotating the midpoint of peaks, rather than peaks themselves (e.g., https://github.com/goxed/peak-tool/b....cpp#L496-L498). That would seem rather problematic.
      Thanks for the feedback!

      I would imagine this to be a tool specifically for converting peaks to gene names. (currently peak summits, but I plan on adding more features including coverage in future, to address the issue that you mention).

      The tool uses an in-memory index of the gencode features, to make the search extremely fast, and is suitable for large bed files.
      Last edited by amitra; 05-07-2015, 01:40 PM.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM
      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      25 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      27 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      24 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      52 views
      0 likes
      Last Post seqadmin  
      Working...
      X