Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • gtduarte
    Junior Member
    • Jan 2016
    • 5

    Script for creating GO annotation file from Interproscan output

    Dear comunity,

    I am working with a non-model organism, thus I have to use alternative approaches for many of the analyses. At the moment I want to run the GO enrichment analysis of my differentially expressed transcripts using BiNGO, and for that I have to load my own annotation file of the transcriptome assembly which I created with Interproscan. The output of IPRS can be found here: https://github.com/ebi-pf-team/inter...example-output

    So basically, using the IPRS output I have to create one association per line of the transcript ID that is in the 1st column with the GO term that is 14th column. The issue is that for many transcripts there are different GO terms associated, while others have none, as for instance:

    transcript_1 ...columns_2-13... GO:0004601|GO:0006979|GO:0020037|GO:0055114
    transcript_1 ...columns_2-13... GO:0004601|GO:0006979|GO:0020037|GO:0055114
    transcript_1 ...columns_2-13...
    transcript_1 ...columns_2-13... GO:0004601|GO:0055114
    transcript_1 ...columns_2-13... GO:0004601|GO:0042744
    transcript_2 ...columns_2-13...
    transcript_2 ...columns_2-13... GO:0055085

    And here is how the the custom annotation file should be:

    transcript_1 = 0004601
    transcript_1 = 0006979
    transcript_1 = 0020037
    transcript_1 = 0055114
    transcript_1 = 0042744
    transcript_2 = 0055085

    Please, can someone help me with that? It wouldn't be a problem if the output of the script generates reduntant lines really, I can remove duplicated values later.

    Best regards,

    Gustavo
  • neavemj
    Member
    • Feb 2014
    • 58

    #2
    Hi Gustavo,

    I've attached a little python script that should do what you want. I had to name it "rearrange_go.txt" because of seqanswers restrictions - just rename it "rearrange_go.py".

    First open the python file in a text editor and replace INPUT_FILE_NAME with the name of your file. Then put the script in the same directory as your file and run the following:

    python rearrange_go.py

    Another file called 'custom_annotation.txt' should be created. The script assumes you have python installed (mac and linux usually do by default). It also assumes that column 14 is always GO terms, and not anything else, though it can be blank.

    Give it a go and let me know if it works!

    Cheers,

    Matt.
    Attached Files

    Comment

    • gtduarte
      Junior Member
      • Jan 2016
      • 5

      #3
      Hi Matt,

      It worked, thanks a lot!!! You saved my day xD

      Cheers,

      Gustavo

      Comment

      • neavemj
        Member
        • Feb 2014
        • 58

        #4
        Excellent! Glad it worked

        Comment

        Latest Articles

        Collapse

        • SEQadmin2
          Nine Things a Sample Prep Scientist Thinks About Before Sequencing
          by SEQadmin2


          I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


          Here are nine questions we think about, in roughly the order they matter, before...
          06-18-2026, 07:11 AM
        • SEQadmin2
          From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
          by SEQadmin2


          Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


          The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
          ...
          06-02-2026, 10:05 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by SEQadmin2, 06-17-2026, 06:09 AM
        0 responses
        24 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-09-2026, 11:58 AM
        0 responses
        41 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-05-2026, 10:09 AM
        0 responses
        48 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-04-2026, 08:59 AM
        0 responses
        49 views
        0 reactions
        Last Post SEQadmin2  
        Working...