Announcement

Collapse
No announcement yet.

Add count numbers to headers in a fasta file

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • syfo
    replied
    A short one:

    Code:
    awk '/^>/{$0=$0"_"(++i)}1' infile

    Leave a comment:


  • Heisman
    replied
    Originally posted by wieni View Post
    Here a quick and dirty solution in python - was still missing :-)


    #!/usr/bin/env python

    import re
    import string
    import sys


    infile = open(sys.argv[1])
    data = infile.readlines()
    infile.close()

    outfile = open(sys.argv[2], "w")
    c = 1
    l = 1
    for i in data:
    i = re.sub("\n|\r", "", i)
    if c%2 != 0:
    outfile.write(i+"_" +str(l) +"\n")
    l+=1
    else:
    outfile.write(i +"\n")
    c += 1
    outfile.close()


    save the upper code in a file called for example "numberFasta.py"
    on a terminal call the program with: python numberFasta.py <yourInfile> <outfilename>
    You can use the "code" tags to make this work (surround the code with [code ] and [/code ] (but no spaces):

    Code:
    #!/usr/bin/env python
    
    import re
    import string
    import sys
    
    
    infile = open(sys.argv[1])
    data = infile.readlines()
    infile.close()
    
    outfile = open(sys.argv[2], "w")
    c = 1
    l = 1
    for i in data:
        i = re.sub("\n|\r", "", i)
        if c%2 != 0:
            outfile.write(i+"_" +str(l) +"\n")
            l+=1
        else:
            outfile.write(i +"\n")
        c += 1
    outfile.close()

    Leave a comment:


  • wieni
    replied
    ah..and correct the indention - was lost here...

    Leave a comment:


  • wieni
    replied
    Here a quick and dirty solution in python - was still missing :-)


    #!/usr/bin/env python

    import re
    import string
    import sys


    infile = open(sys.argv[1])
    data = infile.readlines()
    infile.close()

    outfile = open(sys.argv[2], "w")
    c = 1
    l = 1
    for i in data:
    i = re.sub("\n|\r", "", i)
    if c%2 != 0:
    outfile.write(i+"_" +str(l) +"\n")
    l+=1
    else:
    outfile.write(i +"\n")
    c += 1
    outfile.close()


    save the upper code in a file called for example "numberFasta.py"
    on a terminal call the program with: python numberFasta.py <yourInfile> <outfilename>

    Leave a comment:


  • gsgs
    replied
    I'd just write a simple program to do it

    5 min ?

    Leave a comment:


  • Giorgio C
    replied
    I tried to google it, but couldn't find what I was looking for. Btw the link you posted seems to be good...THANKS a lot!

    Leave a comment:


  • Heisman
    replied
    Google is your friend in situations like these:

    http://www.linuxquestions.org/questi...f-line-803625/

    Leave a comment:


  • Giorgio C
    started a topic Add count numbers to headers in a fasta file

    Add count numbers to headers in a fasta file

    Hi all,

    I have a fasta file with the same header for each sequence, I would like to add natural numbers at the end of each line:

    >OakDna
    ACTCTAAATCAGTGCGAG...
    >OakDna
    AAAAACCCTTTACACTTT...
    >OakDna
    CTCTAAACCTTTAACCTT..
    etc.

    I want something like this:

    >OakDna_1
    ACTCTAAATCAGTGCGAG...
    >OakDna_2
    AAAAACCCTTTACACTTT...
    >OakDna_3
    CTCTAAACCTTTAACCTT..
    etc.
    >OakDna_n
    ACTCATCCAAAACTTTTT..

    Where n is the last number of the sequence in the file.

    Any quick suggestion?

    Thanks in advance,
    Giorgio
Working...
X