Seqanswers Leaderboard Ad

**maubp** · 10-11-2010, 03:00 AM

You have this:

Code:

>GIK1EHM01A0TLT length=84 xy=0302_1103 region=1 run=R_2010_06_08_08_48_07_
GTGTTTCTGTGTGGAGGTGTGTCTCTGTGGTGTGTGTGTCTGTGTGGGTG TACGTGTGTC
TCTGTCTGTGGTGTGTGTGTCTGT

Referring to http://222.73.178.238/mirtools/help.php if you use this you are telling mirTools this read occurred 84 times:

Code:

>GIK1EHM01A0TLT_x84
GTGTTTCTGTGTGGAGGTGTGTCTCTGTGGTGTGTGTGTCTGTGTGGGTG TACGTGTGTC
TCTGTCTGTGGTGTGTGTGTCTGT

You need to use the read count, which is probably one, when renaming the read:

Code:

>GIK1EHM01A0TLT_x1
GTGTTTCTGTGTGGAGGTGTGTCTCTGTGGTGTGTGTGTCTGTGTGGGTG TACGTGTGTC
TCTGTCTGTGGTGTGTGTGTCTGT

Do you know any scripting languages? e.g. Perl or Python

This Biopython script will probably do what you want...

Code:

from Bio import SeqIO
input_fasta = "original.fasta"
output_fasta = "fixed.fasta"
def fix_for_mirtools(records):
    for record in records:
        record.description=""
        record.id += "_x1"
        yield record
records = SeqIO.parse(input_fasta, "fasta")
count = SeqIO.write(fix_for_mirtools(records), output_fasta, "fasta")
print "Saved %i records" % count

**Giorgio C** · 10-11-2010, 03:58 AM

Thank you for your answer.
I know a little bit Python, ill' try with your suggests. I hoped that exists something similar for linux cos it's diffcult to use Python.

**maubp** · 10-11-2010, 05:22 AM

Originally posted by Giorgio C View Post

Thank you for your answer.
I know a little bit Python, ill' try with your suggests. I hoped that exists something similar for linux cos it's diffcult to use Python.

Python works fine on Linux - or did you mean you would like a command line based solution?

You can turn it into a simple command line script taking piped output if you want,

Code:

#!/usr/bin/env python
"""Quick script to read a FASTA file from stdin and write it to stdout,
formatting identifiers for mirTool assuming single read coverage."""
import sys
from Bio import SeqIO
def fix_for_mirtools(records):
    for record in records:
        record.description=""
        record.id += "_x1"
        yield record
records = SeqIO.parse(sys.stdin, "fasta")
count = SeqIO.write(fix_for_mirtools(records), sys.stdout, "fasta")
print "Saved %i records" % count

Then save that script (e.g. as fix_for_mirtools) and mark it as executable with chmod, then call it at the command line:

Code:

./fix_for_mirtools < original.fasta > fixed.fasta

or:

Code:

python fix_for_mirtools < original.fasta > fixed.fasta

Alternatively someone might suggest a one line trick using sed

**Giorgio C** · 10-11-2010, 05:35 AM

I'v tried like you say me to do:

from Bio import SeqIO
>>> input_fasta = "C:\Users\Giorgio Casaburi\Desktop\singleton.fna"
>>> output_fasta = "C:\Users\Giorgio Casaburi\Desktop\singletonfixed.fna"
>>> def fix_for_mirtools (records) :
for record in records:
record.description=""
record.id += "_x1"
yield record
records = SeqIO.parse(singleton.fna, "fasta")
count = SeqIO.write(fix_for_mirtools(records), singletonfixed.fna, "fasta")
print "Saved %i records" % count
>>>

So doesn't happen nothing. Is something else i need to do?

**maubp** · 10-11-2010, 05:41 AM

From your filenames you are using Windows - not Linux.

It looks like you are trying to cut and paste directly at the Python prompt, but the indentation is all wrong. Save the example as a python script file (a plain text file, usually with the extension .py) and run that. You can do this from within the IDLE GUI that comes with Python.

**Giorgio C** · 10-11-2010, 05:46 AM

Yes i tried it on windows cos there i have the Python package while with Vnc i'm working on a remote Pc of the centre where is linux installed and i don't know if is intalled Python and howevere i haven't the administration privilege to install it. I'm at the first arms with Python so is difficult to me understand what you say. I'll try. Thank you very much for your golden help

**maubp** · 10-11-2010, 05:49 AM

This may help: http://hkn.eecs.berkeley.edu/~dyoo/p...tro/index.html

**Giorgio C** · 10-11-2010, 06:16 AM

Sorry,
I'v read all, i'v tried but there is always something wrong. Syntax error, etc. I really don't know how to do. (Myfile.fna is on the desktop).

**Giorgio C** · 10-11-2010, 06:44 AM

from Bio import SeqIO
input_fasta = "C:\Users\Giorgio Casaburi\Desktop\singleton.fna"
output_fasta = "C:\Users\Giorgio Casaburi\Desktop\singletonfixed.fna"
def fix_for_mirtools(records):
record.description=""
record.id += "_x1"
yield record
records = SeqIO.parse(input_fasta, "fasta")
count = SeqIO.write(fix_for_mirtools(records), output_fasta, "fasta")
print "Saved %i records" % count

(run module).....save....

and then:

IDLE 2.6.5
>>> ================================ RESTART ================================
>>>

Traceback (most recent call last):
File "C:/Python26/singleton", line 9, in <module>
count = SeqIO.write(fix_for_mirtools(records), output_fasta, "fasta")
File "C:\Python26\lib\site-packages\Bio\SeqIO\__init__.py", line 398, in write
count = writer_class(handle).write_file(sequences)
File "C:\Python26\lib\site-packages\Bio\SeqIO\Interfaces.py", line 271, in write_file
count = self.write_records(records)
File "C:\Python26\lib\site-packages\Bio\SeqIO\Interfaces.py", line 255, in write_records
for record in records:
File "C:/Python26/singleton", line 5, in fix_for_mirtools
record.description=""
NameError: global name 'record' is not defined
>>> i don't know where i wrong, can you know my error? Please

**maubp** · 10-11-2010, 06:46 AM

Do you have any programmers in your group/department? That would be the easiest way to get help. Once you have the basic skills it will be easier to get help online.

So you have this - I have added the [ code ] and [ /code ] tags for display:

Originally posted by Giorgio C View Post

Code:

from Bio import SeqIO
input_fasta = "C:\Users\Giorgio Casaburi\Desktop\singleton.fna"
output_fasta = "C:\Users\Giorgio Casaburi\Desktop\singletonfixed.fna"
def fix_for_mirtools(records):
    record.description=""
    record.id += "_x1"
    yield record
records = SeqIO.parse(input_fasta, "fasta")
count = SeqIO.write(fix_for_mirtools(records), output_fasta, "fasta")
print "Saved %i records" % count

You are missing the line 'for record in records', hence the error.

**Giorgio C** · 10-11-2010, 06:53 AM

Yes we have a bioinformatic group, but it's not very friendly, i am a Phd student at the first year, i wanted to try to do alone or with an online help. However i know the difficulty for you to explain this kind of things. Thank you very much for all your help.

**dschika** · 10-11-2010, 06:53 AM

one line trick:

sed 's/ length=.*$/_x1/g' your.fna

**maubp** · 10-11-2010, 06:56 AM

Originally posted by maubp View Post

Alternatively someone might suggest a one line trick using sed

Originally posted by dschika View Post

one line trick:

sed 's/ length=.*$/_x1/g' your.fna

I wondered how long it would take

Giorgio - sed is a command line tool which will probably be available on the Unix/Linux machine you have access to. Getting sed on Windows is more complicated.

**Giorgio C** · 10-11-2010, 07:05 AM

one line trick:

sed 's/ length=.*$/_x1/g' your.fna

Wonderful trick!!! Thank you very much

Topics	Statistics	Last Post
The Role of Enhancers in Defining Cell Fate by seqadmin Started by seqadmin, Today, 10:49 AM	0 responses 15 views 0 likes	Last Post by seqadmin Today, 10:49 AM
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 23 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 20 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM

Seqanswers Leaderboard Ad

Announcement

mirTools with 454 Data for non coding Rna analysis

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News