Seqanswers Leaderboard Ad

**GenoMax** · 08-10-2014, 05:16 PM

When you do the blast at NCBI on the blast results page you have the option of saving the alignments as an XML file. See this post for a screenshot: http://seqanswers.com/forums/showpos...92&postcount=7

**PurplePancake** · 08-10-2014, 05:26 PM

Thank you!

Thanks so much for the reply!

I'm actually unsure how to put my whole list into BLAST NCBI as well!

(

If you know of how to do that, I will hopefully be able to reach the results page you kindly provided the screenshot for! Thank you!!!!

**GenoMax** · 08-10-2014, 05:36 PM

Those numbers you posted in this thread (http://seqanswers.com/forums/showthread.php?t=45711) are not going to work at NCBI. You will need to parse out the protein sequences from this file http://marinegenomics.oist.jp/genome...0.1.prot.fa.gz.

If the ID's you have match ones in this protein fasta sequence file then you can use faSomeRecords program from Kent utilities to extract the protein sequences of your interest (http://seqanswers.com/forums/showpos...0&postcount=13).

You can then use them for the Blast search at NCBI.

**PurplePancake** · 08-10-2014, 05:56 PM

Thanks so much for your help!

I am not sure how to parse out the proteins, though. I need to do this pretty fast (unfortunately). Is this a simple process? Any scripts online that I could borrow?

This would make things much easier, because then I could just work witht he proteins in BLAST.... Thanks again...

**GenoMax** · 08-10-2014, 06:04 PM

I posted the procedure in #4 above. Here is a step-wise version. You would need to have access to a linux machine (or OS X) for this to work though.

1. You will need to download the protein sequence file and then gunzip it (uncompress).

2. Have your ID's of interest in a text file.

3. Run this program http://hgdownload.soe.ucsc.edu/admin.../faSomeRecords (this is the linux version) like below.

Code:

$   faSomeRecords protein_file.fa yourlistFile output_with_proteins_of_interest.fa

**PurplePancake** · 08-10-2014, 06:05 PM

Thanks. Does MacOSX count as Linux?

**GenoMax** · 08-10-2014, 06:07 PM

Originally posted by PurplePancake View Post

Thanks. Does MacOSX count as Linux?

OS X is a certified variant of unix.

Use the Mac version of the faSomeRecords program in that case in step 3 above: http://hgdownload.soe.ucsc.edu/admin.../faSomeRecords

**PurplePancake** · 08-10-2014, 06:12 PM

Thanks so much! I did as you said, but get the error

"-bash: faSomeRecords: command not found"

I am *really bad* with Linux, and especially installation. I hope to take a course in two semesters so I can do this... because this happens a lot, and I never figure it out...

(

**PurplePancake** · 08-10-2014, 06:18 PM

Basically, all I did was download the faSomeRecords file, and moved the other two files into the Download folder. Then I typed:

faSomeRecords prot.fa geneList.rtf outProteins.fa

I know there are additional steps to do to "prepare the command"? But I always mess things up, and really freeze when it comes to installation....

**PurplePancake** · 08-10-2014, 06:41 PM

Hello,

I don't feel that I really changed much, but for some reason, there is no error any more. I just did this instead:

./faSomeRecords prot.fa geneList.odt outProteins.fa

There is no error, but there is also nothing in outProteins.fa.

Does this mean there is not enough information to determine proteins and genes identification in geneList?\

Thanks!!

**PurplePancake** · 08-10-2014, 06:50 PM

Also, is there any reason why nothing would match the proteins? I had asked similar questions elsewhere, and was told that since coral was so old, and branched off before mammals, then some of this RNAseq can be difficult/impossible?

Okay, I will stop asking so many questions now

**GenoMax** · 08-11-2014, 04:39 AM

The ID's you posted in the other thread do not seem to match the protein ID's for both sets here: http://marinegenomics.oist.jp/genome...s?project_id=3 What file did you get your ID's from?

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Yesterday, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin Yesterday, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

Generate .xml file from BLAST (Blast2Go)

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News