Seqanswers Leaderboard Ad

**maubp** · 09-10-2012, 02:11 AM

Are you using the -parse_seqids option? If so, try it without this. I only ever use this if my FASTA file identifiers follow the NCBI naming conventions.

It would be useful to show the command you used to run makeblastdb as that might help us to understand what you are doing.

**Tsuyoshi** · 09-10-2012, 02:22 AM

Originally posted by maubp View Post

Are you using the -parse_seqids option? If so, try it without this. I only ever use this if my FASTA file identifiers follow the NCBI naming conventions.

It would be useful to show the command you used to run makeblastdb as that might help us to understand what you are doing.

Dear Maubp,
Thanks for you reply.
Yes I used -parse_seqids, and followed your suggestion, without the -parse_seqids, another error showed up,
*******************************************************************
Error: (CArgException::eNoArg) Argument "dbtype". Mandatory value is missing: `String, `nucl', `prot''
Error: (CArgException::eNoArg) Application's initialization failed
*****************************************************************

The command I used was
makeblastdb -in CrFP.fasta -out CrFP

Thanks

**maubp** · 09-10-2012, 02:30 AM

That error is clear isn't it? You have to tell makeblastdb if your FASTA file is protein or nucleotides. i.e. either:

Code:

makeblastdb -in CrFP.fasta -out CrFP -dbtype nucl

or,

Code:

makeblastdb -in CrFP.fasta -out CrFP -dbtype prot

**Tsuyoshi** · 09-10-2012, 02:38 AM

Originally posted by maubp View Post

That error is clear isn't it? You have to tell makeblastdb if your FASTA file is protein or nucleotides. i.e. either:

Code:

makeblastdb -in CrFP.fasta -out CrFP -dbtype nucl

or,

Code:

makeblastdb -in CrFP.fasta -out CrFP -dbtype prot

YES!
What a stupid mistake I made. It succeeded now!

Thank you!

**maubp** · 09-10-2012, 02:41 AM

Originally posted by Tsuyoshi View Post

It succeeded now!

Oh good. Understanding the NCBI BLAST+ error messages gets easier with practice

**Tsuyoshi** · 09-10-2012, 02:45 AM

Originally posted by maubp View Post

Oh good. Understanding the NCBI BLAST+ error messages gets easier with practice

YEAP!

I couldn't agree with you anymore. Many thanks!

**Tsuyoshi** · 09-10-2012, 03:02 AM

Originally posted by maubp View Post

Oh good. Understanding the NCBI BLAST+ error messages gets easier with practice

HI Maubp,
But I still have a question about the protein ID, it seems like that there is no database name the proteins in that way, I mean, take several proteins as example, they are

C_1620015|156900
C_10830001|152917
C_2020008|159281
C_510029|166481
C_510029|166481
C_510029|166481
C_510029|166481

I do not think they are accession numbers for Chlamydomonas in NCBI, but I want to identify their correct or real NCBI accession numbers, do you have any idea about that?

**maubp** · 09-10-2012, 03:09 AM

That's a different question - the only way your sequences would have real NCBI accession numbers would be if they have already been submitted to one of the databases. I would explore the NCBI databases for this using Entrez search term "chlamydomonas[orgn]" and see if anything matches your dataset:

Search: chlamydomonas

http://www.ncbi.nlm.nih.gov/sites/gquery?term=chlamydomonas[orgn\

(square brackets in the URL confuse the forum software)

Or you could try BLAST'ing some of your sequences against the NR database to see if any give perfect matches?

**Tsuyoshi** · 09-10-2012, 03:12 AM

Originally posted by maubp View Post

That's a different question - the only way your sequences would have real NCBI accession numbers would be if they have already been submitted to one of the databases. I would explore the NCBI databases for this using Entrez search term "chlamydomonas[orgn]" and see if anything matches your dataset:

http://www.ncbi.nlm.nih.gov/sites/gq...=chlamydomonas[orgn]

Or you could try BLAST'ing some of your sequences against the NR database to see if any give perfect matches?

The sequences themselves are perfectly matched the submitted data of Chlamydomonas. I just have no idea what kind of IDs they are that the authors used.

**maubp** · 09-10-2012, 03:14 AM

If you can work out how to get the data from the NCBI with their accessions, that might be simpler than working with the original author's private identifiers.

**Tsuyoshi** · 09-10-2012, 03:22 AM

Originally posted by maubp View Post

If you can work out how to get the data from the NCBI with their accessions, that might be simpler than working with the original author's private identifiers.

That's right.
Anyway, I will try to extract the accession numbers from NCBI.
Thank you very much Maubp !

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 15 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Protein ID that blast could not identify

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News