HI
I downloaded a proteome in fasta formater, which contains hundreds of proteins (http://labs.umassmed.edu/chlamyfp/in...p?content=help). And I want to blast against these proteins with my data using Blast+, however, when I makeblastdb the proteome dataset, an error occurred
*******************************************************************
Error: NCBI C++ Exception:
"/am/ncbiapdata/release/blast/src/2.2.26/IntelMAC-universal/c++/GCC401-ReleaseMT--IntelMAC-universal/../src/objects/seq/../seqloc/Seq_id.cpp", line 1679: Error: ncbi:bjects::CSeq_id::x_Init() - Unsupported ID type C_1150005
*******************************************************************
I thing there must be something wrong with the proteome data, cause the blast+ just worked well when I used the data downloaded directly from NCBI.
Therefore, I opened the proteome data with textedit, and for example, the header of each sequence was like this
*****************************************************************
>C_680011|168600 FAP45, Flagellar Associated Protein Weakly Similar to Nasopharyngeal Epithelium Specific Protein 1
MPQTPPRSGGYRSGKQSYVDESLFGGSKRTGAAQVETLDSLKLTAPTRTISPKDRDVVTLTKGDLTRMLKASPIMTAEDVAAAKREAEAKREQLQAVSKA
RKEKMLKLEEEAKKQAPPTETEILQRQLNDATRSRATHMMLEQKDPVKHMNQMMLYSKCVTIRDAQIEEKKQMLAEEEEEQRRLDLMMEIERVKALEQYE
ARERQRVEERRKGAAVLSEQIKERERERIRQEELRDQERLQMLREIERLKEEEMQAQIEKKIQAKQLMEEVAAANSEQIKRKEGMKVREKEEDLRIADYI
LQKEMREQ
*****************************************************************
Here the "C_680011|168600" should be the protein ID I think, but there was no found if I search it in NCBI. I just wonder what kind of ID it is and how should I do to make the blast+ recognise it.
Thanks!
I downloaded a proteome in fasta formater, which contains hundreds of proteins (http://labs.umassmed.edu/chlamyfp/in...p?content=help). And I want to blast against these proteins with my data using Blast+, however, when I makeblastdb the proteome dataset, an error occurred
*******************************************************************
Error: NCBI C++ Exception:
"/am/ncbiapdata/release/blast/src/2.2.26/IntelMAC-universal/c++/GCC401-ReleaseMT--IntelMAC-universal/../src/objects/seq/../seqloc/Seq_id.cpp", line 1679: Error: ncbi:bjects::CSeq_id::x_Init() - Unsupported ID type C_1150005
*******************************************************************
I thing there must be something wrong with the proteome data, cause the blast+ just worked well when I used the data downloaded directly from NCBI.
Therefore, I opened the proteome data with textedit, and for example, the header of each sequence was like this
*****************************************************************
>C_680011|168600 FAP45, Flagellar Associated Protein Weakly Similar to Nasopharyngeal Epithelium Specific Protein 1
MPQTPPRSGGYRSGKQSYVDESLFGGSKRTGAAQVETLDSLKLTAPTRTISPKDRDVVTLTKGDLTRMLKASPIMTAEDVAAAKREAEAKREQLQAVSKA
RKEKMLKLEEEAKKQAPPTETEILQRQLNDATRSRATHMMLEQKDPVKHMNQMMLYSKCVTIRDAQIEEKKQMLAEEEEEQRRLDLMMEIERVKALEQYE
ARERQRVEERRKGAAVLSEQIKERERERIRQEELRDQERLQMLREIERLKEEEMQAQIEKKIQAKQLMEEVAAANSEQIKRKEGMKVREKEEDLRIADYI
LQKEMREQ
*****************************************************************
Here the "C_680011|168600" should be the protein ID I think, but there was no found if I search it in NCBI. I just wonder what kind of ID it is and how should I do to make the blast+ recognise it.
Thanks!
Comment