Hi everybody
Now I’m bothered with setting of local blast-2.2.26 on linux
Please help me. I'm beginner as bioinfomatician.
I have downloaded local blast-2.2.26, homo_sapiens genome and Refseq from below sites.
and use CentOS linux 7
genome data from ensembl ftp://ftp.ensembl.org/pub/release-75...o_sapiens/dna/
Refseq data from NCBI ftp://ftp.ncbi.nlm.nih.gov/genomes/H..._level.gff3.gz
Now I have two problems.
1.I don't know how to check proper combination of data between genome data and Refseq.
2.I don't know how to make database for blast using above two data
I think that actually user first combine two data into one file using 'cat' command and next use 'formatdb' command
>formatdb -i input-file-name -n database-name -p F
but I got following error
[formatdb] WARNING: Sequence number 1 (lcl|1_input-file-name), 8139970 illegal characters were removed:
54 Es, 114 Fs, 42 Is, 117 Js, 2 Ls, 114 Qs
Please reply to above question.
best regards
Now I’m bothered with setting of local blast-2.2.26 on linux
Please help me. I'm beginner as bioinfomatician.
I have downloaded local blast-2.2.26, homo_sapiens genome and Refseq from below sites.
and use CentOS linux 7
genome data from ensembl ftp://ftp.ensembl.org/pub/release-75...o_sapiens/dna/
Refseq data from NCBI ftp://ftp.ncbi.nlm.nih.gov/genomes/H..._level.gff3.gz
Now I have two problems.
1.I don't know how to check proper combination of data between genome data and Refseq.
2.I don't know how to make database for blast using above two data
I think that actually user first combine two data into one file using 'cat' command and next use 'formatdb' command
>formatdb -i input-file-name -n database-name -p F
but I got following error
[formatdb] WARNING: Sequence number 1 (lcl|1_input-file-name), 8139970 illegal characters were removed:
54 Es, 114 Fs, 42 Is, 117 Js, 2 Ls, 114 Qs
Please reply to above question.
best regards
Comment