Hi everyone,
I'm fresh in the forum and a newbie in bioinformatics.
I'm trying to blast some contigs to use the output in MEGAN. I installed the Blast+ suite (on a mac computer) and I downloaded the nr database in fasta format from NCBI ftp.
I then formatted the database using:
the output looks ok to me:
Now i'm tring to run the blastn against my local database but I get *****no hits found**** for all my contigs. I've tried to blast few of them in the blast website and they get many good hits.
What am I doing wrong??
These are few contigs from my .fasta file:
And this is the code i'm using:
I need the top 10 hits for every sequence to feed to MEGAN. At the moment my output is:
and so on.....
Can someone help me please?
Don
I'm fresh in the forum and a newbie in bioinformatics.
I'm trying to blast some contigs to use the output in MEGAN. I installed the Blast+ suite (on a mac computer) and I downloaded the nr database in fasta format from NCBI ftp.
I then formatted the database using:
Code:
makeblastdb -in nr -dbtype nucl -out nr.db
Code:
Building a new DB, current time: 10/18/2012 19:12:48 New DB name: nr.db New DB title: nr Sequence type: Nucleotide Keep Linkouts: T Keep MBits: T Maximum file size: 1000000000B Adding sequences from FASTA; added 21062489 sequences in 1624.48 seconds.
What am I doing wrong??
These are few contigs from my .fasta file:
Code:
>Assembly_Contig_1 GGGCGGTCGCCTCCGTAAAAAGTAACGGGAGGACGTTACAAAGTTCGGBTCAGGTGGGTTGGAAWHCCACCGTAGAGTATAATGGCATAAGCCGGACTGACTGTGAGACATACAAGTCGAGCAGAGTCGAAAGACGGTCATAGTGATCCGGTGGTTCTGTGTGGAAGGGCCATCGCTCAAAGGATAAAAGGTACGCCGGGGATAACAGGCTGATCTCCCCCAAGAGCTCACATCGACGGGGAGGTTTGGCACCTCGATGTCGGCTCATCGCATCCTGGGGCTGGAGCAGGTCCCAAGGGTATGGCTGTTCGCCATTTAAAGCGGTACGCGAGCTGGGTTCAGAACGTCGTGAGACAGTTCGGTCCCTATCTTCCGTGGGCGTAGGAACGTTGARGAGAGCTGACCCTAGTACGAGAGGACCGGGTTGGACGTGCCACTGGTGCACCAGTTGTTCTGCCAAGAGCATCGCTGGGTAGCTACGCACGGATGAGATAACCGCTGAAAGCATCTAAGCGGGAAGCCAACTCYGAGATGAACGTTCCCTGAAGTACGCTTGAAGACTACAAGCTTGAKASKMKGSWKGTTGTACCGCACGAGTAATCT >Assembly_Contig_2 CTCCCCGTCGATGTGAGCTCTTGGGGGAGATCAGCCTGTTATCCCCGTGCACCTTTACTATAGCTTGACACTGCAATTGGGATATWYWTGTGCAGGATAGGTGGGARSCWTTGATTCATAGTCGCYAGATTATGATGAGSYATCCTTGAGATACCACCCTTATATATTCTGATTGCTAACTTGCKMCAGTTATCCTGKSSGAGGACAATGTCTGGTGGGTAGTTTGACTGGGGCGGTCGCCTCCTAAAAAGTAACGGAGGCTTACAAAGGTTGGYTCAGATGGGTTGGAAATCCATCGYAGAGTATAATGGTACAARCCAGCTTAACTGYGAGACRTACAKGTCGARCAGAGACGAAAGTCGGTCATAGTGATCCGGTGGTTCTGTGTGGAAGGGCCATCGCTCAAAGGATAAAAGGTACGCCGGGGATAACAGGCTGATCTCCCCCAAGAGCTCACATCGACGGGGAGGTTTGGCACCTCGATGTCGGCTCATCGCATCCTGGGGCTGAAGCAGGTCCCAAGGGTATGGCTGTTCGCCATTTAAAGCGGTACGCGAGCTGGGTTCAGAACGTCGTGAGACAGTTCGGTCCCTATCTGCCGTGGGCGTTGGATGATTGAGGAGAGTTGCCCCTAGTACGAGAGGACCGGGGTGAACGAACCACTRGTGCACCARTTKTBSTGCCAAGRGCATMGSTKGGKWRGCTACGTTCGGATGG >Assembly_Contig_3 CTACGGTGGATTTCCAACCCACCTGAGCCGAACTTTGTAAGCCTCCGTTACTTTTTAGGAGGCTTACAAAGGTTGGCTCATATCGGTTGGAAAYCSATMGCAGAGTATAATGGTACAARCCAGCTTAACTGCGAGACRTACATGTCGAGCAGAGACGAAAGTCGGTCATAGTGATCCGGTGGTTCTGTGTGGAAGGGCCATCGCTCAAAGGATAAAAGGTACGCCGGGGATAACAGGCTGATCTCCCCCAAGAGCTCACATCGACGGGGAGGTTTGGCACCTCGATGTCGGCTCATCRCATCCTGGGGCTGAAGCAGGTCCCAAGGGTAYGGCTGTTCGCCRTTTAAAGYGGTACGCGAGCTGGGTTCAGAACGTCGTGAGACAGTTCGGTCCCTATCTGCCGTGGGCGTTGGATGATTGAGGAGAGTTGCCCCTAGTACGAGAGGACCGGGGTGAACGAACCACTAGTGCACCAATTGTTCTGCCAAGAGCATAGTTGGGTAGCTACGTTCGGATGWGATAACCGCTGAAGGCATCTAAGCGGGAAGCCAACTCCAAGATTAATCATCCCTGAAGATCCCAAGAAGACTACTTGGTTGATAGGCTGGGTGTGTAAGCGATGTAAGTCGTTTAGCTGACCAGTACTAATAGATCGTTTRKHTWWAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA >Assembly_Contig_4 CATATATATCCCAATTGCAGTGTCAAGCTGTAGTGGAGGTGAAAATTCCTCCTACCCGCGGAAGACGGAAAGACCCCGTGCACCTTTACTATAGCTTGACACTGCTGTTGGKAWWTTCATGTGCAGGATAGGTGGGAGCCATTGATTCATRGWCGCCAGWTTATGATGAGGCATCCYTKRRRWWMCACCCTTGAATATTCTGATAGCTAACTCCGTACAATTATCTTGTGCGAGGACAATGTCTGGTGGGTAGTTTGACTGGGGCGGTCGCCTCCTAAAAAGTAACGGAGGCTTACAAAGTTCGGCTCAGGTGGGTTGGAAATCCACCGTAGAGTATAATGGCATAAGCCGGACTGACTGTGAGACATACAWGTCGAGCAGAGTCGAAAGACGGTCATAGTGATCCGGTGGTTCTGTGTGGAAGGGCCATCGCTCAAAGGATAAAAGGTACGCCGGGGATAACAGGCTGATCTCCCCCAAGAGCTCACATCGACGGGGAGGTTTGGCACCTCGATGTCGGCTCATCGCATCCTGGGGCTGGAGCAGGTCCCAAGGGTATGGCTGTTCGCCATTTAAASSGGDVSGSSRRSYKGKTYHVRACGTCGTGAGACAGTTCGGTCCCTTA
Code:
blastn -db nr.db -query contigs.fasta -out outblastn.txt -export_search_strategy blastn_parameters.txt -num_threads 4
Code:
BLASTN 2.2.27+ Reference: Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb Miller (2000), "A greedy algorithm for aligning DNA sequences", J Comput Biol 2000; 7(1-2):203-14. Database: nr 21,062,489 sequences; 7,218,481,314 total letters Query= Assembly_Contig_1 Length=603 ***** No hits found ***** Lambda K H 1.35 0.627 1.14 Gapped Lambda K H 1.28 0.460 0.850 Effective search space used: 3755491256660 Query= Assembly_Contig_2 Length=717 ***** No hits found *****
Can someone help me please?
Don
Comment