Hi everybody,
I wrote an automated FASTQ based 16S rrna searcher, so you give it your FASTQ and it tells you which 16S matches it best. Although most people should know which genome they sequenced, I enjoy my computer telling me what I did ;-) It may also help you spot contaminants. Code on https://github.com/beaumontlab/antonie
However - the Green Genes database (at http://greengenes.secondgenome.com/downloads ) gives me a Genbank Accession Number, like this:
Best current guess: Genbank GU198115.1
But I'd like to show my user "Pseudomonas fluorescens strain LMG 7207 16S ribosomal RNA gene, partial sequence."
I have found several e-utils queries that work, like http://eutils.ncbi.nlm.nih.gov/entre...ta&retmode=xml
But these often deliver the entire genome, which I really don't need! Is there a way to send a limited query to only get TSeq_defline or TSeq_orgname?
Or alternatively, is there a database of accession numbers/names that I can download somewhere?
Thanks!
I wrote an automated FASTQ based 16S rrna searcher, so you give it your FASTQ and it tells you which 16S matches it best. Although most people should know which genome they sequenced, I enjoy my computer telling me what I did ;-) It may also help you spot contaminants. Code on https://github.com/beaumontlab/antonie
However - the Green Genes database (at http://greengenes.secondgenome.com/downloads ) gives me a Genbank Accession Number, like this:
Best current guess: Genbank GU198115.1
But I'd like to show my user "Pseudomonas fluorescens strain LMG 7207 16S ribosomal RNA gene, partial sequence."
I have found several e-utils queries that work, like http://eutils.ncbi.nlm.nih.gov/entre...ta&retmode=xml
But these often deliver the entire genome, which I really don't need! Is there a way to send a limited query to only get TSeq_defline or TSeq_orgname?
Or alternatively, is there a database of accession numbers/names that I can download somewhere?
Thanks!
Comment