Announcement
Collapse
No announcement yet.
X
-
As Peter suggests it may be simpler to do the blast search again and ask for tabular output. Is that not feasible?
-
Strange - this ought to have given XML output with the old legacy NCBI BLAST suite:
Code:blastall ... -m 7 ...
Leave a comment:
-
Oh, when I did the blast I use -m 7 and it says it is xml format. Is there any software to convert this to tabular format? I used the old BLAST not BLAST+
Leave a comment:
-
The one I posted earlier is extract from megan.
I have the original file but it didn't work either. Here it is
BLASTX 2.2.20 [Feb-08-2009]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Query= HZKDEPY02G6265
(375 letters)
Database: /nfs/scratch/sdpapet/db/mpidatabase/NCBI05252013nr
25,805,290 sequences; 8,915,431,356 total letters
Score E
Sequences producing significant alignments: (bits) Value
ref|YP_003474070.1| glutamine synthetase [Thermocrinis albus DSM... 220 1e-55
ref|YP_003433234.1| glutamine synthetase [Hydrogenobacter thermo... 219 3e-55
ref|YP_002120801.1| glutamine synthetase, type I [Hydrogenobacul... 204 9e-51
ref|YP_007499517.1| glutamine synthetase, type I [Hydrogenobacul... 202 6e-50
ref|NP_213074.1| glutamine synthetase [Aquifex aeolicus VF5] >gi... 199 5e-49
ref|WP_008286412.1| glutamine synthetase [Hydrogenivirga sp. 128... 196 3e-48
ref|YP_002731353.1| glutamine synthetase, type I [Persephonella ... 187 1e-45
ref|YP_002728028.1| glutamine synthetase [Sulfurihydrogenibium a... 187 1e-45
ref|YP_001931244.1| glutamine synthetase, type I [Sulfurihydroge... 183 2e-44
ref|WP_007545780.1| glutamine synthetase, type I [Sulfurihydroge... 183 3e-44
>ref|YP_003474070.1| glutamine synthetase [Thermocrinis albus DSM 14484]
ref|WP_012992349.1| glutamine synthetase [Thermocrinis albus]
gb|ADC89943.1| glutamine synthetase, type I [Thermocrinis albus DSM 14484]
Length = 469
Score = 220 bits (561), Expect = 1e-55
Identities = 104/110 (94%), Positives = 107/110 (97%)
Frame = +2
Query: 26 KHGPALTAFTNPTINSYHRLVPGFEAPVRLAYSARNRSAAIRIPTYSQSPKAKRIEIRFP 205
KHGPALTAFTNPT+NSYHRLVPGFEAPVRLAYSARNRSAAIRIPTYSQSPKAKRIEIRFP
Sbjct: 303 KHGPALTAFTNPTVNSYHRLVPGFEAPVRLAYSARNRSAAIRIPTYSQSPKAKRIEIRFP 362
Query: 206 DPTCNPYLAFSAILMAAIDGVENKIHPGEPFDKDIYSLPPEELKDIPNCP 355
DPTCNPYLAFSAILMAAIDG+EN+IHPGEP DKDIYSLPPEELKDIP P
Sbjct: 363 DPTCNPYLAFSAILMAAIDGIENRIHPGEPLDKDIYSLPPEELKDIPQLP 412
Leave a comment:
-
BLAST XML output looks like this, and is designed for a computer to read:
Code:<?xml version="1.0"?> <!DOCTYPE BlastOutput PUBLIC "-//NCBI//NCBI BlastOutput/EN" "NCBI_BlastOutput.dtd"> <BlastOutput> <BlastOutput_program>blastp</BlastOutput_program> <BlastOutput_version>BLASTP 2.2.24+</BlastOutput_version> <BlastOutput_reference>Stephen F. Altschul, Thomas L. Madden, Alejandro A. Sch&auml;ffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402.</BlastOutput_reference> <BlastOutput_db>nr</BlastOutput_db> <BlastOutput_query-ID>Query_1</BlastOutput_query-ID> <BlastOutput_query-def>Sample</BlastOutput_query-def> <BlastOutput_query-len>516</BlastOutput_query-len> ...
Leave a comment:
-
BLAST-like file generated by MEGAN
Query=HZKDEPY02FP29T
>ref|YP_003433234.1| glutamine synthetase [Hydrogenobacter thermophilus TK-6]
ref|YP_005512249.1| glutamine synthetase, type I [Hydrogenobacter thermophilus TK-6]
ref|WP_012964213.1| glutamine synthetase [Hydrogenobacter thermophilus]
dbj|BAI70033.1| glutamine synthetase [Hydrogenobacter thermophilus TK-6]
gb|ADO45956.1| glutamine synthetase, type I [Hydrogenobacter thermophilus TK-6]
Length = 469
Score = 80.9 bits (198), Expect = 1e-13
Identities = 35/39 (89%), Positives = 39/39 (100%)
Frame = -2
Query: 161 PLTRERYGRDTRYVAQKAEQYLRQTGIGDTAYFGPEAEF 45
P+TRERYGRDTRY+AQKAEQYL+QTGIGDTAY+GPEAEF
Sbjct: 98 PITRERYGRDTRYIAQKAEQYLKQTGIGDTAYYGPEAEF 136
>ref|YP_003474070.1| glutamine synthetase [Thermocrinis albus DSM 14484]
ref|WP_012992349.1| glutamine synthetase [Thermocrinis albus]
gb|ADC89943.1| glutamine synthetase, type I [Thermocrinis albus DSM 14484]
Length = 469
Score = 80.9 bits (198), Expect = 1e-13
Identities = 35/39 (89%), Positives = 39/39 (100%)
Frame = -2
Query: 161 PLTRERYGRDTRYVAQKAEQYLRQTGIGDTAYFGPEAEF 45
P+TRERYGRDTRY+AQKAEQYL+QTGIGDTAY+GPEAEF
Sbjct: 98 PITRERYGRDTRYIAQKAEQYLKQTGIGDTAYYGPEAEF 136
>ref|YP_007499517.1| glutamine synthetase, type I [Hydrogenobaculum sp. HO]
ref|YP_007646578.1| glutamine synthetase, type I [Hydrogenobaculum sp. SN]
ref|WP_015418780.1| glutamine synthetase, type I [Hydrogenobaculum sp. HO]
gb|AEF18544.1| glutamine synthetase, type I [Hydrogenobaculum sp. 3684]
gb|AEG45832.1| glutamine synthetase, type I [Hydrogenobaculum sp. SHO]
gb|AGG14474.1| glutamine synthetase, type I [Hydrogenobaculum sp. HO]
gb|AGH92778.1| glutamine synthetase, type I [Hydrogenobaculum sp. SN]
Length = 469
Score = 80.1 bits (196), Expect = 2e-13
Identities = 35/39 (89%), Positives = 38/39 (97%)
Frame = -2
Query: 161 PLTRERYGRDTRYVAQKAEQYLRQTGIGDTAYFGPEAEF 45
P+TRERYGRDTRY+AQKAEQYL+QTGIGD AYFGPEAEF
Sbjct: 97 PITRERYGRDTRYIAQKAEQYLKQTGIGDVAYFGPEAEF 135
Leave a comment:
-
In this case, either is fine:
Code:python blastxml_to_tabular.py -o nitrogen.txt nitrogen.xml
Code:python blastxml_to_tabular.py nitrogen.xml -o nitrogen.txt
What does your file "nitrogen.xml" look like? Can you share it via http://gist.github.com or perhaps show the first ten lines here inside [ code ] and [ /code ] tags?
(The code tags are available via the forum's advanced editor view using the "#" icon.)
Leave a comment:
-
Hi, I want to convert xml to txt?
should it be python blastxml_to_tabular.py nitrogen.xml -o nitrogen.txt
do I need to type an "-i" in front of "nitrogen.xml"
The error is invalid data format
Leave a comment:
-
What errors? If you copy/paste the message here it would be far easier to guide you - but I guess part of the problem is you are using -i which is not expected.
Assuming you installed Python 2.7, you would run something like this - the default is the standard 12 column output:
Code:C:\Python27\python blastxml_to_tabular.py -o nitrogen.txt nitrogen.xml
Leave a comment:
-
I tried on windows but it says some errors.
Can you give me an example of using this script, if I want to convert 12 column txt file.
Leave a comment:
-
It looks like you got the Galaxy XML file by mistake. This https://github.com/peterjc/galaxy_bl..._to_tabular.py is the pretty Human Readable page for the Python script, this https://raw.githubusercontent.com/pe..._to_tabular.py link will download the script itself ready to run.
Are you using Linux, Mac OS X, or Windows? If Windows you will also need to install Python and running it is a little more complicated. Linux and the Mac should already have a suitable version of Python installed.
Leave a comment:
-
Originally posted by maubp View PostHere's my Python script https://github.com/peterjc/galaxy_bl..._to_tabular.py with Galaxy wrapper https://github.com/peterjc/galaxy_bl...to_tabular.xml - does that count as a GUI?
should I type python blastxml_to_tabular.py -i nitrogen.xml -o nitrogen.txt.
I am sorry about this kind of newbie questions. I am a biological student. I don't have any background of computer sciences.
I attached the blastxml_to_tabular.py. Please have a quick look and see the format is correct or not.Attached Files
Leave a comment:
-
Here's my Python script https://github.com/peterjc/galaxy_bl..._to_tabular.py with Galaxy wrapper https://github.com/peterjc/galaxy_bl...to_tabular.xml - does that count as a GUI?
Leave a comment:
Latest Articles
Collapse
-
by seqadmin
At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...-
Channel: Articles
09-26-2023, 06:26 AM -
-
by seqadmin
Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...-
Channel: Articles
09-07-2023, 11:15 PM -
-
by seqadmin
Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.
Whole Transcriptome RNA-seq
Whole transcriptome sequencing...-
Channel: Articles
08-31-2023, 11:07 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 06:57 AM
|
0 responses
10 views
0 likes
|
Last Post
by seqadmin
Yesterday, 06:57 AM
|
||
Started by seqadmin, 09-26-2023, 07:53 AM
|
0 responses
10 views
0 likes
|
Last Post
by seqadmin
09-26-2023, 07:53 AM
|
||
Multiplexed Biomarker Detection with Nanopore Technology: A Leap in Precision Diagnostics
by seqadmin
Started by seqadmin, 09-25-2023, 07:42 AM
|
0 responses
15 views
0 likes
|
Last Post
by seqadmin
09-25-2023, 07:42 AM
|
||
Started by seqadmin, 09-22-2023, 09:05 AM
|
0 responses
45 views
0 likes
|
Last Post
by seqadmin
09-22-2023, 09:05 AM
|
Leave a comment: