Seqanswers Leaderboard Ad

**neavemj** · 07-09-2018, 04:03 PM

Hi Irene,

That error is because python is trying to get the second item in a list but the list only contains one item. Looking at the code (line 299), it appears that the script is trying to make a list from the hit definition by splitting is apart at the ">" symbol.

As you can see from the output, that particular hit 'tr_E9FSX5_Daphnia_pul_Cru_Bra' does not contain a ">" symbol, and, therefore, the resulting list only contains this single item.

Basically I think the input is just not in the correct format for this script. You could probably change the code a bit to get it to run but perhaps easiest would be to generate another input format? This help is provided in the script:

# Expecting either this,
# <Hit_id>gi|3024260|sp|P56514.1|OPSD_BUFBU</Hit_id>
# <Hit_def>RecName: Full=Rhodopsin</Hit_def>
# <Hit_accession>P56514</Hit_accession>
# or,
# <Hit_id>Subject_1</Hit_id>
# <Hit_def>gi|57163783|ref|NP_001009242.1| rhodopsin [Felis catus]</Hit_def>
# <Hit_accession>Subject_1</Hit_accession>
#
# apparently depending on the parse_deflines switch
#
# Or, with a local database not using -parse_seqids can get this,
# <Hit_id>gnl|BL_ORD_ID|2</Hit_id>
# <Hit_def>chrIII gi|240255695|ref|NC_003074.8| Arabidopsis
# thaliana chromosome 3, complete sequence</Hit_def>
# <Hit_accession>2</Hit_accession>

Cheers,

Matt.

**CsprsSassyHrly** · 07-10-2018, 06:00 AM

Thank you for your reply, Matt. I guess that's what I'm finding strange... The xml file is being generated using the same BLAST command line that I have used before and haven't had this issue... the only thing I am changing is the query and the database.

The only thing that is really different is that the fasta file I turned into a database, was converted from a philip file into a fasta file before being turned into a database, while the other files I have turned into a database were downloaded as fasta files from Uniprot. I'll keep playing with it and see if I can figure it out!

Thanks again,

Irene

**neavemj** · 07-10-2018, 02:44 PM

Hmm, yep might need a bit of digging. It does seem that the script is requiring headers that look like NCBI / uniprot, e.g:

<Hit_id>gi|3024260|sp|P56514.1|OPSD_BUFBU</Hit_id>

Perhaps when you go from phylip to fasta, this header information is lost? You could also open up your xml file and compare the hit information to an xml file that you know works..

Good luck!

Matt.

Topics	Statistics	Last Post
Mechanical Forces in DNA Transcription Uncovered by Clemson Researchers by seqadmin Started by seqadmin, 10-02-2024, 04:51 AM	0 responses 13 views 0 likes	Last Post by seqadmin 10-02-2024, 04:51 AM
New Epigenetic Clock Links Cheek Cells to Mortality Risk by seqadmin Started by seqadmin, 10-01-2024, 07:10 AM	0 responses 21 views 0 likes	Last Post by seqadmin 10-01-2024, 07:10 AM
AI-Powered Blood Test Shows Promise for Early Ovarian Cancer Detection by seqadmin Started by seqadmin, 09-30-2024, 08:33 AM	0 responses 25 views 0 likes	Last Post by seqadmin 09-30-2024, 08:33 AM
Stem Cell Research Suggests Human Cells May Enter Developmental Pause by seqadmin Started by seqadmin, 09-26-2024, 12:57 PM	0 responses 18 views 0 likes	Last Post by seqadmin 09-26-2024, 12:57 PM

Seqanswers Leaderboard Ad

Announcement

python blastxml_to_tabular.py

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News