Thank you so much for this!
The second option seems to be the best, though I would like to be as memory efficient as possible. I'm not entirely sure what a generator function is, so that might be a little complicated for the moment.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
OK, keeping this simple (but not memory efficient), load all the records into memory as a list, modify the list, save list of records:
Code:from Bio import SeqIO from Bio.SeqFeature import SeqFeature, FeatureLocation node = ... records = list(SeqIO.parse("D13_multi.gbk", "gb")) for record in records: if record.id == node: record.features.append(SeqFeature(FeatureLocation(start, end), type = "misc_feature")) SeqIO.write(records, "D13_edited.gbk", "gb")
Code:from Bio import SeqIO from Bio.SeqFeature import SeqFeature, FeatureLocation node = ... handle = open("D13_edited.gbk", "w") for record in SeqIO.parse("D13_multi.gbk", "gb"): if record.id == node: record.features.append(SeqFeature(FeatureLocation(start, end), type = "misc_feature")) SeqIO.write(record, handle, "gb") handle.close()
Leave a comment:
-
Thanks. I've fixed the indentation for the code now, so hopefully that will help.
I'm pretty certain that if I keep calling SeqIO.write that I overwrite whatever I had already. It seems like when I open the multi entry genbank with SeqIO.parse, it gets to the end of the file and then doesn't go back to the start. So in the for loop, I loop over all the records, get to the record I want and then add the feature, then it loops over the rest of the records. Once the for loop is done, I call SeqIO.write, but there isn't anything to write because its gotten to the end of the file.
So I'm just not sure how to write out my edited genbank with the features I want added in.
Leave a comment:
-
When editing a post choose "Edit" --> "Go Advanced" button at the bottom of the edit window. In the subsequent window that opens up use the mose to highlight the text you to designate as "code" then use the "#" button in the edit menu to addCode:$ code_example
For example:
Code:>>> handle = open("D13_multi.gbk", "rU") >>> genbank = SeqIO.parse(handle, "genbank") >>> for record in genbank: ... if record.id == node: record.features.append(SeqFeature.SeqFeature(SeqFeature.FeatureLocation(start, end), type = "misc_feature")) ... print record.features ... [SeqFeature(FeatureLocation(ExactPosition(0), ExactPosition(35), strand=1), type='misc_feature'), SeqFeature(FeatureLocation(ExactPosition(652), ExactPosition(145)), type='misc_feature')]
Last edited by GenoMax; 11-11-2013, 04:12 AM.
Leave a comment:
-
Adding features to a multi entry genbank
I'm attempting to add features using BioPython to a multi-entry Genbank. Each entry in the genbank is a contig, and I'd like to annotate some BLAST hits.
I've performed a BLAST of my query against a multi-entry fasta, so I know which hits belong to which contig and what the contig is called. I'm able to parse my Genbank and find which contig I'd like to add the feature to. I then append the feature, but I'm problems at the final stage where I attempt to then write out the entire Genbank with the new feature added to one of the entries.
This is my code so far:
Code:>>> handle = open("D13_multi.gbk", "rU") >>> genbank = SeqIO.parse(handle, "genbank") >>> for record in genbank: ... if record.id == node: record.features.append(SeqFeature.SeqFeature(SeqFeature.FeatureLocation(start, end), type = "misc_feature")) ... print record.features ... [SeqFeature(FeatureLocation(ExactPosition(0), ExactPosition(35), strand=1), type='misc_feature'), SeqFeature(FeatureLocation(ExactPosition(652), ExactPosition(145)), type='misc_feature')]
If I do this - SeqIO.write(genbank, "D13_multi_edited.gbk", "genbank"), it writes out all the entries after the the contig that I've just append a feature to.
I've also tried this:
Code:>>> for record in genbank: ... if record.id != node: ... SeqIO.write(genbank, "D13_test.gbk", "genbank") ... if record.id == node: ... record.features.append(SeqFeature.SeqFeature(SeqFeature.FeatureLocation(start, end), type = "misc_feature")) ... SeqIO.write(record, "D13_test.gbk", "genbank") ... 109
What am I doing wrong?
Latest Articles
Collapse
-
by seqadmin
Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.
Nobel Prize for MicroRNA Discovery
This week,...-
Channel: Articles
10-07-2024, 08:07 AM -
-
by seqadmin
Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...-
Channel: Articles
09-23-2024, 06:35 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 10-02-2024, 04:51 AM
|
0 responses
103 views
0 likes
|
Last Post
by seqadmin
10-02-2024, 04:51 AM
|
||
Started by seqadmin, 10-01-2024, 07:10 AM
|
0 responses
111 views
0 likes
|
Last Post
by seqadmin
10-01-2024, 07:10 AM
|
||
Started by seqadmin, 09-30-2024, 08:33 AM
|
1 response
114 views
0 likes
|
Last Post
by EmiTom
10-07-2024, 06:46 AM
|
||
Started by seqadmin, 09-26-2024, 12:57 PM
|
0 responses
21 views
0 likes
|
Last Post
by seqadmin
09-26-2024, 12:57 PM
|
Leave a comment: