Thank you so much for this!
The second option seems to be the best, though I would like to be as memory efficient as possible. I'm not entirely sure what a generator function is, so that might be a little complicated for the moment.
Unconfigured Ad
Collapse
X
-
OK, keeping this simple (but not memory efficient), load all the records into memory as a list, modify the list, save list of records:
Getting a bit more clever, avoid loading all the records into memory, loop over the file once - and writing out the modified file one record at a time in the main loop:Code:from Bio import SeqIO from Bio.SeqFeature import SeqFeature, FeatureLocation node = ... records = list(SeqIO.parse("D13_multi.gbk", "gb")) for record in records: if record.id == node: record.features.append(SeqFeature(FeatureLocation(start, end), type = "misc_feature")) SeqIO.write(records, "D13_edited.gbk", "gb")
If you want to do this in a memory efficient way (like the second version) but also making one call to SeqIO.write, you could define a generator function, or use a generator expression with a custom function to annotate an individual SeqRecord. But I'm guessing that might be a little too complicated for now?Code:from Bio import SeqIO from Bio.SeqFeature import SeqFeature, FeatureLocation node = ... handle = open("D13_edited.gbk", "w") for record in SeqIO.parse("D13_multi.gbk", "gb"): if record.id == node: record.features.append(SeqFeature(FeatureLocation(start, end), type = "misc_feature")) SeqIO.write(record, handle, "gb") handle.close()
Leave a comment:
-
-
Thanks. I've fixed the indentation for the code now, so hopefully that will help.
I'm pretty certain that if I keep calling SeqIO.write that I overwrite whatever I had already. It seems like when I open the multi entry genbank with SeqIO.parse, it gets to the end of the file and then doesn't go back to the start. So in the for loop, I loop over all the records, get to the record I want and then add the feature, then it loops over the rest of the records. Once the for loop is done, I call SeqIO.write, but there isn't anything to write because its gotten to the end of the file.
So I'm just not sure how to write out my edited genbank with the features I want added in.
Leave a comment:
-
-
When editing a post choose "Edit" --> "Go Advanced" button at the bottom of the edit window. In the subsequent window that opens up use the mose to highlight the text you to designate as "code" then use the "#" button in the edit menu to addto properly format your code.Code:$ code_example
For example:
Try it on the post above.Code:>>> handle = open("D13_multi.gbk", "rU") >>> genbank = SeqIO.parse(handle, "genbank") >>> for record in genbank: ... if record.id == node: record.features.append(SeqFeature.SeqFeature(SeqFeature.FeatureLocation(start, end), type = "misc_feature")) ... print record.features ... [SeqFeature(FeatureLocation(ExactPosition(0), ExactPosition(35), strand=1), type='misc_feature'), SeqFeature(FeatureLocation(ExactPosition(652), ExactPosition(145)), type='misc_feature')]Last edited by GenoMax; 11-11-2013, 04:12 AM.
Leave a comment:
-
-
Adding features to a multi entry genbank
I'm attempting to add features using BioPython to a multi-entry Genbank. Each entry in the genbank is a contig, and I'd like to annotate some BLAST hits.
I've performed a BLAST of my query against a multi-entry fasta, so I know which hits belong to which contig and what the contig is called. I'm able to parse my Genbank and find which contig I'd like to add the feature to. I then append the feature, but I'm problems at the final stage where I attempt to then write out the entire Genbank with the new feature added to one of the entries.
This is my code so far:
After this for loop, SeqIO.write() writes 0 features to the new file I specify.Code:>>> handle = open("D13_multi.gbk", "rU") >>> genbank = SeqIO.parse(handle, "genbank") >>> for record in genbank: ... if record.id == node: record.features.append(SeqFeature.SeqFeature(SeqFeature.FeatureLocation(start, end), type = "misc_feature")) ... print record.features ... [SeqFeature(FeatureLocation(ExactPosition(0), ExactPosition(35), strand=1), type='misc_feature'), SeqFeature(FeatureLocation(ExactPosition(652), ExactPosition(145)), type='misc_feature')]
If I do this - SeqIO.write(genbank, "D13_multi_edited.gbk", "genbank"), it writes out all the entries after the the contig that I've just append a feature to.
I've also tried this:
In this case it misses the first entry (there should be 110 entries) and writes out the entry that I've added a feature to but without the feature being added.Code:>>> for record in genbank: ... if record.id != node: ... SeqIO.write(genbank, "D13_test.gbk", "genbank") ... if record.id == node: ... record.features.append(SeqFeature.SeqFeature(SeqFeature.FeatureLocation(start, end), type = "misc_feature")) ... SeqIO.write(record, "D13_test.gbk", "genbank") ... 109
What am I doing wrong?
Latest Articles
Collapse
-
by SEQadmin2
Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.
The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
...-
Channel: Articles
06-02-2026, 10:05 AM -
-
by SEQadmin2
With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.
Introduction
Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...-
Channel: Articles
05-22-2026, 06:42 AM -
-
by SEQadmin2
Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.
Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...-
Channel: Articles
05-06-2026, 09:04 AM -
ad_right_rmr
Collapse
News
Collapse
| Topics | Statistics | Last Post | ||
|---|---|---|---|---|
|
Started by SEQadmin2, Yesterday, 08:59 AM
|
0 responses
13 views
0 reactions
|
Last Post
by SEQadmin2
Yesterday, 08:59 AM
|
||
|
Started by SEQadmin2, 06-02-2026, 12:03 PM
|
0 responses
21 views
0 reactions
|
Last Post
by SEQadmin2
06-02-2026, 12:03 PM
|
||
|
Started by SEQadmin2, 06-02-2026, 11:40 AM
|
0 responses
18 views
0 reactions
|
Last Post
by SEQadmin2
06-02-2026, 11:40 AM
|
||
|
Started by SEQadmin2, 05-28-2026, 11:40 AM
|
0 responses
31 views
0 reactions
|
Last Post
by SEQadmin2
05-28-2026, 11:40 AM
|
Leave a comment: