extracting predicted gene from scaffold: end position precedes start position

amango

Member

Join Date: Dec 2009

Posts: 17
- Share
- Tweet
#1

extracting predicted gene from scaffold: end position precedes start position

09-01-2012, 06:36 AM

I am trying to extract sequences for a list of predicted genes from genomic scaffolds. The list of predicted genes with Scaffold IDs, start and end positions, and other info comes from published supplementary data. My script to extract the sequences doesn't work because for some genes, the start position is a larger number than the end position (fourth-to-last and third-to-last columns below). Here is an example (numbers have been changed from original):

geneID Gene_family Class ScaffoldID start_position end_position Number_of_exons Annotation_status
CSP1 cs Protein candidate gi|294506227|gb|GL650210.1| 61498 52100 2 intact
CSP10 cs Protein candidate gi|294507212|gb|GL649715.1| 293074 297989 2 intact
CSP2 cs Protein candidate gi|294507210|gb|GL650017.1| 234944 236074 2 intact
CSP3 cs Protein candidate gi|294507295|gb|GL649612.1| 323100 323743 2 intact
CSP4 cs Protein candidate gi|294506227|gb|GL650210.1| 41911 40888 2 intact
CSP5 cs Protein candidate gi|294507205|gb|GL649712.1| 274408 272617 2 intact

I am new to working with annotated genomes. Does it make sense that the some "starts" come after the "ends"? Is this because the ORF for this gene is on the opposite strand of the scaffold? If so, and if I want to obtain that sequence, what's the best way to get it--should I extract the sequence in the scaffold between the two numbers and then find the reverse complement?

Thanks for any pointers.
Tags: None
zhidkov.ilia

Member

Join Date: Dec 2010

Posts: 25
- Share
- Tweet
#2

09-02-2012, 07:25 AM

Some genes transcribed from opposite strand of DNA, resulting in reverse coordinates. You can add additional column (i.e. strand) adding '+' in cases when start_position < end_position and '-' start_position > end_position.
Comment

Previous template Next

Current Approaches to Protein Sequencing

by seqadmin

Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
- Channel: Articles
04-04-2024, 04:25 PM
Strategies for Sequencing Challenging Samples

by seqadmin

Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
- Channel: Articles
03-22-2024, 06:39 AM

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

extracting predicted gene from scaffold: end position precedes start position

Comment

Latest Articles

ad_right_rmr

News