Header Leaderboard Ad
Collapse
Use Bowtie Index to get sequences using locations
Collapse
Announcement
Collapse
No announcement yet.
X
-
Although bowtie index essentially keeps the genome, I doubt it is optimized or designed for your purpose.
The code points to a way to retrieve ranges:
Code:/* Parse a .2bit file and sequence spec into an object. * The spec is a string in the form: * * file/path/input.2bit[:seqSpec1][,seqSpec2,...] * * where seqSpec is either * seqName * or * seqName:start-end
edit: indeed, BLAT has such functions included. See here for a bit of discussion about 2bit retrieval using Perl:
Last edited by gringer; 10-31-2013, 03:43 PM.
Leave a comment:
-
yeah, I'm torn on holding it in memory or not. Toy with different workflows
Leave a comment:
-
If you really have a LOT of positions, then it's best to read the genome into memory. samtools faidx is great for a smallish number of sites, but it grabs the sequence from disk, making it a bit slow for a large number of queries.
Leave a comment:
-
I want to retrieve lots of regions efficiently, but thanks for pointing me to faidx, I'll see how it works.
Leave a comment:
-
Although bowtie index essentially keeps the genome, I doubt it is optimized or designed for your purpose. Use faidx if you only want to retrieve a few regions.
Leave a comment:
-
The bowtie-inspect thing does get all the info out, but thats 3gb of info since I can't select a location
Leave a comment:
-
Just to clarify, I mean using the index - giving it a chromosome name (fasta header) and location numbers, and getting back a sequence.
I don't want to run an alignment, just pull out the sequence. So no SAM output.
For this I'm using bowtie, not bowtie2. But of bowtie2 can do this...
Thanks
Leave a comment:
-
Originally posted by shawn.mek View PostWe have the fasta files (obviously) for the hg19 genome, we used them to create a big Bowtie index.
I was hoping not to have to keep the fasta file. Instead just look up sequences in the Bowtie index when I get chromosome locations.
I know when the alignment comes back it tells me where the alignment occurs and which fasta record (header) that it came from. So all the info is there, but I can't figure out how to pull out a sequence given a location.
Does anyone know if this is possible, or know much about the index format (perhaps I could write a little program to fish out a sequence)?
Thanks
Code:bowtie2-inspect No index name given! Bowtie 2 version 2.1.0 by Ben Langmead ([email protected], www.cs.jhu.edu/~langmea) Usage: bowtie2-inspect [options]* <bt2_base> <bt2_base> bt2 filename minus trailing .1.bt2/.2.bt2 By default, prints FASTA records of the indexed nucleotide sequences to standard out. With -n, just prints names. With -s, just prints a summary of the index parameters and sequences. With -e, preserves colors if applicable. Options: -a/--across <int> Number of characters across in FASTA output (default: 60) -n/--names Print reference sequence names only -s/--summary Print summary incl. ref names, lengths, index properties -e/--bt2-ref Reconstruct reference from .bt2 (slow, preserves colors) -v/--verbose Verbose output (for debugging) -h/--help print detailed description of tool and its options --help print this usage message
Leave a comment:
-
Use Bowtie Index to get sequences using locations
We have the fasta files (obviously) for the hg19 genome, we used them to create a big Bowtie index.
I was hoping not to have to keep the fasta file. Instead just look up sequences in the Bowtie index when I get chromosome locations.
I know when the alignment comes back it tells me where the alignment occurs and which fasta record (header) that it came from. So all the info is there, but I can't figure out how to pull out a sequence given a location.
Does anyone know if this is possible, or know much about the index format (perhaps I could write a little program to fish out a sequence)?
Thanks
Latest Articles
Collapse
-
Differential Expression and Data Visualization: Recommended Tools for Next-Level Sequencing Analysisby seqadmin
After covering QC and alignment tools in the first segment and variant analysis and genome assembly in the second segment, we’re wrapping up with a discussion about tools for differential gene expression analysis and data visualization. In this article, we include recommendations from the following experts: Dr. Mark Ziemann, Senior Lecturer in Biotechnology and Bioinformatics, Deakin University; Dr. Medhat Mahmoud Postdoctoral Research Fellow at Baylor College of Medicine;...-
Channel: Articles
05-23-2023, 12:26 PM -
-
by seqadmin
Continuing from our previous article, we share variant analysis and genome assembly tools recommended by our experts Dr. Medhat Mahmoud, Postdoctoral Research Fellow at Baylor College of Medicine, and Dr. Ming "Tommy" Tang, Director of Computational Biology at Immunitas and author of From Cell Line to Command Line.
Variant detection and analysis tools
Mahmoud classifies variant detection work into two main groups: short variants (<50...-
Channel: Articles
05-19-2023, 10:03 AM -
-
by seqadmin
With new tools and computational resources being released regularly, it can be hard to determine which are best suited for the analysis process and which older tools continue to be maintained. In an effort to assist the sequencing community, we interviewed three highly skilled bioinformaticians about their recommended tools for several important analysis applications.
Quality control and preprocessing tools
“Garbage in, garbage out” is a popular...-
Channel: Articles
05-16-2023, 10:11 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Exploring French-Canadian Ancestry: Insights into Migration, Settlement Patterns, and Genetic Structure
by seqadmin
Started by seqadmin, Yesterday, 09:22 AM
|
0 responses
8 views
0 likes
|
Last Post
by seqadmin
Yesterday, 09:22 AM
|
||
Started by seqadmin, 05-24-2023, 09:49 AM
|
0 responses
9 views
0 likes
|
Last Post
by seqadmin
05-24-2023, 09:49 AM
|
||
Introducing ProtVar: A Web Tool for Contextualizing and Interpreting Human Missense Variation in Proteins
by seqadmin
Started by seqadmin, 05-23-2023, 07:14 AM
|
0 responses
27 views
0 likes
|
Last Post
by seqadmin
05-23-2023, 07:14 AM
|
||
Started by seqadmin, 05-18-2023, 11:36 AM
|
0 responses
113 views
0 likes
|
Last Post
by seqadmin
05-18-2023, 11:36 AM
|
Leave a comment: