I am doing some experiment using BowTie and Q-Pick. However, one works with full Human genomic database (BowTie) and another works with it's corresponding chromosomal databases (for chromosome 1,2, 3....23). Now from here, I found full Human Genome Database for h19 (contains 23 chromosome files one for each chromosome i.e. chromFa.tar.gz archive). However, can't understand , if I concatenate all those 23 files in a single file (say using cat command) and give input to the BowTie tool, is it acceptable ? Means does concatenated all chromosome files = Full Genomic database ? More specifically, each chromosome starts with chr(chromosome number)>, should I include those while concatenating or remove those tags ?
Seqanswers Leaderboard Ad
Collapse
X
-
Full Genomic Database and corresponding chromosomal databases
Last edited by Arupsss; 06-18-2012, 07:31 AM. -
I am not sure I understand your question. Bowtie can work with an entire genome, with chromsomes or with parts of chromosomes. So there is no need to have one large file and plenty of reasons not to (e.g., ease of manipulation, ease of visualization, etc.) However ff you do wish to concatenate all of the chromsomes together into one large genome file then leave the '>' part in place. Good luck with your analysis.
-
-
Originally posted by westerman View PostI am not sure I understand your question. Bowtie can work with an entire genome, with chromsomes or with parts of chromosomes. So there is no need to have one large file and plenty of reasons not to (e.g., ease of manipulation, ease of visualization, etc.) However ff you do wish to concatenate all of the chromsomes together into one large genome file then leave the '>' part in place. Good luck with your analysis.
Comment
-
-
Your files should look something like:
>chr1
NN..AG..NN
And the next file should look like:
>chr2
NN..GC..NN
When you cat these files together leave in the '>' part to get a large file that looks like:
>chr1
NN..AG..NN
>chr2
NN..GC..NN
Unless I misunderstanding your question, this is simple FastA format manipulation.
Comment
-
-
Originally posted by westerman View PostYour files should look something like:
>chr1
NN..AG..NN
And the next file should look like:
>chr2
NN..GC..NN
When you cat these files together leave in the '>' part to get a large file that looks like:
>chr1
NN..AG..NN
>chr2
NN..GC..NN
Unless I misunderstanding your question, this is simple FastA format manipulation.
Comment
-
-
Save yourself a significant amount of effort and just download the pre-built bowtie indexes for hg19 from here: ftp://ftp.cbcb.umd.edu/pub/data/bowt.../hg19.ebwt.zip
Comment
-
-
Originally posted by GenoMax View PostSave yourself a significant amount of effort and just download the pre-built bowtie indexes for hg19 from here: ftp://ftp.cbcb.umd.edu/pub/data/bowt.../hg19.ebwt.zip
Comment
-
-
I guess you are trying to do much of this on windows. It may be time to put some effort into using a unix distro. There are several unix distributions that you can try. You may want to experiment with "bioliunx" which has a lot of pre-built bioinformatics apps (http://nebc.nerc.ac.uk/tools/bio-linux/bio-linux-6.0).
You are bound to run into some issue (sooner than later) where trying to do this type of analysis on windows (editing/handling large files is one thing that comes to mind).
A simple unix command like "cat file1 fie2 file3 > final.fa" would achieve what you were asking about in the original question.
Originally posted by Arupsss View PostThanks a lot. However, I have many chromosomal sequences (not only for Human or hg19/18). I have to do it for all. I don't think for all I can get prebuilt indexes. Another point is that for some cases I have to include/exclude sex related chromosomal sequence.
Comment
-
Latest Articles
Collapse
-
by seqadmin
This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.
The Headliner
The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...-
Channel: Articles
03-03-2025, 01:39 PM -
-
by seqadmin
The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...-
Channel: Articles
02-24-2025, 06:31 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 05:03 AM
|
0 responses
15 views
0 reactions
|
Last Post
by seqadmin
Today, 05:03 AM
|
||
Started by seqadmin, Yesterday, 07:27 AM
|
0 responses
12 views
0 reactions
|
Last Post
by seqadmin
Yesterday, 07:27 AM
|
||
Started by seqadmin, 03-18-2025, 12:50 PM
|
0 responses
14 views
0 reactions
|
Last Post
by seqadmin
03-18-2025, 12:50 PM
|
||
Started by seqadmin, 03-03-2025, 01:15 PM
|
0 responses
185 views
0 reactions
|
Last Post
by seqadmin
03-03-2025, 01:15 PM
|
Comment