Seqanswers Leaderboard Ad

**westerman** · 06-18-2012, 07:40 AM

I am not sure I understand your question. Bowtie can work with an entire genome, with chromsomes or with parts of chromosomes. So there is no need to have one large file and plenty of reasons not to (e.g., ease of manipulation, ease of visualization, etc.) However ff you do wish to concatenate all of the chromsomes together into one large genome file then leave the '>' part in place. Good luck with your analysis.

**Arupsss** · 06-18-2012, 08:01 AM

Originally posted by westerman View Post

I am not sure I understand your question. Bowtie can work with an entire genome, with chromsomes or with parts of chromosomes. So there is no need to have one large file and plenty of reasons not to (e.g., ease of manipulation, ease of visualization, etc.) However ff you do wish to concatenate all of the chromsomes together into one large genome file then leave the '>' part in place. Good luck with your analysis.

Thanks a lot. So, while concatenating, suppose chr1>NN..AG..NN and chr2>NN..GC...NN, I should remove the > means output is : chr1NN..AG..NNchr2NN..GC...NN. And give input the concatenated file to BowTie. Am I correct ?

**westerman** · 06-18-2012, 08:17 AM

Your files should look something like:

>chr1
NN..AG..NN

And the next file should look like:

>chr2
NN..GC..NN

When you cat these files together leave in the '>' part to get a large file that looks like:

>chr1
NN..AG..NN
>chr2
NN..GC..NN

Unless I misunderstanding your question, this is simple FastA format manipulation.

**Arupsss** · 06-18-2012, 08:26 AM

Originally posted by westerman View Post

Your files should look something like:

>chr1
NN..AG..NN

And the next file should look like:

>chr2
NN..GC..NN

When you cat these files together leave in the '>' part to get a large file that looks like:

>chr1
NN..AG..NN
>chr2
NN..GC..NN

Unless I misunderstanding your question, this is simple FastA format manipulation.

Yah. I am trying to do that simple FastA format manipulation thus I can give it as a single file input to BowTie. However, "'>' part" means only ">" or ">chr2>" because in the above large file example you just cat those files, no part is dropped.

**GenoMax** · 06-19-2012, 08:12 AM

Save yourself a significant amount of effort and just download the pre-built bowtie indexes for hg19 from here: ftp://ftp.cbcb.umd.edu/pub/data/bowt.../hg19.ebwt.zip

**Arupsss** · 06-20-2012, 10:24 AM

Originally posted by GenoMax View Post

Save yourself a significant amount of effort and just download the pre-built bowtie indexes for hg19 from here: ftp://ftp.cbcb.umd.edu/pub/data/bowt.../hg19.ebwt.zip

Thanks a lot. However, I have many chromosomal sequences (not only for Human or hg19/18). I have to do it for all. I don't think for all I can get prebuilt indexes. Another point is that for some cases I have to include/exclude sex related chromosomal sequence.

**GenoMax** · 06-20-2012, 11:01 AM

I guess you are trying to do much of this on windows. It may be time to put some effort into using a unix distro. There are several unix distributions that you can try. You may want to experiment with "bioliunx" which has a lot of pre-built bioinformatics apps (http://nebc.nerc.ac.uk/tools/bio-linux/bio-linux-6.0).

You are bound to run into some issue (sooner than later) where trying to do this type of analysis on windows (editing/handling large files is one thing that comes to mind).

A simple unix command like "cat file1 fie2 file3 > final.fa" would achieve what you were asking about in the original question.

Originally posted by Arupsss View Post

Thanks a lot. However, I have many chromosomal sequences (not only for Human or hg19/18). I have to do it for all. I don't think for all I can get prebuilt indexes. Another point is that for some cases I have to include/exclude sex related chromosomal sequence.

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 19 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 18 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Full Genomic Database and corresponding chromosomal databases

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News