Seqanswers Leaderboard Ad

**maubp** · 11-01-2010, 07:15 AM

I think something is wrong with your FASTA file - the index failed, apparently your sequence is too long to index (2 to the power of 32 bases is very big, 4.2 billion!).

What URL did you download the FASTA file from?

**louis7781x** · 11-01-2010, 07:37 AM

Originally posted by maubp View Post

I think something is wrong with your FASTA file - the index failed, apparently your sequence is too long to index (2 to the power of 32 bases is very big, 4.2 billion!).

What URL did you download the FASTA file from?

hi,it is my download fils 's url ftp://ftp.ensembl.org/pub/current/fa...toplevel.fa.gz

The file's sorce is from ensembl.

Would you help me find the error thanks!!!

**maubp** · 11-01-2010, 07:39 AM

Did you decompress it properly? e.g. try:

head Homo_sapiens.GRCh37.59.dna.toplevel.fa

**louis7781x** · 11-01-2010, 07:40 AM

Originally posted by maubp View Post

Did you decompress it properly? e.g. try:

head Homo_sapiens.GRCh37.59.dna.toplevel.fa

I use command "gunzip Filename.gz" to decompress this file.

sorry I don;t understand "head" What is this command?

Thanks!

**maubp** · 11-01-2010, 08:01 AM

head is a Unix command to see the start of a text file (short for header I think), tail shows you the end of a text file (head and tail being the opposite ends of an animal).

**Jon_Keats** · 11-01-2010, 08:11 AM

To he best of my understanding you can't use the top level files as the size exceeds the maximum supported by the BWT used in BWA. This is because the top level files include entire duplicate chromosomes for the different haplotypes. Most people are using the 1000 genomes version of GRCh37.

**louis7781x** · 11-01-2010, 09:05 AM

Originally posted by Jon_Keats View Post

To he best of my understanding you can't use the top level files as the size exceeds the maximum supported by the BWT used in BWA. This is because the top level files include entire duplicate chromosomes for the different haplotypes. Most people are using the 1000 genomes version of GRCh37.

hi Jon ,my research is to find gene fusions in brain tumor's cDNA library generated from 454.
I read many papers,and they usually use 454 data align against to hg19 and refseq of cDNA ,and just extract "non-mapping reads". then,using non-mapping reads to find the read where can align across to two exon.

I don't understand .In my research,Is 1000 genomes useful ?

Thanks!

**Jon_Keats** · 11-01-2010, 09:15 AM

Hi Louis,

The 1000 genomes version of human genome build GRCh37 should be as useful as the top level file at ensembl, maybe more so as you can the get BWA running. I'm assuming your analysis strategy is to map all reads to human genome, take all those that don't map, and map against human transcriptome, then take those that still do not map and blast against genome to look for novel hybrid junctions? The only difference is they curated this version to get rid of the redundant duplication that is not necessary and is likely to cause problems in your analysis. If you want a bit more explaination see my thread (http://seqanswers.com/forums/showthread.php?t=4589)

**maria_mari** · 04-02-2012, 02:23 AM

Hi,
size limit in last versions of bwa (use bwa 0.6 )

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, 07-25-2024, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin 07-25-2024, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 26 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 160 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

A question about BWA index

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News