need GRCh37 database

lh3 replied

03-23-2012, 03:37 PM
Just to clarify: the decoy reference (I made it) is used by phase 2. The phase-1 reference genome is the mostly used at present.
Leave a comment:
Jon_Keats replied

03-23-2012, 03:26 PM
If you remove the mitochondrial reads from the fasta file you run the risk that those reads align erroneously to the genome. As Ih3 said, most people use the 1000 genomes version including the supercontigs for the same reason you don't want to delete the mitochondrial genome. Look at the 1000 genomes decoy documentation for a full list of reasons you want the most comprehensive fasta file possible.
Leave a comment:
Richard Finney replied

03-22-2012, 11:34 AM
Edit the chrM entry out of the bamfile header.
Dump header using "view -H"
Use text editor to delete chrM line.
Use "samtools reheader"
Leave a comment:
moty replied

03-22-2012, 01:50 AM
I do have no need for the reads of chrM at all so removing them might be a good option.
I tried using

samtools view -b chr1 .. chrY
samtools index..

but it still gave me the same error. how would you remove it?
Leave a comment:
lh3 replied

03-20-2012, 12:53 PM
For human alignment, the 1000g phase 1 reference is the most widely used, by nearly all the human projects involving Sanger, Broad and UMich. It is available from the 1000g website, the GATK bundle and the sanger FTP others has pointed out. If possible, try to use that. It is not so trivial to build the right reference genome, though for most this has little practical effect.
Leave a comment:
Richard Finney replied

03-20-2012, 11:47 AM
This is the "new chrM plus" Grch37 problem. I'm sure others have other more germanic and shorter syllables descriptions for this. Many hours have been spent dealing with this important "forking" of the data.

Basicaly there's 2 chrM's in common usage for hg19/grch37 analysis.
You can delete chrM from your analysis or get the right version for your data.

I did a bl2seq on the two chrM's and there wasn't much difference: one had 3 inserts the other 1 for a difference of 2 (which you see in the file size difference).

see:
ftp://ftp.sanger.ac.uk/pub/1000genom...ference/README

note comments on NC_012920

I hope when grch38/hg20 comes out everybody just sticks with the snapshot.
Leave a comment:
moty replied

03-20-2012, 11:14 AM
well ive done that but, sadly, now it says

##### ERROR MESSAGE: Input files reads and reference have incompatible contigs: Found contigs with the same name but different lengths:
##### ERROR contig reads = chrM / 16571
##### ERROR contig reference = chrM / 16569.

thanks for all the help thus far all of you.

has anyone ever had this?
Leave a comment:
Richard Finney replied

03-20-2012, 08:45 AM
Yes, manually changing "1" --> "chr1" and so on will solve your problem. Writing a script is an even better way. You might even strip the GL0*,etc. files and just keep chr1-22,X,Y,M just to keep things simple (since you reads were only aligned to to chr1-22,X,Y,M).
Leave a comment:
moty replied

03-20-2012, 08:32 AM
I've tried that one, this time I get:

##### ERROR reads contigs = [chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY, chrM]
##### ERROR reference contigs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, X, Y, MT, GL000207.1, GL000226.1, GL000229.1,...ETC

is manually renaming the reference contigs from 1 to chr1 and so on worth a shot?
Leave a comment:
nexgengirl replied

03-20-2012, 03:25 AM
Yes, the broad ftp is good or here as well:

ftp://ftp.sanger.ac.uk/pub/1000genom...ect_reference/
Leave a comment:
moty replied

03-20-2012, 12:38 AM
thank you very much for your help. I did manage to get that reference, but appareantly that wasn't enough.
I know this question was asked, but I never found a solution for that which helped me, but I am getting this error:

##### ERROR MESSAGE: Input files reads and reference have incompatible contigs: Order of contigs differences, which is unsafe.
##### ERROR reads contigs = [chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22, chrX, chrY, chrM]
##### ERROR reference contigs = [chr1, chr2, chr3, chr4, chr5, chr6, chr7, chrX, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr20, chrY, chr19, chr22, chr21, chr6_ssto_hap7, chr6_mcf_hap5, chr6_cox_hap2, chr6_mann_hap4, chr6_apd_hap1, chr6_qbl_hap6, chr6_dbb_hap3, chr17_ctg5_hap1, chr4_ctg9_hap1, chr1_gl000192_random, chrUn_gl000225, chr4_gl000194_random, chr4_gl000193_random, chr9_gl000200_random, chrUn_gl000222, chrUn_gl000212, chr7_gl000195_random, chrUn_gl000223, chrUn_gl000224, chrUn_gl000219, chr17_gl000205_random, chrUn_gl000215, chrUn_gl000216, chrUn_gl000217, chr9_gl000199_random, chrUn_gl000211, chrUn_gl000213, chrUn_gl000220, chrUn_gl000218, chr19_gl000209_random, chrUn_gl000221, chrUn_gl000214, chrUn_gl000228, chrUn_gl000227, chr1_gl000191_random, chr19_gl000208_random, chr9_gl000198_random, chr17_gl000204_random, chrUn_gl000233, chrUn_gl000237, chrUn_gl000230, chrUn_gl000242, chrUn_gl000243, chrUn_gl000241, chrUn_gl000236, chrUn_gl000240, chr17_gl000206_random, chrUn_gl000232, chrUn_gl000234, chr11_gl000202_random, chrUn_gl000238, chrUn_gl000244, chrUn_gl000248, chr8_gl000196_random, chrUn_gl000249, chrUn_gl000246, chr17_gl000203_random, chr8_gl000197_random, chrUn_gl000245, chrUn_gl000247, chr9_gl000201_random, chrUn_gl000235, chrUn_gl000239, chr21_gl000210_random, chrUn_gl000231, chrUn_gl000229, chrM, chrUn_gl000226, chr18_gl000207_random]
##### ERROR ------------------------------------------------------------------------------------------

any idea what could be done?
Leave a comment:
neha replied

03-14-2012, 03:04 AM
You can download the GRCh37 files from UCSC browser or Broad institute ftp sites.
Leave a comment:
Jon_Keats replied

03-13-2012, 08:56 PM
The GATK FTP site has all of the file you will need just look in the bundle folders
Leave a comment:
moty started a topic need GRCh37 database

03-13-2012, 07:06 AM
need GRCh37 database

I am trying to find variants on some bam files I got but GATK requires the exact database used for the alignment.
Apparently it is GRCh37. Any idea how can I download it?
I have downloaded a file called homo-cre-GRCh37.zip containing a bunch of homo-##.#.ebwt , does this help me in any way?
UnifiedGenotyper needs a fasta afaik.

a little confused here- would love some help
Thanks
Moty
Tags: database, download, gatk, grch37, hg19

Previous template Next

Exploring the Dynamics of the Tumor Microenvironment

by seqadmin

The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
- Channel: Articles
07-08-2024, 03:19 PM

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, Yesterday, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin Yesterday, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 26 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 160 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: