Hi,
I have some Illumina TruSeq exome data and I want to use the picard tool CalculateHsMetrics.jar to look at the hybrid selection. I downloaded the TruSeq bed file from http://www.illumina.com/support/sequ...downloads.ilmn and the reference file for mapping from ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz
My question is on the format of the interval file. I've looked at http://www.broadinstitute.org/gsa/wi...s_for_the_GATK and used that as a template for my header but picard is complaining about this header and that the sequence dictionaries are not the same size.
Here's what my interval_list file looks like:
@HD VN:1.0 SO:coordinate
@SQ SN:1 LN:249250621 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:1b22b98cdeb4a9304cb5d48026a85128
SP:Homo Sapiens
@SQ SN:2 LN:243199373 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:a0d9851da00400dec1098a9255ac712e
SP:Homo Sapiens
@SQ SN:3 LN:198022430 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:fdfd811849cc2fadebc929bb925902e5
SP:Homo Sapiens
@SQ SN:4 LN:191154276 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:23dccd106897542ad87d2765d28a19a1
SP:Homo Sapiens
@SQ SN:5 LN:180915260 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:0740173db9ffd264d728f32784845cd7
SP:Homo Sapiens
@SQ SN:6 LN:171115067 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:1d3a93a248d92a729ee764823acbbc6b
SP:Homo Sapiens
@SQ SN:7 LN:159138663 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:618366e953d6aaad97dbe4777c29375e
SP:Homo Sapiens
@SQ SN:8 LN:146364022 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:96f514a9929e410c6651697bded59aec
SP:Homo Sapiens
@SQ SN:9 LN:141213431 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:3e273117f15e0a400f01055d9f393768
SP:Homo Sapiens
@SQ SN:10 LN:135534747 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:988c28e000e84c26d552359af1ea2e1d
SP:Homo Sapiens
@SQ SN:11 LN:135006516 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:98c59049a2df285c76ffb1c6db8f8b96
SP:Homo Sapiens
@SQ SN:12 LN:133851895 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:51851ac0e1a115847ad36449b0015864
SP:Homo Sapiens
@SQ SN:13 LN:115169878 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:283f8d7892baa81b510a015719ca7b0b
SP:Homo Sapiens
@SQ SN:14 LN:107349540 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:98f3cae32b2a2e9524bc19813927542e
SP:Homo Sapiens
@SQ SN:15 LN:102531392 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:e5645a794a8238215b2cd77acb95a078
SP:Homo Sapiens
@SQ SN:16 LN:90354753 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:fc9b1a7b42b97a864f56b348b06095e6
SP:Homo Sapiens
@SQ SN:17 LN:81195210 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:351f64d4f4f9ddd45b35336ad97aa6de
SP:Homo Sapiens
@SQ SN:18 LN:78077248 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:b15d4b2d29dde9d3e4f93d1d0f2cbc9c
SP:Homo Sapiens
@SQ SN:19 LN:59128983 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:1aacd71f30db8e561810913e0b72636d
SP:Homo Sapiens
@SQ SN:20 LN:63025520 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:0dec9660ec1efaaf33281c0d5ea2560f
SP:Homo Sapiens
@SQ SN:21 LN:48129895 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:2979a6085bfe28e3ad6f552f361ed74d
SP:Homo Sapiens
@SQ SN:22 LN:51304566 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:a718acaa6135fdca8357d5bfe94211dd
SP:Homo Sapiens
@SQ SN:X LN:155270560 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:7e0e2e580297b7764e31dbc80c2540dd
SP:Homo Sapiens
@SQ SN:Y LN:59373566 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:1fa3474750af0948bdf97d5a0ee52e51
SP:Homo Sapiens
@SQ SN:MT LN:16569 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:c68f52674c9fb33aef52dcf399755519
SP:Homo Sapiens
1 14362 14829 + chr1:14363-14829:WASH5P
1 14969 15038 + chr1:14970-15038:WASH5P
1 15795 15947 + chr1:15796-15947:WASH5P
1 16606 16765 + chr1:16607-16765:WASH5P
1 16857 17055 + chr1:16858-17055:WASH5P
1 17232 17368 + chr1:17233-17368:WASH5P
1 17605 17742 + chr1:17606-17742:WASH5P
1 69090 70008 + chr1:69091-70008:OR4F5
I guess I'm stuck and any help would be appreciated. Thanks.
I have some Illumina TruSeq exome data and I want to use the picard tool CalculateHsMetrics.jar to look at the hybrid selection. I downloaded the TruSeq bed file from http://www.illumina.com/support/sequ...downloads.ilmn and the reference file for mapping from ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz
My question is on the format of the interval file. I've looked at http://www.broadinstitute.org/gsa/wi...s_for_the_GATK and used that as a template for my header but picard is complaining about this header and that the sequence dictionaries are not the same size.
Here's what my interval_list file looks like:
@HD VN:1.0 SO:coordinate
@SQ SN:1 LN:249250621 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:1b22b98cdeb4a9304cb5d48026a85128
SP:Homo Sapiens
@SQ SN:2 LN:243199373 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:a0d9851da00400dec1098a9255ac712e
SP:Homo Sapiens
@SQ SN:3 LN:198022430 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:fdfd811849cc2fadebc929bb925902e5
SP:Homo Sapiens
@SQ SN:4 LN:191154276 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:23dccd106897542ad87d2765d28a19a1
SP:Homo Sapiens
@SQ SN:5 LN:180915260 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:0740173db9ffd264d728f32784845cd7
SP:Homo Sapiens
@SQ SN:6 LN:171115067 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:1d3a93a248d92a729ee764823acbbc6b
SP:Homo Sapiens
@SQ SN:7 LN:159138663 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:618366e953d6aaad97dbe4777c29375e
SP:Homo Sapiens
@SQ SN:8 LN:146364022 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:96f514a9929e410c6651697bded59aec
SP:Homo Sapiens
@SQ SN:9 LN:141213431 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:3e273117f15e0a400f01055d9f393768
SP:Homo Sapiens
@SQ SN:10 LN:135534747 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:988c28e000e84c26d552359af1ea2e1d
SP:Homo Sapiens
@SQ SN:11 LN:135006516 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:98c59049a2df285c76ffb1c6db8f8b96
SP:Homo Sapiens
@SQ SN:12 LN:133851895 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:51851ac0e1a115847ad36449b0015864
SP:Homo Sapiens
@SQ SN:13 LN:115169878 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:283f8d7892baa81b510a015719ca7b0b
SP:Homo Sapiens
@SQ SN:14 LN:107349540 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:98f3cae32b2a2e9524bc19813927542e
SP:Homo Sapiens
@SQ SN:15 LN:102531392 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:e5645a794a8238215b2cd77acb95a078
SP:Homo Sapiens
@SQ SN:16 LN:90354753 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:fc9b1a7b42b97a864f56b348b06095e6
SP:Homo Sapiens
@SQ SN:17 LN:81195210 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:351f64d4f4f9ddd45b35336ad97aa6de
SP:Homo Sapiens
@SQ SN:18 LN:78077248 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:b15d4b2d29dde9d3e4f93d1d0f2cbc9c
SP:Homo Sapiens
@SQ SN:19 LN:59128983 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:1aacd71f30db8e561810913e0b72636d
SP:Homo Sapiens
@SQ SN:20 LN:63025520 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:0dec9660ec1efaaf33281c0d5ea2560f
SP:Homo Sapiens
@SQ SN:21 LN:48129895 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:2979a6085bfe28e3ad6f552f361ed74d
SP:Homo Sapiens
@SQ SN:22 LN:51304566 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:a718acaa6135fdca8357d5bfe94211dd
SP:Homo Sapiens
@SQ SN:X LN:155270560 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:7e0e2e580297b7764e31dbc80c2540dd
SP:Homo Sapiens
@SQ SN:Y LN:59373566 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:1fa3474750af0948bdf97d5a0ee52e51
SP:Homo Sapiens
@SQ SN:MT LN:16569 AS:GRCh37 UR:ftp://ftp.sanger.ac.uk/pub/1000genom...k_v37.fasta.gz M5:c68f52674c9fb33aef52dcf399755519
SP:Homo Sapiens
1 14362 14829 + chr1:14363-14829:WASH5P
1 14969 15038 + chr1:14970-15038:WASH5P
1 15795 15947 + chr1:15796-15947:WASH5P
1 16606 16765 + chr1:16607-16765:WASH5P
1 16857 17055 + chr1:16858-17055:WASH5P
1 17232 17368 + chr1:17233-17368:WASH5P
1 17605 17742 + chr1:17606-17742:WASH5P
1 69090 70008 + chr1:69091-70008:OR4F5
I guess I'm stuck and any help would be appreciated. Thanks.
Comment