I'm trying to put together genotype data for wheat samples from this resource:
The two types of file I get look like this (array probe set):
Code:
Affymetrix Code,Bristol Affy Code,Bristol SNP Code,Bristol Contig Code,Sequence (including SNP ambiguity code) AX-94381124,BA00222391,No BS code,GJKKTUG01CGTKM_219,...ACGCA[R]ACNTC... AX-94381126,BA00232763,No BS code,FZU8VVO01APRT5_148,...TCACT[Y]GNGTG... AX-94381127,BA00233916,No BS code,contig77177_250,...CCCGA[M]NCGAC... ...
Code:
A_mutica,A_speltoides,Adhoc,Ae_caudata,Akteur,A... AX-94381124,1,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,... AX-94381126,1,2,2,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2 AX-94381127,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2... ...
0 = AA, 1 = AB, 2 = BB, -1 = No call
So the question is, what information is required to map from genotype 'score' to actual genotype? Obviously heterozygous calls are clear and distinguished from homozygous calls, but is A. speltoides AA or GG at AX-94381124?
Are these actually Affymetrix Axiom format files, or just something custom?
Thanks for help,
Dan.