I have some Solid exome sequence data including small indel calls. For a single exome (~38mb target) I get ~1000 indels called of which ~150 map to exons of which ~20 are present in dbSNP129.
I don't want to believe that there are ~130 genes blighted with indels in this exome. This seems unlikely because when I compare multiple exomes the same gene names crop up repeated.
How best to sort the wheat from the chaff?
BTW The ABI small indel tool was run with default parameters. i.e. Minimum 2 reads to call an indel. Max (normal reads coverage)/(indel reads coverage) < 12 times, because false +ve more likely at high coverage. etc.
I don't want to believe that there are ~130 genes blighted with indels in this exome. This seems unlikely because when I compare multiple exomes the same gene names crop up repeated.
How best to sort the wheat from the chaff?
BTW The ABI small indel tool was run with default parameters. i.e. Minimum 2 reads to call an indel. Max (normal reads coverage)/(indel reads coverage) < 12 times, because false +ve more likely at high coverage. etc.
Comment