GATK gave me around 3.2 million SNPs, and I overlapped the SNPs with regions identified by repeatMasker, and found half are affected by RepeatMasker-defined repeats. I am just wondering if it is necessary to exclude the SNPs marked by repeat masker, or shall I just go with the SNPs passed the GATK filters? What's the common practice regarding this issue?
Thank you for your advice
Thank you for your advice