Dear all,
I have obtained Rad-Tag sequences from a set of 80 samples from a wild mammal, 40 from males and 40 from females, in an attempt to isolate some Y chromosome sequences and do some male specific populationg enetic analyses. I've also produced a reference genome from a male individual that was de novo assembled due to the lack of reference. What I've done so far was to map all the males and all the females to the reference separately using BWA and samtools, and produced two VCF files, one for all the male, one for all the females. I then compared the two VCF file using vcftools, and my expectation was to find a set of SNP's that would only appear on males, and those would be good candidates to be located in the Y. So far so good.
The problem is the output. It turns out that I have more SNP's that are exclusive to females thna SNP's hat are exclusive to males, which I wouldn't expect. And when I blast the contigs against which only male SNP's mapped, only one acutally blasts to Y chromosome of a closely related species. All the other blast against genes that are fairly conserved among mammals, and are always located outside the Y chromosome.
So my question here is, any ideas on what I'm doing wrong? Conceptually, this approach should work but it is not. Also, any ideas of what else I could do to isolate Y chromoseome sequences from the data I have? Any help would be very much appreciated.
Many thanks.
I have obtained Rad-Tag sequences from a set of 80 samples from a wild mammal, 40 from males and 40 from females, in an attempt to isolate some Y chromosome sequences and do some male specific populationg enetic analyses. I've also produced a reference genome from a male individual that was de novo assembled due to the lack of reference. What I've done so far was to map all the males and all the females to the reference separately using BWA and samtools, and produced two VCF files, one for all the male, one for all the females. I then compared the two VCF file using vcftools, and my expectation was to find a set of SNP's that would only appear on males, and those would be good candidates to be located in the Y. So far so good.
The problem is the output. It turns out that I have more SNP's that are exclusive to females thna SNP's hat are exclusive to males, which I wouldn't expect. And when I blast the contigs against which only male SNP's mapped, only one acutally blasts to Y chromosome of a closely related species. All the other blast against genes that are fairly conserved among mammals, and are always located outside the Y chromosome.
So my question here is, any ideas on what I'm doing wrong? Conceptually, this approach should work but it is not. Also, any ideas of what else I could do to isolate Y chromoseome sequences from the data I have? Any help would be very much appreciated.
Many thanks.
Comment