Greetings.
My issue:
I need the SNP-difference(s) between clone 1 and clone 2 of a haploid eukaryote. I do not have an assembled genome of this species for mapping, but there is one that is closely related with 100% synteny (I'll call it X, here).
I have illumina 50pb paired-end reads. I tried mapping clone 1 and clone 2 to X separately, then extracting the SNPs, removing the intersection of the sets and just using the complement. But there are too many SNPs, 1/100bp, and the sequence quality differs problematically among the samples (the clone 1 sample is gorgeous, clone 2 is a little iffy but higher coverage), thus the SNP list for clone 2 is more than twice as long as clone 1 (but the spurious SNPs have a > 100 coverage and Q scores over 100 in many cases.
(Pipeline = BWA -> sampe -> SAMtools mpileup -> BCFtools vcfutils.pl -> SNPs)
I'm thinking of assembling one clone to X, then exporting that as a new sequence, and then mapping clone 2 to that and cutting out the middle man. I am the only one in my immediate area working on this and I am just a mapping monkey, I don't know much about how to assemble a new genome and use it as a reference -- so any advice on how to do this or alternatives to solving the Find the SNP between Clone 1 and Clone 2 Problem is much appreciated.
My issue:
I need the SNP-difference(s) between clone 1 and clone 2 of a haploid eukaryote. I do not have an assembled genome of this species for mapping, but there is one that is closely related with 100% synteny (I'll call it X, here).
I have illumina 50pb paired-end reads. I tried mapping clone 1 and clone 2 to X separately, then extracting the SNPs, removing the intersection of the sets and just using the complement. But there are too many SNPs, 1/100bp, and the sequence quality differs problematically among the samples (the clone 1 sample is gorgeous, clone 2 is a little iffy but higher coverage), thus the SNP list for clone 2 is more than twice as long as clone 1 (but the spurious SNPs have a > 100 coverage and Q scores over 100 in many cases.
(Pipeline = BWA -> sampe -> SAMtools mpileup -> BCFtools vcfutils.pl -> SNPs)
I'm thinking of assembling one clone to X, then exporting that as a new sequence, and then mapping clone 2 to that and cutting out the middle man. I am the only one in my immediate area working on this and I am just a mapping monkey, I don't know much about how to assemble a new genome and use it as a reference -- so any advice on how to do this or alternatives to solving the Find the SNP between Clone 1 and Clone 2 Problem is much appreciated.
Comment