Hi
I have 3 populations that are tagged. Ten individuals (non tagged) within each populations.
We used 454 titanium technology, with non normalized cDNA bank
The aim of our study is to select SNP in order to make a genome scan (Fst scan) to found selection.
I choose Mira for the assembly, but I'm not sure to use it properly.For the moment I made assemblies with default parametres of miraESTSNPs -job=denovo,est,454. I just test modifying -AL: mrs (default = 80, test = 70 and 90), -AL:ms (default=15, test 10 and 20).
For information the average expected identity between 2 alleles is something like 95%.
I need a good coverage because to validate a SNP it has to be present at least 10 times (20 times would be better).
There are a lot of parameters we can modify but I'm not sur to really know on what parameters I have to focus on for my study.
Somebody could give me some advice ?
For information :
few stats : default parameters + ms=10
Num. reads assembled : 360376
Num. singlets : 4008
Num. contig >=500 bases : 5945
Total consensus: 5677061
MAx coverage : 842
avg. coverage = 11,53
few stats : default parameters
Num. reads assembled : 360240
Num. singlets : 4008
Num. contigd >=500 bases : 9119
Total consensus: 8368209
MAx coverage : 840
avg. coverage = 8,73
few stats : default parameters + ms=20
Num. reads assembled : 360103
Num. singlets : 3936
Num. contig >=500 bases : 6001
Total consensus: 5724129
MAx coverage : 795
avg. coverage = 11,87
few stats : default parameters + mrs=90
Num. reads assembled : 308455
Num. singlets : 4039
Num. contig >=500 bases :0
Total consensus: 0
MAx coverage : 0
avg. coverage = 0
few stats : default parameters + mrs=70
Num. reads assembled : 383220
Num. singlets :7495
Num. contig >=500 bases :16355
Total consensus: 14075898
MAx coverage : 851
avg. coverage = 6,27
I have 3 populations that are tagged. Ten individuals (non tagged) within each populations.
We used 454 titanium technology, with non normalized cDNA bank
The aim of our study is to select SNP in order to make a genome scan (Fst scan) to found selection.
I choose Mira for the assembly, but I'm not sure to use it properly.For the moment I made assemblies with default parametres of miraESTSNPs -job=denovo,est,454. I just test modifying -AL: mrs (default = 80, test = 70 and 90), -AL:ms (default=15, test 10 and 20).
For information the average expected identity between 2 alleles is something like 95%.
I need a good coverage because to validate a SNP it has to be present at least 10 times (20 times would be better).
There are a lot of parameters we can modify but I'm not sur to really know on what parameters I have to focus on for my study.
Somebody could give me some advice ?
For information :
few stats : default parameters + ms=10
Num. reads assembled : 360376
Num. singlets : 4008
Num. contig >=500 bases : 5945
Total consensus: 5677061
MAx coverage : 842
avg. coverage = 11,53
few stats : default parameters
Num. reads assembled : 360240
Num. singlets : 4008
Num. contigd >=500 bases : 9119
Total consensus: 8368209
MAx coverage : 840
avg. coverage = 8,73
few stats : default parameters + ms=20
Num. reads assembled : 360103
Num. singlets : 3936
Num. contig >=500 bases : 6001
Total consensus: 5724129
MAx coverage : 795
avg. coverage = 11,87
few stats : default parameters + mrs=90
Num. reads assembled : 308455
Num. singlets : 4039
Num. contig >=500 bases :0
Total consensus: 0
MAx coverage : 0
avg. coverage = 0
few stats : default parameters + mrs=70
Num. reads assembled : 383220
Num. singlets :7495
Num. contig >=500 bases :16355
Total consensus: 14075898
MAx coverage : 851
avg. coverage = 6,27