NOTE: I tried to explain my question better here: http://seqanswers.com/forums/showthread.php?t=65681
Dear all,
I have some RAD-seq data from nematodes (2 parents, ~100 progeny) which I want to use to cluster my genomic scaffolds into linkage groups (ideally each group would be a chromosome).
The genome has been frozen and is now annotated.
I performed the following steps:
1- Quality control of the RAD-seq raw data (fastqc)
2- Demultiplexed the read files with process_ragtags, from Stacks (-> one file of reads per individual)
3- Mapped the RAD-tags to the genome (bowtie2 + samtools -> bam alignments sorted and indexed then I converted the bam files in sam)
4- Used the Stacks pipeline ref_map.pl to build loci, calls variants and construct a catalogs of loci. I ran the program "genotypes" specifying "onemap" as the output format
5- With the markers in hand (in onemap format), I followed the following steps in R:
> library('onemap')
> data_onemap <- read.outcross("/PATH/TO/FOLDER", "batch_3.genotypes_5onemap.tsv")
> twopts <- rf.2pts(data_onemap)
> mark.all <- make.seq(twopts, "all")
> marker.type(mark.all) # I have a LOT of Ds (and a few A and Bs)
> LGs <- group(mark.all, LOD=6, max.rf=0.4)
-> With these LOD and max.rf parameters I only get 1 linkage group which links 70% of my markers. I thought that I was getting only 1 group because the percentage of polymorphism between my 2 parental strains is low (0.10%).
I therefore increased my LOD value to 19, and was able to get 5 groups.But, these 5 groups use only 18% of my markers (330 out of a total of 1814).
Is this a problem? Do you think my pipeline to build a genetic map out of RAD-seq data is correct? Is it too weak?
Any help welcome on how to build a genetic map using RAD-seq!
Cheers!
Dear all,
I have some RAD-seq data from nematodes (2 parents, ~100 progeny) which I want to use to cluster my genomic scaffolds into linkage groups (ideally each group would be a chromosome).
The genome has been frozen and is now annotated.
I performed the following steps:
1- Quality control of the RAD-seq raw data (fastqc)
2- Demultiplexed the read files with process_ragtags, from Stacks (-> one file of reads per individual)
3- Mapped the RAD-tags to the genome (bowtie2 + samtools -> bam alignments sorted and indexed then I converted the bam files in sam)
4- Used the Stacks pipeline ref_map.pl to build loci, calls variants and construct a catalogs of loci. I ran the program "genotypes" specifying "onemap" as the output format
5- With the markers in hand (in onemap format), I followed the following steps in R:
> library('onemap')
> data_onemap <- read.outcross("/PATH/TO/FOLDER", "batch_3.genotypes_5onemap.tsv")
> twopts <- rf.2pts(data_onemap)
> mark.all <- make.seq(twopts, "all")
> marker.type(mark.all) # I have a LOT of Ds (and a few A and Bs)
> LGs <- group(mark.all, LOD=6, max.rf=0.4)
-> With these LOD and max.rf parameters I only get 1 linkage group which links 70% of my markers. I thought that I was getting only 1 group because the percentage of polymorphism between my 2 parental strains is low (0.10%).
I therefore increased my LOD value to 19, and was able to get 5 groups.But, these 5 groups use only 18% of my markers (330 out of a total of 1814).
Is this a problem? Do you think my pipeline to build a genetic map out of RAD-seq data is correct? Is it too weak?
Any help welcome on how to build a genetic map using RAD-seq!
Cheers!