Hi all,
we're currently working on assembling a ~600Mb genome using Pacbio sequences from sperm DNA. We're using 3 different libraries with insert sizes ranging from 10-13kb and have ~100x total but using a read length cut-off of 15kb still gives us ~50x coverage. Our current assembly, after including read lengths down to 10kb, doubled the contig N50 from ~200kb to about ~500kb after scaffolding but also increased the total genome size considerably (500mb to 690mb).
We have just started evaluating the assemblies but I was expecting larger N50s given the sequencing depth. One thing I was pondering since the beginning is if recombination events present in the sperm DNA are frequent enough to mess with the assembly and if so, if Falcon is able to resolve these conflicts based on coverage information. I'd assume that the overlap filtering settings should have problems removing these regions unless Falcon calculates coverages on a haplotype basis (i.e. coverages in haplotype context).
Unfortunately I couldn't find any information on this. Has anyone used sperm DNA for assembly before or has any information how Falcon would deal with such "pseudo chimeric" reads from recombined loci?
Cheers,
Zapp
we're currently working on assembling a ~600Mb genome using Pacbio sequences from sperm DNA. We're using 3 different libraries with insert sizes ranging from 10-13kb and have ~100x total but using a read length cut-off of 15kb still gives us ~50x coverage. Our current assembly, after including read lengths down to 10kb, doubled the contig N50 from ~200kb to about ~500kb after scaffolding but also increased the total genome size considerably (500mb to 690mb).
We have just started evaluating the assemblies but I was expecting larger N50s given the sequencing depth. One thing I was pondering since the beginning is if recombination events present in the sperm DNA are frequent enough to mess with the assembly and if so, if Falcon is able to resolve these conflicts based on coverage information. I'd assume that the overlap filtering settings should have problems removing these regions unless Falcon calculates coverages on a haplotype basis (i.e. coverages in haplotype context).
Unfortunately I couldn't find any information on this. Has anyone used sperm DNA for assembly before or has any information how Falcon would deal with such "pseudo chimeric" reads from recombined loci?
Cheers,
Zapp
Comment