Hi, I'm trying to assemble some 100bp paired-end illumina reads using CA-7.0.

The run did not fail, but resulted in no contigs. 70% of my reads were, however, in degenerate contigs.

Is this expected because of a) short read lengths b) replicate sequences c) variation in the natural population d) all of the above (plus other stuff not listed)?

I did an assembly of a subset of this data that went together quite nicely.

thanks,
Bill

specfile:
Code:
fakeUIDs      = 1  #default 0

#  TRIMMING
vectorTrimmer  =  ca  # default ca; others figaro, umd
doOverlapBasedTrimming = 1 # default 1
obtOverlapper   =  ovl
mbtConcurrency = 4
mbtThreads = 4

#  MERYL configuration
merylMemory   = 2000   # default 800MB
merylThreads  = 4      # default 1

# OVERLAPPER configuration
ovlOverlapper    = ovl      # default is ovl
merSize       = 14 # default is 22

utgErrorRate=0.03
utgErrorLimit=2.5  # Allow mismatches over and above the utgErrorRate. This helps with Illumina reads.
ovlErrorRate=0.05 # Larger than utg to allow for correction.
cnsErrorRate=0.06 # Larger than utg to avoid occasional consensus failures
cgwErrorRate=0.06 # Larger than utg to allow contig merges across high-error ends

gkpFixInsertSizes      = 1

doDeDuplication        = 1
doChimeraDetection     = normal

frgCorrConcurrency=4
ovlCorrConcurrency=4

unitigger     = bog
utgBubblePopping = 1
utgGenomeSize=1500000  # <- !!! I accidentally left this in from a different run

doResolveSurrogates=1
doToggle = 1