Hello,
I have been given a set of 454 data which was assembled using newbler. It is a repetitive region from a relative of grape, with lots of transposons.
The current assembly has misassemblies due to the repeats, and I'd like take the original reads, clean them up and reassemble.
There are about 40000 reads from 454 and maybe 2000 reads from Sanger sequencing of the two BACs that cover this region, where I might be able to use paired sequence information. I'm told by the sequencing facility there is no paired sequence information to take advantage of with 454 data.
I was thinking about trying to use RepeatMasker with the known vectors and plant repeat database and then using CAP3 or PCAP to assemble the result.
(1) Does anyone know which of the publicly available assembly engines works best on 454 data?
(2) If you recommend using CAP3, which parameter settings would you modify from the default and what values would you use?
(3) Are there any other sequence cleaning utilities you'd recommend?
(4) When using RepeatMasker, is the cross_match engine better, or would you use RMblast?
Thanks,
Steph
I have been given a set of 454 data which was assembled using newbler. It is a repetitive region from a relative of grape, with lots of transposons.
The current assembly has misassemblies due to the repeats, and I'd like take the original reads, clean them up and reassemble.
There are about 40000 reads from 454 and maybe 2000 reads from Sanger sequencing of the two BACs that cover this region, where I might be able to use paired sequence information. I'm told by the sequencing facility there is no paired sequence information to take advantage of with 454 data.
I was thinking about trying to use RepeatMasker with the known vectors and plant repeat database and then using CAP3 or PCAP to assemble the result.
(1) Does anyone know which of the publicly available assembly engines works best on 454 data?
(2) If you recommend using CAP3, which parameter settings would you modify from the default and what values would you use?
(3) Are there any other sequence cleaning utilities you'd recommend?
(4) When using RepeatMasker, is the cross_match engine better, or would you use RMblast?
Thanks,
Steph
Comment