hi, i have been working on a pipeline that takes from bisulfite treated reads and returns useful methylation summary and output as simply as possible. i'm posting it here to get feedback. the best summary is to read the page here: http://github.com/brentp/methylcode
it's available for download
directly from the git repository as: git clone git://github.com/brentp/methylcode.git
and via tarball: http://github.com/brentp/methylcode/tarball/master
you'll need:
* numpy from here: http://sourceforge.net/projects/numpy/files/
* cython from here: http://pypi.python.org/packages/sour...-0.12.1.tar.gz
* pyfasta from here: http://pypi.python.org/pypi/pyfasta/
* bowtie from here: http://bowtie-bio.sourceforge.net/index.shtml
MethylCoder uses the well-known method of converting all C's to T's in both the reads and the reference in order to map the bisulfite treated reads. Bowtie is used to do the alignments. It requires a FASTQ file for input, but if you have raw reads, you can convert them to FASTQ and use 'I' or whatever for the quality values and adjust the bowtie params and it will work fine.
We have been using it in the lab for quite a while and I have tested it against published analyses and other software and it matches very closely (but uses less memory and less CPU time), but use at your own risk.
Currently, it does not handle paired end reads. If someone needs this and provides me with a set of paired-end BS-treated reads, I will likely implement.
I would appreciate any feedback in terms of usability or features.
this work is supported the fischer lab (http://epmb.berkeley.edu/facPage/dispFP.php?I=8) at uc berkeley but any problems are my fault. please contact me directly with any questions or problems.
it's available for download
directly from the git repository as: git clone git://github.com/brentp/methylcode.git
and via tarball: http://github.com/brentp/methylcode/tarball/master
you'll need:
* numpy from here: http://sourceforge.net/projects/numpy/files/
* cython from here: http://pypi.python.org/packages/sour...-0.12.1.tar.gz
* pyfasta from here: http://pypi.python.org/pypi/pyfasta/
* bowtie from here: http://bowtie-bio.sourceforge.net/index.shtml
MethylCoder uses the well-known method of converting all C's to T's in both the reads and the reference in order to map the bisulfite treated reads. Bowtie is used to do the alignments. It requires a FASTQ file for input, but if you have raw reads, you can convert them to FASTQ and use 'I' or whatever for the quality values and adjust the bowtie params and it will work fine.
We have been using it in the lab for quite a while and I have tested it against published analyses and other software and it matches very closely (but uses less memory and less CPU time), but use at your own risk.
Currently, it does not handle paired end reads. If someone needs this and provides me with a set of paired-end BS-treated reads, I will likely implement.
I would appreciate any feedback in terms of usability or features.
this work is supported the fischer lab (http://epmb.berkeley.edu/facPage/dispFP.php?I=8) at uc berkeley but any problems are my fault. please contact me directly with any questions or problems.
Comment