TL;DR
How do I `cimport HTSeq` in Cython module?
--- Details ---
I am trying to use HTSeq to read BAM files that are quite large but it takes days to process them in pure Python. Therefore I decided to use Cython and cythonize reading BAM file part.
Here is example code:
Now I am trying to replace `import HTSeq` with `cimport HTSeq` but the Cython can not find `HTSeq.pxd` file. I found that the header file is actually `src/HTSeq/_HTSeq.pxd`(full link: https://github.com/simon-anders/htse...Seq/_HTSeq.pxd )
as also described here: http://htseq.readthedocs.io/en/master/contrib.html
So, the file starts with underscore and is in the htseq repo but `pip install` does not copy it to any of Includes directories. Therefore, I copied it manually to the root directory of my package so that `setup.py` can see it. Then I added `cimport _HTSeq as HTSeq` to my `*.pyx` file and it got compiled to an `*.so` file but when I ran the app it throws error:
My system: macOS 10.13.3 High Sierra; Python 2.7.14 (will port the code to 3.6 later); HTSeq 0.6.0;
Any suggestion about solving this problem OR speeding up reading BAM files is appreciated. (One more trick I want to try next is to extract chromosome info from BAM header file and run them in parallel using multiprocessing/cython_nogil/openMPI or something; still don't know what will work best with Cython.)
Thanks!
How do I `cimport HTSeq` in Cython module?
--- Details ---
I am trying to use HTSeq to read BAM files that are quite large but it takes days to process them in pure Python. Therefore I decided to use Cython and cythonize reading BAM file part.
Here is example code:
Code:
import HTSeq bam_file = '.../test.bam' bam = HTSeq.BAM_Reader(bam_file) for aln in bam: # process alignment
as also described here: http://htseq.readthedocs.io/en/master/contrib.html
So, the file starts with underscore and is in the htseq repo but `pip install` does not copy it to any of Includes directories. Therefore, I copied it manually to the root directory of my package so that `setup.py` can see it. Then I added `cimport _HTSeq as HTSeq` to my `*.pyx` file and it got compiled to an `*.so` file but when I ran the app it throws error:
Code:
ImportError: No module named _HTSeq
Any suggestion about solving this problem OR speeding up reading BAM files is appreciated. (One more trick I want to try next is to extract chromosome info from BAM header file and run them in parallel using multiprocessing/cython_nogil/openMPI or something; still don't know what will work best with Cython.)
Thanks!