Hi,
I have been working with very large fastq files and was thinking of ways that might speed the operations that I regularly perform on them: counting the number of reads, finding a specific sequence or a particular header.
I remembered looking at the Illumina documentation for the bcl2fastq2 utility and remembered that it now allows generation of fastq files with BGZF compression. It got me thinking that the bloc access option would allow multithreaded decompression of the fastqs and therefore susbtantially speed up those operations that I regularly must perform on all of our generated fastqs.
Does anybody know if there is any tool to do this? I have done a search but can't find anything that seems to fit the picture.
Thanks
--Raul
I have been working with very large fastq files and was thinking of ways that might speed the operations that I regularly perform on them: counting the number of reads, finding a specific sequence or a particular header.
I remembered looking at the Illumina documentation for the bcl2fastq2 utility and remembered that it now allows generation of fastq files with BGZF compression. It got me thinking that the bloc access option would allow multithreaded decompression of the fastqs and therefore susbtantially speed up those operations that I regularly must perform on all of our generated fastqs.
Does anybody know if there is any tool to do this? I have done a search but can't find anything that seems to fit the picture.
Thanks
--Raul
Comment