Hi all,
I just thought it might be useful to announce here that my Genozip tool can very effectively compress IonTorrent BAM files with its --optimize-ZM option. The the ZM field (flow signal values) is the main culprit that causes Ion Torrent files to be hard to compress. With this option, negative flow signal values are changed to zero and positives are rounded to the nearest 10.
Example: -20,212,427 -> 0,210,430.
> wget ftp://ftp-trace.ncbi.nih.gov/Referen...rawlib.b37.bam
2021-08-13 23:53:14 (12.2 MB/s) - ‘IonXpress_020_rawlib.b37.bam’ saved [26964896443]
> genozip IonXpress_020_rawlib.b37.bam
genozip IonXpress_020_rawlib.b37.bam : Done (8 minutes 0 seconds, BAM compression ratio: 1.5)
> genozip IonXpress_020_rawlib.b37.bam --optimize-ZM -o IonXpress_020_rawlib.b37.optimized.bam.genozip
genozip IonXpress_020_rawlib.b37.bam : Done (6 minutes 34 seconds, BAM compression ratio: 2.2)
> ls -Ggh IonXpress_020_rawlib.b37*
-rw-rw-r--+ 1 26G Aug 13 23:53 IonXpress_020_rawlib.b37.bam
-rw-rw-r--+ 1 17G Aug 14 00:10 IonXpress_020_rawlib.b37.bam.genozip
-rw-rw-r--+ 1 12G Aug 14 00:17 IonXpress_020_rawlib.b37.optimized.bam.genozip
See here: https://genozip.com
Paper: https://www.researchgate.net/publica...ata_Compressor
I just thought it might be useful to announce here that my Genozip tool can very effectively compress IonTorrent BAM files with its --optimize-ZM option. The the ZM field (flow signal values) is the main culprit that causes Ion Torrent files to be hard to compress. With this option, negative flow signal values are changed to zero and positives are rounded to the nearest 10.
Example: -20,212,427 -> 0,210,430.
> wget ftp://ftp-trace.ncbi.nih.gov/Referen...rawlib.b37.bam
2021-08-13 23:53:14 (12.2 MB/s) - ‘IonXpress_020_rawlib.b37.bam’ saved [26964896443]
> genozip IonXpress_020_rawlib.b37.bam
genozip IonXpress_020_rawlib.b37.bam : Done (8 minutes 0 seconds, BAM compression ratio: 1.5)
> genozip IonXpress_020_rawlib.b37.bam --optimize-ZM -o IonXpress_020_rawlib.b37.optimized.bam.genozip
genozip IonXpress_020_rawlib.b37.bam : Done (6 minutes 34 seconds, BAM compression ratio: 2.2)
> ls -Ggh IonXpress_020_rawlib.b37*
-rw-rw-r--+ 1 26G Aug 13 23:53 IonXpress_020_rawlib.b37.bam
-rw-rw-r--+ 1 17G Aug 14 00:10 IonXpress_020_rawlib.b37.bam.genozip
-rw-rw-r--+ 1 12G Aug 14 00:17 IonXpress_020_rawlib.b37.optimized.bam.genozip
See here: https://genozip.com
Paper: https://www.researchgate.net/publica...ata_Compressor