High Efficiency Acoustic Data Compressor
Navy SBIR FY2018.1

Sol No.: Navy SBIR FY2018.1
Topic No.: N181-067
Topic Title: High Efficiency Acoustic Data Compressor
Proposal No.: N181-067-0984
Firm: Intelligent Automation, Inc.
15400 Calhoun Drive
Suite 190
Rockville, Maryland 20855
Contact: Sergey Voronin
Phone: (301) 294-5271
Web Site: http://www.i-a-i.com
Abstract: We propose a modular approach to compressing multi-channel acoustic time domain data. Our approach applies advanced multi-resolution denoising and outlier removal methods, for each channel, followed by an up-to-lag similarity cross-correlation and dynamic time warping based comparison step between the recovered waveforms, which would potentially decrease the amount of retained waveforms compared to the number of channels, replacing many by associated time lag and differencing information with respect to a set of pillars. Following this, a suitable transformation will be applied to each retained time domain waveform (to make the data amenable to lossless compression) and Burrows-Wheeler type compression, together with Arithmetic coding as the entropy encoder would be employed to achieve desirable compression ratio over the whole data set. This will be an improvement on the standard bzip2 like compressor (typically ranked as one of the best all-around lossless compression tools), which by default uses Huffman coding as the entropy encoder. A suitable decoding implementation will be built to deconstruct the waveforms for each channel. More advanced denoising methods utilizing machine learning will be experimented with. Our preliminary implementation indicates that 10x near lossless compression ratio is attainable.
Benefits: The benefits of this research effort will be a set of denoising, outlier removal, and waveform compression strategies which can be applied to different time series data (for example: for different audio, medical, and Geo seismic applications). The compression approach we will employ will be useful to a wide variety of data types, as the bzip2 type approach has shown to be one of the best universal type compressor in various testing benchmarks. Our proposed work in replacing Huffman by Arithmetic coding for the entropy encoding step (allowing for less than 1 bit per symbol encoding) and modifying the entropy booster scheme from the default Move-to-Front approach should yield better compression ratios and a strategy for fine tuning Burrows-Wheeler type lossless compression to specific DOD data needs.