Development of Stream-based Lossless Data Compression Technology
2021.07.15
【 Research Outline 】
Due to fast growth of the amount of data from the data sources such as network, video, sensors, etc., the fast communication data path is demanded. It is getting to reach the technological limit for the data migration due to the BigData applications. The conventional data compression technology uses blocking approach that treats a data block stored once in memory. It degrades the performance when it is applied to a situation that treats data streams. Thus, the data compression technology requires to treat directly the data stream without buffering. However, there does not exist any data compression technology that fully treats data stream without any stall. This research project will develop a new stream-based data compression/decompression technology that treats a data stream. This technology will contribute a reliable technology to modern computing systems. Read More
1. Background at the beginning of research
The industry needs compression technology to cope with the explosive increase in the amount of data flowing through data transmission lines. Data transmission lines are standardized such as by PCI Express, HDMI, and USB. Those are getting to reach GHz order. It is not scientific approach that multiple patterns are designed at the printed circuit board during the manufacturing where they are investigated by multiple trials. It is extremely difficult to implement a data transmission line before the implementation. In order to increase the transfer amount without complicating the implementation technology, it is an urgent task to develop a compression technology with hardware that compresses the data transferred in the line, reduces the frequency and the amount of data, and simplifies the implementation.
On the other hand, it is necessary to urgently shift the existing compression technology from blocking compression to stream compression. That is, until now, blocking compression has been performed in which data is stored in a memory and compressed / decompressed, but it degrades when the bandwidth of the data transmission increases. Furthermore, even if the speed is increased by hardware, the speed of the circuit must be increased with the improvement in the performance of the transmission line. This cause a situation that the performance can be “cat-and-mouse”. Therefore, the technology can be expected to fail. Therefore, it is necessary to shift the technology to support stream compression that can realize high data density in proportion to the performance of the transmission line. The technology that must be implemented in a scalable manner on hardware.
2. Research objective
According to the above background, the objective of this research is to develop a stream compression technology that compresses the data flowing in the data communication path in real time. The data handled in this study is continuous data without stalling. Therefore, there is no timing to send the symbol look-up table. Therefore, it is necessary to obtain the same table on the compression side and the decompression side in real time to perform compression / decompression. In order to apply this into the data communication path in hardware, this research addresses a real-time compression / decompression mechanism for the stream data communication path, which can operate at high speed with a small amount of resources. It can be implemented in hardware. We aimed to develop a communication protocol and a symbol look-up table update mechanism that ranks the frequency of data appearance in real time. In order to carry out these processes in a scalable manner by hardware, it will be possible to develop a method for registering fixed-length symbol strings in a table and replacing the least used symbol when the table capacity is saturated. With such a dynamic management method, even if the entropy of the data stream in the communication path changes, the amount of data in the path can be increased. Additionally, the stream compression will achieve more than the peak performance of the physical media based on a hardware-based implementation.
3. Research outcome
We proposed a new method, LCA-DLT, which overcomes the problems of conventional data compression explained in the background above. The new method can be applied to an infinite length of data streams, and can be operated at high speed in hardware. We achieved good performance in terms of both hardware and software. In this compression method, as shown in Fig. 1, (1) a module that compresses two unit data into one is prepared, (2) the module dynamically creates a look-up table, and (3) the dynamic management of the compression mechanism is based on the look-up table, (4) restore the equivalent look-up table on the decompression side and restore the data. With this technology, it is not necessary to send the conversion table to the decompression side, and once the compressed data stream is received, decompression can be performed one after another. That is, when the first data is compressed, the data is passed to the decompression side one after another, and the decompression side sequentially restores the look-up table associated with the data structure and decompresses them. Therefore, it is possible to perform compression / decompression processing in a stream.
Additionally, since the compression module performs the compression from 2 data to 1 data, it can compress only 50%. By cascading the modules, we were able to develop an innovative compression method that can theoretically compress up to 1/16 = 25% in the case of 4 stages. As the academic significance of this research, only the data compression method in which the processor randomly accesses the data held in the memory has been the mainstream. However, it is impossible to perform random access for a data stream. The processing performances can be proved by the novel mechanism such as sensor data. A new academic field of stream data compression can be expected in the future.
Research articles in Journals and Magazines
Shinichi Yamagiwa, Koichi Marumo, Suzukaze Kuwabara. Exception Handling Method Based on Event from Look-Up Table Applying Stream-Based Lossless Data Compression, Electronics, 240, (2021-01-21), DOI:10.3390/electronics10030240
Koichi Marumo, Shinichi Yamagiwa, Ryuta Morita, Hiroshi Sakamoto. Lazy Management for Frequency Table on Hardware-Based Stream Lossless Data Compression, Information, 63, (2016-10-31), DOI:10.3390/info7040063
Conference Proceedings
Shinichi Yamagiwa, Ryuta Morita, Koichi Marumo. Reducing Symbol Search Overhead on Stream-Based Lossless Data Compression, ICCS 2019: Computational Science, 619--626, (2019-06-08), DOI:10.1007/978-3-030-22750-0_59
Shinichi Yamagiwa, Ryuta Morita, Koichi Marumo. Bank Select Method for Reducing Symbol Search Operations on Stream-Based Lossless Data Compression, 2019 Data Compression Conference (DCC), , (2019-03-26), DOI:10.1109/dcc.2019.00123
Koichi Marumo, Shinichi Yamagiwa. Time-Sharing Multithreading on Stream-Based Lossless Data Compression, 2017 Fifth International Symposium on Computing and Networking (CANDAR), , (2017-11-19), DOI:10.1109/candar.2017.42
Shinichi Yamagiwa, Koichi Marumo, Hiroshi Sakamoto. Stream-Based Lossless Data Compression Hardware Using Adaptive Frequency Table Management, In proceedings of Big Data Benchmarks, Performance Optimization, and Emerging Hardware. BPOE 2015. Lecture Notes in Computer Science Vol. 9495, 133--146, (2016-01-09), DOI:10.1007/978-3-319-29006-5_11
Shinichi Yamagiwa, Hiroshi Sakamoto. A reconfigurable stream compression hardware based on static symbol-lookup table, 2013 IEEE International Conference on Big Data, , (2013-10-06), DOI:10.1109/bigdata.2013.6691702