1. Symmetric image compression network with improved normalization attention mechanism.
- Author
-
Tai, Shen-Chuan, Yeh, Chia-Mao, Lee, Yu-Ting, and Huang, Wesley
- Subjects
IMAGE processing ,DIGITAL images ,RECORDS management ,SIGNAL-to-noise ratio ,COMPARATIVE studies ,IMAGE compression - Abstract
Image compression plays a vital role in various applications, such as the storage, transmission, and sharing of digital images. We present a symmetric image compression that leverages improved normalization and attention mechanisms to achieve superior compression performance. We focus on addressing the limitations of conventional normalization techniques in handling diverse image characteristics effectively. An adaptive normalization module is proposed to adjust normalization parameters dynamically based on the content of the input image; it ensures optimal data representation and contributes to improved compression efficiency. Furthermore, a windowed attention mechanism is integrated into the compression network to selectively focus on significant image regions while suppressing noise and redundancies by effectively capturing and preserving important visual features to enhance the overall compression quality and reconstruction fidelity. Comparative analysis with state-of-the-art methods demonstrates the superiority of the proposed symmetric image compression network in terms of compression ratio, reconstruction quality, and subjective visual perception. Moreover, the effects of different normalization and attention configurations on compression performance are thoroughly investigated and analyzed. The results of the experiments validate the effectiveness of the proposed symmetric image compression network with improved normalization and windowed attention, known as the Image Compression Attention Normalization Network. The integration of adaptive normalization and windowed attention mechanisms not only enhances the compression efficiency but also enables adaptability to diverse image characteristics. The proposed compression scheme results are as follows. The inference time for encoding and decoding is 0.152 and 0.163 s, respectively. The Bjøntegaard delta (BD) rate shows a reduction of 74.8936%, and the BD-peak signal-to-noise ratio increases by 6.425 dB. The computational load is measured in floating-point operations and totals 263.614G, with the model utilizing 68.1M parameters. Windowed attention preserves critical data, and adaptive normalization alters image processing, enhancing ablation studies. Despite differences, both modules contribute to emphasizing data retention and adjustments and optimizing image processing for improved results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF