1. A pyramid transformer with cross-shaped windows for low-light image enhancement.
- Author
-
Li, Canlin, Gao, Pengcheng, Song, Shun, Liu, Jinhua, and Bi, Lihua
- Subjects
- *
TRANSFORMER models , *CONVOLUTIONAL neural networks , *IMAGE intensifiers , *NATURAL language processing , *DEEP learning , *PYRAMIDS , *COMPUTER vision - Abstract
Low-light image enhancement is a low-level vision task. Most of the existing methods are based on convolutional neural network(CNN). Transformer is a predominant deep learning model that has been widely adopted in various fields, such as natural language processing and computer vision. Compared with CNN, transformer has the ability to capture long-range dependencies to make full use of global contextual information. For low-light enhancement tasks, this capability can promote the model to learn the correct luminance, color and texture. We try to introduce transformer into the low-light image enhancement field. In this paper, we design a pyramid transformer with cross-shaped windows (CSwin-P). CSwin-P contains an encoder and decoder. Both the encoder and decoder contain several stages. Each stage contains several enhanced CSwin transformer blocks (ECTB). ECTB uses cross-shaped window self-attention and a feed-forward layer with spatial interaction unit. Spatial interaction unit can further capture local contextual information through gating mechanism. CSwin-P uses implicit positional encoding, and the model is unrestricted by the image size in the inference phase. Numerous experiments prove that our method is superior to the current state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF