1. Diff-ReColor: Rethinking image colorization with a generative diffusion model.
- Author
-
Li, Gehui, Zhao, Shanshan, and Zhao, Tongtong
- Subjects
- *
IMAGE reconstruction , *IMAGE denoising , *RECOMMENDER systems , *GRAYSCALE model , *INFORMATION filtering - Abstract
This paper presents Diff-ReColor, a novel diffusion-based image colorization framework that addresses the challenges of colorizing grayscale images with high semantic fidelity and diversity. Recognizing the limitations of previous approaches, including CNN-based methods and GANs, which often resulted in ambiguous colorization and artifacts, we leverage the recent advancements in diffusion models known for their stable training and diverse output generation. Diff-ReColor integrates an edge-conditional Denoising Diffusion Probabilistic Model (DDPM) with Dual-Tier Attention Color Reference Encoder, and a segmentation-informed sampling technique to produce colorized images that are both semantically consistent and rich in detail. Our edge-conditional DDPM is specifically designed to handle intricate image details, while the segmentation-guided sampling technique ensures the retention of semantic nuances by utilizing edge and segmentation cues during the colorization process. Inputs are also channeled into a dual attention encoder, where spectral-attention filters semantic information enhanced by spatial-attention, acquiring coarse coloring images to assist the denoising process. Empirical validation on the ImageNet benchmark dataset demonstrates the superior performance of Diff-ReColor in terms of color richness, colorization accuracy, and semantic fidelity. The model's generalization capabilities are further highlighted through its performance on COCO-Stuff and CelebA-HQ datasets, where it surpasses the baseline models without the need for fine-tuning. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF