1. Using histopathology latent diffusion models as privacy-preserving dataset augmenters improves downstream classification performance.
- Author
-
Niehues JM, Müller-Franzes G, Schirris Y, Wagner SJ, Jendrusch M, Kloor M, Pearson AT, Muti HS, Hewitt KJ, Veldhuizen GP, Zigutyte L, Truhn D, and Kather JN
- Subjects
- Humans, Image Interpretation, Computer-Assisted methods, Image Processing, Computer-Assisted methods, Algorithms, Colorectal Neoplasms pathology, Colorectal Neoplasms diagnostic imaging
- Abstract
Latent diffusion models (LDMs) have emerged as a state-of-the-art image generation method, outperforming previous Generative Adversarial Networks (GANs) in terms of training stability and image quality. In computational pathology, generative models are valuable for data sharing and data augmentation. However, the impact of LDM-generated images on histopathology tasks compared to traditional GANs has not been systematically studied. We trained three LDMs and a styleGAN2 model on histology tiles from nine colorectal cancer (CRC) tissue classes. The LDMs include 1) a fine-tuned version of stable diffusion v1.4, 2) a Kullback-Leibler (KL)-autoencoder (KLF8-DM), and 3) a vector quantized (VQ)-autoencoder deploying LDM (VQF8-DM). We assessed image quality through expert ratings, dimensional reduction methods, distribution similarity measures, and their impact on training a multiclass tissue classifier. Additionally, we investigated image memorization in the KLF8-DM and styleGAN2 models. All models provided a high image quality, with the KLF8-DM achieving the best Frechet Inception Distance (FID) and expert rating scores for complex tissue classes. For simpler classes, the VQF8-DM and styleGAN2 models performed better. Image memorization was negligible for both styleGAN2 and KLF8-DM models. Classifiers trained on a mix of KLF8-DM generated and real images achieved a 4% improvement in overall classification accuracy, highlighting the usefulness of these images for dataset augmentation. Our systematic study of generative methods showed that KLF8-DM produces the highest quality images with negligible image memorization. The higher classifier performance in the generatively augmented dataset suggests that this augmentation technique can be employed to enhance histopathology classifiers for various tasks., Competing Interests: Declaration of competing interest The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: JN Kather reports a relationship with AstraZeneca Pharmaceuticals LP that includes: consulting or advisory and speaking and lecture fees. JN Kather reports a relationship with Bayer AG that includes: speaking and lecture fees. JN Kather reports a relationship with Eisai Inc that includes: speaking and lecture fees. JN Kather reports a relationship with Merck Sharp & Dohme UK Ltd that includes: speaking and lecture fees. JN Kather reports a relationship with Bristol-Myers Squibb Company that includes: speaking and lecture fees. JN Kather reports a relationship with F Hoffmann-La Roche Ltd that includes: speaking and lecture fees. JN Kather reports a relationship with Pfizer Inc that includes: speaking and lecture fees. JN Kather reports a relationship with Fresenius Kabi Germany that includes: speaking and lecture fees. JN Kather reports a relationship with Owkin France that includes: board membership and speaking and lecture fees. JN Kather reports a relationship with Panakeia, UK that includes: board membership and speaking and lecture fees. JN Kather reports a relationship with DoMoreDiagnostics that includes: board membership and speaking and lecture fees. JN Kather reports a relationship with Histofy, UK that includes: speaking and lecture fees. JN Kather reports a relationship with StratifAI GmbH that includes: board membership and equity or stocks., (Copyright © 2024. Published by Elsevier Ltd.)
- Published
- 2024
- Full Text
- View/download PDF