Start Over

Latent Transformer Models for out-of-distribution detection.

Authors :: Graham, Mark S.
Tudosiu, Petru-Daniel
Wright, Paul
Pinaya, Walter Hugo Lopez
Teikari, Petteri
Patel, Ashay
U-King-Im, Jean-Marie
Mah, Yee H.
Teo, James T.
Jäger, Hans Rolf
Werring, David
Rees, Geraint
Nachev, Parashkev
Ourselin, Sebastien
Cardoso, M. Jorge
Source :: Medical Image Analysis. Dec2023, Vol. 90, pN.PAG-N.PAG. 1p.
Publication Year :: 2023
Abstract: Any clinically-deployed image-processing pipeline must be robust to the full range of inputs it may be presented with. One popular approach to this challenge is to develop predictive models that can provide a measure of their uncertainty. Another approach is to use generative modelling to quantify the likelihood of inputs. Inputs with a low enough likelihood are deemed to be out-of-distribution and are not presented to the downstream predictive model. In this work, we evaluate several approaches to segmentation with uncertainty for the task of segmenting bleeds in 3D CT of the head. We show that these models can fail catastrophically when operating in the far out-of-distribution domain, often providing predictions that are both highly confident and wrong. We propose to instead perform out-of-distribution detection using the Latent Transformer Model: a VQ-GAN is used to provide a highly compressed latent representation of the input volume, and a transformer is then used to estimate the likelihood of this compressed representation of the input. We demonstrate this approach can identify images that are both far- and near- out-of-distribution, as well as provide spatial maps that highlight the regions considered to be out-of-distribution. Furthermore, we find a strong relationship between an image's likelihood and the quality of a model's segmentation on it, demonstrating that this approach is viable for filtering out unsuitable images. • We develop Latent Transformer Models for unsupervised out-of-distribution detection in 3D. • This approach combines a VQ-VAE with a transformer for likelihood evaluation. • Our method can successfully identify both far- and near-OOD data. • The method further provides spatial maps to localise anomalies. • Experiments on 2D computer vision datasets help further explain the success of our approach. [ABSTRACT FROM AUTHOR]