Back to Search Start Over

Model long-range dependencies for multi-modality and multi-view retinopathy diagnosis through transformers.

Authors :
Huang, Yonghao
Chen, Leiting
Zhou, Chuan
Yan, Ning
Qiao, Lifeng
Lan, Shanlin
Wen, Yang
Source :
Knowledge-Based Systems. Jul2023, Vol. 271, pN.PAG-N.PAG. 1p.
Publication Year :
2023

Abstract

Early eye examination based on fundus images effectively prevents visual impairment caused by retinopathy. The laborious and error-prone process of interpreting fundus images and the lack of ophthalmologists have driven research toward automated retinopathy diagnosis. However, most previous studies have focused on single-modality fundus images, disregarding the integration of information from multiple views, rendering the results unsatisfactory and inconsistent with clinical practice due to the incomprehensive lesion features and incomplete fundus fields. To address this issue, we introduce multi-modality and multi-view fundus images into the automated retinopathy diagnosis pipeline. In contrast to single fundus images, sequential relationships in multi-modality and multi-view fundus images contain essential long-range dependency information, which is vital for retinopathy diagnosis. Inspired by the recent success of transformers for excavating long-range dependencies in sequence data, in this paper, we propose a transformer-based automated retinopathy diagnosis framework for pathology classification and symptom report generation by integrating multi-modality and multi-view fundus images. Specifically, we present two transformer-based networks to construct long-range dependencies in different fundus images. Moreover, we adopt two novel modules to aggregate features of different modalities and views by modeling long-range dependencies among different fundus image sequences. Experiments are conducted on two in-house datasets, in which each subject provides one color fundus photography image and four-view fundus fluorescein angiography images. The experimental results of retinopathy classification and report generation tasks indicate that our proposed method is superior to other benchmarking methods, achieving a classification accuracy of 85.49% and a report generation BlEU-1 of 0.422. • A multi-modality and multi-view framework for automated retinopathy diagnosis. • Transformers for modeling intra-modality and intra-view long-range dependency. • A multi-view fusion embedding module for modeling inter-view long-range dependencies. • A multi-modality fusion embedding module for modeling inter-modality long-range dependencies. • Extensive experiments on three different tasks demonstrate the effectiveness of our models. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09507051
Volume :
271
Database :
Academic Search Index
Journal :
Knowledge-Based Systems
Publication Type :
Academic Journal
Accession number :
163696010
Full Text :
https://doi.org/10.1016/j.knosys.2023.110544