Elsevier

Ophthalmology

Volume 127, Issue 1, January 2020, Pages 85-94
Ophthalmology

Original Article
Development and Validation of Deep Learning Models for Screening Multiple Abnormal Findings in Retinal Fundus Images

https://doi.org/10.1016/j.ophtha.2019.05.029Get rights and content
Under a Creative Commons license
open access

Purpose

To develop and evaluate deep learning models that screen multiple abnormal findings in retinal fundus images.

Design

Cross-sectional study.

Participants

For the development and testing of deep learning models, 309 786 readings from 103 262 images were used. Two additional external datasets (the Indian Diabetic Retinopathy Image Dataset and e-ophtha) were used for testing. A third external dataset (Messidor) was used for comparison of the models with human experts.

Methods

Macula-centered retinal fundus images from the Seoul National University Bundang Hospital Retina Image Archive, obtained at the health screening center and ophthalmology outpatient clinic at Seoul National University Bundang Hospital, were assessed for 12 major findings (hemorrhage, hard exudate, cotton-wool patch, drusen, membrane, macular hole, myelinated nerve fiber, chorioretinal atrophy or scar, any vascular abnormality, retinal nerve fiber layer defect, glaucomatous disc change, and nonglaucomatous disc change) with their regional information using deep learning algorithms.

Main Outcome Measures

Area under the receiver operating characteristic curve and sensitivity and specificity of the deep learning algorithms at the highest harmonic mean were evaluated and compared with the performance of retina specialists, and visualization of the lesions was qualitatively analyzed.

Results

Areas under the receiver operating characteristic curves for all findings were high at 96.2% to 99.9% when tested in the in-house dataset. Lesion heatmaps highlight salient regions effectively in various findings. Areas under the receiver operating characteristic curves for diabetic retinopathy-related findings tested in the Indian Diabetic Retinopathy Image Dataset and e-ophtha dataset were 94.7% to 98.0%. The model demonstrated a performance that rivaled that of human experts, especially in the detection of hemorrhage, hard exudate, membrane, macular hole, myelinated nerve fiber, and glaucomatous disc change.

Conclusions

Our deep learning algorithms with region guidance showed reliable performance for detection of multiple findings in macula-centered retinal fundus images. These interpretable, as well as reliable, classification outputs open the possibility for clinical use as an automated screening system for retinal fundus images.

Abbreviations and Acronyms

AMD
age-related macular degeneration
AUC
area under receiver operating characteristic curve
DR
diabetic retinopathy
IDRiD
Indian Diabetic Retinopathy Image Dataset
RNFL
retinal nerve fiber layer

Cited by (0)

See Commentary on page 95.

Supplemental material available at www.aaojournal.org.

Financial Disclosure(s): The author(s) have made the following disclosure(s): J.S.: Employee – VUNO, Inc. (Seoul, Korea).

K.-H.J.: Employee and Equity owner – VUNO, Inc. (Seoul, Korea).

S.J.P.: Equity owner – VUNO, Inc. (Seoul, Korea).

Supported by the Intelligence Information Service Expansion Project, which is funded by National IT Industry Promotion Agency (grant no.: NIPA-C0202-17-1045) and the Small Grant for Exploratory Research of the National Research Foundation of Korea, which is funded by the Ministry of Science, Information and Communications Technology, and Future Planning (grant no.: NRF-2018R1D1A1A09083241). The sponsors or funding organizations had no role in the design or conduct of this research. The authors alone are responsible for the content and writing of the article.

HUMAN SUBJECTS: No human subjects were included in this study. Images were de-identified and written consent from patients was waived by the institutional review board due to the retrospective nature of the study. All research adhered to the tenets of the Declaration of Helsinki.

No animal subjects were included in this study.

Author Contributions:

Conception and design: Son, Shin, Jung, S. J. Park

Analysis and interpretation: Son, Shin, Jung, S. J. Park

Data collection: Son, Shin, Kim, Jung, K.H.Park, S. J. Park

Obtained funding: Son, Jung, S. J. Park

Overall responsibility: Son, Shin, Jung, S. J. Park

Both authors contributed equally as first authors.