Discussion
In this study, we showed the feasibility of an automated DL classification for the detection of several retinal vascular diseases using UWF-CFP, with an overall accuracy of 88.4%. The cross-validation technique used in our study allowed taking into account the whole dataset since we were able to make a prediction for every image within the dataset and minimising bias. In our study, SCR was the best-identified category on UWF-CFP with an accuracy of 93.8%, and RVO is the second-best detected class, with an accuracy of 88.4%. The reliability of our model was confirmed by the high AUC-ROC obtained. Consequently, the SCR category had the highest AUC-ROC with 96.7% followed by the RVO class with 91.2%. Other classes had an AUC-ROC of about 90%. Moreover, our model obtained a specificity of more than 90% for all four classes and high sensitivity for the SCR class (94.7%). Thus, this DL four-class model presents an important interest and a high accuracy in the detection of different retinal vascular diseases.
Several DL models have already shown high accuracy in detecting proliferative DR,26 RVO27 28 or SCR35 on UWF-CFP. However, none of the previous studies focused on classifying different retinal vascular diseases (ie, DR, RVO and SCR), with potentially similar features, using UWF-CFP. In detail, Nagasawa et al26 used 378 UWF-CFP images acquired with the Optos system to train and test their DL model to detect only proliferative DR from normal eyes. The authors obtained a sensitivity of 94.7%, a specificity of 97.2% and an AUC-ROC of 97% in distinguishing proliferative DR from healthy controls by using VGG-16 and Grad-CAM as a visualisation method.26 Nagasato et al28 used a dataset of 125 central RVOs and 238 healthy controls to train and test both a DL model and support vector machine (SVM) model. The authors obtained a higher sensitivity and specificity for the DL model (sensitivity: 98.4%, specificity: 97.9%, AUC-ROC of 98%) than for the SVM model (sensitivity: 84%, specificity: 87.5%, AUC-ROC of 89.5%) for this binary classification.28 Cai et al gathered 1182 UWF-CFP images from 190 patients with SCR to build their DL model (Inception V4 architecture), aiming to automatically detect sea fan neovascularisation. The authors used two visualisation methods, both Grad-CAM and SmoothGrad. The model obtained a sensitivity of 97.4%, a specificity of 97% and an AUC-ROC of 98.8% for detecting sea fan neovascularisation.35 Nevertheless, these studies used binary classifications (healthy vs retinal disease), while our model used a four-class classification system. Despite the high accuracy obtained by our model, a high sensitivity was obtained for SCR (94.7%) and RVO (78.7%). Conversely, the sensitivities for DR and healthy controls were not high enough for an efficient screening tool. Interestingly, as DR and RVO generate somewhat similar vascular changes at the posterior pole (ie, haemorrhages) and in the periphery (ie, non-perfusion), 13 DR images were erroneously classified as RVO.
Consistent with recent literature, we used two visualisation methods (saliency maps and GradCAM++) for the model’s output, which allowed us to evaluate the areas the model relied on when making a prediction. Indeed, our model relied on the haemorrhagic areas and the hard exudates to predict the class for UWF-CFP DR images (figure 2). In the case of RVO, the model detected well the diffuse haemorrhages in RVOs (figure 2). Concerning SCR, the model took into account the foveal reflex and nerve fibre layer to predict SCR (figure 2). Due to the foveal reflex that is more apparent in the younger population of patients with SCR (as opposed to the older patients with DR or RVO), eyes with SCR were more readily identified by the DL classifier. In other SCR cases, however, predictions seem to rely as well in this particular category on vascular peripheral signs such as sea fans or peripheral non-perfusion, as seen in figure 2.
This DL algorithm can be an impactful tool in areas with a lack of ophthalmological care. Maa et al36 reported that telemedicine in ophthalmology could reduce cost and improve access to care. In areas with a shortage of ophthalmological care, the availability of a non-invasive, fast, non-mydriatic UWF-CFP system allows performing an accurate diagnosis of the most prevalent retinal vascular diseases for referral to a specialist for confirmation and management. Moreover, sickle cell disease is an inherited disorder. Most of the cases are concentrated in referral centres for patients in Europe and the USA. As ophthalmology clinics are not always available in all of these referral centres, an automated artificial intelligence (AI) detection could be of great interest for the diagnosis of retinal involvement.
Our study has several limitations. First, our dataset was rather small, given that 224 images were available for model construction and testing across four classes. In comparison, Nagasawa et al26 or Nagasato et al28 used comparable datasets for binary classifications. Second, another limitation of our study may be the use of the Optos pseudocolour UWF-CFP. Using the Optos system, pseudocolour images are obtained using red and green scanning lasers, and different magnification between the central and peripheral retina.37 This may artefactually enhance certain features while diminishing others. In our dataset, the pseudocolour UWF images of the retina were not individually balanced for the green and red laser images by a grader before export. Moreover, UWF-CFP images can have some artefacts limiting the discriminating power for models, such as eye contour elements like eyelid or eyelash. Third, no objective quality assessment metrics have been used. Last but not least, the lack of an external test dataset counts among the limitations of this study.
The analysis of the prediction distribution showed that there was a difference in the model’s confidence when making a correct or incorrect prediction (online supplemental figure 2). Nevertheless, this difference in confidence might not be enough to clearly identify a correct prediction without prior knowledge of the ground truth. This is a widely known problem of neural networks (also called calibration problem), and may be due, in our study, to the fact that some of our classes share visual information (such as for RVO and DR images).
In conclusion, UWF-CFP combined with DL may be a useful way to detect and screen for retinal vascular diseases. This technology may be a useful tool for telemedicine and in remote areas with limited access to ophthalmic care.