Editorial

Artificial intelligence in ophthalmological practice: when ideal meets reality

‘It was the best of times; it was the worst of times’. Such words never get old in any era. Similarly, in ophthalmology, the same thing is happening. Artificial intelligence (AI) using machine learning and deep learning in ophthalmology have created incredible chemistry,1 2 with the US Food and Drug Administration (FDA) approving AI-based diagnostic technology for autonomous diabetic retinopathy screening.3 A human-centred AI study, on diabetic retinopathy screening, conducted in Thailand with the Google team, highlighted many unexpected real-world problems and questions. Further work is required to determine formal cost-benefit analyses, specific workflows for live implementation in diverse healthcare settings and solutions for real-world challenges such as lack of internet connectivity or electronic health records.4

Transforming healthcare with AI is a beautiful aspiration, which everyone expects something1 2 from. Most patients with stable but chronic eye disease would want to spend less time in crowded waiting rooms waiting for brief and hurried consultations; clinicians wish to focus their limited energy on solving problems; engineers want their code to change the world and device manufacturers want their devices to be the gateway to a virtually connected reality. So, is reality as good as it seems?

The most compelling application of AI in ophthalmology is to aid in diagnosis. Standardised image acquisition in ophthalmology brings a unique advantage to deep learning. From corneal topography to optical coherence tomography angiography, many ophthalmic assistive exams have objective image quality checks on the images at the time of acquisition, which can be used to determine the credibility of the examination results and as high-quality input data for model training. Theoretically, AI-assisted diagnosis applies to most common cases in ophthalmology. Researchers and engineers in ophthalmology have made outstanding contributions to this effort. If someone follow developments in this field, he/she will know that AI has shown fantastic potential in subspecialties such as diabetic retinopathy, glaucoma, retinal vein occlusion and so on.5–8 Many AI-related ophthalmological programmes are going worldwide and many automated eye-disease screening and analysis medical devices have been successfully applied in the clinical practice.9 AI can be found in almost all areas of ophthalmology, from the anterior segment of the eye to the fundus.10–12 Clinicians have provided many noteworthy labels for machine learning based on their own experience, and we look forward to adding the next label that will make the model even better. As of 2020, there are 94 publicly accessible downloadable ophthalmology databases, and more than half of the image data are retinal fundus photographs, with 18% of these databases not labelled with the relevant diseases collected. Unfortunately, most databases lack basic information (age, sex, ethnicity, etc) and inclusion and exclusion criteria are missing. Barriers to using these data include low visibility, accessibility issues or limited usability due to incomplete metadata, including the lack of critical parameters needed to assess data sources, data quality and diversity of the populations sampled.13

Human learning behaviour has not yet been fully explained at the neurological level, and the emerging interdisciplinary field of cognitive neuroscience was born to study this behaviour.14 Most AI models are disease-centric, and it is challenging to train an AI model to detect normal fundus photos. For training, most AI models require the index case (positive cases) versus control (non-positive cases that usually include normal and other non-index pathologies). On the other hand, the human brain is easier in registering the ‘normal images’ without any pathology. This is unique to the human brain compared with the convolutional neural network we are currently using. Of course different algorithms optimise this feature, for example, when we use semi-supervised learning, this method requires only a small number of data to be labelled. To unravel the intricate structure behind the data model, the machine must be able to infer patterns between observations for which it has not received explicit tagged information. Semi-supervised learning aims to provide mechanisms for making such connections, which will be essential for achieving this goal.15 Unfortunately, many semi-supervised learning methods only perform better than their supervised counterparts or base learners in specific cases.16 17 But this has received relatively little academic attention.18 Furthermore, different algorithms fuse to enhance, and many algorithms have been iteratively upgraded to follow the progress of the times and the increase in computing power. First to be noticed are the recent advances in semi-supervised neural networks. Minor variations in the input space should only cause minor variations in the output space. This assumption makes incorporating unsupervised loss terms into the cost function more straightforward than before. This flexibility also accommodates the incorporation of more complex cost terms. Another potential remedy for the lack of robustness of semi-supervised learning methods lies in the application of automated machine learning to the semi-supervised setting. These approaches include meta-learning and neural architecture search as well as automatic algorithm selection and hyperparameter optimisation, which have been prominently and successfully applied to supervised learning, but there has been no application to semi-supervised learning so far.15 However, there are some issues with the usability. Semi-supervised learning is much less standardised compared with supervised learning. The KEEL software package includes a semi-supervised learning module,19 and implementations of some transductive graph-based methods exist in scikit-learn.

While the algorithm evolves, it also must match the right application scenario. In the experiments in Bangkok, the environmental light in the clinic obstructed the proper functioning of the diabetic retinopathy diagnostic model.20 Nurses needed to constantly readjust to get an image quality that the machine would recognise, reducing their productivity in an already busy workday. As humans train the models, the models train humans. Could we also consider adding some non-clinical diagnostic factors into the model training? For example, use an AI-enabled automated image optimisation software to improve the luminance, contrast and eye-camera coordination during the image acquisition stage? Such feature points will help increase the model’s fit, although they require higher-resolution sensors.

Even if the AI diagnostic performance is deemed to be clinically acceptable in the research and development phase like the above-mentioned example, the real-world AI implementation can possess many challenges, including AI bias due to differences in capturing devices, locations in the specific organs, population/ethnicities; generalisability; data privacy; ethics and social equities. As we need evidence-based medicine, algorithm training requires constantly expanding the data sample and maintaining a certain update frequency. The FDA uses a framework called Software as Medical Device to review and approve the marketing of AI-based technologies, which evaluates algorithms throughout their life cycle.21 Big data and AI may also expose the health risks of some specific populations, leading to an inevitable imbalance in the distribution of health insurance resources and social injustice. Although AI systems can often achieve ‘state-of-the-art’ performance on ‘in silico’ testing, these findings are often not replicated in the real world. In this regard, a major focus of clinical AI research is the development of systems which are (1) robust (eg, will work on different machines and in different conditions), (2) reliable (eg, can give some measure of certainty with which they provide an output), (3) safe (eg, can detect rare but potentially serious ophthalmic conditions) and (4) fair (eg, can work equally well in different populations, particularly with regard to age, gender and ethnicity).

Technological optimism and pessimism need to strike a delicate balance about healthcare, a complex relationship between society, ethics and technology that only humans can weigh. Until now, if you ask any clinician whether AI is widely adopted in the clinical practice, his/her answer mostly likely is a ‘no’. Medicine is an art, and there are limits to the benefits that a single technological advancement can bring. AI in the clinical optimisation of the patient and doctor experience is its true mission. AI combined with clinical practice can only better step into the business world to achieve a positive cycle of input and output. As clinicians, we all know that ‘To cure sometimes. Often relieves. Always comfort’. Taking technology and adding humanistic care may be a condition for AI to move towards medical reality; even if AI is sufficient to solve clinical problems, it cannot understand what is happening in the clinic.