Introduction
Prognostic models are an essential component of personalised medicine, allowing health experts to predict the future course of disease in individual patients.1 Advances in computing power and an abundance of data have allowed for increasingly sophisticated models to be developed. Most developed prognostic models use statistical methods such as logistic regression; these models require prior feature extraction, either manual or automatic,2 and are limited in the number of included variables. Feature extraction can be costly and time consuming, especially in imaging data. Deep learning offers the ability to avoid explicit feature extraction, allowing us to develop models without the need for handcrafted features. For this reason, deep learning is especially useful in imaging data. Prognostic deep learning models have been developed in several fields, primarily ophthalmology,3 cardiology4 and neurology,5 and several modalities, including MRI, optical coherence tomography (OCT), colour fundus photography and X-ray.
Current prognostic models that use deep learning to analyse imaging data, either use automatic feature extraction algorithms to extract known features or only consider a single time point. Models developed using feature extraction, train algorithms on annotated images to extract relevant features such as volumes in OCT data; those features are then fed into a traditional statistical model, see Refs. 3 6 7 for examples. Manual feature extraction is time consuming and requires expert readers. Yim et al8 proposed a method which automatically segments OCT layers before classification. This method outperformed human experts; however, automatic feature extraction requires annotations during training, which is not always available in situations when the features are unknown or difficult to quantify, such as is the case when using colour fundus imaging.
An alternative to explicit feature extraction is to use deep learning to extract features implicitly, such as used by Arcadu et al9 and Babenko et al.10 Many models take the previous available image and fit a pretrained convolutional neural network (CNN), with Inception V311 being a popular choice due to its generalisability and high performance in a variety of tasks. This method, unlike the feature extraction method, may be applied to any image even when features are not explicitly known; however, this creates a separate issue, by using only one image, these models may fail to capture the temporal pattern across time points. Most recently, Yan et al12 used Inception V3 to classify single images combined with genetic factors to predict progression to AMD. They found that images alone provided reasonable performance, and the inclusion of multiple genetic factors increased predictive performance; however, this work still only considered a single time point.
Here, we develop a prognostic model to predict the progression of disease, from longitudinal images. The proposed method is demonstrated on a dataset consisting of 4903 eyes with age-related macular degeneration (AMD), taken from the Age-Related Eye Disease Study (AREDS) dataset.13 The method is generalisable to any longitudinal imaging data. We show that by considering the time interval between images and adopting a method from time series analysis, we can provide significantly improved prediction performance.
Our contributions are as follows:
Propose a novel method to predict the future prognosis of a patient from longitudinal images.
Introduce interval scaling which allows for uneven time intervals between visits.
Demonstrate on the largest longitudinal dataset and attain state-of-the-art performance outperforming other state-of-the-art methods.