Discussion
Figure 1 shows the stereopair scatter plots for subjective refractions (SR) from the 50 optometrists for the right (red) and left (blue) eyes of the participant together with their 95% SCPD. The measurements for right eye (red in figure 1) were more variable and more myopic and astigmatic (see table 1 also). Each point in the figure represents the SR from a single optometrist (of the 50 involved) for the participant’s right or left eyes. The boundary of the distribution ellipsoid describes the region of the 3-space (SDPS) within which an estimated 95% of the sample (ie, SR) and possibly the population also from which the sample was drawn based on assumptions such as normality of data and random sampling. In this instance due to outliers (measurements of SR that were atypical for the sample) and departure from normality as well as the purposive nature of sampling of the optometrists rather than a random selection, we need to be cautious when interpreting the meaning of these SCPD for the right and left eyes. Nonetheless, the points themselves and their distributions in SDPS are important in terms of measures of central tendency (ie, means) and measures of dispersion (such as variances) and the points (measurements) alone are not subjected to the abovementioned assumptions (normality and random sampling) for the ellipsoids. Quantitative measures for variances and covariances will be influenced by factors such as outliers and departure from normality. The centres (or centroids) for these distribution ellipsoids represent the sample means (–7.68–4.50×10 and –4.59–1.85×178 for the right and left eyes, respectively; see table 1). Means such as these are more robust to outliers but where necessary medians can be used instead.
All points below the origin of the three axes in figure 1A indicate the myopic astigmatic states of the eyes concerned. The cluster of points in blue closest to the origin represents SR for the left eye. The points lying outside the ellipsoidal boundary or surface are possible outliers, and some are noted for both eyes (red for right and blue for left eye). MD and Euclidean lengths are helpful in identifying the optometrists where SR was atypical as against the remainder of the group. These would be the ones with the largest Euclidean lengths (or differences) for OD and OS. However, most of the optometrists had similar results (if we exclude a few that were atypical), but there was possibly greater difficulty with measurement of SR in the eye with the greater ametropia (ie, the right eye of the participant).
The excesses in figure 1B and mean excesses in table 1 also confirm that despite a few larger excesses (possible outliers), mostly SR is highly reproducible across the 50 optometrists concerned and with learning and experience by the participant reproducibility seems to improve also. Variation in the residuals was mainly scalar (spherical) rather than antistigmatic as indicated also in table 1. The starbursts in figure 4 and the mean Euclidean differences again indicate the high reproducibility of SR and despite the presence of moderate to severe ametropia in the eyes of the participant involved. As before, greater difficulty was experienced in measuring SR for the eye with greater ametropia and the mean Euclidean difference was almost double that for OD (=0.60 D) as against that for OS (=0.34 D). SD for the Euclidean differences was also greater for OD, that is, for the eye with greater ametropia.
There is minimal variability in the scalar coefficients of SR (see table 1 where S
II ≈0.38 D2 and ≈0.16 D2 for OD and OS, respectively), with again greater variance for SR for the eye with greater ametropia. Note also the relatively large orthoantistigmatic variance (S
JJ ≈0.33 D2) for OD as compared with the other antistigmatic variances). The differences in SR findings could be due to changes in the participant’s subjective responses (perhaps because of eyelid squinting or misunderstanding instructions and the examiners using different refractive procedures or different endpoint criteria). Fatigue and/or learning may also factors depending on circumstances during specific SR and, of course, relating also to the techniques of individual optometrists and their ease of measuring the SR. Since the participant was prepresbyopic and, thus, still able to accommodate, some optometrists might have failed to completely relax the participant’s accommodation and, thus, this might have been a confounding or extraneous factor. Since all the visits were completed within a period of 6 months, it is unlikely that the participant’s subjective refractive states would have changed very much between the various examinations or that other factors such as diurnal variation (SR was measured at different times of day) or environmental such as involving ambient temperature or humidity would have been that influential although one cannot obviously rule out some effects of this sort. While some optometrists might have used ‘maximum plus to best visual acuity’ as their endpoint, others might have chosen other options (undercorrecting or overcorrecting slightly) and this is, thus, an unknown factor in terms of the final SR that each optometrist determined and possible influences in this study.
Of importance, here is that besides possible outliers, the samples for SR were also not found to be normally distributed and the samples mainly show leptokurtic distributions for SR irrespective of eye (OD or OS) concerned, but there is also marked or severe positive or negative skewing depending on meridian concerned (figure 2). Some of this departure from normality relates to the presence of possible outliers or variability due to the various methods applied during measurement of SR and perhaps even differences in application of interocular accommodative equalisation or balancing techniques (such as Humphrey immediate contrast test or others), for example, or differences in endpoints and possibly also different experience levels for the optometrists concerned. Increasing the sample size and randomisation in selection of optometrists possibly might reduce such departure from normality and be helpful, but it could also be that SR is inherently not normally distributed due to processes (such as emmetropisation or departures therefrom and genetics) that apply to ametropia, and it is understood that some samples such as that for university students are typically leptokurtic and skewed with a greater prevalence of myopia.35 36
Although an analysis such as herein must first consider and understand the samples themselves, the research question here is directed towards exploring and understanding reproducibility for SR for the right and left eyes of the participant and the last section of the results (see above) analyses this aspect, mainly through the use of starbursts (figure 4), and also Euclidean differences13 and excesses (see figure 1B and C and table 1) for SR from each optometrist as compared with the means for SR for OD and OS from the 50 optometrists. The mean SR for either the right (
OD) or left (
OS) eye was regarded as the gold-standard or criterion-standard, and the SR measured by each optometrist was subtracted from the mean SR for the right or left eye as applicable using matrices. Euclidean differences compare the individual SR for each optometrist to the mean SR for the 50 optometrists. If an optometrist had the same result as the mean SR for the eye concerned, then the Euclidean difference would be zero and the larger the corresponding Euclidean difference, the farther away that optometrist was from the mean SR for either the right and left eyes. So, figures 1 and 4 are relatively simple methods to visually understand SR for the different eyes and different optometrists involved while table 1 supplies quantitative results to indicate just how different are the measurements, that is, concerning the level of agreement or similarity and reproducibility of SR.
The results for this study indicated that most optometrists were within 0.60±0.65 D of the mean SR for the right eye and 0.34±0.29 D for the left eye (here, mean Euclidean distance || and SD ( )). These are measures in SDPS and they are always positive; they are also not quite the same as other measures for reproducibility that might be expressed in terms of clinical notation or power vectors although they can be thought of in sphere-equivalent terms. The smaller the value and closer to zero is ||, the greater is the reproducibility for SR and in terms of the SD here, the smaller the value, the lesser the variation of the Euclidean differences. Here, the values ( 0.65 D and 0.29 D are relatively large (in relation to their means) but removal of possible outliers would decrease . So, this study suggests that SR performed by 50 optometrists on a single participant may differ (sphere-equivalent mean Euclidean differences) by ≈0.34 to 0.60 D. The ranges for the Euclidean differences for the right eye were larger than for the left eye, meaning that the reproducibility was better for SR for the left eye with the lesser ametropia as compared with the right eye. This could be due to issues such difficulties in performing SR or less experience perhaps with some of the optometrists that resulted in the possible outliers (eg, optometrists 2, 5 and 6 for OD and 5, 9 and 10 for OS). These optometrists (see online supplemental figure) had larger Mahalonobis distances and in the starbursts longer lines or comets correspond to these observations. (These potential outliers can also be found in figure 1A outside the distribution ellipsoids for OD and OS.)
Excesses (table 1 and figure 1) are used to compare each SR for OD or OS of the participant to the means (
OD or
OS) for the group of 50 optometrists. Of course, we do not know the true SR for OD or OS for the participant and, thus, the averages for SR for the group are used as our best estimates of the true SR for OD and OS. These means are, thus, the references to which individual subjective refractions are compared and the smaller the residual (excess or difference), the closer is the optometrist measured SR to the mean SR. Although there were some optometrists with larger excesses, where the SR was quite different to the mean SR for the eye (OD or OS) concerned, the mean excesses in table 1 were almost zero and thus, on average, there was not too much of a difference between results for individual optometrists for SR as compared with the mean SR for either OD or OS and consequently the 95% confidence ellipsoids (most obvious in figure 1C) are very small in size and positioned very close to O D (or 0 D in conventional terms).
MacKenzie,11 in the UK, investigated the reproducibility of spherocylinder prescriptions from a healthy young man as provided by 40 experienced optometrists. His study also used univariate and multivariate methods for dioptric power to evaluate the reproducibility. He concluded that optometrists may differ in their stigmatic component (F
I or M) by ≈0.78 D and approximately 0.50 D cylinder (F
c) in 95% of repeated measures. The current study mainly used Euclidean differences,13 which can be either sphere equivalent or cylinder equivalent13 37–39 to obtain the 95% of repeated 39 measures. Sphere equivalency was used herein as this is possibly easier to understand in terms of clinical applications and we found similar results to MacKenzie. In another study by Shah et al,40 six eyes from three groups of standardised patients (basically patients trained to be expert observers) with healthy eyes were investigated by three or some by four optometrists. Both the first and second groups of the standardised patients had no cylindrical correction in their right eyes and relatively small amounts of astigmatism (F
c: −0.25 D) in their left eyes or no astigmatism at all. The spherical ametropia ranged from −3.75 to −4.00 D with a mean of −3.94 D. The third group had no distance prescription. Based on the reproducibility limit data obtained, they concluded that any two optometrists could differ in their spherical equivalent refraction by ≤0.75 D and between 0.25 D and 0.61 D for their cylindrical components (F
c) in 95% of repeated measures.
The findings of the present study are also mostly in an agreement with those by Bullimore et al
8 who reported reproducibility limits for spherical equivalent refraction to be 1.10 D and 1 D for cylinder (F
c). They used power vectors M, J
0 and J
45 rather than power matrices to evaluate their results, but their study design was based on examination of 86 participants by two examiners, so comparisons of their results and ours (one participant only but 50 examiners) should be made with caution as the research designs were obviously quite different. Based on the limits of agreement, Zadnik et al
7 estimated the 95% limits of agreement for SR to be ±0.63 D. However, their study findings are not very comparable with the results of our study since their findings were based on an analysis of power measured along the vertical meridian of each eye. Rosenfield and Chiu10 investigated the repeatability of subjective and objective refractions by one examiner on 12 participants on five separate occasions and showed that 95% limits of agreement for sphere (F
s) and cylinder (F
c) powers were ±0.29 D and ±0.16 D, respectively. However, their study assessed repeatability (repeated measures by the same examiner on different occasions; in their case, two examiners analysed separately) rather than reproducibility (different examiners as compared with one another) as investigated in the current study.
While there are several studies2 5–11 that provide insight into the reproducibility of SR, the findings of those studies are based on SR measurements collected from two, three or even five examiners, and in some instances, students are used as participants. Another limitation was that examiners were not always masked to the results of previous SR or spectacle prescriptions. Although the present study investigated the reproducibility of the SR of a symptomatic participant using SR from 50 experienced optometrists, this participant cannot be considered as representative of the diverse population of South Africa or of the range of ametropias possible. Other possible limitations relate to the methods used for measurement of SR that might have varied across the different optometrists and their years of experience after graduation varied from 5 to 20 years. Environments or practices where the SR were measured would also differ in terms of lighting and different charts and test distances might also apply. Mostly phoropters would have been used, but sometimes, trial frames and lenses might have been used for the full SR or to check the phoropter-based SR. Retinoscopy and/or autorefraction might also have been incorporated before doing SR. Endpoints for SR might have differed across the optometrists. All SR were measured without cycloplegia, and, thus, changes in ocular accommodation may have had unknown influences on SR as determined and analysed here. Additionally, it is likely that there may have been learning effects as the participant experienced multiple eye examinations, and this could have affected the reproducibility for SR.
Advantages for the study here include that a comprehensive investigation and analysis were performed using appropriate methods for analysis of refractive state and it is probably the first study of this type in the South African context. Reproducibility of SR is not really that well understood and this study advances our knowledge and understanding of this rather complicated but intriguing topic.
Recommendations
Future studies could include more participants and perhaps fewer optometrists and possibly be performed both with and without cycloplegia. A wider range of ametropias could be useful in developing greater understanding of the area of interest involved. Younger and older participants than the one involved here would also be very useful for further study. This study was based in only a single geographic region and similar studies could be done elsewhere. A standard protocol for SR could be developed and used by the different examiners involved, and greater control of clinical environments could be used in future studies of this type should the intention be to limit specific factors that might affect reproducibility. Learning effects also should be taken into consideration where multiple eye examinations are contemplated.