A Comprehensive Assessment of the Precision and Agreement of Anterior Corneal Power Measurements Obtained Using 8 Different Devices

Purpose To comprehensively assess the precision and agreement of anterior corneal power measurements using 8 different devices. Methods Thirty-five eyes from 35 healthy subjects were included in the prospective study. In the first session, a single examiner performed on each subject randomly measurements with the RC-5000 (Tomey Corp., Japan), KR-8000 (Topcon, Japan), IOLMaster (Carl Zeiss Meditec, Germany), E300 (Medmont International, Australia), Allegro Topolyzer (Wavelight AG, Germany), Vista (EyeSys, TX), Pentacam (Oculus, Germany) and Sirius (CSO, Italy). Measurements were repeated in the second session (1 to 2 weeks later). Repeatability and reproducibility of corneal power measurements were assessed based on the intrasession and intersession within-subject standard deviation (Sw), repeatability (2.77Sw), coefficient of variation (COV), and intraclass correlation coefficient (ICC). Agreement was evaluated by 95% limits of agreement (LoA). Results All devices demonstrated high repeatability and reproducibility of the keratometric values (2.77Sw<0.36D, COV<0.3%, ICC>0.98). Repeated-measures analysis of variance with Bonferroni post test showed statistically significant differences (P<0.01) among mean keratometric values of most instruments; the largest differences were observed between the EyeSys Vista and Medmont E300. Good agreement (i.e., 95%LoA within ±0.5D) was found between most instruments for flat, steep and mean keratometry, except for EyeSys and Medmont. Repeatability and reproducibility of vectors J0 and J45 was good, as the ICCs were higher than 0.9, except J45 of Medmont and Pentacam. For the 95% LoAs of J0 and J45, they were all ≤ ±0.31 among any two paired devices. Conclusions The 8 devices showed excellent repeatability and reproducibility. The results obtained using the RC-5000, KR-8000, IOLMaster, Allegro Topolyzer, Pentacam and Sirius were comparable, suggesting that they could be used interchangeably in most clinical settings. Caution is warranted with the measurements of the EyeSys Vista and Medmont E300, which should not be used interchangeably with other devices due to lower agreement. Trial Registration ClinicalTrials.gov NCT01587287

The present study sought to prospectively determine the intrasession repeatability and intersession reproducibility of anterior corneal curvature measurements by using the 8 commercially available instruments mentioned above and evaluate differences in mean corneal power measurements for each of the different instruments in order to check the agreement between devices.

Subjects and Methods
The protocol for this trial and supporting CONSORT checklist are available as supporting information; see Checklist S1 and Protocol S1.

Subjects
Thirty-five young adult subjects (10 males and 25 female) with a mean age of 24.6061.64 years (range 21 to 28 years), the mean manifest spherical equivalent refraction was 24.1562.06 diopters (range, 21.0 to 29.0 diopters), were recruited for this prospective study. All procedures followed the Declaration of Helsinki, and the protocol was reviewed and approved by the Research Review Board at Wenzhou Medical College. The written informed consent was received from all subjects before inclusion in the study. All subjects had good best corrected distance visual acuity (BCVA) equal to or better than 20/25 to allow for adequate fixation. The exclusion criteria were 1) intellectual disability or somatic dysfunction, 2) previous ocular surgery, 3) history of ocular pathology, 4) contact lens wearers, and 5) dry eye (significant subjective dry eye symptoms, Schirmer I test results of less than 5.0 mm, tear film break-up time shorter than 5 seconds and corneal fluorescein staining positive. All of these conditions can result in abnormal measurements. Each subject underwent a full ophthalmic examination including vision, auto-and subjective refraction, slit-lamp examination, non-contact tonometry, corneal power measurements with the 8 devices presented above and fundus examination.

Instruments
The Tomey RC-5000 (software version 1.2.6 ) and the Topcon KR-8000 (software version Release 2E) Autorefractors are designed based on the optical principle represented by the relationship between the size of an object and the size of the image of that object reflected from a surface. Assuming the cornea is a convex mirror, the automated keratometer instantly records the size and computes the radius of curvature while focusing the reflected corneal image (infrared illuminated mires) onto an electronic photosensitive device (infrared detectors). Both devices acquire radius of curvature measurements in the flat and steep meridians on a 3.0-mm diameter field of the central cornea.
The IOLMaster (software version 5.4) also works according to the optical principle represented by the reflection by the anterior surface of a luminous pattern of mires in the center of the cornea.
It uses data from a hexagonal array of 6 points reflected off the surface of the cornea, which depends on the corneal curvature. To calculate corneal curvature, the IOLMaster reflects six points of light, arranged in a 2.5 mm-diameter hexagonal pattern, from the air/tear film interface.
The Medmont E-300 (software version 5.1.0) is a Placido diskbased videokeratoscope that utilizes an arc-step reconstruction algorithm and incorporates a range finder. [5,22] It determines the distance from the corneal apex to the instrument's camera and automatically captures images. It has 32 Placido rings and measures 9 600 data points per scan. Each image captured is awarded a score out of 100 based on centering, focus and movement. The images were selected and saved when good focus and alignment were attained. A score higher than 75 was considered good. The device acquires radius of curvature measurements in the flat and steep meridians on a 3.0-mm diameter field of the central cornea.
The Allegro Topolyzer (software version 1.59) and EyeSys Vista (software version 3.11) are also Placido disk-based videokeratoscopes. The former contains 22 rings and measures and generates high-resolution data of the corneal surface with 22 000 data points; the latter allows for freedom and portable corneal topography and is incorporated into the iTrace system with integrated software. The device contains 26 Placido rings and measures 9 360 points. Both devices present keratometric data in three corneal zones: a central zone with a 3-mm diameter, a paracentral zone with a 5mm diameter, and a peripheral zone with a 7-mm diameter. In this study, the 3-mm zone readings were chosen for improved correlation with the central optical zone and the areas of measurement covered by other instruments.
The most recent version of the Pentacam-HR rotating Scheimpflug camera system (software version 1.17r89) was used in this study. It captures 138,000 true elevation points using a high-resolution, 1.45 mega-pixel camera. The automatic release mode was used to reduce the number of operator-dependent variables. In less than 2 seconds, the rotating camera obtains 25 slit images of the anterior segment. Only scans with an ''Examination Quality Specification'' of ''OK'' were chosen for analysis.
The Sirius is a new device that combines the use of single-Scheimpflug cameras and a Placido disk to measure and image the anterior eye segment, including the cornea, anterior chamber, iris, pupil, and lens. It can acquire 25 Scheimpflug frames and one keratoscopy reading in less than 1 second. It is capable of measuring anterior and posterior tangential (instantaneous) curvature, sagittal (axial) curvature altimetry and refractive power, equivalent refractive power, corneal thickness, and visual quality (spot diagram, point-spread function and optical transfer function). Anterior corneal measurements are performed by the Sirius using a proprietary method of merging the Placido and Scheimpflug data. The corneal power was calculated by averaging the axial curvature from the 4th to the 8th Placido ring. [6] Only scans with an ''image acquisition quality'' of ''Scheimpflug images Coverage $90%, Centration $90%, Keratoscopy Coverage $80%'' were chosen for analysis by the available software version 1.0. Both Scheimpflug camera systems acquire radius of curvature measurements in the flat and steep meridians on a 3.0-mm diameter field of the central cornea.
All instruments convert the curvature measurements obtained from the anterior corneal surface into a total corneal dioptric value using the thin lens formula (n 1 2n 0 )/r, where n 0 = refractive index of air ( = 1.0000) and an n 1 = refractive index of the cornea ( = 1.3375) and r = radius in mm.

Procedures
The present study's definitions of reproducibility, repeatability and agreement were based on those adopted by the British Standards Institute and the International Organization for Standardization. [23][24][25] The testing sequence of the measurements with these devices was randomly chosen to avoid methodological bias. MedCalc Statistical Software version 10.0.1.0 (MedCalc Software, Inc., Mariakerke, Belgium), predetermined generate random sample program. The measurements were collected at least 3 hours after subjects woke from sleep. The subjects were asked to avoid substantial reading prior to the measurements. [26] All measurements were conducted between 10 am and 5 pm to minimize variations in the results. Only the right eye of each subject was selected, and cycloplegic drugs were not used. During the first session, three sets of measurements with all the devices were performed by a single experienced examiner (X.Z.) for all subjects according to the manufacturers' instructions. The examiner and subject were masked to the results of the previous measurements obtained from each device. Subjects were instructed to blink completely just before each measurement. The subjects were asked to sit back after each repeat measurement, and the device was realigned before each measurement. The time between repeated scans by the observer was the minimum possible, and the measurements among different instruments were continuous, without significant time intervals. Measurements were repeated in the second session scheduled 1 to 2 weeks later, at almost the same time as the first session, by the same examiner using the same protocol (i.e., 3 measurements with each device). Intersession reproducibility was determined as well. The mean of the 3 measurements of the first session was calculated for each non-contact keratometry device to assess the agreement among the 8 methods.

Sample Size Estimation
Sample size calculation was performed a priori using PS Power and Sample Size Calculation Software (version 3.014, Vanderbilt University, Tennessee, USA). Based on the result of a recent study of corneal power measurements obtained by different devices, the pooled SD of the differences in keratometry between devices was approximately 0.12 diopters (D). [7] Using a two-sided level of level of significance (a) = 5% and a power (12b) = 99%, a sample size of 29 eyes as a requirement to detect a difference of 0.10 D between instruments.

Statistical Analysis
Statistical analysis was performed using SPSS software for Windows version 13 (SPSS Inc., Chicago, IL, U.S.) and Microsoft Office Excel. A P value of less than 0.05 was considered to be statistically significant. The distributions of the datasets were checked for normality using Kolmogorov-Smirnov tests. The results indicated that the data were normally distributed (P..05). For each measurement, the flat (Kf) and steep (Ks) corneal power values, the average power of Kf and Ks (Kave), and the axes of Kf and Ks were noted. The corneal astigmatism was converted into a vector representation, J 0 (cylinder at 0-degree meridian) and J 45 (cylinder at 45-degree meridian), which were calculated according to the following formulas: [27]. These values were calculated for 3 separate measurements in each session and then averaged to determine the reproducibility and the comparability elevation.

Intrasession Repeatability and Intersession Reproducibility
To determine the intrasession repeatability of each device, within-subject standard deviation (S w ), test-retest repeatability (TRT), the within-subject coefficient of variation (COV), and intraclass correlation coefficients (ICC) were calculated for the three repeated measurements obtained during the first and second sessions. [28] TRT was defined as 2.77 S w , which means an interval within which 95% of the differences between measurements are expected to lie. The COV was calculated as the ratio of the Sw to the overall mean. A lower COV is associated with higher repeatability.
The advantage of COV values is that they can be compared between data sets with different units or widely different means. The disadvantage is that when the mean value is near zero, the COV is sensitive to small changes in the mean, limiting its usefulness. Therefore we did not calculate the COV for both vector J 0 and J 45 , whose mean values are close to zero. [8,9] The ICCs (ranging from 0 to 1) measure the consistency for data sets of repeated measurements. The closer the ICC is to 1, the better the measurement consistency. To assess intersession reproducibility, the mean of the three readings from each session was firstly calculated for each device, and then intersession S w , 2.77 S w , COV and ICCs were also calculated.

Comparison Among Devices
Repeated-measures analysis of variance (ANOVA) with Bonferroni correction was used to identify pairs that were significantly different. Bland-Altman analysis was performed to evaluate the agreement between devices. This involved the use of the 95% limits of agreement (LoA) as the mean difference 61.96 SD. A narrower 95% LoA indicates superior agreement between techniques.

Intrasession Repeatability
For the repeatability during the first and second sessions, the 2.77 S ws of repeated Kf and Ks measurements were lower than 0.36 D. With all devices, the COV was lower than 0.3% and the ICC higher than 0.98 (Tables 1 and 2). The COV of Kave was lower than 0.26%, and the ICC higher than 0.99 in both sessions (Table 3). For vectors J 0 and J 45 , during the first session, the 2.77S ws values were lower than 0.36, and the ICCs were higher than 0.94, except for J 45 on the Medmont (0.747) and J 45 on the Pentacam (0.85). The second session displayed a similar tendency; the 2.77S w were lower than 0.27, and the ICCs were higher than 0.92, except for J 45 on the Medmont (0.844) and J 45 on the Pentacam (0.86) (Tables S1 and S2).

Intersession Reproducibility
Tables S3-7 show that the reproducibility of corneal power measurements was excellent for all devices. The differences between both sessions were lower than 0.06 D for each device comparison. The intersession reproducibility parameters demonstrated a trend similar to that of the intrasession repeatability assessments. The 2.77 S w of repeated Kf and Ks measurements were lower than 0.35 D; the COV was lower than 0.28%, and the ICCs higher than 0.99 in all devices (Tables S3 and S4). The ICC was $0.99 also for Kave (Table S5). Though the ICC s for power vectors J 0 and J 45 were lower than Kf and Ks, they were still higher than 0.9 except J 45 of IOLMaster (0.898) and J 45 of Medmont (Tables S6 and S7). Tables 4, 5, 6 and Tables S8-9  There were statistically significant differences in Kf between any two paired devices except for Topcon-Topolyzer, Topcon-Pentacam, IOLMaster-Sirius, and Topolyzer-Pentacam comparisons (Table 4). For Ks, there were insignificant differences between the Tomey and Topolyzer, Tomey and Pentacam, Tomey and Sirius, Topolyzer and Pentacam, and Pentacam and Sirius. As shown in Tables 4 to 6, the Kf, Ks and Kave mean values of the Medmont were the largest, while the Kf, Ks and Kave readings obtained by the EyeSys were the smallest.

Comparison between Devices
As regards Kave (table 6), Tomey and Topolyzer, Tomey and Pentacam, IOLMaster and Sirius, and Topolyzer and Pentacam were not significantly different. Tables S8 and S9 showed significant differences in both J 0 and J 45 : for the former such differences were limited to the Tomey, Topcon and IOLMaster devices, for the latter they interested all instruments but the EyeSys.
As regards agreement, Table 4 shows that the 95% LoAs for Kf were ,0.5 D when comparing all pairs of instruments. The only exception being the EyeSys and Medmont corneal topographers, whose agreement with respect to other devices was lower.
Agreement among the 8 devices was lower for Ks, as shown in Table 5: the 95% LoAs were equal or smaller than 0.5D for almost instruments, except for EyeSys and Medmont. Among these paired comparisons, the largest 95% LoA were obtained for the EyeSys-Medmont comparison (20.88 to 20.22 D).
In the Bland-Altman analysis of Kave (Table 6), the 95% LoA were equal or larger than 0.5D when evaluating the Medmont and EyeSys corneal topographers and lower than 0.5D for other   comparisons. For the 95% LoAs of J 0 and J 45 , they were all # 60.31 among any two paired devices.

Discussion
Accurate measurements of corneal power and astigmatism represent a crucial need requirement in this era of refractive cataract surgery: the former is needed by all formulas calculating IOL power, the latter is needed when planning toric IOL implantation or surgical correction of astigmatism by excimer laser. In this prospective study, we assessed the intrasession repeatability, intersession reproducibility, and agreement of corneal powers obtained from 8 different devices. To our knowledge, no previous study has assessed the precision and interchangeability of keratometry on such a large number of instruments. Moreover, while several studies have assessed the repeatability of one ore more instruments in measuring corneal power, in only a very few cases has the repeatability of astigmatism measurements been carried out by means of vector analysis.
All devices demonstrated excellent intrasession repeatability and intersession reproducibility in measuring Kf and Ks and Kave (ICC $0.98 for all). The vector power Ms were as the SimK. The repeatability and reproducibility of vectors J 0 and J 45 was slightly lower, but still reasonably good. As regards repeatability, the ICC of J 0 and J 45 ranged respectively from 0.925 (EyeSys) to 0.994 (Tomey and IOLMaster) and from 0.747 (Medmont) to 0.982 (Tomey). When evaluating the reproducibility, the ICC of J 0 and J 45 ranged, respectively, from 0.917(Medmont) to 0.990 (Topolyzer) and from 0.803 (Medmont) to 0.971 (Topolyzer).
In comparing the 8 devices, the means of the differences were similar, which suggests a good degree of concordance among them. However in some cases (EyeSys and Medmont) agreement was only fair, so that caution is recommended when using some of them interchangeably.

IOLMaster
Our data confirm the excellent intrasession repeatability of corneal power measurements by the IOLMaster, as previously reported by other authors. [7,10] Like in the study by Shirayama et al., [7] the IOLMaster showed the lowest COV in comparison to the other devices tested. We also observed a good intersession  reproducibility of the keratometry by IOLMaster, with ICCs for Kf and Ks even higher than that reported by Shammas et al. [29]. Previous studies found that the IOLMaster provided slightly steeper corneal power readings than manual keratometry, automated keratometry, Placido disc corneal topography, Scheimpflug imaging, and Scheimpflug imaging combined with Placido disc corneal topography. [7,11,30] This is in good agreement with our findings, as we observed that IOLMaster corneal power measurements were higher than those provided by all instruments except the Medmont. Steeper corneal power values by the IOLMaster may be related to the more central corneal curvature readings of this instrument, which takes measurements over a diameter of approximately 2.5 mm. It is known that the central corneal curvature is steeper than the peripheral curvature in a normal prolate cornea. [7,11,12].
The IOLMaster also offered high repeatability for astigmatic vector analysis as well as good, although slightly lower, reproducibility. To our knowledge, this is the first study to evaluate repeatability and reproducibility of J 0 and J 45 measurements by the IOLMaster.
Tomey RC-5000 and Topcon KR-8000 In this study both autokeratometers offered excellent repeatability and reproducibility, thus confirming the data previously reported in children by Huynh et al. [11] for another Auto-Keratometer (RK-F1, Canon, Japan) based on the same principle. In the present study, even higher degrees of reliability were found for the measurement of Kf and Ks by RC-5000 and KR-8000, suggesting that keratometry measurements in adults were more repeatable than those in children, perhaps because children have difficulty in maintaining proper posture during the imageacquiring procedures and have a short attention span.
Repeatability of both autokeratometers was slightly higher than that of the EyeSys corneal topography, a result that had already been reported for a previous version of the Topcon autokeratometer. [31].
Repeatability and reproducibility of astigmatic vectors were high.

Medmont E300
The E300 has already been found to be a highly accurate and repeatable corneal topographer. [5,22] Our results are in good agreement with previous investigations for corneal power measurements. On the contrary, the repeatability and reproducibility of the E300 were lower (compared to other instruments in this study) for J 45 .
The E300 produced the highest Kf, Ks and Kave among the whole set of instruments. This is consistent with previous studies demonstrating that the E300 gives significantly steeper corneal power values than other devices. [4,13] As a consequence, agreement with instruments providing the lowest Kf and Ks (such as EyeSys) was moderate.

EyeSys Vista
In this study the EyeSys topographer showed good repeatability and reproducibility of corneal curvature measurements, although the results were slightly lower than those of most instruments. High repeatability of the EyeSys for corneal curvature measurements had already been reported. [14,15].
Agreement between EyeSys, which provided the flattest keratometry values, and the other instruments was fair. The 95% LoAs between EyeSys and the other 7 devices were all larger than 60.50 D. Clearly, this range does not allow this instrument to be used interchangeably with other devices. These results are consistent with the findings of previous studies. Stefano et al. [32] compared the keratometry values obtained by EyeSys with those obtained by the Pentacam. Although they found a high correlation between the measurements obtained with both devices, the 95% LoA range from 21.06 to 1.26 D and 20.87 to 0.85 D for Kf and Ks, respectively, suggests that these limits were too large to consider both instruments are interchangeable. Similarly, Tsilimbaris et al. [33] compared the measurements obtained using the EyeSys and the Javal keratometer and found that there was no significant difference between the instruments. However, the 95% LoA ranged from 20.87 to 0.93 D, which also showed that the two instruments were in poor agreement. Other studies have also reported similar findings when measurements were taken with manual keratometry and Placido-based topography. [16,17] These devices were in poor agreement, although a good correlation between different keratometric methods was observed. [28].

Topolyzer
To date, no study has reported on the precision of keratometry measurements obtained by the Topolyzer, thought it was an effective and safe tool in topography-guided corneal excimer laser surgery to correct myopia, hyperopia, and mixed astigmatism. [34,35] The same model marketed by Oculus (Keratograph, Oculus, Germany) has been used to assess corneal wavefront aberrations. [36] Our results represent the first confirmation that the Topolyzer displays excellent reliability in measuring corneal power (ICCs $0.971) and astigmatism (ICCs .0.97 for both J 0 and J 45 ). The current study is also the first to demonstrate good agreement between the Topolyzer and other devices. The 95% LoA values of K were lower than 0.5 D in most cases, except in the case of the EyeSys Vista or Medmont E3000.

Sirius
According to our data, measurements by the Scheimpflug camera combined with Placido corneal topography (Sirius) showed good repeatability and reproducibility. Results for keratometry are quite similar to those previously reported for intrasession repeatability of SimK by the same device. [6] Results for astigmatism decomposition components had never been reported.
Sirius is a relatively new instrument and only a few studies have evaluated its ability to measure corneal curvature and power. [18] Savini et al. [18] compared the anterior segment measurements provided by 3 Scheimpflug topographers and 1 Placido corneal topographer in 25 subjects. Although the mean SimK was significantly different among the 4 instruments, post-test analysis did not reveal any statistically significant difference between Pentacam and Sirius. In good agreement, our study did not find a statistically significant difference between the two instruments for the steep K and a statistically but not clinically significant difference for the flat K. The 95% LoA between Pentacam and Sirius were slightly larger in the study by Savini et al. (20.59 to 0.59 D) than in ours (20.28 to 0.16 D). This discrepancy may be related to the different age of the two samples, as the mean age of Savini's sample (age 57.9 years 621.2) was higher than the mean age of our sample: young subjects have better fixation, and stability of the tear film than older patients.

Pentacam
The Pentacam offered high repeatability and reproducibility in measuring corneal curvature, thus confirming the findings of previous studies that investigated its repeatability and agreement with respect to other instruments. [5,9,19] The values recently reported by McAlinden et al. [19] for Kf and Ks, as provided by the Pentacam HR, and those described by Shankar et al., [20] using the original Pentacam (not HR), are very close to ours. A comparison for astigmatism is difficult as both studies did not evaluate vector analysis. [19,20] The latter was carried out by Read et al., [5] who reported repeatability values similar to ours, although in their study the performance of Pentacam was slightly worse than that of the Medmont, whereas in our sample the opposite was true.
Comparison of the Pentacam to other instruments showed good agreement (LoA ,0.5D) and little mean differences in Kf and Ks in most cases. Recently, several authors compared corneal curvature measurements by the Pentacam to those of other instruments. In 2011 Savini et al. [18] reported no difference in the mean corneal power (SimK) of the Pentacam and Sirius (see above). In 2009 the same authors did not find any statistically significant difference among the Pentacam and two Placido disc corneal topographers, but the 95% LoA were large enough to be considered clinically significant. [37].

Limitations
There are several limitations of the present study. First, the results are based on a relatively small number of eyes, although this number is equal to or higher than those used in previous studies. [4,7,8,10,14,15,17,22] Second, our study is limited to young and healthy subjects with normal corneas and good fixation; the understanding and collaboration of these subjects are very good; and keratometry scans images were of excellent quality. In older patients with corneal abnormalities or subjected to post-laser in situ keratomileusis or corneal surface ablation surgery, the results may be different and could include additional variability. Further research is required to comprehensively assess the validity and precision of the corneal power measurements obtained by different keratometric devices in such cases. Third, our study is limited to the intraobserver repeatability and intersession reproducibility of corneal power measurements by these devices. The variability of the measurement system caused by different observers deserves further investigation. Finally, from a practical point of view, although some instruments showed no statistically significant inter-device differences and good agreement and the result may suggest their measurements can be used interchangeably in IOL power calculation, we still suggest optimizing the constants of IOL power calculation formulas when changing from one instrument to another. More studies are needed to report these constants.

Conclusion
In summary, our data showed that anterior corneal curvature measurements obtained from 8 different devices present very good repeatability and reproducibility. The results obtained using the Tomey RC-5000, Topcon KR-8000, IOLMaster, Allegro Topolyzer, Pentacam and Sirius were well correlated and comparable, suggesting that they could be used interchangeably in most clinical settings. However, caution is warranted when using measurements obtained by the EyeSys Vista and the Medmont. It is inadvisable to use both devices interchangeably with other devices in every clinical situation.   Checklist S1 CONSORT Checklist.