Repeatability and Reproducibility of Eight Macular Intra-Retinal Layer Thicknesses Determined by an Automated Segmentation Algorithm Using Two SD-OCT Instruments

Purpose To evaluate the repeatability, reproducibility, and agreement of thickness profile measurements of eight intra-retinal layers determined by an automated algorithm applied to optical coherence tomography (OCT) images from two different instruments. Methods Twenty normal subjects (12 males, 8 females; 24 to 32 years old) were enrolled. Imaging was performed with a custom built ultra-high resolution OCT instrument (UHR-OCT, ∼3 µm resolution) and a commercial RTVue100 OCT (∼5 µm resolution) instrument. An automated algorithm was developed to segment the macular retina into eight layers and quantitate the thickness of each layer. The right eye of each subject was imaged two times by the first examiner using each instrument to assess intra-observer repeatability and once by the second examiner to assess inter-observer reproducibility. The intraclass correlation coefficient (ICC) and coefficients of repeatability and reproducibility (COR) were analyzed to evaluate the reliability. Results The ICCs for the intra-observer repeatability and inter-observer reproducibility of both SD-OCT instruments were greater than 0.945 for the total retina and all intra-retinal layers, except the photoreceptor inner segments, which ranged from 0.051 to 0.643, and the outer segments, which ranged from 0.709 to 0.959. The CORs were less than 6.73% for the total retina and all intra-retinal layers. The total retinal thickness measured by the UHR-OCT was significantly thinner than that measured by the RTVue100. However, the ICC for agreement of the thickness profiles between UHR-OCT and RTVue OCT were greater than 0.80 except for the inner segment and outer segment layers. Conclusions Thickness measurements of the intra-retinal layers determined by the automated algorithm are reliable when applied to images acquired by the UHR-OCT and RTVue100 instruments.


Introduction
Evaluation of intra-retinal layer thickness plays an important role in the diagnosis and monitoring of various retinal diseases. For example, thinning of the retinal nerve fiber layer (RNFL) is often noted in glaucoma and myopia [1,2], and thinning of the ganglion cell complex (GCC) occurs during the development of glaucoma [3]. The thickness of the outer nuclear layer (ONL) in the fovea was reported to correlate with visual acuity in central serous chorioretinopathy eyes [4]. In early age-related macular degeneration when vision field defects are present, the photoreceptor outer segment (OS) layer thins and the retinal pigment epithelium (RPE) thickens [5]. Segmentation of intra-retinal layers is very important not only in ophthalmology but also in neurology. Except for the RNFL, the thicknesses of the deeper retinal layers are reported to change in multiple sclerosis, Parkinsonian syndromes, and less frequently in disorders such as neuromyelitis optica and Wilson's disease [6,7,8,9,10].
Optical coherence tomography (OCT) is a noninvasive, noncontact diagnostic tool that can provide in vivo cross-sectional images of the retina with high resolution [11]. It has become an essential tool for diagnosing and monitoring the development of various retinal pathologies [12,13,14]. Spectral domain OCT (SD-OCT) is the most readily available OCT system for retinal imaging and has a faster scan speed and higher axial resolution than time domain OCT. Currently, most of the commercially available SD-OCT instruments have a resolution of approximately 5 mm [15,16]. Ultra-high resolution OCT (UHR-OCT), with an axial resolution of approximately 3 mm or less, has the ability to image retinal ultrastructure [17,18,19]. Because the segmentation software of most commercial systems is limited to measuring the thicknesses of only a few layers, such as the total retina and the RNFL, several computer automated algorithms for segmenting intra-retinal layers have been proposed to quantitatively evaluate the thickness of more layers that can be imaged with advanced SD-OCT imaging techniques [16].
The use of automated algorithms has undoubtedly enhanced the quantitative diagnosis of ophthalmic disease. To our best knowledge, however, there are few studies that test the reliability of the intra-retinal layer thickness measurements determined by automated segmentation of OCT images. Knowing the level of reliability of such measurements is very important for clinical applications. Thus, the goal of this study was to investigate the repeatability and reproducibility of thickness measurements determined by an automated segmentation algorithm applied to images of eight intra-retinal layers acquired by a custom-built UHR-OCT instrument and a commercially available RTVue100 OCT (Optovue, Fremont, CA, USA) instrument.

Subjects
This study followed the tenets of the Declaration of Helsinki and was approved by Ethics Committee of Wenzhou Medical University. Twenty normal subjects (12 males and 8 females, mean 6 standard deviation age: 25.162.0 years, range: 24 to 32 years) were included, and each signed an informed consent. The inclusion criteria were as follows: no history of ocular or systemic disease, 20/20 or better visual acuity, range of refractive error between 22.00 diopter (D) and +0.50 D, no history of intraocular pressure higher than 21 mmHg, and a normal appearance of the macula.

Instruments and Image Acquisition
Retinal OCT imaging was performed with two SD-OCT instruments configured as shown in Table 1. Briefly, the UHR-OCT used a superluminescent diode (SLD: T840; SuperLum Diodes Ltd., Moscow, Russia) [12]. For imaging the posterior segment of the eye, it was adapted onto a slit-lamp system with the installation of an ocular lens (60 D; Volk Optical, Mentor, OH, USA) on the sample arm. The field of the scan was set to approximately 15u to 20u. The power of the incident light was set to 750 mW, which is well below the safety standard, according to the American National Standards Institute (ANSI Z136. . The calibrated scan depth was 1.48 mm in air. To calibrate the scan width for the retinal imaging, a model eye (OEMI-7, Ocular Instruments, Bellevue, WA, USA), with a grid implanted on the fundus, was used. Each lattice was 1 mm in width. A horizontal Bscan was performed crossing the grid, and the pixel numbers corresponding to one lattice were acquired.

Procedure
Before imaging, all eyes received an ocular examination including visual acuity testing, autorefraction, intraocular pressure, and ophthalmoscopic examination. After enrollment, all eyes were imaged without mydriasis. Two repeated measurements were performed in a short time on the same day by a single examiner (XL) using both SD-OCT instruments to test the intra-observer  repeatability. The OCT imaging was also performed one time by another examiner (LL) using each OCT instrument on the same day to test inter-observer reproducibility. The order of OCT examinations was chosen randomly for each patient. During OCT imaging, the subjects were asked to move their head away from the headrest after each image acquisition, and after five minutes they were asked to reposition their head for the following measurement [20]. One author (XL), who did not know which images were taken by which observer or from which subjects, processed the images.

Measurements of Macular Thickness of Intra-retinal Layers
Custom software for automatic segmentation was developed to measure the thicknesses of eight intra-retinal layers on 2-D images produced by each OCT instrument. The image segmentation algorithm mainly employed graph theory and shortest-path search based on an optimization algorithm of dynamic programming technique as described in a previous study [16]. Nine boundaries of the intra-retinal layer structures were detected ( Fig. 1) including (1) internal limiting membrane (ILM); (2) nerve fiber layer/ ganglion cell layer (NFL/GCL); (3) inner plexiform layer/inner nuclear layer (IPL/INL); (4) inner nuclear layer/outer plexiform layer (INL/OPL); (5) outer plexiform layer/outer nuclear layer (OPL/ONL); (6) external limiting membrane (ELM); (7) inner segment/outer segment of receptors (IS/OS); (8) outer segment/ retina pigment epithelium (OS/RPE); (9) retina pigment epithelium/choroid (RPE/choroid). Each of these nine boundaries was detected sequentially by a two-step segmentation procedure. First, a graph based on node cost assignments was built. The node costs are mainly based on the intensity gradient values along the vertical direction and other features, such as the edge direction, which depended upon the boundary of interest. Second, the layer boundary was extracted by a shortest path search applied to the graph using a dynamic programming algorithm. Then, all boundaries detected were overlaid on the OCT images and were verified by visual inspection performed by one of the authors (XL). A semi-automated approach was implemented in the algorithm to correct the segmentation errors that occurred in regions which had extremely low reflectivity or almost no structural information [21,22]. Figure 2 illustrates the detailed sequence in the boundary segmentation process. Each OCT image was first pre-processed to reduce the background noise using median and Gaussian filtering techniques. This step helped to improve the performance of the segmentation algorithm. The ILM and the boundary between the RPE and choroid layers were first segmented so that other boundaries could be segmented in a limited search region to save computation time. The ILM was defined as the first highly reflective increase from the inner side of the retinal image. It was most often well demarcated, easily detected, and followed by a sector of high reflectivity. Based on these features, the initialized boundary was determined by the first peak on each sampling line from the inner side of the retinal structure. Then the boundary was refined by finding the shortest path in a limited region based on the initialized boundary. The RPE layer was located on the outermost side of the retina and was one of the most hyperreflective layers within each retinal OCT image. Thus, we searched for the brightest pixel in each A-scan line below the ILM layer on the pre-processed image as an estimated boundary between the RPE and choroid layers. Then the shortest path search was applied on the graph to refine these two boundaries based on their estimated layer locations. Once these two boundaries were segmented, the process for other boundaries was repeated recursively by limiting the search space based on the previous segmentation to detect a new layer boundary.

Statistical Analysis
All statistical analysis was performed with the Statistical Package for the Social Sciences (SPSS) software (v17.0 for Windows; SPSS Inc., Chicago, IL, USA). Descriptive statistics were determined as means 6 standard deviations. The intra-observer repeatability was measured with two OCT images obtained by the same operator, and the inter-observer reproducibility was measured with two OCT images obtained by two different operators. The overall mean thickness, the coefficients of repeatability and reproducibility (COR), and intraclass correlation coefficients (ICC) were calculated to evaluate the repeatability and reproducibility of the thickness measurements. The overall mean thicknesses of the eight intra-retinal layers along the central macular 6-mm scan length were determined as the average of the first and second measurement by the same examiner. The COR was defined as the standard deviation of differences between the two measurements divided by the mean value of two different measurements. The ICC was determined based on a mixed-model analysis of variance proposed by Bartko and Carpender [23]. The paired ttest, ICC, 95% limits of agreement (LoA), and Bland and Altman plots [24] were analyzed to evaluate the agreement of thickness measurements between the two SD-OCT instruments. The 95% LoA was defined as the mean difference 61.96 standard deviation [25]. P-values ,0.05 was considered statistically significant.

Results
The automated algorithm successfully segmented eight intraretinal layers in the macular images obtained by the UHR-OCT and RTVue100 instruments. The algorithm also determined the thickness profiles of each layer along the 6-mm horizontal (Fig. 3) and vertical (Fig. 4) scans obtained by each instrument. There were errors in the segmentation boundary for a few images of lower quality. For example, the algorithm mistakenly identified the OPL/ONL interface in Figure 5A. A similar failure occurred for the RNFL/GCL boundary in Figure 5C. The semi-automated approach successfully corrected the segmentation errors ( Fig. 5B  and 5D). Visual inspection confirmed that the boundary detection for the eight intra-retinal layers was valid in all images acquired by the two OCT instruments.
For   Table 2 and Table 3 show the repeatability and reproducibility of thickness measurements for intra-retinal layers measured with UHR-OCT and RTVue100, respectively. There were no significant differences between the two thickness measurements for either instrument obtained by the same examiner. The ICCs obtained for the intra-observer repeatability and inter-observer  reproducibility tests of both SD-OCT instruments were greater than 0.945 for the total retina and all intra-retinal layers except the IS layer, which ranged from 0.051 to 0.643, and the OS layer, which ranged from 0.709 to 0.959. The coefficients of repeatability and reproducibility were less than 6.73% for the total retina and all intra-retinal layers. The overall mean thicknesses of intra-retinal layers measured by UHR-OCT were compared with RTVue100 (Table 4). There were significant differences between the UHR-OCT and RTVue100 measurements of most macular intra-retinal layer thicknesses except the RNFL and RPE. The total retinal thickness measured by the UHR-OCT was significantly thinner than that measured by the RTVue100. The ICC for agreement of the thickness profiles between UHR-OCT and RTVue OCT were greater than 0.80 except for the IS and OS layers. Bland-Altman plots (Fig. 6) were also used to test the agreement of the thicknesses measured with these two SD-OCT devices. The Bland-Altman plots and the 95% LoA results showed good agreement between the two SD-OCT instruments for all of the intra-retinal layers except the ONL and IS.

Discussion
It is very important in the clinical routine to know the repeatability and reproducibility of measurements. This information enables the clinician to evaluate if observed changes are due to fluctuations in the methods or if they are valid changes in the structures. This is especially true for the measurements of the intra-retinal layers, which are important morphometric parameters in the diagnosis of retinal and neurological diseases and the monitoring of the progression of these disorders. In this current study, we applied our automated algorithm to images from a custom-built UHR-OCT instrument and commercially available RTVue100 OCT instrument to yield the thickness profiles of eight intra-retinal layers. The intra-observer and inter-observer test results indicated that both instruments produced highly repeatable and reproducible measurements for most of the intra-retinal layers. The results of inter-instrument comparisons of the thickness measurements suggested that the thicknesses of intra-retinal layers obtained by the methods were not interchangeable between the two different SD-OCT instruments.
There are many studies showing the repeatability and reproducibility of the retinal measurements for normal subjects. However due to limitations of the image processing software on the commercial OCT instruments, most of them only focus on the RNFL or the total retinal thickness [15,20,26,27]. The aim of our study was to report the repeatability and reproducibility of thickness measurements for eight intra-retinal layers determined by an automated algorithm applied to images from two different SD-OCT instruments. For both intra-and inter-observer comparisons, the ICCs were high for all layers except the IS. This indicates that, as in previous findings, the thickness measurements for most of the intra-retinal layers were repeatable either in different visits or by different examiners [26,27,28]. The repeatability of measurements in the present study is consistent with those in previous reports. Debuc et al. reported the repeatability and reproducibility of thickness measurements for six intra-retinal layers using custom-developed automated software on time domain OCT images [29]. They found ICCs to be greater than 0.75 except for the OPL and OS/RPE. Their results were consistent with ours, and the fuzzy boundaries of the ELM and Table 3. Repeatability and reproducibility of thickness measurements for eight intra-retinal layers measured by the RTVue100. In the current study, we found that the repeatability and reproducibility of the UHR-OCT instrument was better than that of the RTVue100. Additionally, the repeatability of the UHR-OCT with an axial resolution below 2 mm [30] was better than that of our UHR-OCT with axial resolution of 3 mm. These results indicate that the axial resolution of OCT may contribute to the repeatability of retinal thickness measurements. This is consistent with Ge et al. [25], who reported that the higher axial resolution OCT instruments have a better repeatability in measurements of central corneal thickness and epithelial thickness. The repeatability of the IS thickness measurements in the current study was not as good as that reported by Wang et al. [30], even though they used manual segmentation in their study. Besides the axial resolution, there are some other factors, such as the image quality and image size that may also contribute to the repeatability.
The thickness measurements of most of the intra-retinal layers and the total retina were different between the UHR-OCT and RTVue100. This result was consistent with previously reported studies in which the retinal thickness measurements differed significantly depending upon the OCT systems used. For example, Seibold et al. [31] compared RNFL thickness measurements taken with three different SD-OCT instruments and a time-domain OCT instrument. RNFL thicknesses were significantly different among the four instruments, and they could not be used interchangeably. Similarly, Grover et al. found that the central subfield thickness measured by two different SD-OCT instruments differed by almost 70 mm [32]. Moreover, Wolf-Schnurrbusch et al. [33] compared central retinal thickness measurements in healthy eyes taken by six different commercially available OCT instruments. The six OCT systems each provided different results. Their results imply that the different OCT systems cannot be used interchangeably for the measurement of macular thickness.
There are some limitations to the present study. One is the unequal ratio of males to females. Setaro Oeto et al. [34] found the mean thicknesses of the INL and the OPL+ONL were significantly greater in men, and the mean RNFL thickness was greater in women. However, the purpose of our study was to evaluate the reliability of the newly developed segmentation algorithm to measure the thickness profiles of eight intra-retinal layers. Thus, gender is unlikely to influence the outcome of our study. In future studies we will pay attention to gender differences. Another limitation is that it was conducted on normal subjects only. Diseased retinal structures may vary substantially among patients, and this is likely to increase the frequency of segmentation errors. Thus the repeatability and reproducibility values may be reduced in diseased retinas [35,36]. In future studies, we will apply our new method to a variety of retinal diseases to evaluate the clinical significance of any changes in the repeatability and reproducibility of the intra-retinal measurements.
In conclusion, thickness measurements of the intra-retinal layers have good repeatability and reproducibility when determined by the automated algorithm applied to images acquired by the UHR-OCT and RTVue100 instruments. Use of this algorithm will be helpful in the diagnosis of retinal diseases and the evaluation of disease progression. Figure 6. Bland-Altman plots of thickness measurements determined with the automated segmentation algorithm on UHR-OCT and RTVue100 images. Only the images along the horizontal meridian were analyzed. The horizontal full lines represent the mean of thickness differences, and the horizontal dashed lines represent the mean differences 61.96 standard deviation. doi:10.1371/journal.pone.0087996.g006