Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Real-Time Automatic Segmentation of Optical Coherence Tomography Volume Data of the Macular Region

  • Jing Tian,

    Affiliation Bascom Palmer Eye Institute, University of Miami, Miami, Florida, United States of America

  • Boglárka Varga,

    Affiliation Department of Ophthalmology, Semmelweis University, Budapest, Hungary

  • Gábor Márk Somfai,

    Affiliations Bascom Palmer Eye Institute, University of Miami, Miami, Florida, United States of America, Department of Ophthalmology, Semmelweis University, Budapest, Hungary

  • Wen-Hsiang Lee,

    Affiliation Bascom Palmer Eye Institute, University of Miami, Miami, Florida, United States of America

  • William E. Smiddy,

    Affiliation Bascom Palmer Eye Institute, University of Miami, Miami, Florida, United States of America

  • Delia Cabrera DeBuc

    Affiliation Bascom Palmer Eye Institute, University of Miami, Miami, Florida, United States of America


Optical coherence tomography (OCT) is a high speed, high resolution and non-invasive imaging modality that enables the capturing of the 3D structure of the retina. The fast and automatic analysis of 3D volume OCT data is crucial taking into account the increased amount of patient-specific 3D imaging data. In this work, we have developed an automatic algorithm, OCTRIMA 3D (OCT Retinal IMage Analysis 3D), that could segment OCT volume data in the macular region fast and accurately. The proposed method is implemented using the shortest-path based graph search, which detects the retinal boundaries by searching the shortest-path between two end nodes using Dijkstra’s algorithm. Additional techniques, such as inter-frame flattening, inter-frame search region refinement, masking and biasing were introduced to exploit the spatial dependency between adjacent frames for the reduction of the processing time. Our segmentation algorithm was evaluated by comparing with the manual labelings and three state of the art graph-based segmentation methods. The processing time for the whole OCT volume of 496×644×51 voxels (captured by Spectralis SD-OCT) was 26.15 seconds which is at least a 2-8-fold increase in speed compared to other, similar reference algorithms used in the comparisons. The average unsigned error was about 1 pixel (∼ 4 microns), which was also lower compared to the reference algorithms. We believe that OCTRIMA 3D is a leap forward towards achieving reliable, real-time analysis of 3D OCT retinal data.


Real-time processing and quantitative analysis of retinal images, which has always been of great interest to clinicians, is highly desirable. Quantitative image analysis of the retinal tissue is widely used in the diagnosis and early detection of major blinding diseases, such as glaucoma and age related macular degeneration [1, 2]. Furthermore, many systemic diseases, such as diabetes, can be monitored through the vasculature of the retina [3]. In the last decade, optical coherence tomography (OCT) has emerged as a powerful imaging modality that could provide high-resolution and high-speed cross-sectional images of the retina non-invasively [4]. Recent advancements in OCT imaging have facilitated to capture 3D retinal structures in a few seconds with an axial resolution of ∼ 2 microns [5]. The analysis of OCT volume data requires extensive processing which represents a real challenge nowadays. Taking into account the increased amount of patient-specific imaging data in 3D environments, computational complexity is one of the important factors, which have to be taken into account when considering the number of operations needed to process each 3D dataset. Quality evaluation is not the only criterion to evaluate and compare segmentation methods if we are concerned with real-time (or near real-time) processing. Particularly, in case of real-time segmentation, computational complexity is one of the criteria which has to be carefully considered. The key is to implement an algorithm that can give satisfying segmentation results with low computation load. This is a demandingly needed solution in front of the increased development of cloud technologies and big data analysis, which could impact clinical decision-making.

Segmentation of the retinal layers is one of the first steps to interpret the volumetric OCT data. In general, commercial OCT systems are equipped with proprietary software with limited capabilities for automation and full segmentation of the various cellular layers of the retinal tissue [6]. Many important quantitative features, such as the thickness of the outer nuclear layers, remain unexploited due to the lack of a full retinal segmentation algorithm in most of the commercial devices. Manual segmentation is used to obtain the primary research data in many studies. However, such input from the human graders is time consuming and suffers from inter-observer/intra-observer errors and hence is not suitable for large-scale studies.

The development of an automatic segmentation algorithm for OCT volume data is challenging due to the presence of heavy noise, blood vessels and various pathologies. Unfortunately, there is no single method that could work equally well for segmentation tasks. According to the dimensionality of the input features, we roughly categorized the methods into A-scan based, B-scan based and volume based. Below is a brief overview of the published work related to automatic segmentation of OCT data. A more comprehensive review could be found in [7].

  • A-scan based methods
    A-scan based methods identify the boundary locations as the intensity peaks or valleys in each A-scan and link the feature points to form a smooth and continuous boundary using different models [812]. Advanced denoising methods were usually required and the performance of the algorithms were not robust in all of the images [810, 12]. Recent development by Fabritius et al. detected the ILM and RPE in the volume within 17 and 21 seconds respectively [11] with an error less than 5 pixels in 99.7% of the scans.
  • B-scan based methods
    Commonly seen image segmentation methods, such as thresholding [13], active contour [14], pattern recognition [15, 16] and shortest-path based graph search [17], were also used to detect the boundaries in the B-scans from volume OCT data. The thresholding method proposed by Boyer et al. [13] relied on the absolute value of the intensity, hence the performance of the solution was case dependent and not applicable to other OCT devices. Active contour was customized to detect retinal layers in 20 rodent OCT images by Yazdanpanah el al. [14] and is yet to prove its clinical usefulness in human retina. Pattern recognition based approach introduced by Fuller el al. [15] took advantage of support vector machine (SVM) to estimate the retinal thickness in healthy subjects and patients with macular degeneration, but the accuracy of the detection was low (6 pixels) and the processing time was 10 minutes in training and 2 minutes in running. A random forest classifier approach was employed to estimate the position of retinal layer boundaries with an accuracy of 4.3 microns in 9 boundaries [16]. A graph-based automatic algorithm that could segment eight retinal layers with a thickness error of about 1 pixel has also proven to be robust in images with pathologies [17].
  • Volume based methods
    Volume scan of the retina is now commonly available in the commercial OCT devices. Recent algorithms exploited the spatial dependency in the adjacent frame to detect retinal surface from 3D data. For example, Garvin et al. [18] and Dufour et al. [19] detected various retinal surfaces from 3D OCT volume by finding the minimum cost feasible surface in the constructed graph as proposed by Li et al. [20]. Two stand-alone softwares developed by Garvin et al. [21] and Dufour et al. [22] are free for research use and are refered to as IOWA Reference Algorithm and Dufour’s software in the rest of the paper.

In this paper, a fast and accurate automatic algorithm that could segment 3D macular OCT data is presented and is refered to as “OCTRIMA 3D”. The acronym OCTRIMA has been previously used to label OCT retinal image analysis developed by our group [12] using a different formulation for time domain OCT. Besides aiming for high accuracy and robustness, we have also greatly reduced the processing time to improve the clinical usefulness. The OCTRIMA 3D is a B-scan based method using the shortest-path based graph search framework proposed by Chiu et al. [17] and is optimized using the inter-frame spatial dependency. The main contributions of our work are:

  1. The introduction of inter-frame flattening to reduce the curvature in the fovea and further improve the robustness of the algorithm;
  2. The time complexity is greatly reduced by using inter-frame or intra-frame information which limits the search region;
  3. A better distinction is attained for the boundaries closely located by applying biasing and masking techniques in the same search region;
  4. A reduced number of nodes in the graph by down-sampling pixels in the lateral direction of the search region.
As a result, the processing speed for each frame has been greatly improved by about 10 times as compared to the previous work by Chiu et al. [17] and the processing of the whole OCT volume data (644 × 496 × 51 voxels) can be finished within 26 seconds. Up to the authors’ best knowledge, the speed of OCTRIMA 3D outperformed the existing works reported so far [17, 21, 22]. The segmented boundaries in each B-scan were combined to form smooth surfaces and were compared with three state-of-the-art graph search based segmentation algorithms [17, 21, 22]. In addition, segmentation results were compared to a ground-truth, which is the manual delineation of retinal boundaries. Two graders provided the manual labelings and inter-observer differences are used as benchmark to evaluate the accuracy. The results showed that OCTRIMA 3D is more close to the manual labeling as compared to the IOWA reference algorithm and Dufour’s Algorithm. The accuracy of OCTRIMA 3D is similar to that reported by Chiu et al. [17] but our implementation is much faster. Experiments to compare with manual labeling were conducted on 100 OCT B-scans from 10 healthy subjects and the average unsigned error obtained for eight surfaces was about 1 pixel. The absolute detection error on each surface is found to be significantly smaller than the inter-observer difference (p < 0.001).


This section is organized as follows: first, the boundary detection framework is described in Section 1; more implementation details for OCTRIMA 3D are presented in Section 2; Volumetric scans from Spectralis SD-OCT (Heidelberg Engineering GmbH, Heidelberg, Germany) are used to evaluate the performance of OCTRIMA 3D as described in Section 3; using the manual labeling as the ground truth, the algorithm is also compared with the IOWA Reference Algorithm [21], Dufour’s algorithm [22] and Chiu et al. [17] as discussed in Section 4; the performance metrics are presented in Section 5.

1. Shortest-Path Based Graph Search for Boundary Detection

In this study, a total of eight retinal layer boundaries that were clearly visible were targeted for analysis. These boundaries are illustrated in Fig 1 and their corresponding notations are summarized in Table 1.

Fig 1. Exemplary OCT B-scan from Spectralis SD-OCT showing eight intraretinal layer boundaries labeled as , .

Note that boundaries are delineated with red, yellow, magenta, white, cyan, green black and blue solid lines, respectively and the notations are summarized in Table 1. A parafoveal scan is chosen in order to show all the layers that are segmented by OCTRIMA 3D.

Table 1. Notations for eight target boundaries, n denotes the frame number.

The framework proposed by Chiu et al. [17] to segment retinal boundaries in each frame was used in this study. For completeness, the model is briefly presented in this section.

The problem of boundary detection in a given normalized gradient image g is modeled as finding the minimum weight path or the shortest-path in graph G = (V,E), where V is a set of vertices and E is a set of weighted undirected arcs. The weight of edges are positive numbers and zero-weight indicate non-connected edges. To make the end point initialization fully automatic, two additional columns are added on both ends of the gradient image and the gradient value of the two virtual columns are set to 1. Each pixel in the conjunct gradient image gc is represented by a vertex and each vertex in the graph is only connected with the eight nearest pixels on the sides and corners. The weights of the arcs are calculated based on the gradient value as (1) where a and b denotes two distinct elements of V and wmin is a small stabilization constant [17]. The pixels with higher values in the gradient image have smaller weights on the connecting arcs and hence have better chances of being selected. The most prominent boundary is detected as the minimum weight path from the first to the last vertex in V using Dijkstra’s Algorithm [23].

The framework using the shortest-path based graph search is able to detect only one boundary for each graph. For multiple boundaries, careful search region refinement is needed. For example, the connectivity-based segmentation is employed in [17] for search region refinement when segmenting intraretinal layers. In terms of graph constriction, it means that the connecting arcs outside of the search region have to be removed before the shortest-path search. The algorithm in [17] was tested on 100 OCT B-scans obtained from 10 healthy subjects. The average thickness error reported for various retinal layers was about 1 pixel and the average processing time for each frame was 9.74 seconds [17].

In this work, OCTRIMA 3D, a real-time automatic algorithm to segment eight retinal boundaries in OCT volume data is developed based on the aforementioned framework. We proposed a new framework to detect each boundary using the shortest-path graph search as shown in Fig 2. Particularly, in order to detect each boundary, flattening is first performed to reduce the curvature of the target boundary. Second, a reference boundary is used in the alignment process to facilitate the flattening procedure. Then, the flattened image is convolved with edge kernels to calculate the gradient image. Of note, the selection of the edge kernel depends on the orientations of the target boundary. Next, the search region is refined using the location of the previous detected boundary in the current frame or the previous frame. In this paper, the term search region or region of interest (ROI) refers to the rectangle area in the OCT image that could possibly contain the target boundary. Biasing or masking are needed when more than one boundary are located in the same ROI. In the last step, a new graph is constructed using the down-sampled gradient image in the ROI and only the pixels in the search region are included in the vertices set. Finally, the result of the shortest-path search method is interpolated to obtain the target boundary.

Fig 2. OCTRIMA 3D framework to detect each intraretinal layer boundary using the shortest-path based graph search approach.

1. 1 Flattening.

Flattening is defined as the step that shifts A-scans up and down to make the reference boundary flat and is a commonly used preprocessing step in OCT segmentation tasks [16, 17]. As fewer nodes leads to a lower total weight, the graph search algorithm prefers geometric short path. Hence, horizontal boundaries with less curvature are better delineated in the image. The reference boundary is normally parallel to the target boundary and is detected in the earlier steps. The IS-OS boundary in the current frame is usually chosen as the reference boundary to remove the effect of retinal curvature. However, the ILM in the fovea region still has big curvature after flattening with the IS-OS border and may cause the boundary detection algorithm fail as shown in Fig 3(a) and 3(b).

Fig 3. The shortest-path based graph search methodology prefers a geometric straight line and may fail to delineate the ILM boundary in the central region of the fovea.

The ILM boundary can be detected correctly when the flattening operation uses the ILM border from a previous frame. Note that (a) and (c) are the raw OCT scans at the fovea and the resulting flattened image, respectively. (b) and (d) shows the results of the ILM boundary detection in (a) and (c).

In this work, we have improved the flattening of the ILM boundary by exploiting the smoothness of the retinal surface between adjacent B-scans. More specifically, the reference boundary for the ILM is chosen as the corresponding boundary in the previous frame to improve the robustness of the algorithm. After flattening, the ILM boundary has reduced curvature in the central region of the fovea and it can be correctly detected as shown in Fig 3(c) and 3(d). In this paper, the flattening process is named as inter-frame flattening if the reference boundary is on the previous frame. Otherwise, the flattening process is called intra-frame flattening.

1.2 Edge Map.

The gradient value of each pixel indicates the possibility of belonging to a boundary and the calculation of the gradient image is essential in the graph search algorithm. Because the speckle induced high gradient values are randomly located and not connected, the graph search algorithm could easily distinguish them from the actual boundary. Advanced denoising techniques, such as anisotropic diffusion or nonlinear complex diffusion [9, 10, 12, 13], are not needed. As the boundaries in retinal layers are usually horizontal after the flattening step, we only consider two orientations, dark-to-bright and bright-to-dark.

To detect the boundaries with dark-to-bright transition, the gradient image is calulated as (2) where I is the B-scan image and the convolution kernel is defined as (3) and (4) After convolution, the gradient values which are less than 0 are set to zeros and all the gradient values are normalized to the range between 0 and 1. We assumed that the gradient value near the retinal boundary followed the step edge model [24] and the convolution kernel was designed as a matched filter. The size of the kernel is determined experimentally so that the speckle noise is reduced by averaging with the neighboring pixels without losing much details in the horizontal direction.

For the boundaries with bright-to-dark transition, kernel kb2d = −kd2b is used.

1.3 Search Region Refinement.

In order to detect multiple boundaries with the same orientation, the search region could be limited to a different region of interest (ROI). Both the intra-frame and the inter-frame search region refinement were used to locate the ROI.

For the intra-frame search region refinement, the search region is defined according to the relative position of the boundaries within the same image. For example, taking into account that the IPL-INL boundary is always located between the ILM and IS-OS boundaries, the search region of was selected as the rectangle area between and .

For the inter-frame search region refinement, the boundary location of the frame n − 1 was used to refine the search region in the frame n as illustrated in Fig 4. Our assumption is that the difference of the ILM’s axial position between adjacent frames are less than 10 pixels. Hence, we could limit the search region of to [Zh,Zl], where (5)

Fig 4. Illustration of the search region refinement using the inter-frame dependency approach.

Taking into account that the ILM boundary is delineated in the frame n − 1, the search region of could be limited to be .

In our implementation, only the pixels in the ROI were used as the vertices of the graph G. However, the original framework proposed by Chiu et al. [17] used every pixel of the B-scan to construct the graph and only removed the vertices out of the ROI for each boundary detection task.

As it will be discussed in Section 1.5, the time complexity of the algorithm is a function of the number of vertices. Hence, the processing speed is improved when compared with [17] as the number of vertices is greatly reduced in OCTRIMA 3D.

1.4 Biasing and Masking.

Biasing and masking are introduced to detect multiple boundaries located in the same ROI. For example, the IPL-INL border and OPLo are closely located and both are characterized by a dark-to-bright transition with low contrast, therefore it is difficult to distinguish them with the graph search automatic algorithm. In this circumstance, biasing and masking are applied to delineate these two boundaries using the relative position of the boundaries as illustrated in Fig 5. This technique is explained as follows:

  • Biasing
    It is assumed that OPLo is always below the IPL-INL border and that both boundaries are between the IS-OS and ILM boundaries (green solid line in Fig 5). The ROI is the area between the IS-OS and the lowest point of the ILM as indicated by the red dashed rectangle in Fig 5 The flattened bright-to-dark gradient image in the search region is first multiplied with a bias map Bl defined as (6) where M is the number of rows in the region of interest and (z, x) denotes the position in the axial and lateral directions. After biasing, the lowest boundary, i.e. the OPLo, has better contrast than the other layers. By using the shortest-path based graph search, the OPLo could be detected automatically.
  • Masking
    Masking refers to the element-by-element multiplication between the gradient image in the search region and a mask image when detecting the second lowest boundary, i.e. the IPL. To create the mask image, the pixels that are lower than the previously detected boundary are set to zeros and the other pixels are set to ones. After the element-by-element multiplication step, the gradient values in the pixels below the lowest boundary are zeros and hence the algorithm could detect the second lowest boundary using shortest-path based graph search.

Fig 5. Illustration of the biasing and masking operations for the boundary detection of the IPL-INL (red) and OPLo (blue).

The search region or ROI (red dotted rectangle) is the area between IS-OS and ILM (green solid lines). After element-by-element multiplication with lower bias map, the OPLo is more prominent in the gradient image and can be detected easily. A binary mask is generated to set all the pixels below OPLo to zeros and the second lowest boundary, IPL-INL, is detected using the shortest-path graph search.

As the gradient values in the search region are reduced after the biasing and masking operation, the intensities of the gradient image in each column of the ROI are normalized to [0, 1] to improve the contrast when biasing and masking is needed.

1.5 Shortest-Path Based Graph Search.

Once the flattening, search region refinement, and biasing and masking procedures are performed, the pixels’ intensity values in the ROI gM × N indicate the likelihood for detecting a potential boundary. The detection of the specific boundary is formulated as finding the shortest path as described earlier. The constructed graph is highly sparse and every vertex has eight connecting arcs only. Using Dijkstra’s Algorithm [23], the time complexity of the graph search method is O(log(∣V∣)*∣E∣), where ∣V∣ and ∣E∣ are the number of nodes and arcs [23]. In the context of our boundary detection framework, ∣V∣ = MN and ∣E∣ = 8MN. Hence the time complexity is O(log(MN)*MN).

In order to improve the processing time further, we have down-sampled the gradient image by a factor of 2 in the lateral direction. Because the retinal layers are smooth between adjacent columns, the reduction in the lateral resolution results in a great improvement in the processing speed without affecting accuracy much. The boundary location in the raw image is linearly interpolated from the detection results in the down-sampled image and further smoothed with a moving average filter.


This section describes the implementation details of OCTRIMA 3D for the detection of eight retinal boundaries. The overview of the method is shown in Fig 6.

Fig 6. The overview of OCTRIMA 3D framework.

The boundaries labeled using blue and red fonts have the dark-to-bright and bright-to-dark transitions, respectively.

2.1 Detection of the IS-OS boundary.

The IS-OS border is the most prominent and flat boundary in the retinal OCT B-scans of healthy subjects and it is detected on the dark-to-bright gradient images. The boundary detection strategies for the first frame and subsequent frames are different. For the first frame, the search region is the whole gradient image and a bias map Bl is multiplied with the gradient image to eliminate the interference from the ILM which is a high contrast boundary. The result of the shortest-path based graph search is . For the subsequent frames, the detection result of the ILM boundary in the previous frame is used for inter-frame flattening and inter-frame search region refinement.

2.2 Detection of the ILM boundary.

The ILM () border is another high contrast boundary on the dark-to-bright gradient image. Its detection method is described as follows:

For the first frame, the detected IS-OS () boundary is used as the reference boundary for flattening. Intra-frame search region refinement defines the area above as the search region for the ILM border. The result of the shortest-path based graph search is . In order to detect the ILM boundary in the subsequent frames (, n > 2), the inter-frame flattening (as illustrated in Fig 3) is applied to reduce the curvature of the ILM border in the central region of the fovea. The inter-frame search region refinement is used to reduce the processing time.

2.3 Detection of the OS-RPE and RPE-CH boundaries.

Intra-frame flattening is performed for the OS-RPE and RPE-CH boundaries detection using as the reference boundary. The edge kernels for the OS-RPE and RPE-CH boundaries are kb2d and kd2b, respectively. The search region is the rectangle area with a height of 40 pixels below the flattened IS-OS edge in the current frame. The RPE-CH () boundary is detected using the shortest-path based graph search on the bright-to-dark gradient image. The masking operation is applied to set all the gradient values on the pixels below to zeros. The only boundary in the search region of bright-to-dark images is the OS-RPE () border which can be detected easily.

2.4 Detection of the IPL-INL, INL-OPL and OPLo boundaries.

The intra-frame flattening is performed using as the reference boundary for the detection of the IPL-INL / INL-OPL / OPLo. The bright-to-dark edge kernel is used to detect the IPL-INL () and OPLo () boundaries. Intra-frame search region is defined as the area between the flattened IS-OS border and the lowest point of the ILM boundary in the current frame. As the separation between the IPL-INL and OPLo is small and they are in the same search region, masking and biasing operations are performed as described in Section 1.4. As for the INL-OPL () border’s detection, the dark-to-bright edge kernel is used to filter the flattened image. The search region is the same as the one used in the detection of the IPL-INL and OPLo. To make sure the INL-OPL border is always between the IPL-INL and OPLo, the gradient value on the pixels that are above the IPL-INL and below the OPLo are set to zeros by using the masking operation. The shortest-path graph search could delineate the INL-OPL boundary easily.

2.5 Detection of the RNFLo boundary.

The disruption of the RNFL’s outer boundary (RNFLo) caused by the presence of the blood vessels on the retina poses a challenge for the graph search algorithm. In our study, we detect the A-scans affected by the blood vessels by segmenting the enface map of the OCT volume data. As shown in Fig 7(a), the blood vessels have altered the distribution of A-scans in two ways: (1) the intensities of the pixels just below the ILM layer are higher than the surroundings A-scans and; (2) the intensity of the pixels between the IS-OS and RPE-CH boundaries is lower than in the neighboring pixels. The enface map is defined as: (7) where In denotes the nth B-scan/frame in the volume, , and . The A-scan that is affected by the blood vessels would have higher value of En,x and the location of blood vessels is determined by thresholding the enface image with the empirically selected value of −0.1. An example enface map is given in Fig 7(b).

Fig 7. Detection of A-scans affected by retinal blood vessels.

(a)The shadowing effect from the retinal blood vessels is more pronounced near the RPE (yellow rectangle, note the hypo reflective regions) and less pronounced near the ILM (green rectangle). (b) An example of the enface map of a macular volume OCT. (c) The result of RNFLo detection (blue solid line). The regions where the A-scans are affected by the blood vessel shadowing is highlighted with gray rectangles.

Once the location of the blood vessels is determined, the outer boundary of the RNFL () is detected as follows: First, the raw image is flattened with the ILM boundary as the reference boundary () and the kb2d is used as the filtering kernel. Then the search range is defined between the flattened ILM edge and the highest point of the IPL border. To overcome the disruption caused by the blood vessels, the gradient values on the A-scans affected by the blood vessels are set to ones. Hence, the weight of the arcs in the blood vessel region is equal to wmin, which is similar to the algorithm proposed in [17]. The result of the outer RNFL detection and vessel shadow location are illustrated in Fig 7(c).

After the eight intraretinal boundaries are segmented, the detected boundaries form 8 surfaces Bi,i = 1,2…8 defined as (8) where is the location of ith boundary on mth column and nth frames.

3. Clinical Data

We conducted our study on 10 Spectralis SD-OCT (Heidelberg Engineering GmbH, Heidelberg, Germany) volume data sets from 10 healthy adult subjects. The Institutional Review Board of the University of Miami approved the study. The research adhered to the tenets set forth in the declaration of Helsinki and written informed consent was obtained from each subject. The healthy subjects were selected based on a best-corrected visual acuity of at least 20/25, a history of no current ocular or systematic disease, and a normal appearance of the macula when examined with contact lens biomicroscopy.

Each subject was scanned using IR+OCT scanning mode with a 30° area setting. The captured volumes contained 61 images with the dimensions of 768 × 496 pixels (width× height). The axial resolution was 3.9 microns and the transversal resolution varied from 10 to 12 microns. The inter B-scan spacing was from 120 microns to 140 microns. To reduce the speckle noise and enhance the image contrast, every B-scan was the average of five aligned images using the TruTrack active eye tracking technology [25] (ART = 5). We exported the volume scans from Spectralis SD-OCT device using the built-in xml export format, which consisted of 61 JPG images and an xml file specifying the volume scanning details.

In addition, experiments were also conducted on 100 SDOCT images obtained with the Bioptigen device (Bioptigen Inc, Morrisville, North Carolina, USA) images from 10 subjects. The data and the manual labelings were kindly provided by Chiu et al. and the details can be found in [17].

Besides the OCT data from healthy subjects, two B-scans from subjects with pathologies were also used to explore the potential of OCTRIMA 3D in pathological cases. One scan was obtained from a patient with diabetic macular edema (DME) captured at the Bascom Palmer Eye Institute, University of Miami. The other B-scan was from an eye with dry age-related macular degeneration (drye-AMD) downloaded from Dufour’s software package’s website [22].

4. Experimental Setup

OCTRIMA 3D was implemented using Matlab R2014a on a computer with the CPU of Intel Core i7-2600@ 3.4 GHz 3.4 GHz. Prior to the 8 boundaries segmentation procedure, the ILM and RPE-CH borders were segmented from frames 21 to 40 and the point with the smallest distance was detected as the fovea. No training was needed in this work. The OCT A-scans outside the 6mm × 6mm (lateral × azimuth) area and centered at the fovea were cropped to remove low signal regions.

In order to evaluate the performance of OCTRIMA 3D, we compared our segmentation results with three existing graph-based segmentation approaches and the results from a manual grader using the following three experiments:

  • Comparison between Dufour’s algorithm, IOWA Reference Algorithm and OCTRIMA 3D: Dufour’s software is able to read the xml built-in format and detect 6 surfaces from the volumetric data automatically. The segmented surfaces were saved into a.csv file.
    In order to be able to use the Iowa reference algorithm, all scanned data in the xml format was converted to.vol raw format by our customized program using the following steps: 1. read the template.vol file from [26] using the Matlab script [27]. 2. Convert the intensity values in the xml format to the raw format by taking the fourth power [28]. 3. Replace the image data in the template.vol file with the converted raw data obtained in the second step. The converted.vol file was loaded into the IOWA Reference Algorithm and segmented by the “10 Layer Segmentation of Macular OCT” function. Eleven surfaces were segmented fully automatically and saved into a surface.xml file.
    The detected six surfaces from Dufour’s software are corresponding to ILM, RNFLo, IPL-INL, OPLo, IS-OS and RPE-CH in OCTRIMA 3D and are equivalent to surface 1, 2, 4, 6, 7 and 11 in the IOWA Reference algorithm, respectively.
    We compared the 6 segmented surfaces from the IOWA Reference Algorithm, Dufour’s software and OCTRIMA 3D in the ETDRS regions using the manual labeling from an expert grader as the ground truth qualitatively and quantitatively. Comparison was made by paired t-test and the level of significance was set at 0.001.
  • Comparison with the algorithm developed by Chiu et al. [17]: To validate OCTRIMA 3D, we segmented the same set of SDOCT images obtained with the Bioptigen device (Bioptigen Inc, Morrisville, North Carolina, USA) with the algorithm by Chiu et al. The manual labelings from their study reported in [17] were used as the ground truth for comparison. This comparison was performed to mainly assess the potential operational time’s improvement.
  • Comparison with manual graders’ results: In order to estimate the accuracy of OCTRIMA 3D, 100 OCT B-scans from 10 healthy subjects were used in the manual labeling experiment. A subset of 10 images were randomly selected from every volumetric data of a patient for labeling and at least two of these frames contained the fovea. Tracking boundaries of the retinal layers manually is a time-consuming process. In this study, we designed a software tool using Matlab 2014a for manual labeling. Particularly, once the observer or grader clicked on the points along each border, the manual tracing resulting from linear interpolation between the clicked points was taken as the final ground truth for comparison. The grader could also move, add and delete the clicked points to modify the boundary tracings. The labeling task was performed with extreme carefulness by two observers, Observer 1 and Observer 2. On average, it took about 30 minutes to label one frame. The delineated results from Observer 1 were taken as the ground truth and the inter-observer difference were used as a benchmark to evaluate the accuracy. Comparison was made by paired t-test and the level of significance was set at 0.001.
  • Potential application on retinal images showing pathological features: Detection of pathological retinal structures is a difficulty of countless everyday clinical importance. Two OCT B-scans from patients with retinal pathologies as described in the Section 3 were used to explore the potential of extending OCTRIMA 3D to segment volume data showing pathological feautures.

5. Performance Metrics

The following performance metrics were defined to objectively measure the difference between the detection results () and ground truth (denoted as ):

  • The signed error (SE) between the automatic detection and ground truth are defined by (9) where μ and σ denote the mean and standard deviation of the matrices, respectively. The value of mean signed error (MSE) and standard deviation of signed error (SSE) indicate the bias and variablity of the detection results.
  • The mean of the unsigned errors (MUE) which measures the absolute difference between the automatic detection results and manual labeling is defined by (10)
  • The 95th percentile unsigned errors, denoted as E95, is the highest value of the unsigned error after removing the top 5% of the biggest values. It measures the upper bound of unsigned error except the extreme cases.
The run time for the eight boundaries segmentation procedure of the whole volume is measured to calculate the time complexity of the algorithm.


  • Results of the comparison between Dufour’s Software, the IOWA Reference Algorithm and OCTRIMA 3D
    The output csv file from Dufour’s software was imported using Matlab. To interpret the results of the 10 layer segmentation procedure from the IOWA reference algorithm, the surface.xml file was read with a customized Matlab script. The segmentation procedures using OCTRIMA are illustrated in S1 Video. The processing time for the volume on the 6mm × 6mm(lateral × azimuth) area using our algorithm was 26.15 seconds while the processing time for Dufour’s software and the Iowa Reference Algorithm was about 60 seconds and 75 seconds, respectively. The unsigned detection errors obtained for six retinal surfaces are shown in Fig 8(a)–8(f). The average unsigned errors are shown in Table 2. As it can be seen, the unsigned error for OCTRIMA 3D in all the surfaces is significantly smaller than Dufour’s Software and IOWA Reference Algorithm (p < 0.001). The ILM, IS-OS and RPE-CH surfaces were more reliably delineated than the other three surfaces. As an example, the delineated boundaries on an OCT B-Scan are shown in Fig 9.
  • Results of comparison between the algorithm by Chiu et al. [17] and OCTRIMA 3D
    The results of our algorithm on OCT images obtained from the Bioptigen device (Bioptigen Inc, Morrisville, North Carolina, USA) were compared to the Chiu et al. algorithm and the results are shown in Table 3. Every OCT B-scan was processed independently and the average processing time was 1.15 seconds. The processing time for Chiu et al’s algorithm was reported as 9.74 seconds using a computer with a CPU of Intel Core 2 Duo at 2.53 GHz. Our implementation showed a significant improvement in the processing time. As expected (see Fig 10), the results from OCTRIMA 3D agreed well with the Chiu et al. algorithm and the main discrepancy was on the vessel shadow regions. The slight difference in accuracy between OCTRIMA 3D and Chiu et al’s algorithm is subjective to different observers. In particular, the manual labelings provide by Chiu et al. is very smooth. In OCTRIMA 3D, the delineated boundaries trace the small bumps and hence a slightly increase of error is observed.
  • Results of comparison between manual labelings from two observers and OCTRIMA 3D
    The difference between OCTRIMA 3D and manual labeling is shown in Table 4.
    Note that without segmentation bias correction, the bias of the results from OCTRIMA 3D were less than 1 pixel for all the boundaries. The average absolute error, which was measured with MUE, was in the range of [0.72, 1.7] pixel. The boundary 1, 6 and 8 had better performance than the other 5 remaining boundaries. OCTRIMA 3D detection unsigned error was significantly lower than the inter-observer difference (p < 0.001) for all of the eight boundaries. The upper bound of detection errors was between 1.86 to 4.36 pixels.
  • Results of OCTRIMA 3D for segmenting retinal images showing pathological features The segmentation results of OCTRIMA 3D are shown in Fig 11 and Fig 12. As it is shown in Fig 11, OCTRIMA 3D outperformed the built-in algorithm from Spectralis SD-OCT in detecting the ILM and RPE-CH boundaries. The built-in software of Spectralis has obvious detection errors and missing areas for the RPE-CH layer while OCTRIMA 3D was able to detect the ILM and RPE-CH boundaries accurately without any adjustment to the current algorithm. In the particular case of the AMD eye (see Fig 12(a)), the retinal layers were disrupted and posed a challenge to the automatic segmentation softwares. The segmentation results of Dufour’s software was shown in Fig 12(b). The IS-OS boundary detection failed in the left most and center area of the B-scan. In comparison, the OCTRIMA 3D was able to segment the layer correctly except for the IS-OS and OS-RPE boundaries in the drusen area where the surfaces were not flat. Therefore, the delineation of the IS-OS boundary was corrected by removing the flattening step and enhancing the edge map. After refinement, the IS-OS boundary could be detected precisely as shown in Fig 12(c). This particular improvement is an indication that OCTRIMA 3D could be further optimized to quantify morphological or pathological features on images that are not quite flat. However, the performance of OCTRIMA 3D on segmenting the OCT images with various pathologies is going to be investigated more thoroughly in the future.
Fig 8. Comparison of unsigned segmentation errors on six surfaces between Dufour’s algorithm (left column), the IOWA reference algorithm (middle column) and OCTRIMA 3D (right column) in the ETDRS regions.

The graph bar scale indicates the error magnitude in microns. The mean unsigned segmentation errors are reported in Table 2.

Table 2. Comparison of average absolute detection error in unit of pixels and microns between Dufour’s algorithm, IOWA Reference Algorithm and OCTRIMA 3D in ETDRS region.

Fig 9. The comparison between Dufour’s Software (magenta solid line), IOWA reference algorithm (blue solid line) and OCTRIMA 3D (red solid line) using manual labeling as the ground truth (green solid line).

Fig 10. The comparison between OCTRIMA 3D (red solid line) and the algorithm by Chiu et al.

(blue solid line) using manual labeling as the ground truth (green solid line).

Table 3. Comparison results between OCTRIMA 3D and the algorithm by Chiu et al. on Bioptigen OCT images. The error is quantified with (MSE± SSE, MUE, E95) in unit of pixels.

Table 4. Comparison results between OCTRIMA 3D and manual labelings from two graders.

The manual labeling from Observer 1 is taken as the ground truth and the inter-observer difference is reported as a benchmark to evaluate the accuracy. The difference is evaluated using (MSE± SSE, MUE, and E95) in unit of pixels.

Fig 11. Algorithms performance in the B-scan obtained from the patient with diabetic macular edema.

(a) The raw OCT B-scan. (b) The boundaries delineated by the built-in Spectralis SD-OCT software for the ILM and RPE-CH. The yellow arrows are indicating the boundary detection errors by the built-in software of the Spectralis device. (c) The boundaries delineated by OCTRIMA 3D for the ILM and the RPE-CH.

Fig 12. The segmentation results obtained for the B-scan in the eye with dry age-related macular degeneration using Dufour’s software and the OCTRIMA 3D algorithm.

The legend of the boundaries is the same as Fig 1. (a) THe raw OCT B-scan. (b) The segmentation result of Dufour’s software. The IS-OS delineation failed at the left most and center area of the B-scan. (c) The initial segmentation results of OCTRIMA 3D detected retinal boundaries reliably except for the IS-OS in the drusen area (green doted line). By adjusting the flattening step, the IS-OS is delineated correctly (green solid line).


This paper presents a graph-based automatic algorithm, OCTRIMA 3D, to segment cellular layers of the retina on macular scans from OCT volume data. The graph-based segmentation method solved the boundary detection problem by employing well-solved graph models, such as max-flow/min cut or shortest-path search and was found to be robust against the speckle noise and vessel disruption [17, 21]. In this work, shortest-path based graph search was used to detect eight boundaries in the macular region of OCT volume data. No training was needed in this work. Bias correction is not performed in our experiments as the systematic error is minimum in the dataset analyzed. Both accuracy and speed were evaluated in the comparison with manual labelings and two state of the art graph-based segmentation methods [17, 21]. The mean and standard deviation of the signed errors, the mean of the unsigned errors and the 95 percentile were reported to quantify the accuracy of OCTRIMA 3D. The detection errors of eight boundaries by OCTRIMA 3D were significantly lower (p < 0.001) than inter-observer difference in 100 Spectralis SD-OCT images from 10 subjects. The processing time for the whole OCT volume of 496 × 644 × 51 voxels (captured by Spectralis SD-OCT) is around 26 seconds and the average unsigned error is about 1 pixel. The Iowa reference algorithm is based on the minimum cost surface search on the graph constructed from 3D volume data. It is robust even when the boundary is missing in one of the frames due to vessel disruption or low signal strength. The detection result is smooth across the whole volume. However, the smoothness between frames could be a disadvantage in the following scenarios: (a) when there are motion artifacts in the OCT volume data and (b) when there are bumps or sudden curvature changes in the retinal structure. The first scenario is usually not found in the commercially available OCT devices due to the built-in motion correction algorithm. However, the algorithm may not work well in custom-built OCT devices without the implementation of motion correction algorithms. The smoothness constraint of the retina surface in the Iowa algorithm limited the capability of the algorithm to trace the small bumps and sudden curvature accurately. The Dufour’s software improves the Iowa Reference algorithm by adopting trained soft-constraint. The accuracy is greatly improved as illustrated in Fig 8 and Table 2.

The shortest-path based graph search method presented in [17] by Chiu et al. is the most related to our work. From the experiments, it was found that the processing speed of OCTRIMA 3D is greatly improved. The resulting improvement was mainly obtained as follows: 1. In Chiu et al.’s work, two graphs (dark-to-bright and bright-to-dark) were constructed from all the pixels in the OCT image for detecting eight boundaries. Most of the pixels were outside the search region resulting in a redundant process for the graph performance. In our methodology, a new graph for each boundary with a minimum number of vertices is constructed, facilitating a great reduction for the processing time; 2. Connectivity-based segmentation and several heuristic techniques were applied to define the search region for each boundary by Chiu et al. However, the heuristic techniques have limited robustness as images from different OCT modalities have different contrast and resolution. In OCTRIMA 3D, the connectivity-based segmentation was replaced with the masking and biasing operations; 3. The published work of Chiu et al. [17] did not consider the information from adjacent frames. In our implementation, the flattening and search region refinement step made use of inter-frame similarities.

Despite the promising results, our study has a few limitations. First, the processing time of the software is not only affected by the computational complexity, but also depends on the CPU processing power, programing language and number of tasks, which were not under control in the comparision between softwares. For example, the processing time of Chiu et al. algorithm is measured on a computer with differenct CPU power. The number of surfaces segmented by Iowa Reference software, Dufour’s software and OCTRIMA 3D were 11, 6 and 8, respectively. The programing languages used to deploy the algorithms were also different for the three software we compared. Therefore, the processing time of the softwares we measured is only a indicator of the real time capabilities. Second, the ground truth used the manual delineation of the retinal boundaries, which may be prone to inter-observer errors. Therefore, the manual labeling process was performed very carefully. On average, the labeling of eight boundaries on one OCT B-scan took 30 minutes to complete. Third, the current assumption is that the retinal surface is rather a flat surface and there are no big changes of boundary locations between frames (in the datasets analyzed/compared). This assumption works well for all of healthy cases. However, authors are aware that this assumption should be optimized in the presence of pathological features that may alter the retinal surface.

The most usual simplification approaches toward having real-time implementation require reducing both the number of operations and amount of data as well as the use of a simple or simplified algorithm. In this study, the processing time for the whole OCT volume has been greatly reduced while preserving the same amount of volume data.

Conclusion and Future Work

In conclusion, a fast and accurate automatic segmentation algorithm, OCTRIMA 3D, has been developed to detect eight boundaries in macular scans from OCT volume data. OCTRIMA 3D methodology was developed based on the shortest-path graph search method proposed by Chiu et al. [17] and extended to 3D by making use of inter-frame similarities. The processing time for the whole volume was about 26 seconds and the average of unsigned detection error was about 1 pixel (about 4 microns). The processing time could be further improved by making use of parallel computing.

Overall, OCTRIMA 3D provides a fast, accurate and robust solution for the analysis of OCT volume data in real-time which could improve the usefulness of OCT devices in daily clinical routine. Future work will include the segmentation of peripapillary scans and corresponding tests of the algorithm robustness. As macular scans often seem more uniform than the peripapillary scans, we expect that some adjustments in OCTRIMA 3D may be needed. An evaluation of a larger OCT volume dataset of diseased eyes is also planned to investigate whether the method could be used as an aid to diagnose and monitor various retinal pathologies in a clinical setting.

Supporting Information

S1 Video. The video illustrates the use of OCTRIMA 3D software to segment Spectralis OCT 3D data.


S1 Data. The compressed folder contains the data from 10 subjects.

For each subject, the *.mat file contains raw images used for segmentation, the results of OCTRIMA 3D, the manual labelings from Observer 1 and Observer 2.



This study was supported in part by a NIH Grant No. NIH R01EY020607, a NIH Center Grant No. P30-EY014801, by an unrestricted grant to the University of Miami from Research to Prevent Blindness, Inc., and by an Eotvos Scholarship of the Hungarian Scholarship Fund. The authors gratefully acknowledge Sina Farsiu, Ph.D. and Stephanie Chiu, Ph.D at Duke University for facilitating their complete study dataset and manual markings. We would also like to thank Milan Sonka, Ph.D. from the University of Iowa, for facilitating the online interactions with the IOWA Reference algorithm. Thanks to Sandra Pineda, B.S. for her assistance with the recruitment of healthy subjects and clinical coordination.

Author Contributions

Conceived and designed the experiments: JT GMS DCD. Performed the experiments: JT BV. Analyzed the data: JT. Contributed reagents/materials/analysis tools: DCD WHL WES. Wrote the paper: JT GMS DCD.


  1. 1. Quigley HA, Nickells RW, Kerrigan LA, Pease ME, Thibault DJ, Zack DJ. Retinal ganglion cell death in experimental glaucoma and after axotomy occurs by apoptosis. Investigative ophthalmology & visual science. 1995;36(5):774–786.
  2. 2. Schuman SG, Koreishi AF, Farsiu S, Jung Sh, Izatt JA, Toth CA. Photoreceptor layer thinning over drusen in eyes with age-related macular degeneration imaged in vivo with spectral-domain optical coherence tomography. Ophthalmology. 2009;116(3):488–496. pmid:19167082
  3. 3. Zhao Y, Chen Z, Saxer C, Xiang S, de Boer JF, Nelson JS. Phase-resolved optical coherence tomography and optical Doppler tomography for imaging blood flow in human skin with fast scanning speed and high velocity sensitivity. Optics letters. 2000;25(2):114–116. pmid:18059800
  4. 4. Huang D, Swanson EA, Lin CP, Schuman JS, Stinson WG, Chang W, et al. Optical coherence tomography. Science. 1991;254(5035):1178–1181. pmid:1957169
  5. 5. Drexler W, Liu M, Kumar A, Kamali T, Unterhuber A, Leitgeb RA. Optical coherence tomography today: speed, contrast, and multimodality. Journal of biomedical optics. 2014;19(7):071412–071412. pmid:25079820
  6. 6. Han IC, Jaffe GJ. Evaluation of artifacts associated with macular spectral-domain optical coherence tomography. Ophthalmology. 2010;117(6):1177–1189. pmid:20171740
  7. 7. DeBuc DC. A review of algorithms for segmentation of retinal image data using optical coherence tomography. Image Segmentation. 2011;1:15–54.
  8. 8. Hee MR, Izatt JA, Swanson EA, Huang D, Schuman JS, Lin CP, et al. Optical coherence tomography of the human retina. Archives of ophthalmology. 1995;113(3):325–332. pmid:7887846
  9. 9. Koozekanani D, Boyer K, Roberts C. Retinal thickness measurements from optical coherence tomography using a Markov boundary model. Medical Imaging, IEEE Transactions on. 2001;20(9):900–916.
  10. 10. Baroni M, Fortunato P, La Torre A. Towards quantitative analysis of retinal features in optical coherence tomography. Medical engineering & physics. 2007;29(4):432–441.
  11. 11. Fabritius T, Makita S, Miura M, Myllylń R, Yasuno Y. Automated segmentation of the macula by optical coherence tomography. Optics express. 2009;17(18):15659–15669. pmid:19724565
  12. 12. Cabrera Fernández D, Salinas HM, Puliafito CA. Automated detection of retinal layer structures on optical coherence tomography images. Optics Express. 2005;13(25):10200–10216. pmid:19503235
  13. 13. Boyer KL, Herzog A, Roberts C. Automatic recovery of the optic nervehead geometry in optical coherence tomography. Medical Imaging, IEEE Transactions on. 2006;25(5):553–570.
  14. 14. Yazdanpanah A, Hamarneh G, Smith B, Sarunic M. Intra-retinal layer segmentation in optical coherence tomography using an active contour approach. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2009. Springer; 2009. p. 649–656.
  15. 15. Fuller AR, Zawadzki RJ, Choi S, Wiley DF, Werner JS, Hamann B. Segmentation of three-dimensional retinal image data. Visualization and Computer Graphics, IEEE Transactions on. 2007;13(6):1719–1726.
  16. 16. Lang A, Carass A, Hauser M, Sotirchos ES, Calabresi PA, Ying HS, et al. Retinal layer segmentation of macular OCT images using boundary classification. Biomedical optics express. 2013;4(7):1133–1152. pmid:23847738
  17. 17. Chiu SJ, Li XT, Nicholas P, Toth CA, Izatt JA, Farsiu S. Automatic segmentation of seven retinal layers in SDOCT images congruent with expert manual segmentation. Optics express. 2010;18(18):19413–19428. pmid:20940837
  18. 18. Garvin MK, Abràmoff MD, Wu X, Russell SR, Burns TL, Sonka M. Automated 3-D intraretinal layer segmentation of macular spectral-domain optical coherence tomography images. Medical Imaging, IEEE Transactions on. 2009;28(9):1436–1447.
  19. 19. Dufour PA, Ceklic L, Abdillahi H, Schroder S, De Dzanet S, Wolf-Schnurrbusch U, et al. Graph-based multi-surface segmentation of OCT data using trained hard and soft constraints. Medical Imaging, IEEE Transactions on. 2013;32(3):531–543.
  20. 20. Li K, Wu X, Chen DZ, Sonka M. Optimal surface segmentation in volumetric images-a graph-theoretic approach. Pattern Analysis and Machine Intelligence, IEEE Transactions on. 2006;28(1):119–134.
  21. 21. of IOWA U. The Iowa Reference Algorithms (Retinal Image Analysis Lab, Iowa Institute for Biomedical Imaging, Iowa City, IA), URL:; 2011. Available from:
  22. 22. OCT Segmentation Application (2013). OCT Segmentation Application (ARTORG Center, University of Bern, Switzerland).
  23. 23. Dijkstra EW. A note on two problems in connexion with graphs. Numerische mathematik. 1959;1(1):269–271.
  24. 24. Jayaraman. Digital Image Processing. Jayaraman, editor. McGraw-Hill Education (India) Pvt Limited; 2011. Available from:
  25. 25. TruTrack Active Eye Tracking, Minimizes Motion Artifact,;.
  26. 26. Abramoff M, Garvin M, Lee K, Sonka M. Heidelberg Macular.vol; 2014. Available from:
  27. 27. Mayer M. Read Heidelberg Engineering (HE) OCT raw files; 2011. Web. Available from:
  28. 28. Girard MJ, Strouthidis NG, Ethier CR, Mari JM. Shadow removal and contrast enhancement in optical coherence tomography images of the human optic nerve head. Investigative ophthalmology & visual science. 2011;52(10):7738–7748.