Reliable Identification of Deep Sulcal Pits: The Effects of Scan Session, Scanner, and Surface Extraction Tool

Sulcal pit analysis has been providing novel insights into brain function and development. The purpose of this study was to evaluate the reliability of sulcal pit extraction with respect to the effects of scan session, scanner, and surface extraction tool. Five subjects were scanned 4 times at 3 MRI centers and other 5 subjects were scanned 3 times at 2 MRI centers, including 1 test-retest session. Sulcal pits were extracted on the white matter surfaces reconstructed with both Montreal Neurological Institute and Freesurfer pipelines. We estimated similarity of the presence of sulcal pits having a maximum value of 1 and their spatial difference within the same subject. The tests showed high similarity of the sulcal pit presence and low spatial difference. The similarity was more than 0.90 and the spatial difference was less than 1.7 mm in most cases according to different scan sessions or scanners, and more than 0.85 and about 2.0 mm across surface extraction tools. The reliability of sulcal pit extraction was more affected by the image processing-related factors than the scan session or scanner factors. Moreover, the similarity of sulcal pit distribution appeared to be largely influenced by the presence or absence of the sulcal pits on the shallow and small folds. We suggest that our sulcal pit extraction from MRI is highly reliable and could be useful for clinical applications as an imaging biomarker.


Introduction
The first sulci form while radial migration of neurons forms the cerebral cortex. Early gyrogenesis has been hypothesized to be more influenced by genetic than environmental factors because of the relative invariant spatial distribution of early sulci, which may be due to a human-specific predetermined protomap of functional areas [1,2,3,4,5]. It has been suggested using functional magnetic resonance imaging (MRI) that early-developing sulci are predictive for functional area locations and spatially covary with them [6,7,8]. It is therefore important to identify those first folds and analyze them with functional features for understanding the anatomical and functional development of the human brain.
The first folds can be estimated in mature brains because they are thought to develop into the deepest local regions of sulci, called the sulcal pits. We have extracted and analyzed sulcal pits from MRI data and supported the hypothesis about their biological meaning. We have analyzed the spatial distribution of sulcal pits on the cortical surface and have found hemispherical asymmetry in the superior temporal regions, which may be related to the left lateralization of language function [2]. We have also reported the significant relationship between the presence of sulcal pits and intellectual ability estimated with intelligence quotient (IQ) scores [9]. In the high verbal IQ group a sulcal pit was more frequently present in the left posterior inferior frontal sulcus (near Broca's area) and the right posterior inferior temporal sulcus, regions reported to be involved in language function [10]. Other recent studies have presented a graph-based sulcal pattern comparison method using sulcal pits and applied this method to twins and polymicrogyria patients [11,12]. We showed that the similarity of the 3D position of the sulcal pits in twin pairs was strongly higher than in unrelated pairs, supporting the hypothesis that the spatial distribution of sulcal pits might be under tight genetic control [11]. Our previous studies suggest that sulcal pits have the potential to be a new important neuroimaging biomarker of brain function and development.
Methodologically, a sulcal pit is defined as the point having a local maximum depth on the inner cortical surface (white matter surface) [2] and therefore the extraction of sulcal pits is affected by the gray/white matter surface reconstruction. Gray/white surface reconstruction is in turn affected by image characteristics such as gray/white matter intensity and signal to noise and image processing. MR intensity properties can be influenced by subject-related factors, such as hydration status [13] or blood pressure and instrument-related factors such as field strength, scanner hardware, imaging magnetic gradients or pulse sequence [14,15,16,17,18]. Different image processing such as software package or the parameters chosen for the processing, when the identical image is used may also contribute to variability in sulcal pit extraction. For example, the cortical surface reconstruction algorithms used in the Montreal Neurological Institute (MNI) [19,20], Freesurfer (FS) [21,22], and BrainVISA [23] pipelines result in different geometric accuracy and different surface mesh characteristics [24]. Therefore, these factors may affect surface reconstructions and sulcal pit extraction. Although we have shown that our method detected appropriate sulcal pits and their spatial distribution across subjects [2], intra-subject reliability and reproducibility has not been evaluated. It is important to determine the reliability of sulcal pit extraction under different scanning conditions and with different image processing factors to determine if sulcal pits have a potential role as an imaging biomarker.
Therefore the purpose of this study was to investigate the effects of scan session, scanner, and surface extraction tool on the reliability of the sulcal pit extraction. Healthy young subjects were scanned several times on different scanners and their images were processed to extract sulcal pits on the cortical surface. We reconstructed cortical surfaces with 2 different surface extraction pipelines, the MNI [19,20] and FS [21,22] pipelines, which have been largely used for cortical thickness and shape analyses. Sulcal pit maps acquired from different images, or different surface extraction pipelines were compared to each other within the same subject by measuring the similarity of the presence of sulcal pits and their spatial difference. Figure 1. Example of intra-subject cortical surface alignment. Eight white matter surfaces (4 MNI and 4 FS surfaces) are rendered (A) and they are overlapped with a reference volume image (B) after a linear registration with 6 parameters. They are aligned well into a common space, but their local shape of the gray/white matter boundary is slightly different. doi:10.1371/journal.pone.0053678.g001  ), and the other 5 subjects were scanned 3 times at 2 centers (SMC and EUMC) including 1 test-retest session at SMC (SMC 1 and SMC 2 ). Two sessions of test-retest were approximately 3 weeks apart and all other sessions were within this period. As in previous studies of MRI-based morphometric structural reliability [15,16], 3-week intervals would include the source of variability relevant to subject-related factor, such as hydration status or blood pressure, and instrument-related factors, such as scanner drift, which may be minimized when the test-retest interval is several minutes to ,1 day. We obtained written consents from each patient and the Institutional Review Board of the Samsung Medical Center, Ewha-woman's University Mokdong Medical Center and Asan Medical Center approved the study protocol.

Data acquisition
Four or 3 sets of MRI data were acquired on the scanners of same MR manufacturer using same sequence (Philips 3T Achieva scanners, T1-weighted 3D-TFE (Turbo Echo Field) sequence). The 3 MRI centers used identical parameters for isotropic 0.5 mm acquisition: sagittal slice thickness = 1.0 mm, overcontiguous slices with 50% overlap, no gap, TR = 9.9 ms, TI = 1,245 ms, TE = 4.60 ms, Bandwidth = 142.3 Hz/pixel, flip angle = 8u, matrix size of 2406240 pixels, reconstructed to 4806480 over a FOV of 240 mm, voxel size = 0.560.560.5 (mm). High resolution structural image is sensitive to head motion even within the single volume. We inspected all raw images and assured that there was no head motion.

Image processing and cortical surface extraction using the MNI and FS pipelines
In the MNI pipeline, the native images were normalized to a standardized stereotaxic space using a linear transformation and corrected for intensity nonuniformity [25,26]. Images were, then, classified into white matter, gray matter, cerebrospinal fluid, and background using an advanced neural net classifier [27]. The hemispherical surfaces of the gray/white matter boundary and gray matter/cerebrospinal fluid boundary) were automatically extracted, consisting of 40,962 vertices [19,20].
The FS pipeline includes removal of non-brain tissue [28], stereotaxic space transformation, tissue segmentation, intensity normalization [25], tessellation of the gray/white matter boundary, automated topology correction [29,30], and surface deformation following intensity gradients to optimally place the gray/ white matter and gray matter/cerebrospinal fluid boundaries [21].

Extraction of sulcal pits on the cortical surface
A sulcal pit is the deepest local point in a sulcal catchment basin, and can be identified by using a sulcal depth map on the cortical  surface. We used the white matter surface (gray/white matter boundary) and the 3D Euclidean depth map to extract sulcal pits [2,9]. The 3D Euclidean sulcal depth maps were generated by measuring the Euclidean distance from each vertex in the cortical surface to the nearest voxel on the cerebral hull [2,31]. We used a watershed algorithm based on a depth map to extract sulcal pits on triangular meshes. To prevent overextraction of the pits, we first reduced noisy depth variations by surface-based heat kernel smoothing with a full-width half-maximum value of 10 mm [32]. Subsequently we performed segment merging in the watershed algorithm using the area of the catchment basin, the distance between the sulcal pits, and the ridge height. If one of the areas of two or more catchment basins was smaller than a threshold (30 mm 2 ) when they met at a ridge point, the smaller catchment basin below the threshold was merged into the adjacent catchment basin with the deepest pit and its sulcal pit removed. If the distance between two pits was less than a 15 mm threshold, the shallower pit was also merged into the deeper one. Finally, merging was executed when the ridge was lower than a threshold of 2.5 mm. The methodological procedure was explained in more detail in our previous study [2].

Intra-subject cortical surface alignment
Surface vertex correspondence must be built to compare the results of sulcal pit extraction. In the MNI and FS pipelines, corresponding regions between subjects are determined using their own surface-based registration methods with a sphere-to-sphere matching in which the vertices of each subject are nonlinearly registered to a template surface [33,34,35]. However, our reliability test needs not inter-but intra-subject comparison. The nonlinear registration between surfaces of the same subject could give rise to extra noise in the correspondence definition. In addition, when using any specific surface registration algorithm, its performance and accuracy could be biased by different mesh properties of the MNI and FS surfaces. Instead, we adopted a volume linear registration to perform the intra-subject surface alignment [15]. We randomly chose 1 reference target volume for each subject and the other 3 or 2 volumes were registered to the target volume using the MNI linear registration tool with 6 parameters. The transformation matrix of the registration was then applied to the corresponding surface. Finally, 8 or 6 surfaces from each individual subject (MNI and FS surfaces from 4 or 3 images) were registered to a common space. In order to check the registration result, we overlapped all surfaces to a reference  volume image in the same space (Fig. 1). It is confirmed that their global position and orientation are matched and the linear registration is sufficient and robust for intra-subject surface alignment. Alignment results of 8 surfaces for the other 4 subjects are provided in supplementary data (Fig. S1). However, since the surfaces were reconstructed from different images or tools, the local shape of the gray/white matter boundary was slightly different as shown in Fig. 1. It would lead to differences in the presence and location of sulcal pits.

Evaluation of reliability for sulcal pit extraction
After the global surface alignment on the common space, we compared the map of sulcal pits between different surfaces of the same subject across the entire cortex. Because of different surface shape on the gray/white matter boundary, sulcal depth maps were not identical with each other. The following process including the surface-based smoothing and the merging in the watershed algorithm might also cause inconsistent identification of the local maximum depth points. First, our reliability test was to observe the variability of the presence of sulcal pits. Second, if two pits existed in the same anatomical region, we measured how much their locations are different. Given a pair of surfaces, S A and S B , containing sulcal pits P A = {a 1 , a 2 , …} and P B = {b 1 , b 2 , …} respectively, we estimated similarity of the presence of sulcal pits and their spatial difference by detecting the sulcal pits which exist in the same sulcal catchment basin. We assumed that if two pits from P A and P B are close to each other and so their spatial difference is less than a threshold of 10 mm, they are matched and identified in the same catchment basin. Matching of two pits from different sulcal basins might be expected, but there is low probability for that case. As described above, noisy sulcal pits that were extracted from small catchment basins and those that were located too close to other pits were merged and removed, therefore by construction the area within 10 mm distance from the sulcal pit did not encroach into other sulcal pit areas. Fig. 2 shows the map of geodesic distance from sulcal pits as a seed point in an individual cortical surface for an example. It is shown that areas within 10 mm distance from the sulcal pits are separated from each other.
We first projected the P A onto the S B with the nearest Euclidean distance and then measured the geodesic distances from the P B on the S B (Fig. 3A). The geodesic distances were computed and assigned to the vertices constituting the surface model [35,36]. In case the nearest points from the P A to the S B were not located on the vertex but on the plane of a triangle, we performed an interpolation using known distance values of three vertices with barycentric coordinates on the triangle [37]. The schematic illustration is provided with more detail in Fig. 3B. Among the P A , we counted the number of sulcal pits matched with the P B , N(P ARB ), whose shortest geodesic distances from the P B were less than 10 mm. At this time when the pits from different surfaces were regarded to be matched within 10 mm, we also acquired their geodesic distances and calculated the mean value D(P ARB ). Next, the P B was projected onto the SA and N(P BRA ) and D(P BRA ) were measured with the same manner explained above. We computed the ratios of N(P ARB ) and N(P BRA ) to the whole number of sulcal pits N(P A ) and N(P B ) respectively. The similarity of the presence of sulcal pits (M 1 ) and their spatial differences (M 2 ) between SA and SB were finally defined as follows: where M 1 and M 2 are symmetric matrices for the comparison of all pairs among 8 or 6 surfaces. The simple example for these measurements is shown in supplementary data (Fig. S2).
Effects of different scan session, scanner, and surface extraction tool We constructed n 6 n matrices M 1 and M 2 (n = 8 or 6) and performed (n 2 -n)/2 comparisons for each subject. Matrix of all pairwise comparison of sulcal pit maps is shown in Fig. 4. In these matrices, we evaluated the effects of different scan session, scanner, and surface extraction tool on the similarity of the presence of sulcal pits and their spatial difference. A detailed explanation of the test for 868 matrix is as follows:   The difference of sulcal pit maps was measured between the MNI and FS surfaces in the 4 pairs who came under the same scan session and scanner.
(d) Effects of different scan session or scanner, and surface extraction tool The sulcal pit maps of all other 12 pairs acquired from different scan session or scanner, and surface extraction tool were compared. As shown in Fig. 4, each case is marked in the cells of the matrix. For the cases of b, c, and d, there were several comparisons, so we measured the mean values respectively.

Sulcal depth and area of sulcal catchment basin on matched and unmatched sulcal pits
We performed the supplementary analysis to find the structural characteristics of the regions where sulcal pit presence was not consistent. In the process of the similarity measure of sulcal pit presence, matched and unmatched sulcal pits from the pair of 2 surfaces were differentiated. Measuring the depths and areas of the sulcal basin of unmatched sulcal pits can provide morphological information on the regions showing variable sulcal pit presence. We computed the means of these measurements in each subject  Figure 7. The map of the 96 cluster regions (48 regions for the left and right hemispheres) and the frequency of sulcal pits from the distribution of the pits in 148 normal subjects. This is reproduced from our previous study [2]. The frequency of sulcal pits is represented as a percentage (number of pits/1486100). doi:10.1371/journal.pone.0053678.g007 and compared them with the same measurements of matched sulcal pits. The differences between matched and unmatched sulcal pits were examined with a paired t-test.

Results
We display the sulcal pit maps from different 8 surfaces for an individual brain. When we visually compare the location and existence of the sulcal pits, their distributions are highly similar to each other although the presence of sulcal pits is variable in some regions as marked in Fig. 5. We also projected all sulcal pits onto one cortical surface and constructed a pit representation for each subject indicating the frequency of sulcal pit presence across the 4 MNI and 4 FS surfaces (Fig. 6). The same representation is provided for the other 5 subjects scanned 3 times in Fig. S3. Our quantitative reliability evaluation shows the similarities of the presence of sulcal pits and their spatial differences in both hemispheres for each subject with respect to different scan session, scanner, and surface extraction tool, which are presented in Table 1  The mean depth on matched sulcal pits was more than 12 mm in both hemispheres, however the mean depth on unmatched sulcal pits was less than 10 mm. Their difference across subjects was evaluated with a paired t-test and was statistically significant (P,0.0001). The areas of sulcal catchment basin of matched sulcal pits were also significantly larger than the areas of unmatched sulcal pits in both hemispheres (P,0.0001). All data and statistical results are shown in Table 3.

Discussion
We acquired 4 images on 3 different scanners or 3 images on 2 scanners (1 test-retest session) from the same MR manufacturer and field strength and using the same sequence with identical parameters in 10 subjects. We constructed sulcal pit maps on the white matter surfaces, extracted with both MNI and FS pipelines, and compared with each other. We globally aligned cortical surfaces using a volume linear registration to avoid biases that might be caused by a nonlinear surface registration, and carried out pairwise comparisons by projecting sulcal pits from one surface to another one. Although there is a minute difference in the local shape between surfaces, their global positions matched well ( Fig. 1  and S1). Therefore, sulcal pits can be projected to corresponding areas on another surface. The effects of scan session, scanner, and surface extraction tool were investigated for each subject.
Sulcal pit extraction from MRI was highly reliable across different scan sessions and scanners, showing high similarity of the sulcal pit presence and low spatial difference. The similarity of the presence of sulcal pits was more than 0.90 and their spatial difference was only about 1.70 mm. Typically, the edge distance between a vertex and its neighboring vertex was 1.75 to 1.85 mm and 0.83 to 0.85 mm for the MNI and FS surfaces respectively. Hence, the spatial difference of sulcal pits was quite small when considering the mesh resolution of the surface model. These results suggest that the locally deepest point in a sulcus detected from MRI is not random but is a reliable structural feature and our algorithm generates reliable sulcal pit maps on the cortical surface. In the process of sulcal pit extraction, we smoothed the depth map on the surface and performed a merging process which removes noisy pits according to several criteria and parameters. This process likely plays an effective role in increasing the reliability of sulcal pit detection. We could also have examined the reliability of sulcal pit distribution with different parameters. However, we have evaluated and optimized the parameters for sulcal pit extraction in our previous study [2], and have continuously shown significant relationships between our sulcal pit map and brain function [2,9,11,12]. Thus, this study aimed at considering the reliability of sulcal pits extracted with our optimized and fixed parameters.
We also detected reliable sulcal pits with the MNI and FS image analysis tools using the same images, with the similarity of more than 0.85 and about 2.00 mm spatial difference. However, the reliability test across different tools showed lower similarity and higher spatial difference than across different scan sessions and scanners. In addition, the similarity and spatial difference didn't change significantly when both surface extraction tools and scan sessions or scanners were different. The reason the effect of surface extraction tool is higher than the effect of scan session or scanner on reliability of sulcal pit extraction may be because cortical surface extraction requires several specific procedures including intensity nonuniformity correction, intensity normalization, stereotaxic space transformation, tissue segmentation, surface modeling, and so forth. Although MNI and FS pipelines adopt the same tool for some processes, such as the N3 algorithm for the intensity nonuniformity correction [25], much of their pipelines contain different tools and algorithms. Therefore there are several different image processing steps that may have caused the variability in the presence of sulcal pits and their spatial localization. We hypothesize that a method for surface modeling might be one of the most critical steps affecting sulcal pit extraction, although further investigation is required to understand and evaluate the effect of each processing step. The MNI and FS pipelines use totally different approaches for cortical surface modeling. The white matter surface is extracted by deforming an initial spherical mesh onto the gray/white matter boundary using the Constrained Laplacian-based Automated Segmentation with Proximities algorithm in the MNI pipeline [19,20]. In the FS pipeline, a surface tessellation is constructed by using triangles to represent face separating white matter voxels [21]. The number of triangles and the properties of surface mesh are different between the MNI and FS surfaces and their local shapes are not identical. These differences may have an impact on the generation of sulcal depth map and sulcal pit extraction reproducibility. Moreover, when we visually observe individual maps of the sulcal pit frequency and compare the maps between the MNI and FS surfaces, the presence and distribution of the sulcal pits seem more variable on the medial areas than lateral areas according to different tools ( Fig. 6 and S3). Nevertheless, our results confirm that we can still expect high reliability of sulcal pit extraction from the gray/white surface. We additionally investigated differences in the number of pits between the MNI and FS surfaces with a paired t-test. We analyzed the sulcal depth and area of sulcal catchment basin on matched and unmatched sulcal pits to see the structural characteristics of the regions where sulcal pit presence was not consistent. Our results showed that the sulcal depth was significantly shallower and the area of sulcal catchment basin was significantly smaller in the regions where sulcal pit presence or absence was variable according to scan sessions, scanners, or surface extraction tools. The sulcal pit presence was more consistent in the deep and large sulcal folds. These results can be confirmed in the individual representations of the sulcal pit frequency across different images ( Fig. 6 and S3). Sulcal pits on the shallow and small folds are candidates for the merging in our algorithm. Sometimes they may be merged and removed, but other times they may not, due to minute changes in the modeling of cortical surface shape. However, once sulcal pits are identified, they showed invariant spatial localization, as shown in our results of the spatial difference.
We have published a group map of sulcal pits constructed from 148 normal adult brains and defined 96 cluster regions from the distribution of the pits in the group [2]. The map showing the cluster regions and the frequency of sulcal pits as a percentage is reproduced from our previous study (Fig. 7). For further understanding of sulcal pit identification, we visually compared and related the intra-subject reproducibility of the pits with the group map of the sulcal pit frequency. It is of interest that intrasubject reproducibility of sulcal pit extraction seems relatively low in the cingulate sulcal region compared to other regions, and the frequency of sulcal pits in the group is also low ( Fig. 6 and 7). It may be because the cingulate sulcus is shallow and its depth profile is not dynamic, increasing uncertainty in identifying sulcal pit location, and also variable across subjects. In the inferior frontal and inferior temporal sulcal regions, low frequency of sulcal pits is shown in the group, but we can see highly reproducible identification of sulcal pits in individual subjects ( Fig. 6 and 7). We suggest that low frequency in those regions in the group is not due to unreliable sulcal pit extraction, but due to high inter-subject variability of sulcal patterns.
In conclusion, the extraction of deep sulcal pits and their distribution appears to be highly reliable across scan session, scanner type and extraction tool. The reliability of sulcal pit extraction in shallow regions with small catchment basins, although still high across scan session, scanner type and extraction tool, is less reliable. It may be necessary for us to consider the appropriate way of analysis and interpretation of results for the sulcal pits in those regions. With the high reliability and reproducibility of sulcal pits according to different scan sessions, scanners, and surface extraction pipelines, we suggest that extraction of sulcal pits could provide stable landmarks for studying structure-functional relationship in the human brain, and could be useful for various clinical application studies as an imaging biomarker. Recently multicenter or longitudinal neuroimaging studies, such as the Alzheimer's Disease Neuroimaging Initiative [38,39], are increasingly becoming an important element of clinical research for diagnosing and evaluating neurological impairments. One of the challenges is to understand and minimize image variability caused by non disease-related factors. Our results suggest that sulcal pits may be useful in multi-center or longitudinal studies, and also comparison between results of cross-sectional studies performed with different image processing pipelines. In future work, it would be important to investigate other instrument-related effects on sulcal pit extraction, such as scanner manufacturer, field strength, or pulse sequence.