Mindboggling morphometry of human brains

Mindboggle (http://mindboggle.info) is an open source brain morphometry platform that takes in preprocessed T1-weighted MRI data and outputs volume, surface, and tabular data containing label, feature, and shape information for further analysis. In this article, we document the software and demonstrate its use in studies of shape variation in healthy and diseased humans. The number of different shape measures and the size of the populations make this the largest and most detailed shape analysis of human brains ever conducted. Brain image morphometry shows great potential for providing much-needed biological markers for diagnosing, tracking, and predicting progression of mental health disorders. Very few software algorithms provide more than measures of volume and cortical thickness, while more subtle shape measures may provide more sensitive and specific biomarkers. Mindboggle computes a variety of (primarily surface-based) shapes: area, volume, thickness, curvature, depth, Laplace-Beltrami spectra, Zernike moments, etc. We evaluate Mindboggle’s algorithms using the largest set of manually labeled, publicly available brain images in the world and compare them against state-of-the-art algorithms where they exist. All data, code, and results of these evaluations are publicly available.

Cortical labels displayed in the ROYGBIV interactive online brain image viewer. 206 The anatomical labels included in the DKT cortical labeling protocol [22] used to label the Mindboggle- 207 101 data are displayed on a left cortical surface. These two panels show the current state of our 208 prototype for a browser-based interactive visualization of the left hemisphere of a human brain [69] and [70]) and AddNeuroMed [71] data for an international Alzheimer's disease challenge [72] 214 (https://www.synapse.org/Synapse:syn2290704/wiki/60828). Teams performed statistical analyses on 215 Mindboggle shape measures to try and determine which brains had Alzheimer's disease, mild cognitive 216 impairment, or were healthy, and to try and estimate a cognitive measure (mini-mental state exam 217 score). The Results section presents an analysis of some of these data.   232 For running individual functions on surface meshes, the only inputs to the software are outer cortical 233 Mindboggle performs is to convert FreeSurfer volume and surface formats to NIfTI and VTK for further 284 processing. All volume images in this study have a resolution of 1x1x1 mm 3 per voxel (volume 285 element). All surface-based shape measures are computed on the "pial surface" (cortical-cerebrospinal 286 fluid boundary) by default, since it is sensitive to differences in cortical thickness.

288
Step 2: Optionally combine FreeSurfer and ANTs gray/white segmented volumes and fill with 289 labels: 290 This optional step of the pipeline will be skipped in the future when methods for tissue class 291 segmentation of T1-weighted MR brain images into gray and white matter improve. FreeSurfer and 292 ANTs make different kinds of mistakes while performing tissue class segmentation (Fig 2). After visual 293 inspection of the gray/white matter boundaries in over 100 EMBARC 294 (http://embarc.utsouthwestern.edu/, https://clinicaltrials.gov/ct2/show/NCT01407094) brain images 295 processed by FreeSurfer, we found that at least 25 brains had significant overcropping of the brain, 296 particularly in ventral regions such as lateral and medial orbitofrontal cortex and inferior temporal lobe 297 due to poor surface mesh reconstruction in those regions. This corroborates Klauschen's observation that 298 FreeSurfer underestimates gray matter and overestimates white matter [77]. We also found that ANTs 299 tends to include more cortical gray matter than FreeSurfer, but at the expense of losing white matter that 300 extends deep into gyral folds, and sometimes includes non-brain tissue such as transverse sinus, sigmoid 301 sinus, superior sagittal sinus, and bony orbit. combine_2labels_in_2volumes function overlays FreeSurfer white matter atop ANTs cortical gray, by taking the union of cortex voxels from both binary files as gray matter, the union of the non-309 cortex voxels from the two binary files as white matter, and assigning intersecting cortex and non-cortex 310 voxels as non-cortex. While this strategy often preserves gray matter bordering the outside of the brain, 311 it still suffers from over-inclusion of non-brain matter, and sometimes replaces true gray matter with 312 white matter in areas where surface reconstruction makes mistakes.  outside of the brain that the ANTs segmentation mistakenly includes as gray matter. To reconcile some 329 of these discrepancies, Mindboggle currently includes an optional processing step that combines the 330 segmentations from FreeSurfer and ANTs. This step essentially overlays the white matter volume 331 enclosed by the magenta surface in the middle panel atop the gray/white segmented volume in the right 332 panel.

334
Step 3: Compute volumetric shape measures for each labeled region: 335 • volume 336 • thickness of cortical labels (thickinthehead) 337 As mentioned in the Introduction, the most common shape measures computed for brain image data are 338 volume and cortical thickness for a given labeled region of the brain. Volume measurements are To avoid surface reconstruction-based problems with the cortical thickness measure, we built a function 349 called thickinthehead that computes a simple thickness measure for each cortical region from a 350 brain image volume without relying on surface data (Fig 3). The thickinthehead function first 351 saves a brain volume that has been segmented into cortex and non-cortex voxels into separate binary 352 files, then resamples these cortex and non-cortex files from, for example, 1mm 3 to 0.5mm 3 voxel 353 dimensions to better represent the contours of the cortex. Next it extracts outer and inner boundary 354 voxels of the cortex by morphologically eroding the cortex by one (resampled) voxel bordering the 355 outside of the brain and bordering the inside of the brain (non-cortex). Then it estimates the middle 356 cortical surface area by the average volume of the outer and inner boundary voxels of the cortex.

357
Finally, it estimates the thickness of a labeled cortical region as the volume of the labeled region divided 358 by the middle surface area of that region. The thickinthehead function calls the ImageMath, 359 Threshold, and ResampleImageBySpacing functions in ANTs.  Mindboggle's thickinthehead algorithm estimates cortical thickness for each brain region without 363 relying on cortical surface meshes by dividing the volume of a region by an estimate of its middle 364 surface area. Clockwise from lower left: 3-D cross-section and sagittal, coronal, and axial slices. The 365 colors represent the inner and outer "surfaces" of cortex created by eroding gray matter bordering white 366 matter and eroding gray matter bordering the outside of the brain. The middle surface area is estimated by taking the average volume of these inner and outer surfaces.

369
Step 4: Compute shape measures for every cortical surface mesh vertex:  (Fig 4). Area can be used to normalize other values computed within a given 383 region such as a gyrus or sulcus [86].  Mindboggle computes surface area for each surface mesh vertex as the area of the Voronoi polygon small neighborhood (Fig 5), which works best for low resolution or for local peaks, but can be sensitive    Depth is an important measure characterizing the highly folded surface of the human cerebral cortex.

411
Since much of the surface is buried deep within these folds, an accurate measure of depth is useful for  We are aware of three predominant methods for measuring depth of points on the surface of the cerebral 416 cortex, where depth is the distance between a given point on the brain surface to an outer reference 417 surface of zero depth (the portions of the brain surface in contact with the outer reference surface are 418 gyral crowns or crests). The first is Euclidean depth, the distance along a straight path from the point on 419 the brain to the outer reference surface. A straight path has the undesirable property that it will cross 420 through anything, which can make a highly folded surface indistinguishable from a slightly folded 421 surface that fills the same volume. The second is geodesic depth, the shortest distance along the surface of the brain from the point to where the brain surface makes contact with the outer reference surface.   The above implementations of travel depth use a convex hull (Fig 2 in Supplement 1), as do most measures of cortical depth such as the adaptive distance transform [100], while other algorithms do not 448 define a zero-depth reference surface but rely instead on convergence of an algorithm, such as the depth 449 potential map [101]. The shape of the brain is concave in places, resulting in some gyral crowns that do 450 not touch the convex hull. For example, in Fig 3 in Supplement 1, the gyri of the medial temporal lobe 451 are assigned positive depth, resulting in an unreasonably high depth for the folds of that region. Since 452 the convex hull is not suitable for application to brain images, or for surfaces with global concavities, we 453 define and construct a different reference surface that we call the wrapper surface (Fig 5 in  image with a probe of radius r, then erode it with the same probe. This operation is also known as 457 morphological closing, and it is important to carefully set the probe radius. If the radius is too large, the 458 wrapper surface will be similar to the convex hull, and if the radius is too small, the wrapper surface will 459 be too close to the original surface and the travel depth will be close to zero even inside folds. We used  Mindboggle's travel depth algorithm assigns a depth value to every vertex in a mesh, is faster and more 467 accurate than voxel-based approaches, assigns more reasonable path distances that are less sensitive to 468 surface irregularities and imaging artifacts than geodesic distances, and is faithful to the topology of the  values. Bottom left: individually colored folds from the same brain. The red surface shows that folds can 499 be broadly connected, depending on the depth threshold, and therefore do not map one-to-one to 500 anatomical region labels. Top right: The same folds with individually colored anatomical labels. These 501 labels can be automatically or manually assigned (as in the case of this Mindboggle-101 subject).

502
Bottom right: Individually colored sulci. Mindboggle uses the anatomical labels to segment folds into 503 sulci, defined as folded portions of cortex whose opposing banks are labeled with sulcus label pairs in 504 the DKT labeling protocol [22]. Each label pair is unique to one sulcus and represents a boundary between two adjacent gyri, so sulcus labels are useful to establish correspondences across brains. 506 Portions of folds that are missing are not defined as sulci by the DKT labeling protocol.

508
A fundus is a branching curve that runs along the deepest and most highly curved portions of a fold (Fig  Since folds are defined as deep, connected areas of a surface, and since folds may be connected to each 546 other in ways that differ across brains, there usually does not exist a one-to-one mapping between folds 547 of one brain and those of another. To address the correspondence problem, we need to find just those 548 portions of the folds that correspond across brains. To accomplish this, Mindboggle segments folds into 549 sulci, which do have a one-to-one correspondence across non-pathological brains (right side of Fig 7). 550 Mindboggle defines a sulcus as a folded portion of cortex whose opposing banks are labeled with one or 551 more sulcus label pairs in the DKT labeling protocol. Each label pair is unique to one sulcus and 552 represents a boundary between two adjacent gyri, and each vertex has one gyrus label. The 553 extract_sulci function assigns vertices in a fold to a sulcus in one of two cases. In the first case, if 554 a vertex has a label that is in only one label pair in the fold, it is assigned that label pair's sulcus if it can 555 be connected through vertices with one of the pair's labels to the boundary between the two labels. In 556 the second case, the segment_regions function propagates labels from a label boundary to vertices 557 whose labels are in multiple label pairs in the fold. Once sulci are defined, the segment_by_region function uses sulcus labels to segment fold fundi into sulcal fundi, which, like sulci, are features with 559 one-to-one correspondence across non-pathological brains.

561
Step 7: Compute shape measures for each cortical surface label or sulcus:    To calculate the distance between the descriptors of two shapes, Reuter describes several approaches, 581 e.g., L p -norm, Hausdorff distance and weighted distances. One of the more prominent and simple 582 distance measures is the Euclidean distance (L2 norm) of the first N smallest (non-zero) eigenvalues, where N is called the truncation parameter. To account for the linearly increasing magnitude of the 584 eigenvalues (Weyl's law), Reuter recommends to divide each value by its area and its index (done by 585 default in Mindboggle). As an alternative, the Weighted Spectral Distance (WESD) [107] is included in 586 Mindboggle (but not used by default). It computes the L p -norm of a weighted difference between the 587 vectors of the N smallest Eigenvalues. This approach forms a pseudo-metric and also avoids domination 588 of higher components on the final distance, making it insensitive to the truncation parameter N (with a 589 decreasing influence as N gets larger). Additionally, the choice of p (for the L p -norm) influences how 590 sensitive the metric is to finer as opposed to coarser differences in the shape; as p increases, WESD 591 becomes less sensitive to differences at finer scales. [111] efficient 3-D implementation of Zernike moments in Matlab, and helped us test our Python 611 implementation to ensure they give consistent results. The length of the descriptors exponentially 612 increases with order, so order 20 yields 121 descriptors while order 35 yields 342, for example. Values 613 are generally less than or equal to one, with values much greater than one indicating instability in the 614 calculation, which could be due to the way the mesh is created or due to calculating at an order that is 615 too high given the resolution or size of the object.

641
Mindboggle has been and continues to be subjected to a variety of evaluations (https://osf.io/x3up7/) and 642 applied in a variety of contexts. In this section, we compare related shape measures (3.1), evaluate 643 fundus extraction algorithms (3.2), and evaluate the consistency of shape measures between scans (3.3). 644 We also demonstrate Mindboggle's utility in measuring shape differences between left and right 645 hemispheres (3.4), and in measuring brain shape variation (3.5).  648 We compared shape measures with one another in a representative individual from the Mindboggle-101 649 data set (Fig 10) and for the entire data set (Fig 11, Fig 12, and Fig 13) to emphasize to the reader that 650 shape measures are not independent of one another and that care must be taken when comparing 651 differently defined shape measures or when using one as a proxy for another. higher shape values than travel depth, and may exaggerate depth, such as in the insula (also clearly 656 evident in Fig 6 and Fig 11).   739 We are aware of only one study directly comparing FreeSurfer with manual cortical thickness measures, 740 where the manual estimates were made in nine gyral crowns of a post-mortem brain, selected for their low curvature and high probability of having been sampled perpendicular to the plane of section [113]. 742 We compared thickinthehead, FreeSurfer, and ANTs cortical thickness estimates in different 743 populations, including the Mindboggle-101 subjects (Fig 13) Table 2a gives the average across the 41 subjects of the fractional shape differences between MRI scans 795 for each of the 31 left cortical regions, and for each shape measure, and Table 2b gives a statistical 796 summary of the differences. In general, the values are low enough to suggest high inter-scan shape    This table gives a statistical summary of the shape differences between two scans of the same brain for 837 41 brains. The "mean" column is the average of the mean values in Table 1a, while the other columns

Measuring shape differences between left and right hemispheres 848
To measure interhemispheric shapes differences, we computed the fractional shape difference per 849 cortical region as in the preceding section, replacing inter-scan differences with interhemispheric 850 differences (https://osf.io/dp4zy/), and using all 101 Mindboggle-101 brains. Table 3a gives the average 851 across the 101 subjects of the fractional shape differences between hemispheres for each of the 31 852 cortical regions, and for each shape measure, and Table 3b gives a statistical summary of the 853 differences. The values are much higher than the corresponding inter-scan differences in the previous 854 section, suggesting that shape differences between hemispheres are greater than shape differences 855 between MRI scans of the same hemisphere.    We organized the data in a nested fashion: brain hemisphere is nested within subject, and subject is 901 nested within laboratory. In addition to the five shape measurements and the three nested classification 902 factors, the data also include three covariates: sex (male, female), age (integer variable), and handedness 903 (left, right; we relabeled two ambidextrous subjects as left-handed). Given the grouped nature of the 904 data, we used linear mixed models for the statistical modeling of the data. To assess the importance of 905 each of the covariates and nested classification factors, we fitted 24 distinct linear mixed models for 906 each shape measure and brain region combination to assess the importance of each of the covariates 907 (sex, handedness, and age as fixed effects) and nested classification factors (laboratory, subject, and 908 brain hemisphere as random effects). For each shape measure, we decomposed the total variance into the 909 variance between laboratories, between subjects within a laboratory, between brain hemispheres within a 910 subject, and within brain hemispheres.

912
For each shape measure and brain region combination, we used the Bayesian Information Criterion  nesting level. For all shape measures and brain regions, the bulk of the variability was concentrated in 928 the residual, not in the hemisphere ("side"), subject, or laboratory (top of Fig 14).  Mindboggle to brain and non-brain data. Here we will very briefly summarize this article's results and brain, with some exceptions (such as entorhinal volume). We found that for the shape measures and 986 populations we studied that shape differences between hemispheres were greater than shape differences 987 between MRI scans of the same hemisphere, and that the variability within each brain hemisphere was 988 higher than the variability between brain hemispheres in a participant or between participants. Finally, 989 we reported which brain regions were significantly correlated with changes in ADNI-MEM cognitive score after a three-year interval as part of an international Alzheimer's challenge.

993
The Mindboggle software will continue to be subjected to evaluations of its algorithms as well as of its analyzing interactions among shape measures to find higher order morphological relationships with 1001 brain shape differences.

1003
There are many ways to enhance Mindboggle's functionality and applicability to pathological brains.

1004
Taking advantage of different and multiple types of images, atlases, labels, features, and shape measures 1005 are clear ways to expand and improve Mindboggle, and the software was built using the Nipype 1006 framework specifically to enable modular and flexible inclusion of different algorithms. Even the inputs 1007 to Mindboggle can change to take advantage of promising new algorithms that combine surface 1008 reconstruction with whole-brain segmentation in a way that is more robust to white-matter abnormalities