Deep convolutional neural networks for segmenting 3D in vivo multiphoton images of vasculature in Alzheimer disease mouse models

The health and function of tissue rely on its vasculature network to provide reliable blood perfusion. Volumetric imaging approaches, such as multiphoton microscopy, are able to generate detailed 3D images of blood vessels that could contribute to our understanding of the role of vascular structure in normal physiology and in disease mechanisms. The segmentation of vessels, a core image analysis problem, is a bottleneck that has prevented the systematic comparison of 3D vascular architecture across experimental populations. We explored the use of convolutional neural networks to segment 3D vessels within volumetric in vivo images acquired by multiphoton microscopy. We evaluated different network architectures and machine learning techniques in the context of this segmentation problem. We show that our optimized convolutional neural network architecture, which we call DeepVess, yielded a segmentation accuracy that was better than both the current state-of-the-art and a trained human annotator, while also being orders of magnitude faster. To explore the effects of aging and Alzheimer's disease on capillaries, we applied DeepVess to 3D images of cortical blood vessels in young and old mouse models of Alzheimer's disease and wild type littermates. We found little difference in the distribution of capillary diameter or tortuosity between these groups, but did note a decrease in the number of longer capillary segments ($>75\mu m$) in aged animals as compared to young, in both wild type and Alzheimer's disease mouse models.


Introduction
The performance of organs and tissues depend critically on the delivery of nutrients and removal of metabolic products by the vasculature.Blood flow deficits due to disease related factors or aging often leads to functional impairment [1].In particular, the brain has essentially no energy reserve and relies on the vasculature to provide uninterrupted blood perfusion [2].
Multiple image modalities can be used to study vascular structure and dynamics, each offering tradeoffs between the smallest vessels that can be resolved and the volume of tissue that can be imaged.Recent work with several modalities, including photoacoustic microscopy [3], optical coherence tomography [4], and multiphoton microscopy (MPM) [5], enable individual capillaries to be resolved in 3D over volumes approaching 1mm 3 in living animals.The analysis of such images is one of the most critical and time-consuming tasks of this research, especially when it has to be done manually.
For example, in our own work we investigated the mechanisms leading to reduced brain blood flow in mouse models of Alzheimer's disease (AD), which required extracting topology from capillary networks each with ∼ 1, 000 vessels from dozens of animals.The manual tracing of these networks required ∼ 40× the time required to acquire the images, greatly slowing research progress [6].The labor involved in such tasks limits our ability to investigate the vital link between capillary function and many different diseases.
In this paper, we consider the segmentation of vessels, a core image analysis problem that has received considerable attention [7,8].As in other segmentation and computer vision problems, in recent years deep neural networks (DNNs) have offered state-of-the-art performance [9].DNN approaches often rely on formulating the problem October 19, 2018 2/30 as supervised classification (or regression), where a neural network model is trained on some (manually) labeled data.For a survey on deep learning in medical image analysis, see a recent review by Litjens et al. [9].
Here, we explore the use of a convolutional neural network (CNN) to segment 3D vessels within volumetric in vivo MPM images.We conduct a thorough study of different network architectures and machine learning techniques in the context of our segmentation problem.We apply the final model, which we call DeepVess, on image stacks of cortical blood vessels in mouse models of AD and wild type littermates.Our experimental results show that DeepVess yields segmentation accuracy that is better than current state-of-the-art and a trained human annotator, while being orders of magnitude faster.

Related work
Blood vessel segmentation is one of the most common and time-consuming tasks in biomedical image analysis.This problem can either be approached in 2D or 3D, depending on the specifics of the application and analytic technique.The most established blood vessel segmentation methods are developed for 2D retinography [10] and 3D CT/MRI [8].
Among segmentation methods, region-based methods are well-known for their simplicity and low computational cost [11].For example, Yi et al. [12] developed a 3D region growing vessel segmentation method based on local cube tracking.In related work, Mille et al. [13] used a 3D parametric deformable model based on the explicit representation of a vessel tree to generate centerlines.In recent years, these traditional segmentation methods have become less popular and are considered to be limited in comparison to deep learning methods, because they require handcrafted filters, features, or logical rules and often yield lower accuracy.
Today, in problems that are closely related to ours, various deep learning techniques dominate state-of-the-art.For instance, in a recent Kaggle challenge for diabetic retinopathy detection within color fundus images, deep learning was used by most of the 661 participant teams, including the top four teams.Interestingly, those top four methods surpassed the average human accuracy.Subsequently, Gulshan et al. [14] October 19, 2018 3/30 adopted the Google Inception V3 network [15] for this task and reached the accuracy of seven ophthalmologists combined.For retinal blood vessel segmentation, Wu et al. [16] used a CNN-based approach to extract the entire connected vessel tree.Fu et al. [17] proposed to add a conditional random fields (CRF) to post-process the CNN segmentation output.They further improved their method by replacing the CRF with a recurrent neural network (RNN), which allows them to train the complete network in an end-to-end fashion [18].Further, Maninis et al. [19] addressed retinal vessel and optic disc segmentation problems using one CNN network and could surpass the human expert.
There are 3D capillary image datasets in mice [11] and human [20] that were segmented using traditional segmentation methods and have illustrated the scientific value of such information, but few such datasets are available.
To the best of our knowledge, there are only two studies that used deep learning for our problem: vascular image analysis of multi-photon microscopy (MPM) images.The first one is by Teikari et al. [21] who proposed a hybrid 2D-3D CNN architecture to produce state-of-the-art vessel segmentation results in 3D microscopy images.The main limitation of their method was the use of 2D convolutions and 2D conditional random fields (CRF)s, which restrict the full exploitation of the information along the third dimension.In addition, they trained their network based on static in vitro images rather than in vivo images.Segmentation of in vitro images is often less challenging as they do not include motion artifacts.The second study was conducted by Bates et al. [22], where the authors applied a convolutional long short-term memory RNN to extract 3D vascular centerlines of endothelial cells.Their approach was based on the U-net architecture [23], which is a well-known fully convolutional network [24] widely used for biomedical image segmentation.Bates and colleagues achieved state-of-the-art results in terms of centerline extraction; nevertheless, they reported that certain vessels in the images were combined in the automatic segmentation.Finally, we consider the 3D U-Net [25], which is the volumetric version of the U-net architecture [23] and is regarded by many as state-of-the-art for microscopy image segmentation problems.

Data and Methods
The proposed vasculature segmentation method for 3D in vivo MPM images, DeepVess, consists of (i) pre-processing to remove in vivo physiological motion artifacts due to respiration and heartbeat, (ii) applying a 3D CNN for binary segmentation of the vessel tree, and (iii) post-processing to remove artifacts such as network discontinuities and holes.

Animals
All animal procedures were approved by the Cornell University Institutional Animal Care and Use Committee and were performed under the guidance of the Cornell Center for Animal Resources and Education.We used double transgenic mice (B6.Cg-Tg (APPswe, PSEN1dE9) 85Dbo/J, referred to as APP/PS1 mice) that express two human proteins associated with early onset AD, a chimeric mouse/ human amyloid precursor protein (Mo/HuAPP695swe) and a mutant human presenilin1 (PS1-dE9), which is a standard model of AD and typically develops amyloid-beta plaque deposition around 6 months of age [26].Littermate wild type (WT) mice (C57BL/6) served as controls.
Animals were of both sexes and ranged in age from 18 to 31 weeks for young mice and from 50 to 64 weeks for the old mice (6 WT and 6 AD at each age, for a total of 24 mice).

In vivo imaging of cortical vasculature
We use a locally-designed multiphoton microscope [27] for in vivo imaging of the brain vasculature.Glass-covered craniotomies were prepared over parietal cortex, as described previously [28,29].We waited at least three weeks after the surgery before imaging to give time for the mild surgically-induced inflammation to subside.Windows typically remained clear for as long as 20 weeks.This technique allows us to map the architecture of the vasculature throughout the top 500 µm of the cortex.Briefly, the blood plasma of an anesthetized mouse was labeled with an intravenous injection of Texas Red labeled dextran (70 KDa, Life Technologies).The two-photon excited fluorescence intensity was recorded while the position of the focus of a femtosecond laser pulse train was scanned October 19, 2018 5/30 throughout the brain, providing a three-dimensional image of the vasculature [27].
Imaging was done using 800-nm or 830-nm, 75-fs pulses from a Ti:Sapphire laser oscillator (MIRA HP, pumped by a Verdi-V18, or Vision S, Coherent).Lasers were scanned by galvonometric scanners and focused into the sample using a 1.0 NA, 20X water-immersion objective lens (Carl Zeiss, Inc.).Image stacks were acquired with 645/45 nm (center wavelength/bandwidth) bandpass filters.The ScanImage software package [30] was used to control the whole system.Image stacks were taken with a range of magnifications resulting in lateral voxel sizes from 0.45 to 1.71 µm/pixel, but always 1 µm in the axial direction.

Expert annotation
We implemented a protocol to facilitate the manual 3D segmentation task using ImageJ, an open-source image processing software package [31]  to a breath occurring part way though the raster scanning for an MPM image.In this study, we adopted the non-rigid non-parametric diffeomorphic demons image registration tool implemented based on the work of Thirion [32] and Vercauteren et al. [33].Our approach is to register each slice to the previous slice, starting from the first slice as the fixed reference.The diffeomorphic demons algorithm aims to match the intensity values between the reference image and deformed image, where cost is computed as the mean squared error.The smoothness prior on the deformation field is implemented via an efficient Gaussian smoothing of gradient fields, and invertability is ensured via concatenation of small deformations.A Gaussian kernel with the standard deviation of 1.3 was chosen based on our empirical tests with MPM images.Next, In our pre-processing steps, the 1-99% range of the image intensities in the input image patch were linearly mapped between -1 and 1, and the extreme 1% of voxels were clipped at -1 and 1.This step, we found, helps with generalizing the model to work well with images taken from other MPM platforms.Finally, the images were resampled to have a voxel size of 1 µm 3 .

Convolutional Neural Network Architectures
Our aim in this work is to design a system that takes an input stack of images (in 3D) and produces a segmentation of vessels as a binary volume of the same size.For this task, as we elaborate below, we explored different CNN architectures using validation performance as our guiding metric.Our baseline CNN architecture starts with a 3D input image patch (tile), which has 33 × 33 × 5 voxels (in x, y, and z directions).The first convolution layer uses a 7 × 7 × 5 voxel kernel with 32 features to capture 3D structural information within the neighborhood of the targeted voxel.The output of this layer, 32 nodes of 27 × 27 × 1 voxel images, enter a max pooling layer with a 2 × 2 kernel and 2 × 2 strides.Another convolution layer with 5 × 5 × 1 kernel and 64 features, followed by a similar max pooling layer are then applied before the application October 19, 2018 7/30 of the fully connected dense layer with 1024 hidden nodes and dropout [34] with a probability value of 50%.The output is a two-node layer, which represents the probability that the pixel at the center of the input patch belongs to tissue vs. vessel.
The CNN takes an input 3D patch and produces a segmentation label for the central voxel.All the convolution layers have a bias term and rectified linear unit (ReLU) as the element-wise nonlinear activation function.
Different kernel sizes for the 3D convolution layers were explored in our experiments.
Note that each choice in the architecture parameters (including the kernel size) corresponded to a different input patch size.As the validation results summarized in Table S2 indicate, the best performing baseline architecture had a 3D convolution layer with a 7 × 7 × 7 voxel kernel and an input patch size of 33 × 33 × 7. Based on this result we chose an input patch size of 33 × 33 × 7 as the optimal field of view (FOV) for segmentation.We then explored the effect of the number of convolutional and max pooling layers.As summarized in Table S3, the best architecture had three 3D convolution layers with a 3 × 3 × 3 voxel kernel, a max pooling layer, followed by two convolution layers with a 3 × 3 voxel kernel, a max pooling layer, and a fully connected neural network with a 1024-node hidden layer and FOV of 33 × 33 × 7. Finally, we investigated the performance for different output patch sizes, ranging from 1 voxel to 5 × 5 × 5 voxels and found that performance was improved further when the output is the segmentation of the central 5 × 5 × 1 patch and not just a single voxel.A larger output area has the advantage of accounting for the structural relationship between adjacent voxels in their segmentation.The optimal CNN architecture scheme is shown in Fig 1.

Performance metrics
There are different performance metrics to compare agreement between an automated segmentation method and a "ground truth" (GT) human annotation.In the context of binary segmentation, the foreground (F ) will be the positive class, and the negative The optimal 3D CNN architecture.The field of view (FOV), i.e. the input patch size, is 33 × 33 × 7 voxels and the output is the segmentation of the 5 × 5 × 1 patch (region of interest, ROI) at the center of the patch.The convolution kernels are 3 × 3 × 3 voxels for all the layers and ReLU is used as the element-wise nonlinear activation function.The first three convolution layers have 32 channels and are followed by pooling.The second three convolution layers have 64 channels.The output of convolution layers is 5 × 5 × 1 voxels with 64 channels, which is fed to a fully connected neural network with a 1024-node hidden layer.The final result has 5 × 5 × 1 voxels with two channels representing the probability of the foreground and background label associations.
Negative (FN) can be defined in a similar fashion.
Based on these, we can compute sensitivity and specificity.For example, sensitivity is the percentage of GT foreground voxels that are labeled by the automatic segmentation (ASeg) correctly.Mathematically, we have: The Dice coefficient (DC), Jaccard index (JI), and modified Hausdorff distance (MHD) are another set of commonly used segmentation performance metrics.JI is defined as the ratio between the number of voxels labeled as foreground by both GT and ASeg, to October 19, 2018 9/30 the total number of voxels that are called foreground by either GT and ASeg.DC is very similar to JI, except it values TP twice as much as FP and FN.JI and DC are useful metrics when the number of the foreground voxels is much less than background and the detection accuracy of the foreground voxels is more important compared to background voxel detection, which is the case for 3D imaging of vasculature.
On the other hand, MHD [35] quantifies accuracy in terms of distances between boundaries, which might be appropriate when considering tubular structures.For each boundary point in image to any boundary point inside image . This is then averaged over all boundary points in A: 1 Na a∈A d(a, B) [36].MHD is then defined as: Note that in the segmentation setting, A and B can represent the foreground boundaries in the automatic and GT segmentations, respectively.Finally, we can compute the MHD on centerlines instead of boundaries, a metric we call MHD-CL.

Training and implementation details
In training our segmentation algorithms, we used the cross-entropy loss function, defined as: y i is the GT label and p i is the model's output as the probability of the target voxel i belonging to the foreground.We trained our model using Adam stochastic optimization [37] with a learning rate of 10 −4 for 100 epochs during architecture October 19, 2018 10/30 exploration and a learning rate of 10 −6 for 30,000 epochs during the fine tuning of model parameters for the proposed architecture with mini-batch size of 1000 samples.
The fine tuning took one month on one NVIDIA TITAN X GPU.We implemented our models in Python using Tensorflow ™ [38].

Post-processing
CNN segmentation results contain some segmentation artifact such as holes inside the vessels, rough boundaries, or isolated small objects.In order to remove these artifacts, the holes within the vessels were filled.This was followed by application of a 3D mean filter with a 3 × 3 × 3 voxel kernel and the removal of small foreground objects, e.g.
smaller than 100 voxels.This result was used to compare to the gold standard.

Analysis of experimental data sets
To characterize the cortical vasculature of the experimental animals, we identified capillary segments by calculating centerlines from the segmented image data.Our centerline extraction method includes dilation and thinning operations, in addition to some centerline artifact removal steps.The binary segmentation image was first thinned using the algorithm developed by Lee et al. [39].The result was then dilated using a spherical kernel with a radius of 5-voxels to improve the vessel connectivity, which was followed by mean filtering with a 3 × 3 × 3 voxel kernel and removing holes from each cross section.Next, a thinning step was applied again to obtain the new centerline result.The original segmented image was dilated using a spherical kernel with a radius of 1-voxel to act as the mask for the centerlines with the goal of improving the centerline connectivity.The following rules were applied to the resulting centerlines repeatedly until no further changes could be done.A vessel is a segment between two bifurcations.
1. Remove any vessels with one end not connected to the network (i.e., dead end) and with length smaller than 11 voxels.
2. Remove single voxels connected to a junction.
3. Remove single voxels with no connections.
4. Remove vessel loops with length of one or two voxels.

Results
We conducted a systematic evaluation of several network architecture parameters in order to optimize segmentation accuracy of images of mouse cortex vasculature from MPM.We emphasize that this exploration was all based on performance on the validation dataset and the final results presented reflect the model accuracy on an independent test dataset.The detailed performance results for some of the tested architectures are reported in Tables S2 and S3.The optimal architecture, DeepVess, Furthermore, we implemented two state-of-the-art methods [21,25], and an improved version of the method of Teikari et al. [21], where we changed the 2D convolutional kernels into 3D kernels and inserted a fully connected neural network layer at the end, based on the suggestion in the discussion of their paper.Table 1 summarizes the comparison between the performance of our optimal architecture on the test dataset, with and without the post-processing step, comparing to two state-of-the-art methods and a second human annotator to provide a measure of the inter-human variability.
These results, as well as Fig S1 demonstrate that DeepVess outperforms both the state-of-the-art methods [21,25] and the trained human annotator on the test dataset in terms of sensitivity, Dice index, Jaccard index, and boundary modified Hausdorff distance (MHD).The proposed method does not outperform the benchmarks in specificity because of the relatively higher rate of false positive voxels.Yet we note that the slightly lower specificity is still very high (above 98%).

Capillary alteration caused by aging and Alzheimer's disease
Strong correlations between vascular health, brain blood flow and AD suggest that mapping the microvascular network is critical to the understanding of cognitive health in aging [40].[21,25] compared to the gold standard of the expert human annotation.The central red mark is the median, and the top and bottom of the box is the third and first quartiles, respectively.The whiskers indicates the range of data.DeepVess has higher median value in comparison to the Teikari et al. [21], C ¸içek et al. [25], and the human annotator (Wilcoxon signed-rank test, p = 2.98e − 23, p = 2.59e − 32, and p = 2.8e − 28, respectively).young and old mouse models of AD (young AD and old AD) and their young and old wild type littermates (young WT and old WT).Imaged volumes ranged from 230 × 230 to 600 × 600 µm 2 in x-y and 130 to 459 µm in the z direction.We imaged 6 animals per group, with at least 3000 capillary segments analyzed for each group.
The resulting 3D stacks of images were preprocessed, segmented with DeepVess, and post-processed as discussed in the previous sections.Centerlines were extracted and individual vessel segments were identified.To analyze capillaries while excluding arterioles and venules, only vessel segments less than 10 µm in diameter were included [6,41,42].Three metrics were selected to characterize the vascular network.S1).There were subtle shifts (∼ 0.25 µm) in the diameter distribution between groups, but no clear trend across old/young or WT/AD.However, we observed a systematic decrease in the number of longer length (> 75µm) capillaries in older October 19, 2018 14/30 animals as compared to young in both WT and AD mice.We compared these metrics between the groups using Kruskal-Wallis test followed by Bonferroni multiple comparison correction [43] (Table S1).

Discussion
The segmentation of 3D vasculature images is a laborious task that slows down the progress of biomedical research and constrains the use of imaging in clinical practice.
There has been significant research into tackling this problem via image analysis methods that reduce or eliminate human involvement.In this work, we presented a CNN approach, which surpasses the state-of-the-art vessel segmentation methods [21,25] as well as a trained human annotator.The proposed algorithm, DeepVess, segments 3D in vivo vascular MPM image with ten million voxel in ten minutes on a single NVIDIA TITAN X GPU, a task that takes 30 hours for a trained human annotator to complete manually.DeepVess implements pre-and post-processing tools to deal with in vivo MPM images that suffer from different motion artifacts.DeepVess is freely available at https://github.com/mhaft/DeepVessand can be used immediately by researchers who use MPM for vasculature imaging.Also, our model can be fine-tuned further with training samples for other 3D vasiform structures or other imaging modalities.
In order to characterize the performance of DeepVess, we compared the automated segmentation to an expert manual segmentation ( as altered diameters [5].However, the images often suffer from poor signal to noise and motion artifacts.An additional challenge is that unlabeled, moving red blood cells in the vessel lumen cause dark spots and streaks that move over time.Disease models are often especially challenging because inflammation and tissue damage can further degrade imaging conditions.The segmentation method developed in this work provides robust and efficient analysis which enabled us to quantify and compare capillary diameters and other vascular parameters from in vivo cortex images across multiple animals, with varying age as well as across WT mice and AD models.Many studies have shown anatomical and physiological differences in microvasculature associated both with age and AD, such as changes in composition of large vessel walls' smooth muscles [45], increased collagen VI in microvascular basement membranes and their thickening in AD [46], and age-associated reduction of microvascular plasticity and the ability of the vessels to respond appropriately to changes in metabolic demand [47].For the vascular parameters of segment length, diameter, and tortuosity considered here, previous work has shown that AD mouse models have increased tortuosity in cortical penetrating arterioles as compared to WT mice [48,49].Our analysis of capillaries excluded these vessels.young and old animals or between wild type and AD mouse models.There was a decrease in the number of long capillary segments in the aged animals compared to young in both the wild type and AD groups.These finding may not generalize across all ages and mouse models of AD.Sonntag et al. [1] argue that changes in vasculature due to aging might be non-linear and multi-phasic.For instance, two studies showed that the capillary density increases during adulthood and then declines in more advanced October 19, 2018 18/30 age [50,51].Several previous studies have characterized the average diameters of cortical capillaries in mice, as summarized in Table 2.There are a wide range of imaging approaches used in these various studies and data from both live animal and postmortem analysis is included.It is possible that some of these differences emerge when tissues are processed rather than measured in vivo as was done here.Studies based on sectioned tissue sample the 3D vascular architecture differently so it is difficult to make direct comparisons between datasets.Measures of capillaries depend on the definition of capillaries.Here it was based on a threshold diameter of 10µm, which could explain some of the variability in the literature.Not surprisingly given the differences in approach and sample preparation, there is significant disagreement between reported average diameters.Some differences may, however, reflect differences in vasculature across strains and ages of animals.Therefore, the proposed fully automated objective segmentation of 3D in vivo images of the vasculature can be used to reduce the variability due to sample preparation and imaging/analysis approach, allowing such strain and age differences to be elucidated clearly. While DeepVess offers very high accuracy in the problem we consider, there is room for further improvement and validation, in particular in the application to other vasiform structures and modalities.For example, other types of (e.g., non-convolutional) architectures such as long short-term memory (LSTM) can be examined for this problem.Likewise, a combined approach that treats segmentation and centerline extraction methods together, such as the method proposed by Bates et al. [22] in a single complete end-to-end learning framework might achieve higher centerline accuracy levels.

Conclusions
Here, we presented DeepVess, a 3D CNN segmentation method together with essential pre-and post-processing steps, to fully automate the vascular segmentation of 3D in vivo MPM images of murine brain vasculature.DeepVess promises to expedite biomedical research on the differences in angioarchitecture and the impact of such differences by removing the laborious, time consuming, and subjective manual segmentation task from the analysis pipelines in addition to elimination of subjective image analysis results.We hope the availability of our open source code and reported results will facilitate and October 19, 2018 19/30 Fig 1.The optimal 3D CNN architecture.The field of view (FOV), i.e. the input patch size, is 33 × 33 × 7 voxels and the output is the segmentation of the 5 × 5 × 1 patch (region of interest, ROI) at the center of the patch.The convolution kernels are 3 × 3 × 3 voxels for all the layers and ReLU is used as the element-wise nonlinear activation function.The first three convolution layers have 32 channels and are followed by pooling.The second three convolution layers have 64 channels.The output of convolution layers is 5 × 5 × 1 voxels with 64 channels, which is fed to a fully connected neural network with a 1024-node hidden layer.The final result has 5 × 5 × 1 voxels with two channels representing the probability of the foreground and background label associations.
We next examined the quality of the vessel centerlines derived from the different segmentations.Using the centerline modified Hausdorff distance (CL MHD) as a centerline extraction accuracy metric, DeepVess (CL MHD [DeepVess] = 3.03) is substantially better than the state-of-the-art methods (CL MHD [Teikari et al.] = 3.72, CL MHD [C ¸içek et al.] = 6.13).But there is still room for improvement in terms of automatic centerline extraction as neither automatic methods yielded scores as good as the trained human annotator (CL MHD [human annotator] = 2.73).In MPM, the variation in the signal to noise as a function of imaging depth leads to changes in image quality between image slices.The performance of a segmentation method should therefore be assessed by analyzing slices separately.Fig 2 illustrates the boxplot of slice-wise DI values from the x-y planes within the 3D MPM image dataset.DeepVess had a higher DI in comparison to the Teikari et al. and the trained annotator's results.However, there was more variation compared to the other two results, which implies the possibility and need for further improvements.

For
each capillary segment, we calculated the diameter averaged along the length (Fig 3.A), the length (Fig 3.B), and the tortuosity, defined as the length divided by the Euclidean distance between the two ends (Fig 3.C).The strong agreement between the measurements based on DeepVess and the manual measurements by Cruz-Hernandez et al. [6], confirms that the proposed pipeline yields unbiased and accurate metrics to analyze capillary segments.The distributions of capillary diameter, length, and tortuosity varied little between young and old mice or between WT and AD genotype (Table

Fig 3 .
Fig 3.  Comparison of capillaries between young and old mice with WT and AD genotype.The relative probability and Cumulative distribution function (CDF) of the (A) diameters, (B) length , and (C) tortuosity based on all capillaries aggregated within each of the four groups.We compared these metrics between the groups using Kruskal-Wallis test followed by Bonferroni multiple comparison correction[43] (TableS1).

Using a large databaseFig 5 .
Fig 5. Comparison of DeepVess and the gold standard human expert segmentation results in image planes as shown in Fig 4. Imaging is generally higher quality at planes closer to the sample surface.(Left column) Image intensity shown with gray scale after motion artifact removal.The dark spots within the vessels are red blood cells that do not take up the injected dye.(Middle column) Comparison between CNN (red) and the expert (green) segmentation results overlaid on images.Yellow shows agreement between the two segmentations.(Right column) Shannon entropy, which is a metric of CNN segmentation uncertainty computed with 50% dropout at test-time [44].The boundaries of vessels with high entropy values, shown in warmer colors, demonstrate the uncertainty of CNN results at those locations.Scale bar is 50µm.

Fig S1 .
Fig S1.Jaccard as a measure of the model accuracy.The DeepVess results surpass expertsthe human annotator result at all three train, validation, and test datasets.The expert and CNNhuman annotator and DeepVess results are show in dashed and solid lines respectively.The differences between three datasets results for both DeepVess and expertshuman annotator are due variability of the MPM image qualities.The constant difference between DeepVess and experts'human annotator's result confirm the avoidance of overfitting.

Table 1 .
[21,25]parison of our proposed CNN architecture, manual annotation by a trained person, and two state-of-the-art methods[21,25]to the gold standard of the expert human annotation.CNN surpass both human annotator and two state-of-the-art methods in terms of sensitivity as well as Dice index, Jaccard index, and boundary modified Hausdorff distance (MHD), which are the three metrics that are widely used in segmentation.
Slice-wise Dice Index of DeepVess vs. manual annotation by a trained person and the state-of-the-art methods To explore this question, we imaged the cortical vascular networks in October 19, 2018 13/30

Table 2 .
Comparison of measured mouse capillary diameters from different studies.

Table S3 .
The results of investigating different architectures.

Table S4 .
[21,25]ults of our proposed CNN architecture and the state-of-the-art methods[21,25], compared to the gold standard of the expert human annotation on the second independent dataset from different mouse and voxel size with lower SNR.CNN surpass both of them in terms of sensitivity, Dice index, Jaccard index, and boundary modified Hausdorff distance (MHD).