Detection of axonal synapses in 3D two-photon images

Studies of structural plasticity in the brain often require the detection and analysis of axonal synapses (boutons). To date, bouton detection has been largely manual or semi-automated, relying on a step that traces the axons before detection the boutons. If tracing the axon fails, the accuracy of bouton detection is compromised. In this paper, we propose a new algorithm that does not require tracing the axon to detect axonal boutons in 3D two-photon images taken from the mouse cortex. To find the most appropriate techniques for this task, we compared several well-known algorithms for interest point detection and feature descriptor generation. The final algorithm proposed has the following main steps: (1) a Laplacian of Gaussian (LoG) based feature enhancement module to accentuate the appearance of boutons; (2) a Speeded Up Robust Features (SURF) interest point detector to find candidate locations for feature extraction; (3) non-maximum suppression to eliminate candidates that were detected more than once in the same local region; (4) generation of feature descriptors based on Gabor filters; (5) a Support Vector Machine (SVM) classifier, trained on features from labelled data, and was used to distinguish between bouton and non-bouton candidates. We found that our method achieved a Recall of 95%, Precision of 76%, and F1 score of 84% within a new dataset that we make available for accessing bouton detection. On average, Recall and F1 score were significantly better than the current state-of-the-art method, while Precision was not significantly different. In conclusion, in this article we demonstrate that our approach, which is independent of axon tracing, can detect boutons to a high level of accuracy, and improves on the detection performance of existing approaches. The data and code (with an easy to use GUI) used in this article are available from open source repositories.


Introduction
Imaging in neuroscience is widely used to study dynamic processes such as structural and functional neuronal plasticity. Three-dimensional, high-resolution images of neurons from experiments. Mice were anesthetized with Isoflurane, an inhalation anaesthetic, and secured to a fixed support under the microscope. To prevent dehydration in the eyes, Lacri-lube (Allergan) was applied. To regulate body temperature (37˚C) an underlying heat pad was used and rehydration administered with isotonic saline solution (i.p.) as required during long imaging sessions. Depth of anesthesia was closely monitored by a video camera and regularly checking reflexes (toe pinch) and respiratory rates. An Olympus 4× with a 0.13 numerical aperture (NA) objective was used to identify characteristic blood vessel patterns, in order to relocate axons used in previous imaging sessions. An Olympus 40 × 0.80 NA water-immersion objective was used to acquire the images (512 × 512 pixels, 0.147 μm per pixel for the x, y planes, and 1 μm for the z plane. A Point Spread Function characterised by a Full Width at Half Maximum (FWHM) values of 0.45 × 0.45 × 2.5 μm (x, y, z)). Either a water repellent pen or Vaseline (pure Petroleum Jelly) was applied around the cranial window to stabilize the meniscus for the 40x objective. A pulsed 910 nm laser beam was used never exceeding 70 mW on the back focal plane. Each imaging session typically lasted for 60-90 min, during which time up to 40 image stacks were collected.

Algorithm details
Overview of bouton detection method. To make use of the varicosity of boutons we decided on the following architecture. Our bouton detection algorithm consists of feature enhancement which strengthens the bouton-specific signals that are applied to the mean intensity projections of two-photon stacks. Following this, an interest point detector is used to identify candidate boutons, and a feature vector is constructed for each region of interest (ROI). With the local maxima of boutons increased by the blob feature enhancement module, the interest point detection module identifies candidate boutons, which significantly improves the scores (precision, recall, F1 score) and computation speed. In addition, a custom local maxima suppression algorithm is used in order to move candidate boutons to their local maxima. Candidate boutons are then classified as boutons and non-boutons using an SVM based on the extracted features. Lastly, we extend the 2D locations detected to 3D. In the process of choosing the most appropriate interest point detector and feature descriptor, we compared several well-known approaches and chose the best methods for our algorithm. A summary of the main steps of the algorithm is shown in  Feature enhancement module. The purpose of the Feature Enhancement module is to enhance structures corresponding to boutons (Fig 3). Because the varicosity of the boutons yields a blob-like image structure, we decided to use a negative Laplacian of Gaussian (LoG) convolution kernel with a set radius, as it has been shown to enhance such types of structure well [24]. The Gaussian kernel with a scale parameter, σ, is defined as where x and y denote spatial position in the imaging plane, and σ is the standard deviation of the Gaussian function G(x, y; σ). The LoG kernel may be thought of as a "dark blob" detector Example of the bouton detection method step by step. A, A mean intensity projection image from the 3D axon dataset. B, The same image convolved with LoG mask. C, In this example Interest points were detected using SURF (green "+" signs). D, Following SVM classification, the final proposed boutons are plotted on the mean projection image (white "+" signs). https://doi.org/10.1371/journal.pone.0183309.g003 Detection of axonal synapses in 3D two-photon images and is expressed as LoGðx; y; sÞ ¼ G xx ðx; y; sÞ þ G yy ðx; y; sÞ ¼ x 2 þ y 2 À 2s 2 ffiffiffiffiffiffiffiffiffi ffi 2ps 5 at a certain scale space representation LoG(x, y; σ), where G xx,yy denotes the partial second order derivatives in x and y, respectively. By converting to polar coordinates and considering only radial distance, ρ, (because LoG(x, y; ρ) is circularly symmetric), the value of σ that gives the maximum response for an image, I(x, y), containing light blobs with a radius, r, is such that: This leads to: Convolution of the image with the LoG kernel increases sensitivity of the bouton detection by accentuating local extrema corresponding to blob-like structures in the image. In our case, we use the prior value for σ = 4, which reflects the average diameter of a bouton (average of 20 boutons from this dataset, σ = 4.56 pixels; 512 × 512 resolution; 0.147 μm per pixel). Convolving the image with −LoG(x, y; σ) suppresses the intensity of image structures significantly smaller than σ. By moving to a coarser scale image, bouton detection specificity is increased with the disappearance of small structures that could become false positive instances of boutons in the next step of processing (Fig 2). This is particularly important, due to the scaleinvariance of the succeeding interest point module for detecting candidate boutons. Moreover, the smoothing effect of the LoG kernel increases the stability of the subsequent interest point detection by reducing shot noise, and increasing the image response of blob-like structures. One disadvantage of this is that it may lead to duplicate detections of the same bouton. To address this, a custom non-maximum suppression step is used.
Interest point detection module. The purpose of using an interest point detector is to find candidate locations for the boutons within the image. This allows the remainder of the bouton detection process to focus on using computationally expensive routines to separate the candidate boutons from non-bouton structures, far more efficient than pixel-wise application of a feature descriptor (see Feature Descriptor module). We selected three widely used interest point detectors (Harris, SURF keypoint detector and SIFT keypoint detector), focusing on those that are more likely to be appropriate for detecting the blob-like nature of boutons. We explain the reasons for these choices below, evaluating their performance in a later section.
The Harris corner detector relies on the principle that at a corner, the image intensity within a local window will change considerably when the window is shifted in different directions [25]. Because a bouton is a Gaussian-like object [26] that changes its intensity along all radial directions, it displays some attributes that might be considered to be corner-like. The Harris detector makes use of local partial derivatives in intensity over space, computing a 2 × 2 Harris matrix [25]. The eigenvalues of the matrix yield a corner response at all locations in the image, which is then thresholded to produce candidate bouton locations. We used the Matlab implementation of the Harris detector in this work.
The SIFT interest point detector forms one part of the SIFT keypoint detector [27], which is in extremely wide usage in object recognition, image stitching and video tracking [28,29]. The interest point locations of SIFT are identified by finding local maxima in a scale-space representation of the original image; this representation is obtained by blurring the image with Gaussian masks at different spatial scales. We used the VLFEAT implementation of SIFT [30].
Finally, the SURF keypoint detector [31] uses box filter approximations of second order spatial Gaussian functions in order to construct a local Hessian matrix; the determinant of this matrix can be used to find locations of intensity minima, maxima and saddle points. The approach has been suggested [31] to be good at dealing with slightly elongated structures, making it suitable for detecting the possible locations of boutons, particularly since they sit on or very near to axons. We used the Matlab implementation of SURF.
Non-maximum suppression module. In this step, we propose a way to eliminate candidate bouton locations that are present in a relatively close local area by using non-maximum suppression [32]. This is due to the statistics of the image data, and tendency of SURF interest point detector to fire several times around circular structures, which are common in microscopy data. By removing multiple detections of the same bouton automatically, the user will not need to manually remove the over-detections.
In this paper, non-maximum suppression is done differently to that of the Canny edge detector in [32], as it is applied to candidate bouton centres rather than candidate edge locations. To place candidate boutons at their local maxima, we iterate over all candidate bouton locations and relocate them to the pixel with the highest intensity in their local area (= W 1 pixels). In order to save time in manual user intervention, we then remove all boutons in the local area (= W 2 pixels), apart from one candidate bouton in the location with highest pixel intensity. We found that the best values for W 1 and W 2 were 20 and 10 respectively.
Feature descriptor module. The purpose of using a feature descriptor is to capture the appearance of image regions around candidate bouton locations; these descriptors are then used as observation vectors to classify the candidate interest points as boutons or non-boutons. In this section, we briefly describe 3 feature descriptors that were evaluated for this task; one of these is custom designed for the bouton detection task.
Standard region descriptors. The more widely used feature descriptors (including SIFT [27] and HOG [33,34]) for local image appearance consist of a standard series of steps, based around estimating the local intensity gradient field around a reference location (such as an interest point): they differ mainly in the local operators used to estimate the intensity gradient field, the way that the gradient field is encoded in to the descriptor, and the way that the distributions of gradient orientation are normalised. We used the most widely employed standard implementations of HOG (Matlab) and SIFT-based descriptors (VLFEAT [30]). HOG yields a 144 element descriptor which describes the gradient field in a number of square pixel regions arranged around the interest point. SIFT yields a 128 element rotation invariant description of a center-weighted local image gradient field, again computed around the interest point. We reduced the length of both descriptors to 12 elements by using minimum Redundancy, Maximum Relevance [35,36]. This process selects the most informative and least redundant subset of features, speeding processing, and improving stability. Because reducing dimensionality in this way changes the normalisation of the remaining elements of the vector, we re-normalise the reduced vectors using the best of L 1 , L 2 and L 1 normalisation on the remaining elements within the validation data set.
A custom descriptor. SIFT descriptors and the HOG descriptor are designed for general use in visual recognition. We hypothesized that a custom-designed descriptor, providing a joint encoding of spatial and directional structure, could improve the ability of a classifier to distinguish boutons from non-boutons. Accordingly, we built a Gabor-based feature descriptor that was designed to perform well for the bouton detection problem.
Gabor filters give large responses at intensity edges and within regions of an image where the texture matches that of the filter. They have successfully been used for edge detection [37] and texture segmentation [38,39]. However, the parameters of spatial Gabor filters can also be tuned so that they yield strong responses to blob-like structures. The patterns of intensity around boutons also tend to contain some directional structures (e.g. the axons they lie on). Thus, without performing explicit axon detection, a descriptor that takes the visual of structure surrounding boutons into account can be constructed from patterns of directional Gabor filters. The two-dimensional Gabor functions were selected using the standard definition for symmetric 2D Gabor functions: where σ x and σ y are the standard deviations of the elliptical Gaussian for x and y, respectively. The centre of the receptive field in the spatial domain is (x 0 , y 0 ) and (w x 0 ; w y 0 ) is the optimal spatial frequency of the filter in the frequency domain. The Gabor filters module uses several Gabor filters with varying spatial frequency and standard deviations to construct a robust bouton discriminatory feature vector x = [x 1 , . . ., x V ]. Element d of the feature vector x is defined as: where f is an image patch of size I × J pixels meant to capture the entirety of individual boutons and has its centre in a single interest point. The inner products between an image patch and the Gabor filters at angles θ = nπ/6, {n = 1, 2, 3, . . ., 12} make up the feature vector x with V = 12 features (Fig 4). The image patch size is determined by the standard deviation used in the interest point detection module such that because σ x and σ y are approximately equal to 4, and f consists of 25 × 25 pixels. SVM classification module. The classification module is trained to identify the true positives among candidate bouton regions (obtained from the interest points) using the extracted features (Fig 3G). For this detection task, we use a SVM, which is a sparse kernel machine that maximises the margin between points of different classes in a high-dimensional feature space. SVMs have been found to be easy to train, are widely implemented and are relatively fast in classifying. Our SVM uses a polynomial discriminant function to maximise the margin between the two classes of candidate boutons. The regularisation coefficient, C > 0, penalises the misclassification of non-boutons as boutons and vice versa. To train and validate the SVM to correctly classify interest points, 80 images were used; 900 patches were randomly selected for training and validation. The collected training image patches of negative and positive instances of boutons were resized to 25 × 25 pixels to match the size of the average bouton. All the training images contain a part of an axon.
3D point detection module. The 3D point detection module aims to transfer the final 2D points on the mean intensity projection image detected by the algorithm, and extend them to 3D (i.e. detect which slice they lie on).
For each bouton detected in 2D, a 3D point is found by: In which f(x i , y i , z k ) is a voxel region in 3D space, k is the z coordinate, x i , y i are the (x, y) coordinates, N is the number of pixels in a patch. For each bouton detected, the algorithm iterates over all slices in the stack, and sums the pixel intensities within a 25 × 25 patch centred on the point. The slice with highest overall value is chosen as the location z.
Method validation. We compared the performance of the algorithm to manual labelling of boutons, done by 3 expert neurobiologists, and another automated tool, EPBscore, a Matlabbased software tool available for detection and analysis of axonal boutons. We used a total of 100 images in the training, validation (80 images), and testing (20 images). The images chosen for testing had a few of axons per image, and considerable noise (Fig 1). Each image had a size of 512 × 512 × Z voxels, 2 [15,50]. We used 900 image patch examples of boutons and nonboutons (450 each) for training (720 patches) and validation (180 patches). A total of 345 (average of 17 per image) boutons were present in the test images.
To compute the metrics (Precision, Recall, F1 score and AUC), we calculated the amount of True Positives (TP), True Negatives (TN), False Positives (FP) and False Negatives (FN) that were present in each image (Fig 5). Precision is computed using TPs and FPs (Precision ¼ TP TPþFP ), Recall is computed using TPs and FNs (Recall ¼ TP TPþFN ), and F1 score is described by both Precision and Recall (F 1 ¼ 2 Â precisionÂrecall precisionþrecall ).

Comparison of feature descriptors and interest point detectors
We compared several well-known feature descriptors (Fig 6, left), including HOG, and SIFT, and Gabor kernels. We trained the SVM using the same datasets on HOG, SIFT, or Gabor features. For a fair comparison with the Gabor features, we extracted the 12 most significant features for each by using minimum redundancy feature selection [35,36], and normalized the features using the best normalization method for each descriptor (L 1 , L 2 , L 1 ). We also Detection of axonal synapses in 3D two-photon images

Fig 6. Graphs comparing the performance of the descriptors and interest point detectors at 10 3 SVM class thresholds.
We chose Gabor and SURF, as our descriptor and interest point detector, as they had better performance than the other methods (all with separately optimized hyperparameters). The precision-recall graphs seem to have an unusual curvature; however, this can be explained by the nature of the dataset. In this axon dataset, where the number of TPs (i.e. boutons) is relatively small compared to the size of the image, it is to be expected that there will always be some FP detections when TP points are also detected. As such, there will never be a case in which precision = 1, as there will always be some FPs detected as well (i.e. the optimized the SVM hyperparameters (e.g. Kernel function, Polynomial Orders (d), Cost (C), and γ) and used the best result for each descriptor. Comparison between the descriptors is shown in Fig 6. We compared various performance measures including Precision, Recall, F1 score, and Area Under Curve (AUC).
The proposed Gabor-based descriptor had the best performance compared to both HOG and SIFT (Fig 6, left). We also compared several interest point detectors (SURF, Harris and SIFT), as an improvement in the interest point detection can have a big influence on the final performance of the algorithm. SURF had significantly better Recall and F1 scores (Fig 6,  right). We therefore chose SURF as our interest point detector, and Gabor as our feature descriptor. In a later section, we compare our method against the existing semi-automated alternative for bouton detection (within EPBscore [16]).

Optimization of hyperparameters
To optimize the performance of the algorithm, we compared the Precision-Recall and Receiver operating characteristic (ROC) curves for different descriptors and interest point detectors (Fig 6) on a validation set. After scaling the SVM scores to range [−1, 1], we found that the threshold-defining a point on the ROC-that gave the best results was at −0.0399, giving a mean F1 score of 0.840 (Precision = 0.765; Recall = 0.952). We optimized our algorithms after choosing the optimal normalization paradigm on the features, and on different SVM hyperparameters. These include kernel functions (Polynomial and Gaussian), polynomial orders, Cost (C), and γ.
Our best performing SVM model has a polynomial kernel (order = 3) and a regularisation term that penalises the misclassification of non-boutons and boutons.

Comparison to EPBscore
In this section, we compare our algorithm (using SURF and Gabor features, as our interest point detector and feature descriptor, respectively) to EPBscore. When analysing data from neurons, it is important to include as many synapses (i.e. boutons) as possible in the analysis. Hence, we used the measure of Recall (the number of true boutons that were correctly detected). We found that our algorithm detected boutons (TPs) extremely well (Fig 7, Table 1), with a 0.952±0.15 Recall compared to EPBscore that had a recall of 0.306±0.21 (p < 10 −5 , Kolmogorov-Smirnov (KS) test). The Precision, which is an indication of the number of false bouton detections, was also measured. There was no significant difference (p = 0.135, KS test) in the precision of our algorithm and EBPscore. We then computed the F1 score, which takes into account both Precision and Recall. Our detector is significantly better (p < 10 −5 , unpaired two-tailed t-test); with 0.433 higher average F1 score than EPBscore, demonstrating that our algorithm performs much better than the current state of the art method used to analyse synaptic boutons. Detection of axonal synapses in 3D two-photon images Application to a published dataset We applied our algorithm to a published dataset from De Paola et al [23], in which they did a morphological analysis of axons, and quantified synaptic dynamics. The data was collected using a two-photon microscope, in the mouse barrel cortex. The genetic background was Thy-1 transgenic mice in a c57/bl6 background, expressing cytoplasmic GFP (more details of experimental method in [23]). We analysed 22 images, containing 348 boutons.
The algorithm performed well, even without being tuned to this dataset. Example of performance is shown in Fig 7C. Precision (p = 0.04, unpaired two-tailed t-test), Recall (p = 0.002, Detection of axonal synapses in 3D two-photon images unpaired two-tailed t-test) and F1 (p = 0.004, unpaired two-tailed t-test) scores were significantly better than EBPscore (Fig 7D).

Discussion
We used an algorithm based on interest point detection, feature extraction, and classification using an SVM. In our chosen method, first a negative LoG mask was convolved with the mean projection image, which produced an image with enhanced blob-like objects (i.e. boutons). Then a SURF point detector was applied to the pre-processed LoG image, which extracts interest points for the classifier. These image patches [25 × 25 pixels] were convolved with 12 Gabor filters to generate a feature vector for each interest point. The feature vectors were then processed by the SVM classifier which gave the final detected 2D bouton locations, which can be extended to 3D locations.
We found that our algorithm is significantly better (p < 10 −5 ) than the current state of the art method (EPBscore) in the same dataset. It detects nearly all boutons that were present in the images, with a mean Recall of 0.95 ± 0.06. The limitation of the algorithm is that, in our tests, it sacrifices a small amount of precision (i.e. more FPs) for a better recall. Most of the FPs were either due to noise, spines or intersection points in the axons. However, an algorithm that is independent of the axon or neuron tracing is advantageous, as it is faster, and reliably produces likely bouton candidates, unlike methods based on neuronal tracing which can be slow, manual, and produce variable results that are dependent on the neuron trace. To remove the extra FPs, we include a simple Graphical User Interface (GUI) that allows a user to easily remove these over-detected points, or to add undetected boutons. The GUI also records these FPs, so that they can be used in future training. The algorithm presented in this paper is likely to work best for images acquired using a similar protocol to the dataset provided here. Indeed, we demonstrated that the algorithms worked well in another published dataset [23], which had similar conditions (Fig 7C and 7D). It is also likely to work well for other image datasets with similar blob-like objects, because of the data this algorithm was trained on, as well as the steps taken in the algorithm to detect initial points. However, is it possible that other imaging protocols with considerably different conditions might lead to different contrasts, resolutions, and noise levels which might require changes to some of the modules, or require additional training for optimal results. In addition, the use of interest point detectors also provides scale estimates that could be used to support the detection of boutons at different magnification factors, either through training the classifier with the scale estimate, or using the scale estimate to tune the Gabor parameters. We suggest this for future work.
The strategy for detecting boutons on the mean intensity projection of a stack, and then tracing back through slices to determine the z coordinate, works well for the acquisition protocol used in this and similar studies, and requires low computational effort. However, there is the small possibility that axons which cross in the third dimension (z), and have boutons in a nearby x, y location, can lead to incorrect bouton detections. The solution to this problem lies in creating a fully three-dimensional bouton detector which is able to operate on the 3D data by taking into account the slice spacing and slice thickness of the confocal stack. Future work will involve the use of intensive training using surrogate data in order to learn a 3D bouton detector. This might employ a deep learning architecture, currently a very popular class of methods in computer vision, which has also been applied to biomedical image analysis [40][41][42][43][44]. Deep learning methods are often trained end-to-end, i.e. the features that are learned during training are not hand crafted, and are optimal for the particular task in hand. This kind of approach might yield improved performance over methods with hand-crafted features. It is, however, achieved at the expense of having sufficiently large numbers of datasets with labelled boutons. The benefit of the current approach is that the parameters are sufficiently small in number to enable hand optimization.
We will also extend the analysis to time-series images in order to analyse the changes in synapses over time, analysis often required in the study of structural plasticity of neurons.

Conclusion
In this paper we proposed an algorithm for the detection of axonal boutons in 3D two-photon microscopy images. We found that using SURF keypoints and Gabor features gives the best results after comparing several well-known keypoint detectors and feature descriptors, and that the algorithm provides significant improvements over the currently available methods. Most importantly, our method makes advances in automated bouton detection without tracing the axons first, which can be an inaccurate and a computationally expensive step. We showed that despite removing this step, a high Recall of 95% is achieved, therefore detecting more true boutons than the existing method [16]. Increasing the TPR is an important factor for the analysis of axonal boutons in neuroscience research, as it can significantly increase the number of bouton samples that can be analysed. Since this type of data is often limited and hard to acquire, it is important to detect all boutons present in the image to get more data for the statistical analysis. This usually requires a significant and time-consuming amount of user-intervention, but by using the approach presented in this paper, it may be more easily achieved, as the vast majority of boutons are found by the algorithm.