Automated classification of synaptic vesicles in electron tomograms of C. elegans using machine learning

Synaptic vesicles (SVs) are a key component of neuronal signaling and fulfil different roles depending on their composition. In electron micrograms of neurites, two types of vesicles can be distinguished by morphological criteria, the classical “clear core” vesicles (CCV) and the typically larger “dense core” vesicles (DCV), with differences in electron density due to their diverse cargos. Compared to CCVs, the precise function of DCVs is less defined. DCVs are known to store neuropeptides, which function as neuronal messengers and modulators [1]. In C. elegans, they play a role in locomotion, dauer formation, egg-laying, and mechano- and chemosensation [2]. Another type of DCVs, also referred to as granulated vesicles, are known to transport Bassoon, Piccolo and further constituents of the presynaptic density in the center of the active zone (AZ), and therefore are important for synaptogenesis [3]. To better understand the role of different types of SVs, we present here a new automated approach to classify vesicles. We combine machine learning with an extension of our previously developed vesicle segmentation workflow, the ImageJ macro 3D ART VeSElecT. With that we reliably distinguish CCVs and DCVs in electron tomograms of C. elegans NMJs using image-based features. Analysis of the underlying ground truth data shows an increased fraction of DCVs as well as a higher mean distance between DCVs and AZs in dauer larvae compared to young adult hermaphrodites. Our machine learning based tools are adaptable and can be applied to study properties of different synaptic vesicle pools in electron tomograms of diverse model organisms.


Introduction
Caenorhabditis elegans is a well-studied model organism. Its small nervous system, consisting of 302 neurons [4,5] allows for studies of the connectome in its entirety [5][6][7]. A focus of a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 To circumvent these shortcomings we previously developed 3D ART VeSElecT, a workflow for automated reconstruction of vesicles [22]: This ImageJ/Fiji tool allows for automated reconstruction of synaptic vesicle pools in electron tomograms, but is not able to differentiate between different types of vesicles. To solve this issue, we here present a python-based machine learning approach that trains vesicle classifiers, and a straightforward ImageJ/Fiji tool that applies a trained classifier to previously detected vesicles.
We show that our technique can differentiate between DCV and CCV in two developmental states, young adult hermaphrodites and dauer larvae. The application of these image analysis and machine learning tools for electron tomograms is in principle useful to study differences in synaptic vesicle pools of many other model organisms.

Manual classification shows three distinct vesicle types
To compare vesicle properties in two different developmental states of C. elegans, dauer larvae and young adult hermaphrodites, we used high-pressure freezing and subsequent freeze substitution to immobilize and fix animals in a near-to-native state. Using electron tomography on~250 nm cross-sections of the worm, we obtained high resolution 3D images that allow for investigation of the ultrastructure of the nematode's NMJs. We used this information to compare synaptic vesicle pool characteristics, aiming for an automated approach for vesicle classification.
Upon visual examination of electron microscopic datasets of the NMJs of C. elegans dauer samples, we noted an increased fraction of DCV in dauer larvae compared to young adult hermaphrodites. To systematically quantify this, vesicles were automatically extracted from tomograms, using an improved version of our 3D ART VeSElecT workflow (see S1 Fig), and then manually assigned for vesicle types by two experts. In uncertain cases, a third expert was consulted. Since closer examination of tomograms revealed that a definite assignment of some vesicles to one of the two classes (CCV and DCV) was difficult, we initially established a third, hypothetic group of non-determinable (ND) vesicles. Non-assignable (NA) vesicles are a subgroup of ND that we could ultimately not classify as CCV or DCV. Fig 1 shows a representative selection of clear core, non-determinable and dense core vesicles from 8 exemplary tomograms (4 dauer larva tomograms and 4 hermaphrodite tomograms).
Together, we included 1072 segmented vesicles from the 7 young adult hermaphrodite tomograms, and 1108 segmented vesicles from the 8 dauer larva tomograms for analyses (Table 1). When we included all identified vesicle populations from all tomograms, the percentage of manually labeled DCVs differed slightly for dauer larvae (13%) and adult hermaphrodite (10%) datasets. After filtering the datasets, comparing only the vesicle population from the same type of NMJ with the bouton fully reconstituted in the tomograms (putative cholinergic NMJs with central cut through the AZ area and at least 100 vesicles), the dauer larvae NMJs exhibited a higher DCV ratio of 16% versus 9% in young adult hermaphrodites.

Characterization based on image features
In order to perform a quantitative comparison of vesicles and for clarification about the ND vesicle class, we extracted a number of image features based on qualitative observations. Since DCVs appear larger, darker and further apart from the active zone compared to CCVs, we used mean radius (r), mean gray value (gv) of the vesicle volume, and distance of the vesicle center to the manually annotated center of the active zone (distAZ) as primary features (Fig 2). In addition, the standard deviation of the gray value distribution of a vesicle (GVSD) is a measure for the contrast between membrane and core that is independent of the brightness of the vesicle. While GVSD was determined on the central slice of each vesicle for better discrimination, the other features were calculated using the entire vesicle volume excluding the membrane.
Statistical analysis and comparison of ND vesicles with CCV and DCV suggests that synapses show no third type of vesicles and that NDs are in most cases CCVs (see S2 Fig). Hence, for further analysis, we proceeded with the assumption of two existing vesicle classes, CCVs and DCVs (Table 1). We tried to apply simple thresholding and manually varied the weights of the linear combination of vesicle radius, distAZ and gv to separate DCVs from CCVs, but with limited success (see S2 Fig and S1 Text). Therefore, a machine learning approach was developed and applied to automatically determine the optimal weighting of the features, as described below. To generate training data, two experts independently reassigned ND vesicles as either DCV or CCV. Their label was only changed if they turned out to be DCV, as we Automated classification of synaptic vesicles in electron tomograms of C. elegans programmed classifiers to treat the label ND as CCV. In difficult cases, a third expert was consulted. Still non-assignable vesicles (NA) were excluded from training by labelling them as error (see Table 1: n = 12,~1.1% in dauer larvae; n = 14,~1.3% in young adult hermaphrodites).

Machine learning can distinguish vesicle types
To automatically classify vesicles using the previously calculated image features, we evaluated different machine-learning algorithms. We compared a support vector machine (SVM), random forest (RF), and k nearest neighbor (KNN) for their performance on differentiating between CCVs and DCVs using four image-based characteristics described above: radius (r), mean gray value (gv), standard deviation of the gray value (GVSD) and distance to the active zone (distAZ) (see Fig 3). 1994 manually labelled vesicles from 15 different tomograms (7 young adult hermaphrodite and 8 dauer larval tomograms) were used as training data. Adult and dauer stages were combined into a single training set to reflect the biological variability and to focus the classifier on the task of identifying vesicle types rather than developmental stages. The features were extracted as in the first two steps of our classification macro (see Fig  3 and Methods section 7.3), saved as a table, and processed in a python script using the scikitlearn machine learning library [29]. Table 2 shows the results of the comparison of the different algorithms. The overall accuracy (first column) was not a good indicator for classifier performance, as CCV greatly outnumber DCV. We instead used precision DCV , recall DCV and Fscore DCV defined in the methods section 7.6 (Eqs 2-4) to specifically score the ability of classifiers to recognize DCVs. The comparison between linear and nonlinear SVM, random forest and KNN revealed that the SVM algorithm performs best on our classification problem. The linear SVM shows the lowest error rates in all four categories (see Methods): Accuracy, precision DCV , recall DCV and F-score DCV . KNN shows second best results that were about 0-3 percent worse compared to the linear SVM, and slightly better than nonlinear SVM. For the random forest variants, t = 1500 trees performed similarly or better than t = 10, except for recall DCV . Random forest (t = 1500) shows overall third best results with 3 percent worse results for recall DCV and about 1 percent worse results for accuracy, precision DCV and F-score DCV compared to KNN. Resulting values of the SVM hyperplane delimiting DCVs from CCVs are shown in Table 3. We also performed a systematic comparison of classifier performance over a much wider range of input parameters, summarized in S2 Table. The final trained SVM classifier used the following weighting of the four features: The weight for distAZ is smaller than for the other three parameters, which indicates that it is less important. Still, it contributes to the classifier, since the performance of the SVM using Automated classification of synaptic vesicles in electron tomograms of C. elegans only three features without taking distAZ into account drops to precision DCV = 0.91 ± 0.09, recall DCV = 0.89 ± 0.1, and F-score DCV = 0.90 ± 0.08 on the same training set.
In summary, the SVM algorithm in its linear form without kernelization outperformed all other classification methods including non-linear SVM and random forest (t = 1500) classification. Additionally, using a linear SVM has the advantage that the resulting classifier can easily be incorporated into a classification workflow by extracting the coefficients after successful training. For this purpose, a new vesicle classification macro in ImageJ/Fiji was developed to be consecutively applied after the modified 3D ART VeSElecT registration macro. The workflow of the classification macro is shown in Fig 3 and is described in detail in Materials and Methods.

Application to C. elegans dauer and adult hermaphrodite NMJs
We used the ground truth classification to quantify differences in the vesicle pool of C. elegans dauer and adult hermaphrodite NMJs in electron tomograms. Additionally, visual results of 3D ART VeSElecT and the classification macro for one exemplary tomogram of each analyzed developmental state are presented in Fig 4 to provide an intuitive impression of DCV and CCV distribution.
The 3D models show that DCVs most often localize on the rim of the CCV cluster. CCVs themselves gather around the dense projections, electron dense structures attached to the cell membrane that indicate the active zone where synaptic vesicle fusion occurs. For further analysis of vesicle characteristics (radius, gray value, GVSD and distAZ), ground truth results are shown as violin plots in Fig 5. The predictions of the SVM resulted in the same statistical significance as the manual ground truth regarding most differences between vesicle types and developmental stages (S6 Fig).
Fig 5A shows highly significant differences in CCV and DCV size in both developmental states. In the following, the radius always refers to our original measured results, which is the "inner radius" of all vesicles, as described in [22]. To make vesicle size comparable to other published results, the measured radius must be multiplied by two to receive the inner diameter and increased by 9 nm to include vesicle membrane thickness [22]. This estimated diameter is only an approximation and cannot be directly compared to manually measured data due to the different way how both are measured [22]. CCVs of dauer larvae show a median inner radius of 9.0 nm, whereas DCVs have a median inner radius of 14.7 nm, which results in a significant difference of 5.7 nm. However young adult hermaphrodites' median inner radius of DCVs is 16.7 nm, hence they have a~5.4 nm Automated classification of synaptic vesicles in electron tomograms of C. elegans bigger radius than CCVs with a median inner radius of 10.3 nm. Noticeably, both vesicle types are slightly but significantly bigger in young adult hermaphrodites compared to dauer larvae. The radius of hermaphrodite CCVs is 1.3 nm larger compared to dauer larvae, whereas the radius of hermaphrodite DCVs is 2 nm larger compared to dauer larvae.
In all cases, DCVs have significantly darker normalized gray value intensities in comparison to CCVs, as shown in Fig 5B. Gray values were normalized on a per-tomogram basis by subtracting the minimum and dividing by the difference of maximum and minimum of all vesicles in the same tomogram to average out brightness variations between tomograms. Young  Automated classification of synaptic vesicles in electron tomograms of C. elegans adult hermaphrodites show a normalized gv of 0.17 for CCVs and 0.45 for DCVs, whereas dauer larvae have a normalized gv of 0.23 for CCVs and 0.56 for DCVs. Hence, vesicles of dauer larvae tomograms possess significantly darker vesicles (both CCVs and DCVs) in comparison to hermaphrodite tomograms. In contrast, GVSD (gray value standard deviation in the central slice of each vesicle) does not differ significantly between developmental states, but shows highly significant differences of~0. 22 in GVSD values of CCV to DCV for both developmental states ( Fig 6D).
The normalized distance of CCVs to the active zone distAZ is smaller (0.42 for young adult hermaphrodites and 0.4 for dauer larvae) than that of DCVs (young adult hermaphrodites: 0.64, dauer larvae: 0.72) in both larval stages ( Fig 6C). The normalized distances of DCVs to the AZ, of young adult hermaphrodites in comparison to dauer larvae, show a highly significant difference of 0.08 (normalized). For better visualization of this observation, distances are visualized as histogram plots in Fig 6. Distances were normalized on a per-tomogram basis by subtracting the minimum and dividing by the difference of maximum and minimum of all vesicles in the same tomogram to average out cell size variations between tomograms. Fig 6A and 6C illustrate that normalized distances of CCVs of both developmental states are almost identical, whereas distances of DCVs differ significantly in both plots. DCVs show a systematic discrepancy in their localization: DCVs in young adult hermaphrodites cluster closer to the active zone than DCVs of dauer larvae. Histograms in Fig 6B and 6D shows results of distAZ in real distance measured in nm. Interestingly, this statistic shows that vesicle pools in dauer larvae have a generally smaller maximal distAZ in young adult hermaphrodites. The largest distance of dauer larvae vesicles from the active zone is about 550 nm whereas the largest distance of vesicles in young adult hermaphrodites is about 700 nm. It should be noted that distAZ as derived from tomograms should be considered as an estimate of the actual distance of vesicles to the active zone, since a single tomographic stack may not contain the full information on the actual position and shape of the active zone.

Error rates and cross validation
To generate the ground truth data, a consensus label was created (N = 15 tomograms, 8 dauer larva and 7 hermaphrodite tomograms): First, labels were assigned independently by two biologists. Differing labels were re-examined. In case of no agreement, a third examiner was employed. NA vesicles (~1%) were excluded from SVM training. The initial assignment by individuals was compared with this consensus label to determine the manual error rate (Table 4).
Manual annotation shows higher scores than the SVM. Manual assignment of vesicle types shows 98% precision DCV , whereas the SVM reaches 95% precision DCV . The results for recall DCV is 96% for manual assignment and 90% for the SVM. The F-score is the harmonic mean of precision and recall (see Materials and Methods) and shows a mean result for the error rate of 97% for manual and 92% for automated classification. Results for precision and recall are additionally shown as scatterplot in Fig 7. The SVM classifier was cross-validated against each tomogram using all other tomograms as training data. To examine the transferability of the vesicle classification workflow, we tested the pretrained classifier to two tomograms of the unc-59(e261) septin mutant of C. elegans [22]. We observe a precision DCV of 0.78 and 0.8 and a recall DCV of 0.88 and 0.94, respectively, on these EM data without any additional training of the classifier. When we adjusted the image brightness to be more similar to those used for training, the performance increased further (Fig 8).

Discussion
Automated methods for image analysis are of great importance to reduce manual labor in the long term and prevent biased analysis of data, hence making double blind analysis unnecessary. This is confirmed by our own experience, where one re-analysis of the same tomogram by the same expert revealed one differential interpretation of CCV and DCV and 6 other nonmatching assignments on n = 134 vesicles in total. Such deviations in manual analysis may have a great influence on the outcome. For this reason, we had two experts manually assign all vesicles independently. In case of inconsistency, both persons came to a mutual agreement or consulted a third expert in the decision process. Still unassignable vesicles were excluded from analysis. Since this evaluation process is very elaborate and time-consuming, a long-term automated solution is inevitable. As far as we know there is no other approach for the specific task (1) Compares two independent manual assignments to the consensus label (ground truth) (2) Compares the SVM assignment to the ground truth https://doi.org/10.1371/journal.pone.0205348.t004 Automated classification of synaptic vesicles in electron tomograms of C. elegans Automated classification of synaptic vesicles in electron tomograms of C. elegans of automated vesicle classification in electron tomograms. We present here a new methodological approach that is applied subsequent to a modified version of our previously published Fiji macro 3D ART VeSElecT [22]. First, a Fiji macro was written to read out important characteristics. Those were, together with manual labelling, used to train machine learning classifiers in python. Extracted machine learning parameters of an SVM completed the new Fiji macro to classify vesicles and visualize classes through 3D color labelling. Our tool can thus be used for automated vesicle classification. We used our ground truth label to identify differences between two types of vesicles in two different developmental states of C. elegans, young adult hermaphrodites and dauer larvae. We then use these data to train an automated classification workflow to show that resulted algorithm can reproduce these manually obtained biological results. Manual analysis revealed a hypothetical third type of vesicles, called non-determinable vesicles (ND) which are most likely CCVs with no perfectly clear inner core. This seems reasonable, since they resemble CCV characteristics in radius and distance to the active zone and only differ significantly in their gray value, which lies among gv intensities of DCVs and CCVs (S2 Fig). This observation can be explained by either biological or technical reasons. On the one hand, medium-gray vesicles could be DCVs with partially released cargo, e.g. after kissand-run [30][31][32]. Overall this does not explain why NDs have the same size as CCVs, since DCVs are known to be considerably larger [23]. On the other hand, vesicles may appear darker in electron micrographs because of lower quality of tomograms at some locations. Lacking precision of the tomogram overlay of the two single tilted tomograms to receive one double tilted tomogram could result in darker, more blurry vesicles with no perfectly clear core.
The comparison of DCV number in dauer larvae and young adult hermaphrodites revealed differences in the ratio of DCVs (9% in young adult hermaphrodites and 16% in dauer larvae), when comparing only cholinergic synapses with apparent dense projections and at least 100 vesicles in the NMJ. Furthermore, we found that DCV ratio correlates with cell size change throughout the tomogram (S4 Fig). A large cell size change indicates inclusion of decentral regions of the synaptic bouton in the tomogram. We assume that inclusion of more remote areas in the tomogram is more likely in dauer larvae, since the overall size of the synaptic bouton is smaller (mean diameter ± SD of cross section: 452 ± 85 nm) compared to young adult hermaphrodites (mean diameter ± SD of cross section: 542 ± 97 nm). Therefore, standard sized 250 nm sections, typically prepared for electron tomography, naturally have a higher chance of catching decentral regions in dauer larvae NMJs. For a more comprehensive understanding, serial section tomography for visualization of complete synaptic boutons would be necessary.
Additionally, we found that young adult hermaphrodites show significantly larger vesicles (CCVs and DCVs) and diameters of synapses in comparison to dauer larvae. This observation could either be a biological phenomenon of dauer larvae or a side effect of the sample preparation. Dauer larvae have a distinct cuticle that shields the animal from adverse environmental conditions, like heat and drought. This could lead to adverse effects during fixation (freeze substitution) of the sample, even though we did not observe any sign of deficient sample fixation in dauer larvae. Divergent synaptic vesicle sizes have already been found in NMJs of other organisms, like rats [33] and zebrafish [23]. For zebrafish a dependency of vesicle size on the maturation stage of NMJs was assumed that might be related to the functional transition of these synapses to control different swimming behavior [23]. Differences in synaptic vesicle sizes during normal development have not been observed for C. elegans, to our knowledge. However, divergent vesicle sizes could also be related to different moving behavior of dauer larvae that are well known for their unique nictation behavior [9]. Since manual measurements are often inaccurate, smaller deviations in vesicle sizes are only observable in extensive analysis, therefore automated approaches are inevitable. Our previously published macro 3D ART VeSElecT proved to be successful for distinguishing vesicle size differences [22]. This approach in combination with our here presented extended version for automated classification may be the key to reveal so far undistinguishable differences in the future.
Interestingly, although nervous cells in dauer larvae are typically smaller, segregation of DCVs and CCVs is more distinct. Our interpretation is that DCVs in dauer larvae locate further away from the active zone on the rim of the synaptic vesicle pool compared to young adult hermaphrodites. Differences in this characteristic of synaptic architecture could be explained by a shift towards a higher ratio of DCVs in a non-fusion competent resting state in dauer compared to young adult hermaphrodites. Moreover, the location of DCVs closer to the cell margin in dauer compared to young adult hermaphrodites could indicate the need for faster peptide release, and their increased number could compensate for lower volume.
For automated classification of vesicles, different machine learning algorithms were tested. The SVM showed best results in comparison to random forest or k-nearest neighbor classification. Furthermore, the SVM works regardless of dimensionality, can classify non-linearly separable classes by using the kernel trick, enables an easy integration of the trained linear classifier into the macro and generally can be used for assignment into more than two classes [34].
Cross-validation showed very good results for automated classification of vesicles. Interestingly, dauer larval tomograms showed better results than young adult hermaphrodite tomograms, which could be related to the differing gray value intensities of the tomograms. Hence high contrast vesicles were easier to automatically recognize and showed more distinct results for GVSD and gv. Another reason could be the more pronounced positional segregation of the two vesicle types for dauer larvae.
We suggest that our pre-trained SVM can be used for DCV and CCV classification in other, similar tomograms, and provide the Fiji classification macro including the trained parameters. In addition, users can train an SVM on their own data using the python script we provide, and integrate the parameters into the Fiji macro, which is quickly achievable via an input dialog.
Our ImageJ classification macro and the python script, for retraining of the classifier, a tutorial and test datasets are available at: https://www.biozentrum.uni-wuerzburg.de/bioinfo/ computing/3dart-veselect/.

Conclusion
We developed a new automated approach to classify vesicles and to quantify their properties from electron tomograms. We combined machine learning with an extension of our previously developed vesicle segmentation workflow 3D ART VeSElecT to reliably distinguish clear core from dense core vesicles using image-based features. We apply this method to electron tomograms of C. elegans NMJs in young adult hermaphrodites and dauer larvae. Using our ground truth data, we find an increased fraction of dense core vesicles (~16% vs~9%) in tomograms matching certain characteristics (cholinergic synapses with > 100 vesicles of a central slice through the AZ area), as well as significantly reduced vesicle size and a higher distance of dense core vesicles to the active zone in dauer larvae compared to young adult hermaphrodites. Our approach is not limited to C. elegans and can be easily adapted to study differences in synaptic and other vesicles in electron tomograms in various settings and many other model organisms, e.g. in Danio rerio [22].

Electron microscopy
Animals. C. elegans wild-type (Bristol N2) were used and maintained using standard methods [4]. For the experiments, three different young adult hermaphrodite worms and two different dauer larvae were used.
Purification of dauer larvae. For segregation of dauer larvae from other states of the C. elegans life cycle, treatment with sodium dodecyl sulfate (SDS) buffer was used [35]. Firstly, worms were washed from the agar plates using M9 medium [36]. The worm solution was then transferred into 50 ml tubes and centrifuged for 5 min at 2000g at 4˚C. Afterwards, the supernatant was removed and the pellet was dissolved in 50 ml 1% SDS solution and incubated on a shaker for 15 min. SDS was then removed by centrifuging (2000g, 5 min, 4˚C) and subsequent discarding of the supernatant while retaining 5 ml of solution. The tube was then refilled with distilled water and the procedure repeated 2-3 times. For separation of dauer larvae from dead material, the solution was pipetted into a pit on an unseeded agar plate. Once the solution had dried, living dauer larvae were able to crawl away from the pit. After 1-3 hours, emigrated worms were washed off with M9 medium.
High-pressure freezing, freeze-substitution and electron tomography. HPF, FS and electron tomography were performed as previously described ( [19], [22]). Tomograms of young adult hermaphrodites were from three animals in three different HPF experiments, and Dauer tomograms were from two animals in a single HPF experiment. Pixel size during acquisition and after reconstruction was between 0.6 and 0.8 nm, and sample thickness was between 150 and 250 nm. All tomograms were dual tilt series and were recorded from at least +65˚to -65˚but not exceeding +70˚to -70˚.

Improvements to 3D ART VeSElecT
We extended our tool 3D ART VeSElecT to calculate various image features of detected vesicles to be used for classification. The workflow is shown in S1 Fig, with changes compared to the previous version highlighted by a black box. The overall workflow of 3D ART VeSElecT remained as previously published [22]. One change in the preprocessing step is the determination of the radius (mean distance between object center and surface voxels in 3D) instead of volume, since the radius or diameter is typically used when comparing vesicle size. Furthermore, automated calculation of the 3D cell volume was added. Since semi-automated selection of the cell outline by the user was already included in the macro, this selection is now basis for calculating the analyzed volume in 3D (which corresponds to cell volume if the selection is taken along the cell membrane, as described in the 3D ART VeSElecT tutorial).

Classification macro
The workflow of the new Fiji macro for classification of DCV and CCV in electron tomograms of synapses comprises four main steps: 1) preparation, 2) reading out features, 3) processing of features followed by labelling of vesicles, and 4) output. In the preparation step, StackDuplica-teScale and StackSegmented, which result from the 3D ART VeSElecT recognition macro, are read into the classification macro. StackDuplicateScale is the scaled original tomogram, Stack-Segmented contains the registered vesicles. First, the macro opens a preparation window for changing the names of read-in stacks (StackDuplicateScale and StackSegmented) and setting parameters for standardization and adaption of SVM weights. In case of adaptation of the SVM algorithm to another user's needs, values can easily be changed through the input window. In a second step, four characterizing vesicle features are computed from the images: radius, gv, GVSD and distAZ. Radius is determined as the mean of the distance between vesicle center and surface voxels, and gray value (gv) is the mean of all voxels, both excluding the vesicle membrane as described in [22]. GVSD is a value for standard deviation of gray values in the central slice of each vesicle, since CCVs have a higher gv variance than DCV. It is calculated on a blurred version of the image volume to enhance the contribution of dark membrane voxels (Fig 3) and uses only the central slice of the vesicle which has the highest contrast, whereas r and gv both are determined in 3D. The distance to the active zone (distAZ) was determined by taking the Euclidean distance in 3D from the center of the vesicle to the manually annotated center of the AZ. In the third step, features are processed, which includes applying an offset to gv (if the darkest vesicle is lighter than a threshold), standardization of parameters, weighting features based on SVM weights, and finally the assignment of labels according to the identified class (CCV = magenta, DCV = green). In the last step, the output is created, including a logfile and visual output shown in new windows. The logfile includes all measured characteristics (e.g. features, classification labels (CCV = 1, DCV = 0), number of vesicles in each class). The visual output shows, besides the input stacks ("StackSegmented" and "StackDuplicateScale"), an additional image stack, named "label" which corresponds to the StackSegmented but displays color-coded vesicles according to their classification. Furthermore a Composit stack, named "CompositLabel" (overlay of original tomogram and the "label" stack) is created.

Comparison of different types of machine learning classifiers
We compared a support vector machine (SVM), random forest (RF, using t = 10, t = 1500; t indicates the number of trees) and k nearest neighbor (KNN, k = 10 neighbors) for their performance on differentiating between CCVs and DCVs based on four image-based characteristics: radius (r), mean gray value (gv), standard deviation of the gray value (GVSD) and distance to the active zone (distAZ). Random forest uses decision trees to classify samples: Comparable to a flowchart, samples are divided based on their features. A random forest utilizes the majority classification of t decision trees each grown on randomly selected training data and split using the best randomly selected feature. Generalization error is lower than for individual decision trees. Performance, but also runtime increases with t. KNN classifies samples based on the majority of their k nearest training samples in feature space. Linear SVM (C = 1) showed the best average classification results (see Table 2). C determines the penalty for falsely classified training data points and therefore indicates bias-variance trade-off, since bias decreases with C [37]. The SVM will be described more accurately below. Parameters of the SVM (Table 3) were determined using all input tomograms as training data and transferred to a Fiji-Macro (Fig 3) to enable classification downstream of the 3D ART VeSElecT registration macro.

Support vector machine (SVM)
A linear support-vector machine (SVM) divides two classes by a hyperplane. As such, the SVM is a binary classifier dealing with linear classification problems. A one-versus-all approach and kernelization can extend classification to non-linear problems and into more than two classes. Using a nonlinear radial basis function kernel (RBF) did not improve SVM performance, see results. With the aim of minimizing generalization error, the optimization objective of the SVM is to maximize the margin between the hyperplane and the closest data points of the two classes. Those eponymous support-vectors lie on the positive and negative hyperplane, which are equidistant to the separating hyperplane [37]. As the linear SVM is parametric, the equation of the separating hyperplane can be transferred independently of the training-data (see Results, Table 3 and Eq 1), with xi being the features and ω i the assigned feature-weights. All data points with P n i¼1 w i x 00 i > 0 are assigned to class 1 (CCV), whereas all data points with P n i¼1 w i x 00 i < 0 are assigned to class -1 (DCV). applies classifiers from the scikit-learn library [29]. The manual labels are transformed into a feature array: Particles marked "D" (DCV) are relabeled "-1". Particles marked "E" (error) are removed. All other markings are relabeled "1", including "C" (CCV) and "N" (not determinable), since non-determinable particles show more similarities to CCV (S2 Fig).

Implementation of machine learning algorithms
For random forest, we used the RandomForestClassifier in scikit-learn. Subsets of the same size as the original training set were drawn with replacement (bootstrap), resulting in about 2/ 3 of the training set used to build each tree. Features were randomly permuted at each split. Voting was weighted by probability estimates, such that the predicted class was the one with highest mean probability across all trees. For SVM, we used the SVC classifier in scikit-learn with a linear or radial basis function (RBF) kernel, and one-vs.-rest decision function. For KNN, we used the KNeighborsClassifier in scikit-learn with 10 neighbors, uniform weights, and Euclidean distance metric (p = 2).
Classifiers are evaluated by cross-validation: Data from one CSV file are used as validation data, all other files are concatenated and used as training data (there are as many training-validation combinations as input CSV files). For each combination, features are standardized using mean and standard deviation (sd) of the training data and passed to the classifiers, namely linear SVM, random forest and KNN. An evaluation function determines accuracy of all vesicles and precision, recall and F-score of DCV only (see Eqs (2)-(4)). Evaluation results are returned as pandas.DataFrame [38] and as CSV file. Equations: tp: true positive; fp: false positive; fn: false negative. The runtime of the training script for the different classifiers was 5 s for t = 10 trees and 140 s for t = 1500 trees. The ImageJ classification macro takes 33s for the calculation, plus the time required to manually click on the active zone. All computations were performed on a 16-core AMD Opteron 1.7 GHz server with 128 GB RAM running Ubuntu Linux 16.04. In comparison, manual classification of vesicles takes an experienced user about 1 h per tomogram. The time required to set up the python environment and prepare the training data depends on the level of experience of the user, but overall there should still be at least a factor of 10 (one order of magnitude) of speed-up with reasonably large data sets of a few tomograms, as in our case. In addition, automated workflows as reported here increase reproducibility compared to manual assignment.  [22], with changes framed in black. The macro allows the segmentation of vesicles and includes steps for the preprocessing, separation of the foreground, removal of interfering components and segmentation of individual vesicles. 1) Preprocessing: The pixel per nm resolution for the image-stack is set by user input. To allow a reproducible routine for different image resolutions and the determination of real vesicle sizes the image-stack is scaled to 1 nm/pixel in x-and y-direction using bilateral scaling. For the z-dimension, no rescaling was applied and the distance between the stack images was set to the actual voxel depth, in order to maintain the proper spatial dimensions without generating additional interpolated images. Afterwards the contrast is enhanced via histogram stretching, and the stack is then smoothed with a mean filter (radius = 3.45 px) to lower the image noise. Additionally, a 2D bilateral filter is applied for noise-reduction and edge-preservation. 2) Separation of foreground: Thresholding based on mean brightness is performed for every stack image individually. Holes within the vesicles are filled in an additional processing step. 3) Segmentation: In order to extract the round vesicles, a watershed segmentation is performed in a two-step process: first a 2D segmentation is done, then a 3D Distance map is created and smoothed with a 3D Gaussian blur of 3.45 in order to remove artifacts on the edges. Based on the created distance map, a 3D watershed segmentation is performed, which results in a map of separated particles with different colors. In a final step all particles are registered with the 3D RoiManager, and particles that don't match predefined characteristics (volume, sphericity and ellipse fitting) are automatically removed before the 4) output of the registration marks the end of the workflow. In the yellow box only one small change was made: instead of volume, the inner radius is now used for size determination. Since it is known that DCVs appear in higher concentration at the rim of the AZ [19], we compared the cell size of each neuromuscular junction (NMJ) in x and y direction in the first and last picture of each image stack. To obtain the overall cell size change, we added both differences up. This measurement gives a clue of how centrally the NMJ has been cut. For a cut near the center of the NMJ, we expect a very low cell size change rate, whereas for a more remote cut through the AZ (although they demonstrably include dense projections) we expect a higher change of the cell size since synapses tend to narrow at regions far-off the AZ. Tomograms of adult hermaphrodites are plotted in dark red. Results of dauer larvae are plotted in teal green. Tomograms of the selected set (see S1 Table, only putative cholinergic synapses with > 100 vesicles and apparent dense projections) are visualized as circles, NMJs with divergent characteristics are shown as different symbols (see legend). This shows that the number of DCVs included in a cell seems to be proportional to the cell size change. As an example, 3 of the dauer larvae tomograms show a DCV ratio of 18-21% but also have a high change of cell size. Hence, although apparent dense projections can be seen in the tomograms, the sample also includes decentral regions, were higher DCV abundance is expected. (TIF) First four columns are the four features, the fifth column is the manual labels. The python script rewrites manual labels into a binary label-array: "D" (DCV) is rewritten to -1, "E" (error) is removed, and everything else is rewritten as "1" (CCV). (TIF) S1 Table. Overview of the final results for each tomogram. All selected tomograms (> 100 vesicles, cholinergic, central cut) are marked in yellow. Other tomograms are colored after exclusion criterion in light gray (< 100 vesicles), dark gray (decentral cut plus < 100 vesicles) or blue (GABAergic NMJ). (TIF) S2 Table. Systematic comparison of classifier performance over a wide range of input parameters. For the random forest classifier, we use different values for the number of trees, apply the Gini criterion, no bootstrapping, and different feature numbers. For KNN, we vary the number of neighbors, the algorithm, and distance metric. For the linear SVM, we applied a one-vs-one decision function and probability estimates. For the nonlinear SVM, we tried radial basis function (RBF), polynomial and sigmoidal kernels. All other parameters were identical to those reported in the Methods section. (DOCX) S1 Text. Vesicle classification by thresholding a combination of features. Here we describe why classifying vesicles by manually thresholding a linear combination of features is not sufficient and why a machine learning approach was used instead.