Ensemble Tractography

Tractography uses diffusion MRI to estimate the trajectory and cortical projection zones of white matter fascicles in the living human brain. There are many different tractography algorithms and each requires the user to set several parameters, such as curvature threshold. Choosing a single algorithm with specific parameters poses two challenges. First, different algorithms and parameter values produce different results. Second, the optimal choice of algorithm and parameter value may differ between different white matter regions or different fascicles, subjects, and acquisition parameters. We propose using ensemble methods to reduce algorithm and parameter dependencies. To do so we separate the processes of fascicle generation and evaluation. Specifically, we analyze the value of creating optimized connectomes by systematically combining candidate streamlines from an ensemble of algorithms (deterministic and probabilistic) and systematically varying parameters (curvature and stopping criterion). The ensemble approach leads to optimized connectomes that provide better cross-validated prediction error of the diffusion MRI data than optimized connectomes generated using a single-algorithm or parameter set. Furthermore, the ensemble approach produces connectomes that contain both short- and long-range fascicles, whereas single-parameter connectomes are biased towards one or the other. In summary, a systematic ensemble tractography approach can produce connectomes that are superior to standard single parameter estimates both for predicting the diffusion measurements and estimating white matter fascicles.


Introduction
Tractography uses diffusion-weighted magnetic resonance imaging (diffusion MRI) data to identify specific white matter fascicles as well as the connections these fascicles make between cortical regions [1][2][3][4][5][6]. Specifying the pattern of connections between brain regions ("connectome") is a fundamental goal of neuroscience [7][8][9]. One of the major goals of tractography is to establish a model of the complete collections of white matter tracts and connections ("structural connectome", also referred as "tractogram") in the human brain. Hereafter, we refer to structural connectomes estimated using tractography as "connectomes" or "connectome models".
For any tractography method, investigators must set parameter values. Key tractography parameters include maximum and minimum streamline length, seed selection, and stopping criteria for terminating a streamline, and the minimum radius of curvature allowed for building each streamline. Differences in parameter values yield differences in streamlines [32][33][34][35][36][37][38][39]. The parameter dependency of tractography has been observed in both local and global tractography algorithms [34].
In common practice, investigators choose an algorithm and set fixed parameter values in the hope of optimizing streamlines for general use. However, recent studies [40,41] demonstrated that no algorithm or parameter values are optimal across all conditions. Specifically, Chamberland and colleagues [41] show that the best choice depends on a variety of factors such as the specific region of white matter or the specific tract being studied. For example, Fig 1 compares two tracts and shows how the best parameter value differs. Tracts between nearby regions on the cortical surface have short association fibers with relatively high curvature (U-fiber; left panels in Fig 1). To identify U-fibers investigators must set parameters that allow tracts with high curvature (top panels in Fig 1). In contrast, the major fascicles of the brain, such as the Inferior Longitudinal Fasciculus (ILF) or the Superior Longitudinal Fasciculus (SLF), have relatively long and straight cores. Better estimates of the core of these tracts are obtained by sampling streamlines with relatively low curvature (middle panels in Fig 1). Additional factors affecting the optimal parameter choice for streamline generation may include diffusion MRI acquisition parameters (e.g., b-value, voxel size and angular resolution). In general, no single parameter value may capture the full range of streamlines globally in every brain.
In the machine learning and statistical classification literature, it has been shown that for large and heterogeneous data sets combining multiple types of classifiers improves performance over single classifier methods (Ensemble methods [42][43][44], see [45] for a review). The human white matter provides similar challenges, because it contains large sets of heterogeneous fascicles different in length, volume and curvature. Given the complexity of human white matter, ensemble methods incorporating a range of tractography algorithms and parameters may be a valuable approach for improving tractography performance. The idea of incorporating tracts from multiple sources in the initial construction of a connectome has been suggested in earlier publications [27,31].
We describe an ensemble method, which we call Ensemble Tractography (ET), to reduce problems arising from single algorithm and parameter selection. We illustrate the method with Short-and long-range fascicles supported by different parameter selections. The two columns compare short-range fascicles (left, U-fiber) connecting V3A/B and V3d and long-range fascicles (right, the inferior longitudinal fasciculus; ILF) segmented from different connectome models. The images show extremely different estimates using a low minimum radius of curvature threshold (a, 0.25 mm) and high threshold (b, 2 mm). a. The 0.25 mm results show a dense set of short-range fascicles, but a thin set of longrange fascicles. b. Conversely the 2 mm results show sparse short-range fascicles and dense long-range fascicles. c. Ensemble Tractography generates connectomes including both short-and long-range fascicles. Streamline colors in c indicate different parameter settings used to generate the streamlines (blue, 0.25 mm; green, 0.5 mm; red, 1 mm; yellow, 2 mm; light blue, 4 mm). Results are shown from one left hemisphere (subject 1, STN96 data set; see Material and Methods).
an example that addresses the parameter selection problem. First, we create a set of connectomes, each generated using a different parameter setting. These are called single parameter connectomes (SPCs). We then combine streamlines from multiple SPCs into a new candidate connectome, and we use Linear Fascicle Evaluation (LiFE [46]) to optimize this connectome and eliminate redundant streamlines. We call the result the Ensemble Tractography Connectome (ETC). We report two key findings. ETCs (1) include streamlines that span a wider range of curvatures as compared to any of the SPCs, including both short-and long-range fibers (bottom panel in Fig 1), and (2) ETCs predict the diffusion signal more accurately than any SPC.
To support reproducible research, the algorithm implementation and example data sets are made available at an open website (http://purl.stanford.edu/qw092zb0881).

Results
We evaluated ET with respect to one key parameter the streamline curvature threshold. Here we describe an example ET architecture, and in S1 Text (Section 5), we discuss alternative architectures. Fig 2 describes the schematic flowchart of the example ET architecture. We analyzed ET using diffusion data from 10 hemispheres. In each hemisphere, we generated five candidate SPCs (minimum radius of curvatures = 0.25, 0.5, 1, 2 and 4 mm [18]). Each SPC candidate was composed of 160,000 streamlines. We combined SPC streamlines to create a candidate ensemble connectome. Finally, we used LiFE to optimize the candidate ETC. Below we compare the properties of each of the five optimized SPCs with the optimized ETC.
The images in the bottom panels of Fig 1 show the streamlines in the optimized ETC. The ETC model includes many U-fiber streamlines, similar to the 0.25 mm SPC. The estimated ILF contains the same branching pattern that extends into the occipital lobe as the 2 mm SPC. The color of the individual ETC streamlines indicates its SPC origin. The ETC estimates of the Ufibers include streamlines mainly from SPC that permit high curvature (0.25 mm). The optimized ILF includes streamlines mainly from SPCs with lower curvature (1 to 4 mm). The ETC includes streamlines from all of the SPCs.

The curvature parameter is not only a bound
Nominally, the curvature parameter is a bound-one should not have higher curvature than the specified level [18]. In practice, however, we find that the bound impacts many properties of the candidate connectome.
We illustrate the effect of the curvature threshold on each SPC in the occipital white matter of the 10 hemispheres in STN96 dataset (Fig 3; see Materials and Methods; S2 Fig depicts white matter regions used for the analysis). For each of the bounds we tested, the candidate and optimized connectome curvatures form compact, single-peaked distributions; the peak increases monotonically as the minimum radius of curvature increases (see S3 Fig for distribution in candidate connectomes). When the curvature bound is high (small radius of curvature), the candidate connectome streamlines tend to have a relatively high mean curvature. When the curvature bound is low (high radius of curvature), the candidate connectome tends to have a relatively low mean curvature.
Thus, the curvature parameter is not simply a threshold; it influences the distribution of streamline curvatures in the optimized and candidate connectomes. For this reason, setting a lenient bound on the curvature (i.e., a low value of the minimum radius of curvature) does not yield a good representation of long-straight fascicles (Fig 1). Conversely, setting a strict bound on the curvature (i.e., a high value of minimum radius of curvature) eliminates U-fibers from the candidate connectome. We confirmed that the lenient bound on the curvature does not produce many straight streamlines using other tractography algorithm implemented in a different software package (PICo [11]; S4 Fig, S1 Text, Section 1).
To reduce the curvature bias present in each SPC, the candidate connectome for the ETC combines samples from multiple SPCs whose parameters span a significant curvature range (thick orange line; Fig 3). Hence, the ETC strategy is effective in the sense that ETCs include streamlines with a broader range of curvatures. The optimized ETC includes more streamlines than any of the optimized SPCs (Fig 4a). Importantly, nearly twice as many streamlines from the candidate ETC survive the LiFE process and contribute to the diffusion signal predictions.
Typically streamlines generated using whole brain tractography do not pass through all of the voxels in the white matter. For very simple algorithms, such as deterministic tracking based on diffusion tensors [10], as many as 17% of the white matter voxels contain no streamlines (see S8c Fig). We show that ETC streamlines pass through a larger percentage of white matter voxels than any of the individual SPCs (Fig 4b). The streamlines in SPCs (based on CSD and probabilistic tractography methods [18]) cover up to 95% of the white matter, whereas streamlines in the ETC cover up to 98% of the white matter. Because in reality the entire white matter volume contains streamlines, this result suggests that ET recovers more information from the diffusion data. The failure to find streamlines in about 2% of the voxels shows that we continue to miss some fascicles.
While the number of ETC streamlines is nearly twice the number in any SPC, the white matter coverage is only about 3 percent greater. It follows that the number of streamlines per white matter voxel in the ETC is larger than the number in any of the SPCs. Whereas the mean number of streamlines per voxel in the SPCs is around 13, the mean in the ETC is nearly 18. Fig 4c shows a histogram that counts the number of streamlines in each voxel, comparing the 2 mm SPC and the ETC. Notice that many of the voxels (77.9% voxels on average) have more streamlines in the ETC.
The larger number of streamlines within each voxel implies that the ETC streamlines can predict more complex diffusion orientation distribution functions. S5 Fig describes the example crossing fascicle voxel in which ETC predicts diffusion signal significantly better than SPC. This is because each streamline can point in a slightly different direction and thus potentially predict diffusion in more directions. Coupled with the greater coverage across white matter voxels, the ETC should be able to provide a better prediction of the diffusion signal.

ETC connectome accuracy
Next, we compare SPC and ETC connectome accuracy (Fig 5). Accuracy is evaluated as the ratio of root mean square error between model and data to the test-retest reliability (R rmse [46][47][48]; see Eq 3 in Materials and Methods). and tests whether increasing the size of the candidate SPC reduces the primacy of the ETC over the SPC (see S1 Text, Section 2). In this comparison, we matched the size of candidate SPC to that of ETC (800,000 streamlines; BigSPC model; see S1 Text, Section 2). The optimized BigSPC supports as many streamlines as the ETC (S6b Fig The optimal parameters vary between white matter pathways  Fig 1). We compared the accuracy of six connectome models in the voxels defined by the best U-fiber (Fig 6a, left, ETC U-fiber) and ILF (Fig 6b, left, ETC ILF) within the same hemisphere of the same subject. In all SPC models, 0.25 mm curvature threshold produces the best performance as compared with other thresholds in the U-fiber voxels, whereas the 4 mm SPC performs better than others in the ILF voxels (Fig 6b). This shows that the best SPC differs between white matter pathways and brain volumes. In both U-fiber and the ILF, ETC model performs similarly or better than the best SPC model (Fig 6).

ETC performance evaluated in the total white matter volume
Testing the ETC performance in the total white matter volume is computationally demanding, because of the increase of the matrix size in LiFE with ET (see the recent paper [49] for computational load of LiFE). For example, if we combine five whole-brain SPCs including 2 million streamlines, the candidate ETC size is 10 million streamlines. In order to generate whole-brain ETC model, we used the ETC-preselection method (see S1 Text, Section 5). Briefly, we selected streamlines from each SPC with highest weight (best contributing to predicting the diffusion signal) to build the candidate ETC. This ETC-preselection method reduces the size of the candidate ETC, but produces better prediction accuracy as compared with any SPC (S10 Fig).
Using ETC-preselection method, we optimized the whole-brain ETCs in five brains (Fig 7). We compared properties of preselected ETC with those of the SPCs. Consistent with results in occipital white matter (Figs 4 and 5), the whole-brain ETC supports a larger number of streamlines (Fig 7a), covers larger portion of white matter (Fig 7b) and predicts the diffusion signal better than any of the SPCs (Fig 7c). Fig 7d shows maps of measured (Data 1 and 2) and predicted diffusion signal for a single diffusion direction using two connectome models (SPC 0.25 mm and ETC with preselection). The result suggests that the ET approach is also effective for whole-brain connectome analysis.

Robustness across datasets
We evaluated ET also using data from the Human Connectome Project (HCP90 [50]; see Materials and Methods). Consistent with results obtained on the STN96 data set, ET included a wider range of curvatures (S7b

Ensemble tractography across different algorithms and parameters
In addition to the ET method described above, we also used the ET method to create candidate connectomes that include streamlines from different algorithms (Tensor deterministic, CSD deterministic and CSD probabilistic in MRtrix [18]; see S1 Text, Section 3). The optimized connectomes from the ensemble of these algorithms had better prediction accuracy, and both increased streamline count and white matter coverage (S8 Fig). We also observed that the ETC generated using an ensemble of Fiber Orientation Distribution (FOD) amplitude cutoff parameters had better prediction accuracy as compared with SPCs (S9 Fig; S1 Text, Section 4). Hence, we find substantial evidence across different diffusion datasets, tractography methods and parameters sets that ET improves the connectome model.
The ET method reduces the parameter and algorithm dependency by creating candidate connectomes whose tracts are generated using a range of parameters and algorithms. We illustrated ET for the case of sweeping out the curvature parameter in the MRtrix algorithm. We show that any single choice of the curvature parameter biases the distribution of candidate streamlines (Figs 3, S3, S4, and S7b), and that different parameter values are better suited for different types of fascicles (Figs 1 and 6). The candidate connectome is created as an ensemble, and the LiFE method is used to select an optimized connectome from the ensemble candidate connectome.
We have three principal findings. First, the optimized ensemble tractography connectome predicts diffusion signals better than any tested single parameter connectome. Second, the ensemble tractography connectome includes more unique streamlines and generates a denser representation than any single parameter connectomes. Third, the ensemble tractography Whole brain ETC performance. a. Optimized connectome size of SPCs and ETC with preselection (ETCpre; see S1 Text, Section 5) using wholebrain white matter. b. White matter coverage. c. Comparison of R rmse across connectome models covering whole-brain. Error bar depicts ±1 s.e.m. across five individual brains. Conventions are identical to those in Fig 4. d. Maps of measured and predicted diffusion signal in a typical coronal brain slice for a single diffusion direction (subject 1, STN96 dataset). Colors indicate the normalized anisotropic diffusion signal for a single diffusion direction (red: higher signal, blue: lower signal). We plot the measured diffusion signal from two independent sessions as well as the diffusion signal prediction from two connectome models (SPC 0.25 mm and ETCpre). connectome includes streamlines having different degree of curvature and length, and represent valuable anatomical features of the human white matter such as long-and short-range fibers.

Alternative ET architectures
There is an enormous space of possible methods for creating candidate ETCs. The method for creating ensembles will need to evolve over many experiments from different laboratories. This paper presents one simple ET architecture that we found to be effective and efficient; just adding all streamlines from each parameter setting and optimize the ETC. One of the disadvantages of the ETC method presented in this paper is the computational demand required in building large candidate sets. In the following we discuss alternative architectures that we considered. S1 Text (Section 5) proposes one alternative ET method; ETC-preselection. In this method, we chose 20% of streamlines contributing diffusion signal prediction from each of the individually optimized SPCs to build a new candidate ETC. The advantage of this method is that the resulting size of new candidate ETC becomes equal to that of original candidate SPCs. The disadvantage of this method is that we must evaluate (using LiFE) individually each SPC and also the ETC. Our results show that ETC-preselection performs significantly better than SPCs, and only slightly worse than ETC without preselections (S10 Fig). Preselection is particularly useful for whole-brain models including large streamline sets (Fig 7), but not necessarily the best for connectome models with smaller size.
As it is impossible to evaluate all possible ET algorithms in an initial paper, we describe the method and provide an open-source implementation (francopestilli.github.io/life/; github. com/brain-life/life/) to the community for exploration of the many possible options.
Bastiani and colleagues [34] analyzed how parameter and tractography algorithms influence connectomes and network properties. Their paper and others motivates the need for a means of deciding which solutions are best supported by the data [46,[51][52][53][54][55] (see also [56]). Several other groups also noted that the best parameter differs between different white matter pathways [40,41].
BlueMatter [27] used streamlines generated by three different algorithms (STT [20], TEND [21], ConTrack [16]) to create a candidate connectome. An important difference is that the BlueMatter algorithm could only be run on a supercomputer (BlueGene/L, a 2048-processor supercomputer with 0.5 TB of memory), while the current ET algorithm using LiFE runs on a personal computer [49]. This advance enables investigators to systematically combine streamlines from many different parameters and algorithms and adopt ensemble tractography into their daily work flow. This paper is the first systematic exploration to sweep out several key parameters (curvature, stopping criterion) in tractography and demonstrate the advantage of ensemble methods in terms of anatomy (Fig 1) and prediction accuracy for diffusion signal (Figs 5 and 7).
A number of groups compared tractography with an independent measurement, such as invasive tract tracing or manganese enhanced MRI in macaques or mice [39,40,[57][58][59][60]. For example, Thomas et al. [22] collected a diffusion data set in one macaque and compared the results of several single parameter connectomes with tracer measurements from a different macaque. This comparison has several limitations. First, the tracer measurements depend upon factors including the tracer type (e.g., anterograde or retrograde) and the selection of planes and injection sites; hence, they can differ substantially (e.g. [61,62]). When the methods disagree, it is often best to assemble a conclusion from multiple studies. Second, comparisons in a particular data set do not guarantee validation in a different experiment. For example, we cannot use high-resolution human adult brain fMRI data acquired in 7T scanner to support conclusions made from lower resolution fMRI data in children acquired using a 1.5 T scanner. Each methodology requires means for stating both the conclusions and the strength of the support for those conclusions. It is best to integrate fully justified findings derived by a variety of methods rather than discarding one method or another.
Others have proposed to evaluate tractography by defining ground truth using synthetic phantoms [31,[63][64][65][66]. Some investigators have pointed out the logical limitations of this approach [5]. We agree that there are limitations to using phantoms for testing tractography but that in some cases synthetic phantoms can be valuable for analyzing computational methods. Unfortunately, for our current work none of the currently available phantoms can be used. This is because most phantoms have been generated using either single tractography parameters [67] or simple fiber configurations [63]. Close and colleagues [68] provide software for generating numerical phantoms that can simulate complex fiber organization. However, their method was not proposed to evaluate tractography performance by comparison with ground truth. This fact makes it impossible for us to use the current phantoms to test the superiority of multiple tractography approaches such as ET to resolving multiple types of fiber configurations simultaneously.
The potential value of creating connectomes from a collection of tractography methods was mentioned by both Sherbondy et al. [27] and Lemkaddem et al. [31]. Here, we provide a specific, open-source, implementation, and we begin a systematic analysis of this methodology. The analyses show that ET based on sweeping out the curvature parameter has the specific benefit of creating connectomes with both short-and long-range fascicles. In addition, the ET method produces more fascicles, larger coverage, and a better cross-validated prediction error.

Future work
In this paper, we described the advantage of combining multiple tractography parameters and algorithms in order to improve the accuracy of connectome models. We use several example parameters and algorithms as a target for ET applications, and there are likely to be other beneficial combinations of algorithms and parameters which will be tested in future work. For example, we could combine connectomes by sweeping out two different parameters, or combine connectomes generated by different software packages that implement different algorithms, or combine connectome generated by using different seeding strategy tested in the literature [38,65,69]. Although it is impossible to test every pattern of combinations in this paper, we made LiFE software open (http://francopestilli.github.io/life/; https://github.com/ brain-life/life/) to help other researchers test different ET architectures. Future studies by multiple research groups will clarify the optimal ET architecture in both model accuracy and computational efficiency.
Current tractography uses a fixed set of parameters to generate each streamline. However, several fascicles, such as many within the optic radiation, include both curving and straight sections [71][72][73][74]. When this is known a priori, it may be more accurate to change the tractography parameter along one fascicles, allowing high and low curvature in the relevant portions of the tract. LiFE and ET will provide the opportunity to evaluate the model accuracy of new tractography tools in terms of the prediction accuracy on diffusion signal.

Extending the range of tractography
It is widely agreed that diffusion MRI contributes useful information about the large and longrange fasciculi in the human brain [75][76][77][78]. Meanwhile, the existence of U-fiber system has been supported [79,80], but not extensively studied in the literature presumably because of the limitations in tractography parameter selections. The optimized ETCs extend tractography to include both long-and short-range fascicles in a single connectome, improving on the optimized SPCs which include one or the other. The higher model-accuracy and the inclusion of both short-and long-range fibers is a validation that the optimized ETC improves on any SPC. The preliminary ET results are encouraging, but they will surely benefit from further optimization.
Tracer studies are not well-suited to identifying long-range pathways in the human brain. Even in animal models, with more than a century of history, recent tracer measurements challenge conventional thinking about long-range pathways. Reports describing many new found projections demonstrate that the field is active and evolving [62,81,82].
The progress in human tractography complements the strengths of tracer studies in animal models. Ultimately, combining insights from these technologies will provide a more complete view of human brain anatomy and function.

MR data acquisition and pre-processing
We used two magnetic resonance diffusion imaging datasets. The STN96 dataset was acquired at the Stanford Center for Neurobiological Imaging (CNI); the HCP90 dataset was acquired by the Human Connectome Consortium [50].
STN96 data set: Diffusion-weighted MRI acquisition. The main analyses were conducted for the STN96 dataset. These have also been used in other papers [46][47][48][49]. STN96 was collected from five human subjects (five males; age range 27-40, mean age 32.6 years old). Informed written consent was obtained from all subjects. The experimental procedures were approved by the Stanford University Institutional Review Board.
A dual spin echo diffusion-weighted sequence [83] was used. The diffusion MRI data were acquired for 96 different directions at a spatial resolution of 1.5 mm 3 (isotropic), two averages in k-space (i.e., NEX = 2). The b-value was 2000 s/mm 2 and TE was 96.8 msec. Ten non-diffusion weighted images (b = 0) were acquired at the beginning of each scan. Two scans were performed.
MR images were corrected for subject's motion using a rigid body alignment algorithm [84]. We also used the measurements of the B 0 magnetic field for post-hoc correction of EPI spatial distortion (https://github.com/kendrickkay/preprocessfmri). Dual-spin echo sequence minimizes the eddy-current artifact [83]. Hence, eddy current correction was not applied. All preprocessing steps have been implemented in Matlab as part of the mrVista software distribution (https://github.com/vistalab/vistasoft).
HCP90 data set. The HCP90 data set was acquired at multiple b-values (1000, 2000 and 3000 s/mm 2 ). Measurements from the 2000 s/mm 2 shell were extracted from the original data set and used for analyses because the implementation of LiFE that we used only accepts single-shell diffusion MRI data [46]. Processing methods for HCP data has been described elsewhere [85,86].

Selection and evaluation of white-matter connectomes
Candidate connectome generation. The total white-matter volume was initially identified from the tissue type segmentation using FreeSurfer [87], edited manually ( [88] http://www. itksnap.org/pmwiki/pmwiki.php), and finally resampled at the resolution of the diffusion data. Portions of the white-matter volume were used as seed regions for fiber tracking. S2 Fig depicts  occipital white matter regions (10 hemispheres) used for the main analyses in STN96 dataset. Whereas most of the analyses on the STN96 dataset were focused on the occipital white matter, we also used the total white matter volume for testing the generality of the findings (see Fig 7).
The candidate connectome was created using fiber tracking in MRtrix 0.2 [18]. We used a constrained-spherical deconvolution (CSD [89]) and probabilistic tracking (step size: 0.2 mm; maximum length: 200 mm; minimum length: 10 mm; FOD amplitude stopping criterion: 0.1; vector specifying the initial direction: 20 deg). We set the maximum number of spherical harmonics to 8 (L max = 8). We used the entire total white matter mask as seed, and seed voxels were randomly chosen from the mask for producing individual streamlines. Tracking was terminated when a streamline reached outside the white matter mask. The minimum radius of curvature was set to different values in different candidate connectomes, comprising the ensemble.
In both datasets, we initially performed whole-brain tracking to generate 2 million streamlines for each parameter settings. For the analysis using occipital white matter, we clipped the streamlines at the boundary of white matter Region of Interest (ROI) described in S2 Fig. For the STN96 dataset, each subject had two scans; one was used to create the candidate connectomes and the second was used for cross-validation (see "Evaluation of model accuracy" below).
Connectome model optimization and evaluation. We optimized connectome models using LiFE (Linear Fascicle Evaluation [46], https://francopestilli.github.io/life/; https://github. com/brain-life/life/). Briefly, LiFE uses the candidate connectome to create a linear model that predicts the measured diffusion signal. From the linear model, LiFE derives a weight describing each streamline's contribution to predicting the data. The weight is estimated using a non-negative least-square optimization method (SBB [90]). The model accuracy is assessed by using the model to predict a diffusion data set. The evaluation is global in that the error is measured for the entire set of streamlines and the entire diffusion MRI data set. The processing of one occipital connectome model (160,000 streamlines) requires 64.7 minutes on the computer we used to analyze STN96 dataset (16 processing core with 32GB Random Access Memory). The computational load of LiFE on standard notebook computer is described elsewhere [49].
We evaluated two types of connectomes: Single parameter connectome (SPC): Connectome model generated by a single curvature parameter. We generated four connectome models by using five different curvature parameters (the minimum radius of curvature = 0.25, 0.5, 1, 2 and 4 mm). These curvature parameters correspond to angle thresholds 47.2 deg, 23.1 deg, 11.5 deg, 5.7 deg and 2.9 deg respectively, in a step size (0.2 mm) we used (see S11 Fig for the relation between minimum radius of curvature and angle). In each SPC models, we used 160,000 streamlines as candidate connectome for occipital white matter regions used for each analysis.
Ensemble tractography connectome (ETC): Connectome model generated from multiple curvature parameters. The candidate connectome streamlines derive from five SPC models, and each SPC include 160,000 streamlines as described above. Thus, the candidate ETC connectome includes 800,000 streamlines. Fig 2 describes the flowchart of the ETC. Alternatives to the ETC that include preselection are described in S1 Text, Section 5.
Evaluation of model accuracy. Model accuracy is evaluated by comparing the error between the LiFE model prediction and the test-retest reliability. Specifically, we evaluated the model prediction error using cross-validation in order to control over-fitting [46,91]. We compute this error in a series of simple calculations [46].
First, we calculate the model root mean squared error (RMSE), M rmse , as: Where m(θ i ) is the diffusion-modulation predicted by connectome model at each measured diffusion directionθ i and S 2 (θ i ) is the measured diffusion-modulation signal in a second, independent set of diffusion data not used for tractography. N is a number of measured diffusion directions. Second, we calculate the test-retest reliability, D rmse , from the repeated measurements.
The signals S 1 (θ i ) and S 2 (θ i ) are two diffusion-weighted measurements in the same subject. Finally, model accuracy is analyzed as the ratio of the prediction to the test-retest reliability, R rmse : A value of R rmse = 1 indicates that the optimized connectome model predicts the second data as accurately as test-retest reliability. We evaluated the accuracy of each connectome model by using the R rmse (Eq 3) to describe how well model predicts an independent dataset (cross-validation) with respect to the noise in the STN96 dataset (test-retest reliability). The theoretical lower bound of R rmse is 0.707 [48]. The HCP data set does not include a second independent scan. Hence, for this data set, we used the RMSE between diffusion signal prediction and first diffusion data for evaluating connectome model accuracy (S7d Fig). This number has no absolute significance, but it can be used to compare relative model performance for model fits to data sets. More technical details about LiFE have been published [46,47,49].
Measuring mean streamline curvature. We computed the streamline curvature distribution in each connectome model. First, we fit a spline function to individual streamlines. We then computed extrinsic curvature (C) using individual spline curves at individual step points: C ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ððx 00 y 0 À x 0 y 00 Þ 2 þ ðx 00 z 0 À x 0 z 00 Þ 2 þ ðx 00 z 0 À x 0 z 00 Þ 2 Þ q ðx 02 þ y 02 þ z 02 Þ Where x', y', z' and x", y", z" are the first and second derivative respectively of the x, y, z oordinates at each node in a streamline. We computed the mean curvature C across all nodes in the streamline as follows: where N is the number of nodes along the streamlines. The mean radius of curvature was defined as the inverse of mean curvature: We computed the mean radius of curvature in all streamlines and plotted the distribution in Figs 3 (STN96 data set) and S7b (HCP90 data set). This is the same computation used in MRtrix to generate the streamlines given a certain parameter [18].

Tract identification
We identified several tracts within each optimized connectome to compare how different connectome represents anatomical features of the white-matter fascicles. All figures of brain anatomy and fascicles were made using Matlab Brain Anatomy (www.github.com/francopestilli/ mba). Inferior Longitudinal Fasciculus. We identified ILF in one subject in STN96 dataset (subject 1, left hemisphere) and one subject in HCP90 dataset (subject 6, left hemisphere). We used the AFQ toolbox [38] to identify ILF from connectome models. Briefly, AFQ defined waypoint ROIs in individual subject by non-linear transformation from waypoint ROI in MNI template brain, which is drawn on the basis of anatomical prescription [75]. ILF is identified as streamlines passing through two waypoint ROIs. We excluded streamlines with length ! 3 sd and with position ! 3 sd away from the mean position of the ILF [76].
U-fiber in occipital cortex. We identified U-fiber system (a fascicle set travelling parallel to a cortical sulcus; [79]) in occipital cortex in one subject in STN96 dataset (subject 1, left hemisphere) and one subject in HCP90 dataset (subject 6, left hemisphere). We manually defined two waypoint ROIs to identify U-fibers from connectome models (the location of ROIs is shown in S1 Fig). We selected the streamlines having endpoints in both of these ROIs in all connectome models as U-fibers. We excluded topological outliers based on the length and position, by using the same criterion for ILF. Example result is shown in Fig 1. In subject 1 in STN96 dataset, the comparison with visual field maps [92,93] showed that this U-fiber is connecting V3A/B and V3d.

Fascicle evaluation for whole-brain connectome
We evaluated model accuracy for whole-brain connectomes. To do so, we generated five 2-million streamlines candidate SPCs by using different curvature thresholds (from 0.25 mm to 4 mm). We then used LiFE to assign a weight to each streamline. Next, we selected the top 400,000 streamlines with highest weight from each SPC (preselection method; see S1 Text, Section 5). This resulted in an ETC connectome containing 2 million streamlines. Finally, we optimized this ETC using LiFE. The processing of one whole-brain connectome model with 2 million streamlines requires 28.4 hours on a computer with 16 processing cores and 32GB Random Access Memory.

Fascicle evaluation along the ILF
The ILF extends outside the occipital white matter region used for the main analysis (S2 Fig). In order to evaluate the connectome model along these fascicles, we selectively fitted LiFE to white matter voxels containing these tracts. To do so, we (1) identified the ILF from candidate connectome in all connectome models using the identification method described above, (2) concatenated all streamlines identified as ILF across multiple connectome models, (3) extracted the voxels in which any of streamlines are passing through. Finally we obtained a white matter region covering the ILF. LiFE analysis on the ILF is limited to these portions of white matter in all connectome models tested. Distribution of the radius of curvature in candidate connectome in four different whole-brain connectome, each of which is generated by using four different angle threshold (5.7, 11,5, 23.1, 47.2 deg) in PICo algorithm [11] on Camino toolbox (see S1 Text, Section 1). We have also observed that the connectome using lenient bound on the curvature (e.g. 47.2 deg angle threshold) does not produce straight streamlines having large radius of curvature. Plot conventions are identical to S3 Measured and predicted diffusion signal from one example voxel (from Subject 5, STN96 dataset). Horizontal axis depicts the diffusion gradient directions (arbitrary order) and vertical axis depicts the magnitude of demeaned diffusion signal in each direction. Black lines depict measured diffusion signal (solid line, scan 1; dotted line, scan 2) whereas colored lines depict predicted diffusion signal (top panel, ETC; bottom panel; SPC 2 mm). Whereas the ETC predicts the diffusion signals, the SPC 2 mm fails. R rmse in each plot indicates the R rmse of ETC and SPC 2 mm model in the voxel. b. Spatial distribution of measured and predicted diffusion signal. Horizontal and vertical axis depicts the magnitude of demeaned diffusion signal in X and Z direction. Individual data points describe the measured or predicted demeaned diffusion signal in one of 96 diffusion-weighted directions. The plot indicates that ETC successfully predicts complex diffusion signal distribution derived from crossing fascicles. c. Scatter plot showing the correlation between measured and predicted diffusion signal. Horizontal axis depicts the prediction for demeaned diffusion signal by ETC (top panel) and SPC 2 mm (bottom panel). Vertical axis depicts the measured diffusion signal in diffusion dataset not used for tractography (cross-validation, see Materials and Methods). While ETC diffusion predictions showed a substantial correlation with the signal in independent dataset (r = 0.837), diffusion signal prediction by SPC 2mm does not correlate with diffusion signal (r = -0.015). (EPS) S6 Fig. Comparison between ETC and SPC using large candidate connectome size. a. Flow diagram of BigSPC model. We generate the identical number of streamlines to ETC only using single parameter (minimum radius of curvature = 2 mm), and optimized it using LiFE (see S1 Text, Section 2). b. Optimized connectome size. BigSPC supports comparable number of streamlines to ETC. c. White matter coverage. ETC covers larger portion of white matter that the BigSPC. d. Ensemble Tractography across algorithms (see S1 Text, Section 3). Using three different tractography algorithms in MRtrix (DT_STREAM: Tensor deterministic; SD_STREAM: CSD deterministic, and SD_PROB: CSD probabilistic; [18]), we generated three Single Algorithm Connectome (SAC) candidate containing 120,000 streamlines in occipital cortex. In ETC model, we simply combined all SAC streamlines into ETC candidate connectome. We used LiFE to optimize SACs and ETC. b. Optimized connectome size of four connectome models. c. White matter coverage. d. Accuracy of ETC. ETC predicts diffusion signal better than SACs. Conventions are identical to those in S6 2) in MRtrix, we generated four SPCs containing 160,000 streamlines in occipital cortex. In ETC model, we simply combined four SPCs to generate candidate connectome. We used LiFE to optimize each connectome model. b. Optimized connectome size of four connectome models. c. White matter coverage. d. Accuracy of ETC. ETC predicts diffusion signal better than SPCs. Conventions are identical to those in S6 Flow diagram of the ETC-preselection ('ETCpre'; see S1 Text, Section 5). Using LiFE, we optimize each SPC first, and select streamlines contributing diffusion signal prediction in each SPC. We combine those preselected streamlines to create candidate ETCpre connectome, and optimized it using LiFE again. See S1 Text, Section 5 for detail. b. Optimized connectome size of SPCs, ETCpre and ETC. The optimized ETCpre supports larger number of streamlines as compared to SPCs, meanwhile candidate connectome size is identical. c. White matter coverage. ETCpre covers wider regions of white matter as compared with SPCs. d. Accuracy of ETCpre. ETCpre predicts diffusion signal better than SPCs. Accuracy is slightly lower than ETC without preselection. Conventions are identical to those in S6