• Loading metrics

Machine learning to predict mesenchymal stem cell efficacy for cartilage repair

  • Yu Yang Fredrik Liu ,

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Project administration, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, United Kingdom

  • Yin Lu,

    Roles Conceptualization, Data curation, Investigation, Methodology, Validation, Writing – original draft, Writing – review & editing

    Affiliation Bioprocessing Technology Institute, Agency for Science Technology and Research (A*STAR), Singapore, Singapore

  • Steve Oh,

    Roles Conceptualization, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Writing – review & editing

    Affiliation Bioprocessing Technology Institute, Agency for Science Technology and Research (A*STAR), Singapore, Singapore

  • Gareth J. Conduit

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Methodology, Project administration, Resources, Software, Supervision, Validation, Writing – review & editing

    Affiliation Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, United Kingdom

Machine learning to predict mesenchymal stem cell efficacy for cartilage repair

  • Yu Yang Fredrik Liu, 
  • Yin Lu, 
  • Steve Oh, 
  • Gareth J. Conduit


Inconsistent therapeutic efficacy of mesenchymal stem cells (MSCs) in regenerative medicine has been documented in many clinical trials. Precise prediction on the therapeutic outcome of a MSC therapy based on the patient’s conditions would provide valuable references for clinicians to decide the treatment strategies. In this article, we performed a meta-analysis on MSC therapies for cartilage repair using machine learning. A small database was generated from published in vivo and clinical studies. The unique features of our neural network model in handling missing data and calculating prediction uncertainty enabled precise prediction of post-treatment cartilage repair scores with coefficient of determination of 0.637 ± 0.005. From this model, we identified defect area percentage, defect depth percentage, implantation cell number, body weight, tissue source, and the type of cartilage damage as critical properties that significant impact cartilage repair. A dosage of 17 − 25 million MSCs was found to achieve optimal cartilage repair. Further, critical thresholds at 6% and 64% of cartilage damage in area, and 22% and 56% in depth were predicted to significantly compromise on the efficacy of MSC therapy. This study, for the first time, demonstrated machine learning of patient-specific cartilage repair post MSC therapy. This approach can be applied to identify and investigate more critical properties involved in MSC-induced cartilage repair, and adapted for other clinical indications.

Author summary

Cartilage damage affects the life quality of hundreds of millions of people, causing chronic joint pain and disability. Cartilage has poor regenerative capacity. Only minor damage could improve on its own or with passive treatments, while more severe damage often requires surgery. In recent decades, stem cell therapy has become a promising treatment option to reduce pain and repair cartilage. However, with complex mechanisms and various factors involved, efficient and consistent cartilage regeneration remains elusive. Our neural network learns information from clinical trials and animal studies to predict therapeutic outcomes along with the confidence level based on the patient’s condition. This machine learning approach provides an important reference and significant insights into the optimization of treatment strategies.


Articular cartilage is a critical tissue with multifaceted mechanical functions. It holds compression, absorbs shock, and enables smooth articulation at the joints. Cartilage injury is unfortunately common due to tears, accidents and arthritis, which often leads to joint pain, stiffness, and inflammation. Cartilage disorders affect millions of people worldwide, including 52.2 million adults in US [1], and more than 10 million in UK [2]. In particular, osteoarthritis alone affects more than 200 million people globally [3]. Adult cartilage has limited self-repair capacity due to its avascular nature [4], thus treatments are often necessary to accelerate repair and relieve pain during joint motions. Besides the conservative treatments and conventional surgical options, such as microfracture and autologous chondrocyte implantation (ACI), mesenchymal stem cell (MSC) has also been widely investigated in the management of cartilage damages in recent decades [5].

Although significant success has been achieved for MSC therapy in cartilage repair, the efficacy of therapy has been inconsistent. This is likely attributed to the complex cellular mechanisms and dynamic interplay across different populations of cells involved in the stem cell assisted tissue repair processes. MSC therapy is also complicated by heterogeneity of cell, culture conditions, delivery methods, and recipients’ conditions, which are all highly variable in current clinical trials and laboratory studies. Thus, disconnectedness between the in vitro, pre-clinical, and clinical performances of MSCs have been broadly observed [5], which has so far rendered the analysis of MSCs’ therapeutic efficacy largely retrospective, rather than predictive. As a result, there is a lack of guidelines on MSC therapy strategy to promote optimal therapeutic efficacy.

Setting guidelines for MSC therapy requires identification of critical properties that affect MSCs’ therapeutic efficacy most significantly. To achieve this, quantitative assessment of the significance of individual property is needed. However, this is ineffective through conventional controlled biomedical experiments where one or at most a few properties can be interrogated at a time. To overcome this challenge, we use machine learning to capture multi-property correlations and exploit all of the information in a database. A machine learning model predicts based on the training dataset, and each algorithm has a basic set of parameters to fit multidimensional functions that can be changed to improve its accuracy [6]. Deep learning methods are able to predict multiple output properties simultaneously [7].

In this paper, we performed a meta-analysis on MSC therapy for cartilage repair. The data we analyzed were generated by different researchers using different experimental designs; as a result, the properties considered in one study may not always be addressed in another, which has led to a database containing “missing information” in some of its entries. Many machine learning methods do not analyze the entries with incomplete information, which often results in a shrinking database with compromised cognitive performance. We adapt a neural network formalism [812] with a unique capacity to “fill” the missing data by learning the correlations across multiple properties, and recursively imputes with precise estimates. Furthermore, our machine learning method computes the uncertainty of predictions raised from experimental noise and computational extrapolation, which allows the neural network model to focus on the most confident predictions.

The coefficient of determination (R2) of our machine learning model in predicting MSC therapy outcome was 0.637 ± 0.005 in cross-validation test. Through machine learning, we identified defect area percentage, defect depth percentage, implantation cell number, body weight, tissue source, and cartilage damage type as critical therapy properties of cartilage repair. In particular, an optimal dosage range of 17-25 million cells was identified for achieving the best therapeutic outcome. We also predicted that the optimal therapy outcome was most likely to be achieved in patients with cartilage defects less than 6% in area and 22% in depth of the knee cartilage. Larger defects significantly dampen the efficacy of MSC therapy.

The capacity of predicting MSCs’ therapeutic outcome using machine learning holds great clinical significance in suggesting critical therapy input properties to maximize the therapeutic benefits. Further development of this technology could extend its applications in other diseases and cell types, and shed light on substantial improvements in cell therapy efficacy and consistency.


Data sets

We collected data from 36 published articles on PubMed [1348] to train and validate our machine learning models. Some articles comprised more than one type of cartilage injury models or treatment conditions. In total, 15 clinical trial conditions and 29 animal model conditions (1 goat, 6 pigs, 2 dogs, 9 rabbits, 9 rats, and 2 mice) on osteochondral injury or osteoarthritis were included, where MSCs were transplanted to repair the cartilage tissue. We documented each case into an entry of a database. We considered the cell- and treatment target-related factors as input properties, including species, body weight, tissue source, cell number, cell concentration, defect area, defect depth, and type of cartilage damage. The therapeutic outcomes were considered as output properties, which were evaluated using integrated clinical and histological cartilage repair scores, including the international cartilage repair society (ICRS) scoring system, the O’Driscoll score, the Pineda score, the Mankin score, the osteoarthritis research society international (OARSI) scoring system, the international knee documentation committee (IKDC) score, the visual analog score (VAS) for pain, the knee injury and osteoarthritis outcome score (KOOS), the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC), and Lyscholm score. In this study, these scores were linearly normalized to a number between 0 and 1, with 0 representing the worst damage or pain, and 1 representing the completely healthy tissue. The list of entries was combined together to form a database.

Neural network formalism

We now define the neural network formalism that was used to capture the functional relation between all properties, and predict these relations for new therapies. The establishment of the core neural network and its critical feature on estimating the uncertainty in predictions are described as follow, before the second novel aspect of handling missing data.

Each entry x = (x1, …, xI) to the neural network is a vector of length I, with the first I − 1 variables being the distinct treatment conditions (including species, body weight, tissue source, cell number, cell concentration, defect area, defect depth, and type of cartilage damage); and the final Ith variable is the therapeutic outcome. We intended to find a function f that satisfies the fixed-point equation f(x) ≡ x for all entries in the database. The trivial solution to this fixed-point equation is the identity operator, f(x) = x, but this solution does not allow us to impute data using the function f. We search for a solution to the fixed-point equation that is orthogonal to the identity operator, and allow the function to predict a given component of x from some or all other components. The output (y1, …, yI) is a vector of length I, with the first I − 1 variables being the predicted treatment conditions (if unknown); and the final Ith variable is the therapeutic outcome. A linear superposition of hyperbolic tangents was chosen to model the function x, (1) with and

The neural network with one layer of hidden nodes was shown in Fig 1. Each hidden node ηhj performed a tanh operation on a superposition of input properties xi with variational parameters Aihj and Bhj for 1 ≤ i ≤ I. Each property yj for 1 ≤ j ≤ I was predicted separately as a superposition of all hidden nodes with variational parameters Chj and Dj. There are exactly as many given properties as predicted properties, since all types of properties are treated equally by the ANN. We set the Ajhj = 0 so that the network predicts yj without knowledge of the true quantity xj. A hyperbolic tangent activation function was used to constrain the magnitude of ηhj, giving the weights Chj sole responsibility for the amplitude of the output response. The variational parameters were selected to minimize the mean square error of predictions of the training data.

Fig 1. Neural network to model data.

The graph illustrates how different inputs xi are used to calculate the outputs for y1 (top) and y2 (bottom); similar graphs can be drawn for all other yj to compute all the predicted properties. Linear combinations (grey lines) of the given properties (red) are taken by the hidden nodes (blue) through a non-linear tanh operation is applied, and a linear combination (grey lines) of the hidden nodes returns the predicted property (green).

The ANN has to be trained on a provided data set. The parameters {Aihj, Bhj, Chj, Dj} are initialized with random values, and varied following a random walk. The new values are accepted, if the new function f(x) models the fixed-point equation f(x) ≡ x better, which is quantitatively measured by the error function, (2) This form is also known as the root-mean-square error (RMSE) cost function. The optimization is equivalent to the minimization of the RMSE cost function and a steepest descent approach is used.

In order to measure the uncertainty in the ANN’s prediction, we train a number of models simultaneously, and treat their average as the overall prediction and their standard deviation as the uncertainty. The pseudocode is shown in Algorithm A, at least 100 training models were used to evaluate the uncertainty. This concept is similar when estimating the uncertainty in ensemble models, with the underlying model being changed to neural networks and the uncertainty generated accounts for both experimental uncertainty in the underlying data and the uncertainty in the extrapolation of the training data [49, 50].

Handling incomplete data

Sometimes a database may contain entries with incomplete input information due to experimental design or data acquisition problems. The possibility of such missing data is higher in meta-analysis when results from studies with acceptable differences in design and purpose are pooled to form a database. In our database, for example, the osteochondral defect studies took the area of defect as a common critical data for evaluating the severity of injury [14, 17]. However, this information was not always presented in osteoarthritis studies due to difficulties in precisely measuring defect area with complicated geometry [25, 48]. This leads to “missing data” in the entries. We noticed that underlying correlations may exist across the different properties, and can be distilled by a neural network to “fill in” the missing information. A typical neural network formalism requires each property to be either an input or output of the network, and all inputs must be provided to compute a valid output. In contrast, our neural network takes the known treatment conditions and the therapeutic outcome (if known) as inputs then outputs the predictions for unknown treatment conditions and the therapeutic outcome. Then following the flowchart in Fig 2 the neural network is applied iteratively to cycle the predictions of the unknown treatment conditions and therapeutic outcome until self-consistency, an expectation-maximization algorithm [51].

Fig 2. Data imputation algorithm for the vector x.

After checking the missing properties of entries, we set x0 = x, and replace all the missing data by averages from the training data set. We then iteratively compute xn+1 as a function of xn and f(xn) until we reach convergence after n iterations.

The algorithm is shown in Fig 2. For any unknown properties, we first set missing values to the average of the values present in the data set. With estimates for all values of the neural network we then recursively apply the following equation until convergence: (3) where n denotes the iteration step, f(xn) is a prediction for x obtained from the neural network. The converged result is then returned instead of f(xn). The function f remains fixed on each iteration of the cycle. A softening parameter, γ ∈ [0, 1], is used to combine the results with the existing predictions, and a γ > 0 serves to prevent oscillations and divergences of the predictions. Typically, we set γ as 0.5. The performance against the missing data percentage in the database is shown in S1 Text (Fig. B).

Thus we were able to utilize the full information in the database, derive a more robust model and enhance the quality of predictions.


The model was initially fitted on a training dataset, which is a set of entries from our database. The fitted model then used to predict the outputs in a second validation dataset, to provide an unbiased evaluation of the model. To assess the performance of the model, we adapt the coefficient of determination (R2) metric when training our model on the validation dataset. In this work, we are only using the therapeutic outcome as the variable for R2: where is the therapeutic outcome from the ith case (patient or animal) and is the corresponding prediction. The value of R2 ranges from negative infinity to 1 and is a measure of the fit to the perfect identity line , where R2 = 1 means a perfect fit, R2 = 0 corresponds to making the most naive prediction of all values being the average of the data. To confirm the accuracy of the neural network prediction and avoid overfitting, the R2 was calculated within the leave-one-out cross-validation framework. We first removed one entry from the database at a time for all the entries, trained the model on the remaining entries, and presented the inputs of the unseen entry to predict its output. Eventually we then gathered all the predicted properties of every entry, and computed the R2 against the actual experimental properties.

Other machine learning methods

We compare our neural network algorithm with a variety of other machine learning approaches in S1 Text (Fig. A(i)). Random Forest (RF) [52] is a popular method, which builds an ensemble of decision trees to predict individual results. However, decision trees require all their input to be present during training that makes it impossible to build RF models using incomplete entries but to drop them, we use the imputation algorithm to fill the database and record the second-best R2 value of 0.554 compared to the value of 0.637 from our neural network method. We have also tested the K-Nearest Neighbor (KNN) and Multiple Linear Regression (MLR) method [53], where 3 nearest neighbors was chosen as the optimal setting of KNN using Euclidean distance.

Another popular method of analyzing sparse databases is matrix factorization, where the matrix of condition and treatment values is approximately factorized into two lower-rank matrices that are then used to predict therapeutic outcome for the new patient. We used the modern Collective Matrix Factorization (CMF) [54] implementation for comparison, and the hyperparameter alpha for the CMF model was chosen heuristically as 0.99. The R2 value is -0.003, the reason might be the CMF method assumes linearity in the interaction of latent factors which fails to capture some complex non-linear interactions.

We also use the leave-one-out cross-validation to determine other hyperparameters of the neural network in S1 Text (Fig. A).

Selection of input properties

The procedure for the neural network to select the most appropriate input properties is challenging for our meta-analysis, as discussed before the available properties vary across different studies, and the same or related properties may be reported in different ways. The input properties were categorized into two types, factual and derived. The factual properties were: species, implantation cell number, defect area, defect depth, type of cartilage damage, body weight, and tissue source. The type of tissue source can be further classified into bone marrow (BM), adipose tissue (AD), synovial fluid (SF), Warton’s jelly (WJ), synovial tissue (ST), and umbilical cord blood (UCB). The derived properties emerged from our biological intuitions and may not have been used in the previous studies, such as defect area percentage, defect depth percentage, and cell concentration.

We first trained a neural network to take only one input property and predicted the cartilage repair score. This allowed us to probe the performance of individual property in Fig 3. It is possible that two or more properties of MSC therapy were individually not impactful to the cartilage repair, but when used in combination they allow the model to capture important correlation. For example, both implantation cell concentration and defect volume have low R2 values (-0.04 and 0.12 respectively), but the implantation cell number, which is the product of two former properties, gives a R2 value of 0.41.

Fig 3. Accuracy of the neural network.

R2 values of the neural network model trained with individual property (bar) and the combination of best performing individual properties (red line). The shaded red area represents the uncertainty of each R2 value.

The full set of factual and derived properties was provided as inputs to train the neural network. A correlation test is performed between all properties to make sure no pairs of input properties are closely correlated, finding that both Pearson’s Correlation and Spearman’s Rank Correlation coefficients are smaller than 0.53. The individual properties’ correlation with the cartilage repair score was computed and sorted in descending order in Fig 3. The most correlated property is defect depth percentage, with an R2 value of 0.55, followed by defect area percentage (0.42), cell number (0.41) and body weight (0.30). The tissue type BM, AD, and type of cartilage damage are less correlated, and the cell concentration along with other tissue types (SF, WJ, ST, and UCB) are negatively correlated with the cartilage repair score. The top four properties gives an R2 value of 0.625 ± 0.012, and the combination of all seven positive properties has a maximum R2 of 0.637 ± 0.005. Overfitting was observed at a decreasing R2 with more than seven descriptors, this happened when the system matched the training dataset but failed with unseen data draw from the validation dataset. Fewer descriptors did not provide a sufficient basis set, so we chose the first seven descriptors where each of them individually yields a positive R2 value, which captured more correlations of clinical properties without overfitting and provided higher quality uncertainty prediction. The tissue type BM and AD have been consolidated into a single tissue type property, and we have a total of six different input properties.


With the identified six critical input properties, the neural network used for our machine learning model achieved a R2 of 0.637 ± 0.005 with blind cross-validation. The neural network also delivered the prediction uncertainty in terms of the absolute error between the predicted value and the actual value, as plotted in Fig 4. The random errors associated with the model correctly followed a normal distribution so should well capture the true uncertainty. There are 18 entries out of the total 44 entries lie outside the one standard deviation region. We will exploit all of this knowledge in the next subsection.

Fig 4. Histogram of errors for predictions using our neural network.

The dotted red line is fitted with a normal distribution. Each bin contains four data points. Overpredicted refers to predicted values are better than post-treatment cartilage repair scores. Underpredicted refers to predicted values are worse than post-treatment cartilage repair scores.


With access to the uncertainties in Fig 4, we can gain further insight from the neural network predictions. In particular, we can discard predictions carrying large uncertainty, and trust only those with smaller uncertainty. The idea is illustrated in Fig 5A, where we select four of the points from Fig 5B, including that with the largest uncertainty that has the highest likelihood of deviating from the true value so should give the largest error, as well as other quartiles in uncertainty. This allowed us to focus on the most confident predictions only at the expense of reporting fewer predictions, e.g. discard the data point with the largest uncertainty (yellow bar) and recalculate the sum of squares for the R2 value. By doing so, the quality of the remaining neural network predictions increases as the root-mean-square error between the predicted values and the actual values decreases when a smaller fraction of predictions is accepted and validated as shown in Fig 5B, 100% of data validated means we predicted and validated against every entry in our database, and all of these values contribute to the final R2. 75% of data validated means that we calculate the R2 using only the 75% of the data with smallest uncertainties in their predictions. The best R2 value of 0.743 was reported at 82% of data being validated, and then reached the plateau when less than 70% of data are being validated. Validating fewer data can lead to significant noise and is less applicable in the real-world where we wish to impute as much as possible, therefore we focus on the >50% regime. The result confirms that the neural network is able to accurately and truthfully inform us about the uncertainties in its predictions; and so the confidence of predictions is correlated with their accuracy.

Fig 5. Model performance after imputation.

(A) shows an example when making predictions for just four data points. The y-axis is the prediction from the machine learning, and the x-axis delineates four different sample predictions ordered by their uncertainty. The colored dots represent the predicted value and their uncertainty that is also predicted by the machine learning method is shown by the colored bars (magenta, turquoise, green, and yellow), the violin plot represents the probability density distribution for predicted outcomes, and the red dots are the true (unknown to machine learning) values used for validation, and the difference between predicted and true values is measured as the grey arrows. The sum of squares (SS) value is then normalized to calculate the R2 value. (B) shows the R2 value with percentage of data validated, and the data points are color-coded by their uncertainty ranking. The blue line is the trend line fitted to the data points. The turquoise, green, and yellow points in (A) are the points at 50%, 75%, and 100% in (B).

We note that this post-processing corresponds to increase in accuracy, once a model was trained, and the desired level of confidence can be specified and used to return only sufficiently accurate predictions. The projected cartilage repair score along with the confidence level of the prediction will be provided once the patient’s condition has been set as inputs to the model, which will allow clinicians to focus treatments on those most likely to lead to success, and trials to focus on the most illuminating input property space.

Identifying anomalous results

With the computed uncertainties of prediction, we identified entries with particularly high deviations between the predicted and experimental results. Those can then be re-examined, and corrected to improve the training dataset. Most predictions of our model were expected to lie within one standard deviation (±1) of the experimental results, as shown in Fig 4. The 18 entries lay outside of the one standard deviation region are shown in Table 1. Three of them were from clinical trials, and the other 15 were from animal studies. A positive number of standard deviation away means our neural network overpredicts the cartilage repair score, and a negative number means underprediction. We analyzed the over- and underpredicted repair scores as follows.

Table 1. The table highlights predicted entries where the number of standard deviations out by clinical results are greater than 1 or less than -1, which indicates our prediction is away from the experiments.

In general, our model predicted 80% of the clinical trials with an error smaller than one standard deviation, which was better than that of 48% for the animal studies. Three human clinical trial outcomes were underpredicted. In two of the cases, the researchers performed additional surgical procedures besides MSC implantation to repair the damaged cartilage. De Windt et al. implanted debrided autologous chondrocytes together with MSCs in their procedure [18]. The interaction between MSCs and chondrocytes was not considered as an input property in the current neural network, but might promote the cartilage repair. Koh et al. performed microfracture surgery before MSC implantation [22]. The recruitment of autologous MSCs from the subchondral bone to the defect cartilage area by the microfracture surgery was likely the cause of the underpredicted outcome from the neural network.

The most underpredicted entry with -2.34 standard deviations away from the actual experiment outcome, appeared in the study from Ando et al., where an MSC-based tissue scaffold was implanted to chondral defects in porcine models [29]. Similarly, Li et al. encapsulated MSCs in microspheres prior to transplantation to the rabbit osteochondral defects [34], which yielded a standard deviation of -1.30. In both cases, the use of scaffold likely induced pre-differentiation of MSCs towards chondrogenic lineage, and the production of extracellular matrix proteins before transplantation might have greatly promoted the repair efficacy. Xue et al. also delivered MSCs to their rat model in tissue-engineered scaffold made from poly (lactide-co-glycolide) (PLGA)/nano-hydroxyapatite (NHA), but the MSCs possibly remained at undifferentiated status [37]. This resulted in a smaller underprediction by the neural network with a standard deviation of -1.03. Another underprediction with a standard deviation of -1.62 was seen in the study from Ma et al. [32], where an autologous graft was transplanted together with the MSCs. In this study, the mosaicplasty might have contributed significantly to the repair, which was not analyzed as an input to the neural network.

For overpredicted repair scores, Katayama et al. reported their MSC treatment efficacy to rabbit cartilage defect [33] at a much lower level than the neural network prediction, with 2.88 standard deviations away. Although the isolation and subculture of MSCs were performed using standard protocols, the authors did not provide sufficient quality control of the cells before the treatment. The uncertainty in cell purity and quality might have resulted in the suboptimal repair.

Re-visiting these inaccurately predicted cases has allowed us to gain further insights on the therapeutic efficacy of MSCs in cartilage repair. The majority of the less accurate predictions occurred in animal trials, where special delivery methods or manipulations to the MSCs have been implemented. These findings implied the potential impact of these novel therapy input properties on cartilage repair, although they are not readily applied in clinical trials. We also realized that not all the less accurately predicted cases were associated with special delivery strategy or cell modification, and the underlying causes were not obvious. It is reasonable to believe that the potency of MSCs, secretome profile, and the surgical procedures might all impose significant impacts on the therapeutic outcome. Including these information as input properties in the database would empower the neural network to enhance the prediction accuracy.

Influence of properties

The patients’ pre-treatment conditions and therapeutic strategies were encoded within the input properties for the model to make predictions. The relative strength of the properties on predicting the cartilage repair score, defined as the change of R2 on removing a property, is plotted Fig 6. The pre-treatment conditions such as defect area percentage, defect depth percentage, and body weight play important roles in the treatment outcome. Whereas the treatment strategy properties, such as the implantation cell number and the tissue source, impact the outcome to a lesser extent. We now study these input properties in descending order of importance.

Fig 6. Illustration of the relative strength of properties used in our model.

Defect area and depth percentage.

We first investigated the two most important properties: defect area percentage and defect depth percentage; a surface plot is shown in Fig 7 where the cartilage repair score has been normalized against the full range of scores in the database. It is worthwhile to note that although most training dataset has defect area percentage less than 30% and defect depth percentage greater than 40%, our neural network model extrapolated cartilage repair of a patient with indications beyond the existing range of conditions in the database. The neural network can do this due to its unique ability to handle missing data over the full range of conditions (0-100%). In general, the cartilage repair score drops as the percentage of defect area and depth increases, implying the difficulty for MSC therapy to achieve full recovery in patients with severe cartilage damages.

Fig 7. Surface plot of the normalized cartilage repair score based on defect area percentage and defect depth percentage.

The trajectory of changing area or depth is shown in white arrows.

The study showed that critical thresholds of damage exist for effective cartilage repair to happen, which is similar to the case of volumetric muscle loss [55]. In cartilage repair models, a “critical size” osteochondral defect that can not effectively repair by itself, has been widely used. In most cases, such critical sizes were applied at estimated default values for different animal models. Some studies have attempted to experimentally determine the critical size of the defect in terms of depth and diameter in specific animal models [56]. In our machine learning model, we predicted those “critical size” defects as we observed a rapid decrease in the normalized cartilage repair score when the defect area percentage increases from 6% to 35%. Another fast drop was observed at 64%, because minimal repair should be expected when more than 70% of cartilage area is damaged. These sharp drop-offs identified from the model indicates the presence of multiple “critical sizes” that constrain cartilage repair to different levels post MSC therapy. These quantitative cartilage repair predictions based on the patients’ defect conditions provide useful references for the clinicians to make decisions on the therapy.

Body weight.

As shown in Fig 6, the body weight also acts as an important input property in our neural network: heavier species tend to have a better therapeutic outcome. However, this may be attributed to the large inter-species weight differences in the database. The lack of intra-species weights information in the databased has made further analysis difficult. This could be a valuable topic for further investigation.

Implantation cell number.

The next most important input property is the implantation cell number. Fig 8 shows a near linear increase in the cartilage repair score with implantation cell number less than 17 million. The normalized cartilage repair score is above 0.9 between 17 to 25 million implantation cell number, and is maintained around 0.8 in the 25 to 75 million range. Further increase in the implantation cell number results in a sudden drop of the normalized cartilage repair score to below 0.7.

Fig 8. Impact of implantation cell number on the cartilage repair.

The left y-axis shows the predicted patient’s cartilage repair scores normalized to the range of score that patients have been evaluated in clinical trials. The right y-axis shows the number of studies (blue histogram) that use a certain cell number in our database.

The determination of MSC dose for therapy remains intuitive in current practice. A wide range of implantation cell numbers has been found in the literature, ranging from a few thousand to 10 billion with the majority falling between 1 to 100 million [5]. Besides the implantation cell number, these cells were also transplanted at a vast range of concentrations in different animal studies and clinical trials, between a thousand to a billion cells per millilitre of the delivery agents [5]. Controversial results on the cell dose-dependent influence on cartilage repair have been reported. On one hand, higher cell number and concentration have been associated with better chondrogenesis and cartilage repair [5762]. The high cell density likely recapitulated the mesenchymal condensation process that occurred during embryonic development of cartilage, and promoted MSC differentiation towards chondrogenic lineage [63]. On the other hand, native cartilage is an ECM-rich avascular tissue with low cell density. Studies have pointed out the limitation to cell saturation and survival [64], and high dose of MSC transplantation was likely to increase the risk of synovitis and synovial proliferation [57, 65].

In this study, we untangle the long-lasting controversy through machine learning approach, and recommend an optimal dose of 17-25 million MSC for human therapy. This conclusion is partly supported by a dose-dependent MSC Phase II clinical trial [48] to treat osteoarthritis patients, which is unseen to the machine learning model, where MSC dose larger than 25 million resulted in a decline in the patients’ cartilage repair scores. This overturns the long-standing protocol of using fewer than 2 million cells for implantation.

Tissue source.

The tissue sources of MSCs, bone marrow (BM) and adipose tissue (AD), have been combined to form a new property in our model. These two sources of MSCs are the most widely used and studied, mainly because of the high accessibility to BM and AD. The abundant MSC number obtainable from BM and AD also determines that these cells have greater potential to be produced at large scale for allogenic uses. The number of occurences of BM and AD MSCs were abundant in our database, and the machine learning results suggested that both BM and AD MSCs are beneficial to the treatment. However, more studies are needed to reach a conclusion on the effects of other tissue sources, including synovia fluid (SF), Wharton’s jelly (WJ), synovia tissue (ST), and umbilical cord blood (UCB). Their individual performances were tentatively analyzed and displayed in Fig 3 based on the current database.

Type of cartilage damage.

The least important property in this machine learning model is the type of cartilage damage. Although fundamental difference exists in the causes and pathologies between osteochondral defect and osteoarthritis, the mechanisms of cartilage repair through MSC therapy in both cases may share many commonalities, such as differentiation of the MSCs into chondrocytes at the damage site, secretion of regenerative factors, and immune regulation.


In this study, we have developed a neural network model that exploits the inter-property and property-property correlations to predict the therapeutic efficacy of MSC transplantation for cartilage repair based on animal results and human clinical trials. We started with cartilage injury models where different MSCs were given and measures of their performance were recorded. We characterized the cartilage repair score and filled the missing information using the neural network while training the model. The assessment of new patient would provide input information for the model to make predictions on human clinical trial outcomes and the recommended properties, clinicians would be given the uncertainty in the prediction along with the confidence level to decide the most suitable therapy for treatment.

We reported an optimal implantation cell number of 17-25 million to treat patients with cartilage damages, and quantitatively demonstrated how the key factors, including the number of cells implanted, defect area, and depth, could impact the post-transplantation healing. In particular, the neural network has the ability to systematically estimate the confidence level of each prediction, make decisions based on reliable results, and expedite trials. The predictive power of our model enables personalized therapy. We predicted the optimal therapeutic outcome based on individual patient’s disease conditions, including defect area percentage, defect depth percentage, and body weight. For patients with severe cartilage damages beyond the threshold for effective repair, other treatment strategies should be considered. Together, the predictions from our model would serve as important references to the clinicians and scientists to design better MSC therapy strategies for cartilage repair, and their findings can be used to further optimize the model. The technology can also be adapted for MSC therapies to other medical indications, and address other biomedical questions.

There is open access to the data and codes at

Supporting information

S1 Text. We provide additional details, including the algorithm to calculate uncertainties and figures that validate the hyperparameters for our machine learning method.



  1. 1. Barbour KE, Helmick CG, Theis KA, Murphy LB, Hootman JM, Brady TJ, et al. Prevalence of doctor-diagnosed arthritis and arthritis-attributable activity limitation—United States, 2010–2012. MMWR Morbidity and mortality weekly report. 2013;62(44):869.
  2. 2. Government response to the overview of Arthritis. Department of Health, retrieved from; 2019.
  3. 3. Lin AC, Seeto BL, Bartoszko JM, Khoury MA, Whetstone H, Ho L, et al. Modulating hedgehog signaling can attenuate the severity of osteoarthritis. Nature medicine. 2009;15(12):1421. pmid:19915594
  4. 4. Sophia Fox AJ, Bedi A, Rodeo SA. The basic science of articular cartilage: structure, composition, and function. Sports health. 2009;1(6):461–468.
  5. 5. Goldberg A, Mitchell K, Soans J, Kim L, Zaidi R. The use of mesenchymal stem cells for cartilage repair and regeneration: a systematic review. Journal of orthopaedic surgery and research. 2017;12(1):39.
  6. 6. Michie D, Spiegelhalter DJ, Taylor CC, Campbell J. Machine learning. Neural and Statistical Classification. 1994;13.
  7. 7. Kocev D, Džeroski S, White MD, Newell GR, Griffioen P. Using single-and multi-target regression trees and ensembles to model a compound index of vegetation condition. Ecological Modelling. 2009;220(8):1159–1168.
  8. 8. Conduit B, Jones NG, Stone HJ, Conduit GJ. Design of a nickel-base superalloy using a neural network. Materials & Design. 2017;131:358–365.
  9. 9. Conduit B, Jones NG, Stone HJ, Conduit GJ. Probabilistic design of a molybdenum-base alloy using a neural network. Scripta Materialia. 2018;146:82–86.
  10. 10. Verpoort P, MacDonald P, Conduit GJ. Materials data validation and imputation with an artificial neural network. Computational Materials Science. 2018;147:176–185.
  11. 11. Santak P, Conduit G. Predicting physical properties of alkanes with neural networks. Fluid Phase Equilibria. 2019;501:112259.
  12. 12. Whitehead T, Irwin B, Hunt P, Segall M, Conduit G. Imputation of Assay Bioactivity Data Using Deep Learning. Journal of chemical information and modeling. 2019;59(3):1197–1204.
  13. 13. Park YB, Ha CW, Lee CH, Yoon YC, Park YG. Cartilage regeneration in osteoarthritic patients by a composite of allogeneic umbilical cord blood-derived mesenchymal stem cells and hyaluronate hydrogel: results from a clinical trial for safety and proof-of-concept with 7 years of extended follow-up. Stem cells translational medicine. 2017;6(2):613–621.
  14. 14. Park Y, Ha C, Kim J, Han W, Rhim J, Lee H, et al. Single-stage cell-based cartilage repair in a rabbit model: cell tracking and in vivo chondrogenesis of human umbilical cord blood-derived mesenchymal stem cells and hyaluronic acid hydrogel composite. Osteoarthritis and cartilage. 2017;25(4):570–580. pmid:27789339
  15. 15. Ha CW, Park YB, Chung JY, Park YG. Cartilage repair using composites of human umbilical cord blood-derived mesenchymal stem cells and hyaluronic acid hydrogel in a minipig model. Stem Cells Translational Medicine. 2015;4(9):1044–1051.
  16. 16. Park YB, Song M, Lee CH, Kim JA, Ha CW. Cartilage repair by human umbilical cord blood-derived mesenchymal stem cells with different hydrogels in a rat model. Journal of Orthopaedic Research. 2015;33(11):1580–1586.
  17. 17. Wu KC, Chang YH, Liu HW, Ding DC. Transplanting human umbilical cord mesenchymal stem cells and hyaluronate hydrogel repairs cartilage of osteoarthritis in the minipig model. Tzu-Chi Medical Journal. 2019;31(1):11.
  18. 18. de Windt TS, Vonk LA, Slaper-Cortenbach IC, van den Broek MP, Nizak R, van Rijen MH, et al. Allogeneic mesenchymal stem cells stimulate cartilage regeneration and are safe for single-stage cartilage repair in humans upon mixture with recycled autologous chondrons. Stem Cells. 2017;35(1):256–264. pmid:27507787
  19. 19. Vega A, Martín-Ferrero MA, Del Canto F, Alberca M, García V, Munar A, et al. Treatment of knee osteoarthritis with allogeneic bone marrow mesenchymal stem cells: a randomized controlled trial. Transplantation. 2015;99(8):1681–1690. pmid:25822648
  20. 20. Koh YG, Jo SB, Kwon OR, Suh DS, Lee SW, Park SH, et al. Mesenchymal stem cell injections improve symptoms of knee osteoarthritis. Arthroscopy: The Journal of Arthroscopic & Related Surgery. 2013;29(4):748–755.
  21. 21. Koh YG, Choi YJ. Infrapatellar fat pad-derived mesenchymal stem cell therapy for knee osteoarthritis. The Knee. 2012;19(6):902–907.
  22. 22. Koh YG, Kwon OR, Kim YS, Choi YJ, Tak DH. Adipose-derived mesenchymal stem cells with microfracture versus microfracture alone: 2-year follow-up of a prospective randomized trial. Arthroscopy: The Journal of Arthroscopic & Related Surgery. 2016;32(1):97–109.
  23. 23. Lee KB, Hui JH, Song IC, Ardany L, Lee EH. Injectable mesenchymal stem cell therapy for large cartilage defects—a porcine model. Stem cells. 2007;25(11):2964–2971.
  24. 24. Zhang Y, Wang F, Chen J, Ning Z, Yang L. Bone marrow-derived mesenchymal stem cells versus bone marrow nucleated cells in the treatment of chondral defects. International orthopaedics. 2012;36(5):1079–1086.
  25. 25. Pers YM, Rackwitz L, Ferreira R, Pullig O, Delfour C, Barry F, et al. Adipose mesenchymal stromal cell-based therapy for severe osteoarthritis of the knee: A phase I dose-escalation trial. Stem cells translational medicine. 2016;5(7):847–856. pmid:27217345
  26. 26. Fodor PB, Paulseth SG. Adipose derived stromal cell (ADSC) injections for pain management of osteoarthritis in the human knee joint. Aesthetic surgery journal. 2015;36(2):229–236.
  27. 27. Davatchi F, Sadeghi Abdollahi B, Mohyeddin M, Nikbin B. Mesenchymal stem cell therapy for knee osteoarthritis: 5 years follow-up of three patients. International journal of rheumatic diseases. 2016;19(3):219–225.
  28. 28. Akgun I, Unlu MC, Erdal OA, Ogut T, Erturk M, Ovali E, et al. Matrix-induced autologous mesenchymal stem cell implantation versus matrix-induced autologous chondrocyte implantation in the treatment of chondral defects of the knee: a 2-year randomized study. Archives of orthopaedic and trauma surgery. 2015;135(2):251–263. pmid:25548122
  29. 29. Ando W, Tateishi K, Hart DA, Katakai D, Tanaka Y, Nakata K, et al. Cartilage repair using an in vitro generated scaffold-free tissue-engineered construct derived from porcine synovial mesenchymal stem cells. Biomaterials. 2007;28(36):5462–5470. pmid:17854887
  30. 30. Ando W, Fujie H, Moriguchi Y, Nansai R, Shimomura K, Hart DA, et al. Detection of abnormalities in the superficial zone of cartilage repaired using a tissue engineered construct derived from synovial stem cells. European cells & materials. 2012;24:292–307.
  31. 31. Jo CH, Lee YG, Shin WH, Kim H, Chai JW, Jeong EC, et al. Intra-articular injection of mesenchymal stem cells for the treatment of osteoarthritis of the knee: a proof-of-concept clinical trial. Stem cells. 2014;32(5):1254–1266. pmid:24449146
  32. 32. Ma X, Sun Y, Cheng X, Gao Y, Hu B, Wen G, et al. Repair of osteochondral defects by mosaicplasty and allogeneic BMSCs transplantation. International journal of clinical and experimental medicine. 2015;8(4):6053. pmid:26131203
  33. 33. Katayama R, Wakitani S, Tsumaki N, Morita Y, Matsushita I, Gejo R, et al. Repair of articular cartilage defects in rabbits using CDMP1 gene-transfected autologous mesenchymal cells derived from bone marrow. Rheumatology. 2004;43(8):980–985. pmid:15187242
  34. 34. Li Y, Cheng H, Cheung K, Chan D, Chan B. Mesenchymal stem cell-collagen microspheres for articular cartilage repair: cell density and differentiation status. Acta biomaterialia. 2014;10(5):1919–1929.
  35. 35. Zhu S, Chen P, Wu Y, Xiong S, Sun H, Xia Q, et al. Programmed application of transforming growth factor β3 and Rac1 inhibitor NSC23766 committed hyaline cartilage differentiation of adipose-derived stem cells for osteochondral defect repair. Stem cells translational medicine. 2014;3(10):1242–1251. pmid:25154784
  36. 36. Nishimori M, Deie M, Kanaya A, Exham H, Adachi N, Ochi M. Repair of chronic osteochondral defects in the rat: a bone marrow-stimulating procedure enhanced by cultured allogenic bone marrow mesenchymal stromal cells. The Journal of bone and joint surgery British volume. 2006;88(9):1236–1244.
  37. 37. Xue D, Zheng Q, Zong C, Li Q, Li H, Qian S, et al. Osteochondral repair using porous poly (lactide-co-glycolide)/nano-hydroxyapatite hybrid scaffolds with undifferentiated mesenchymal stem cells in a rat model. Journal of Biomedical Materials Research Part A: An Official Journal of The Society for Biomaterials, The Japanese Society for Biomaterials, and The Australian Society for Biomaterials and the Korean Society for Biomaterials. 2010;94(1):259–270.
  38. 38. Dahlin RL, Kinard LA, Lam J, Needham CJ, Lu S, Kasper FK, et al. Articular chondrocytes and mesenchymal stem cells seeded on biodegradable scaffolds for the repair of cartilage in a rat osteochondral defect model. Biomaterials. 2014;35(26):7460–7469. pmid:24927682
  39. 39. Yan X, Cen Y, Wang Q. Mesenchymal stem cells alleviate experimental rheumatoid arthritis through microRNA-regulated IκB expression. Scientific reports. 2016;6:28915.
  40. 40. Chiang ER, Ma HL, Wang JP, Liu CL, Chen TH, Hung SC. Allogeneic mesenchymal stem cells in combination with hyaluronic acid for the treatment of osteoarthritis in rabbits. PloS one. 2016;11(2):e0149835.
  41. 41. Toghraie F, Chenari N, Gholipour M, Faghih Z, Torabinejad S, Dehghani S, et al. Treatment of osteoarthritis with infrapatellar fat pad derived mesenchymal stem cells in Rabbit. The Knee. 2011;18(2):71–75. pmid:20591677
  42. 42. Choi S, Kim JH, Ha J, Jeong BI, Jung YC, Lee GS, et al. Intra-articular injection of alginate-microencapsulated adipose tissue-derived mesenchymal stem cells for the treatment of osteoarthritis in rabbits. Stem cells international. 2018;2018.
  43. 43. Zhang X, Yamaoka K, Sonomoto K, Kaneko H, Satake M, Yamamoto Y, et al. Local delivery of mesenchymal stem cells with poly-lactic-co-glycolic acid nano-fiber scaffold suppress arthritis in rats. PloS one. 2014;9(12):e114621. pmid:25474102
  44. 44. Murphy JM, Fink DJ, Hunziker EB, Barry FP. Stem cell therapy in a caprine model of osteoarthritis. Arthritis & Rheumatism: Official Journal of the American College of Rheumatology. 2003;48(12):3464–3474.
  45. 45. Black LL, Gaynor J, Gahring D, Adams C, Aron D, Harman S, et al. Effect of adipose-derived mesenchymal stem and regenerative cells on lameness in dogs with chronic osteoarthritis of the coxofemoral joints: a randomized, double-blinded, multicenter controlled trial. Veterinary Therapeutics. 2007;8(4):272. pmid:18183546
  46. 46. Black LL, Gaynor J, Adams C, Dhupa S, Sams AE, Taylor R, et al. Effect of intraarticular injection of autologous adipose-derived mesenchymal stem and regenerative cells on clinical signs of chronic osteoarthritis of the elbow joint in dogs. Veterinary therapeutics: research in applied veterinary medicine. 2008;9(3):192–200.
  47. 47. Papadopoulou A, Yiangou M, Athanasiou E, Zogas N, Kaloyannidis P, Batsis I, et al. Mesenchymal stem cells are conditionally therapeutic in preclinical models of rheumatoid arthritis. Annals of the rheumatic diseases. 2012;71(10):1733–1740. pmid:22586171
  48. 48. Gupta PK, Chullikana A, Rengasamy M, Shetty N, Pandey V, Agarwal V, et al. Efficacy and safety of adult human bone marrow-derived, cultured, pooled, allogeneic mesenchymal stromal cells (Stempeucel): preclinical and clinical trial in osteoarthritis of the knee joint. Arthritis research & therapy. 2016;18(1):301.
  49. 49. Heskes T. Practical confidence and prediction intervals. Advances in neural information processing systems. 1997; p. 176–182.
  50. 50. Papadopoulos G, Edwards PJ, Murray AF. Confidence estimation methods for neural networks: A practical comparison. IEEE transactions on neural networks. 2001;12(6):1278–1287.
  51. 51. McLachlan G, Krishnan T. The EM algorithm and extensions. vol. 382. John Wiley & Sons; 2007.
  52. 52. Liaw A, Wiener M, et al. Classification and regression by randomForest. R news. 2002;2(3):18–22.
  53. 53. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research. 2011;12:2825–2830.
  54. 54. Lee DD, Seung HS. Algorithms for non-negative matrix factorization. In: Advances in neural information processing systems; 2001. p. 556–562.
  55. 55. Anderson SE, Han WM, Srinivasa V, Mohiuddin M, Ruehle MA, Moon JY, et al. Determination of a critical size threshold for volumetric muscle loss in the mouse quadriceps. Tissue Engineering Part C: Methods. 2019;25(2):59–70.
  56. 56. Gotterbarm T, Breusch S, Schneider U, Jung M. The minipig model for experimental chondral and osteochondral defect repair in tissue engineering: retrospective analysis of 180 defects. Laboratory animals. 2008;42(1):71–82.
  57. 57. Lee JC, Min HJ, Park HJ, Lee S, Seong SC, Lee MC. Synovial membrane–derived mesenchymal stem cells supported by platelet-rich plasma can repair osteochondral defects in a rabbit model. Arthroscopy: The Journal of Arthroscopic & Related Surgery. 2013;29(6):1034–1046.
  58. 58. Ho STB, Hutmacher DW, Ekaputra AK, Hitendra D, Hui JH. The evaluation of a biphasic osteochondral implant coupled with an electrospun membrane in a large animal model. Tissue Engineering Part A. 2009;16(4):1123–1141.
  59. 59. Hui T, Cheung K, Cheung W, Chan D, Chan B. In vitro chondrogenic differentiation of human mesenchymal stem cells in collagen microspheres: influence of cell seeding density and collagen concentration. Biomaterials. 2008;29(22):3201–3212.
  60. 60. Saw KY, Hussin P, Loke SC, Azam M, Chen HC, Tay YG, et al. Articular cartilage regeneration with autologous marrow aspirate and hyaluronic acid: an experimental study in a goat model. Arthroscopy: The Journal of Arthroscopic & Related Surgery. 2009;25(12):1391–1400.
  61. 61. Charles Huang CY, Reuben PM, D’Ippolito G, Schiller PC, Cheung HS. Chondrogenesis of human bone marrow-derived mesenchymal stem cells in agarose culture. The Anatomical Record Part A: Discoveries in Molecular, Cellular, and Evolutionary Biology: An Official Publication of the American Association of Anatomists. 2004;278(1):428–436.
  62. 62. Erickson IE, Kestle SR, Zellars KH, Farrell MJ, Kim M, Burdick JA, et al. High mesenchymal stem cell seeding densities in hyaluronic acid hydrogels produce engineered cartilage with native tissue properties. Acta biomaterialia. 2012;8(8):3027–3034. pmid:22546516
  63. 63. Ghone NV, Grayson WL. Recapitulation of mesenchymal condensation enhances in vitro chondrogenesis of human mesenchymal stem cells. Journal of cellular physiology. 2012;227(11):3701–3708.
  64. 64. Zhao Q, Wang S, Tian J, Wang L, Dong S, Xia T, et al. Combination of bone marrow concentrate and PGA scaffolds enhance bone marrow stimulation in rabbit articular cartilage repair. Journal of Materials Science: Materials in Medicine. 2013;24(3):793–801. pmid:23274630
  65. 65. Oshima Y, Harwood FL, Coutts RD, Kubo T, Amiel D. Variation of mesenchymal cells in polylactic acid scaffold in an osteochondral repair model. Tissue Engineering Part C: Methods. 2009;15(4):595–604.