Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A statistically compiled test battery for feasible evaluation of knee function after rupture of the Anterior Cruciate Ligament – derived from long-term follow-up data

  • Lina Schelin ,

    Contributed equally to this work with: Lina Schelin, Eva Tengman, Patrik Ryden, Charlotte Häger

    Affiliations Department of Community Medicine and Rehabilitation, Physiotherapy, Umeå University, Umeå, Sweden, Department of Statistics, Umeå School of Business and Economics, Umeå University, Umeå, Sweden

  • Eva Tengman ,

    Contributed equally to this work with: Lina Schelin, Eva Tengman, Patrik Ryden, Charlotte Häger

    Affiliation Department of Community Medicine and Rehabilitation, Physiotherapy, Umeå University, Umeå, Sweden

  • Patrik Ryden ,

    Contributed equally to this work with: Lina Schelin, Eva Tengman, Patrik Ryden, Charlotte Häger

    Affiliation Department of Mathematics and Mathematical Statistics, Umeå University, Umeå Sweden

  • Charlotte Häger

    Contributed equally to this work with: Lina Schelin, Eva Tengman, Patrik Ryden, Charlotte Häger

    Affiliation Department of Community Medicine and Rehabilitation, Physiotherapy, Umeå University, Umeå, Sweden

A statistically compiled test battery for feasible evaluation of knee function after rupture of the Anterior Cruciate Ligament – derived from long-term follow-up data

  • Lina Schelin, 
  • Eva Tengman, 
  • Patrik Ryden, 
  • Charlotte Häger



Clinical test batteries for evaluation of knee function after injury to the Anterior Cruciate Ligament (ACL) should be valid and feasible, while reliably capturing the outcome of rehabilitation. There is currently a lack of consensus as to which of the many available assessment tools for knee function that should be included. The present aim was to use a statistical approach to investigate the contribution of frequently used tests to avoid redundancy, and filter them down to a proposed comprehensive and yet feasible test battery for long-term evaluation after ACL injury.


In total 48 outcome variables related to knee function, all potentially relevant for a long-term follow-up, were included from a cross-sectional study where 70 ACL-injured (17–28 years post injury) individuals were compared to 33 controls. Cluster analysis and logistic regression were used to group variables and identify an optimal test battery, from which a summarized estimator of knee function representing various functional aspects was derived.


As expected, several variables were strongly correlated, and the variables also fell into logical clusters with higher within-correlation (max ρ = 0.61) than between clusters (max ρ = 0.19). An extracted test battery with just four variables assessing one-leg balance, isokinetic knee extension strength and hop performance (one-leg hop, side hop) were mathematically combined to an estimator of knee function, which acceptably classified ACL-injured individuals and controls. This estimator, derived from objective measures, correlated significantly with self-reported function, e.g. Lysholm score (ρ = 0.66; p<0.001).


The proposed test battery, based on a solid statistical approach, includes assessments which are all clinically feasible, while also covering complementary aspects of knee function. Similar test batteries could be determined for earlier phases of ACL rehabilitation or to enable longitudinal monitoring. Such developments, established on a well-grounded consensus of measurements, would facilitate comparisons of studies and enable evidence-based rehabilitation.


Rupture of the anterior cruciate ligament (ACL) is a common injury especially in individuals who participate in sports [1, 2]. Treatment involves either physiotherapy in combination with reconstructive surgery, or physiotherapy alone. Regardless of treatment, individuals still often suffer from varying extents of impaired knee function, both in the short [3, 4] and long-term perspective despite completing rehabilitation [58]. Such reduced knee function may be manifested by, for instance, instability, pain, swelling, decreased range of motion, joint stiffness, reduced physical capacity or decreased activity level in everyday tasks, but particularly with regard to sports and recreational activities. Consequently, attempts to determine knee function often combine several assessment tools covering different aspects of knee function based mainly on clinical examination, knee-specific scores and functional tests. The latter are aimed at capturing indicators of physical capacity, e.g. muscular strength, balance, motor coordination etc. There is, however, still no consensus on which outcome measures to use, which makes comparisons across studies difficult and leads to a lack of evidence for specific interventions. In the clinic, self-reported questionnaires and examiner-administrated knee scores such as the International Knee Documentation Committee 2000 subjective form (IKDC) [9], Knee injury and Osteoarthritis Outcome Score (KOOS) [10] or Lysholm questionnaire [11] are commonly used, and often in combination with a strength measurement and a hop task. Regarding functional assessments, different test batteries have been suggested [1214]. A test battery in this context refers to a set of functional tests. A test battery consisting of three commonly used hop tests (vertical hop, one-leg hop for distance, and side hop), has shown a high ability to discriminate between the injured and non-injured leg of individuals with ACL injury [12]. Another test battery consisting of four hop tests (one-leg hop for distance, 6-m timed hop, triple hop for distance and crossover hop for distance) has also been demonstrated to be reliable and valid [14, 15]. Yet another test battery, consisting of knee-extension, knee-flexion and leg-press tests, discriminates between strength of the injured and the non-injured leg [13]. The full potential of such test batteries is not always achieved, since the specific test results are most often evaluated separately. The statistical methodology for a research question related to a single outcome variable is often straightforward. Typically two (or more) groups are compared with respect to a single variable using a statistical test [e.g. [5, 6, 8]]. Such tests are sometimes suitable to answer research questions, but single outcome variable analysis might not reveal all of the information contained in the data. It is, for example, possible to find significant differences between two groups when studying two variables simultaneously, while a separate analysis for each of them would not reveal any significant group differences. Hence, it would be desirable to analyze several variables simultaneously.

Correlation analysis and cluster analysis can be used to understand relationships between variables. Examples of statistical methods for dimension reduction are factor analysis, principal component analysis (PCA) and logistic regression. Such methods may be used to combine information from several tests into a more valid estimator of knee function. However, inclusion of many variables in one model may make interpretation difficult. Alternatively, building a model based on a selected subset of variables may result in a model that is easier to interpret. This could be achieved by applying a statistical approach that can determine which knee tests that would be necessary and which would be redundant. In the present paper, a statistical approach was implemented to define a comprehensive and feasible test battery (Fig 1) that would be more discriminative than each of the included single subtests applied separately. To the best of our knowledge, such an approach has not been attempted with regard to knee function assessment.

Fig 1. Illustration of the data structure and the statistical approach.

First, correlation analysis combined with cluster analysis is applied to better understand the relationship between all outcome variables. Potential test batteries are then investigated using logistic regression and subsequently evaluated based on their misclassification rate and on their feasibility. The combined outcomes of the final test battery result in an estimator of knee function, again using logistic regression. Finally, this new variable (estimator of knee function) is analyzed using traditional statistical approaches such as Spearman rank correlation and Wilcoxon rank sum test.

We have utilized data from our long-term follow-up to implement the proposed statistical approach and suggest a test battery to evaluate long-term knee function after ACL injury. The statistical approach used to identify potential test batteries is based on logistic regression models. The models in question should be able to discriminate between different knee function abilities. This could be quantified by considering the models misclassification rate; defined as the proportion of incorrectly classified individuals, with and without ACL injury, when using the model. A low misclassification rate implies in our case, that the test battery better discriminates between injured individuals and healthy-knee controls. Since there may be a bilateral decrease of knee function following a unilateral injury [1618], test batteries evaluated in both injured individuals and healthy-knee controls may provide additional information about which functional tasks to include in the test battery.

Due to the difficulties in defining optimal knee function from a rehabilitation perspective, it would be advantageous to measure all potential aspects of knee function. However, for practical reasons the test battery often needs to be relatively small and feasible (i.e. running the test should be relatively quick and not require extraordinary/specialized equipment). This aspect is also considered when compiling the proposed test battery. Finally, the variables in the final test battery are combined to an estimator of knee function using logistic regression.

In addition to having a model that facilitates an estimation of the overall knee function, it also seems highly important to understand the relationship between all outcome variables, e.g. to identify groups of variables that are correlated to each other. This problem may be addressed by using correlation analysis combined with cluster analysis, as we present later. Highly correlated variables might contribute with similar information, i.e. little information is lost if a group of strongly correlated variables is represented by just one of these variables.

Further, in the case of knee function, previous studies have found low or moderate correlations between patient-reported outcome scores and variables from functional tests [1922]. This suggests that single functional tests are generally not able to measure overall knee function. A compiled index derived from a representative test battery would also be more likely to correspond with self-reported function.

The aim of this paper was to investigate the possibility of applying such a statistical approach to a large set of knee assessments, thereby detecting highly correlated variables and filter them down to a suggestion of a comprehensive and yet feasible clinical test battery consisting of only a few tests to be used in long-term evaluation after ACL injury. If proven useful, the suggested method could be applied to propose test batteries appropriate for acute and sub-acute phases of ACL rehabilitation, as well as monitoring and evaluation of other disorders in many clinical fields.



The KACL20-study (Knee injury—Anterior Cruciate Ligament after more than 20 years [7, 23]) is a long-term follow-up with a cross-sectional design, where 70 individuals who had suffered unilateral ACL injury, on average 23 (17–28) years ago, were compared to 33 healthy-knee controls matched for age and sex. Basic individual characteristics are found in Table 1, and detailed outcome aspects related to physical activity, hop performance, and knee strength have been reported elsewhere [7, 23]. All participants were presented with written and oral information about the study and gave their written informed consent according to the declaration of Helsinki. The project was approved by the Regional Ethical Review Board in Umeå, Sweden (Dnr. 07-155M and Dnr. 08-211M).

Outcome variables

The variables were obtained from a large set of knee tests, questionnaires and scores considered to have good measurement properties, which are commonly used in research and clinics for evaluation of ACL rehabilitation [12, 24]: we chose nine functional tests (including hop tests, strength measurements and balance tests), four self-reported questionnaires, and three examiner-administrated scores, resulting in a total of 48 outcome variables. Brief descriptions of all 48 variables and information about their feasibility are found in Table 2. The different hop tasks and the one-leg balance have been comprehensively described in earlier papers [7, 25]. The variables obtained from functional tests were recorded in a movement laboratory; U-motion lab Umeå University. Participants performed the one-leg hop for distance (OLH), one-leg vertical hop (VH), rise from chair (RC), side hop (SH) and one-leg balance (B) on both the injured (i) and the non-injured (c) leg. For healthy-knee controls both the non-dominant leg (i) and the dominant leg (c) were included. For each exercise both absolute measurements (e.g. maximal hop distance on each leg) and relative measurements, such as the Limb Symmetry Index (LSI), were considered. The strength variables were obtained from peak isokinetic measurements where knee flexion torque (representing hamstrings, H) and knee extension torque (representing quadriceps, Q) in concentric and eccentric contractions were measured on both legs (for details see Tengman et al. b 2014 [23]). All strength variables were quantified in relation to the body weight (Nm/kg) of the individual. LSI and the ratio between hamstrings and quadriceps peak torque (H:Q ratio) were also calculated.

Table 2. A brief description of the 48 outcome variables included in the analysis.

All individuals answered several knee-specific and more general questionnaires including: KOOS [10], Physical Activity Scale (PAS) [12, 26], International Physical Activity Questionnaire (IPAQ), 36-Item Short Form Health Survey (SF 36) [27]. For KOOS and SF-36 each sub score was considered as one variable. Lysholm score, Tegner activity scale, [11] and Beighton score were examiner administrated. See Table 2 for the complete list of all variables. In addition to the variables described above, some background variables were observed, including age, sex, and clinical history (i.e. ACL-injured or healthy-knee control).

Feasibility index

A clinical test battery should be feasible. We therefore asked ten expert physiotherapists to independently rank all included functional assessments and questionnaires according to time requirement and to which extent specific equipment is needed. The ranking from the physiotherapists is presented in Table 2 as a T:E-index, where T stands for time and E for equipment. Regarding time, a ranking of 1, 2 or 3 corresponds to less than 15 minutes, 15–30 minutes or more than 30 minutes respectively. The estimated time demand includes the needed time for preparation, execution and data registration.

Regarding equipment, a ranking of 1 corresponds to basic equipment always being available, while a ranking of 2 implies advanced equipment or licenses. The T:E-indexes in Table 2 were obtained as the median of the answers given by the physiotherapists. The aim with the T:E-index was to allow comparisons between test batteries regarding feasibility.

Statistical analysis

In order to better understand the relationship between all outcome variables we use correlation analysis combined with cluster analysis, and to statistically derive potential test batteries we use logistic regression models. To evaluate the models, i.e. the test batteries; we consider each models misclassification rate. The statistical analysis is summarized in Fig 1 and the methodology details are presented below.

Correlation analysis combined with hierarchical cluster analysis was used to identify highly correlated variables. First, the Spearman’s rank correlation was used to calculate the correlation between all pairs of variables, denoted ρij, where i and j are indexes for the variables. Next, a dendrogram (i.e. tree describing the relative distance between the variables) was obtained using hierarchical cluster analysis [28] with Ward linkage and a distance matrix D for which the elements were: one minus the absolute correlation, i.e., dij = 1 − |ρij|. The cluster analysis resulted in a dendrogram where highly correlated variables were grouped in clusters. Each cluster corresponds to one branch of the dendrogram.

Test batteries were obtained by selecting 1–30 variables from the functional assessments defined in Table 2. Here, different selection strategies were considered. For small batteries (1–4 variables) all possible combinations were investigated, and for all larger batteries (5–20 variables) 10,000 randomly sampled sets of variables were considered. In addition, we also considered the complete test battery when including all 30 functional test variables.

For each test battery logistic regression was used to model knee function as a function of the variables in the respective test battery. This was done as follows. Let Y denote the binary variable reflecting the clinical history of the patient, that is; 1 for healthy controls, and 0 for ACL-injured. Let (Xb1,…,Xbk) denote the variables used in the bth battery. Logistic regression [29], using the above explanatory variables, but no interaction terms, was used to model the probability wb that the patient is healthy, by

Note that wb may be interpreted as an estimator of the individuals’ relative knee function, where 0 is bad and 1 is good. The coefficient βbi should be interpreted in the following way: a one-unit increase in the variable xbi, holding all other variables at fixed values, corresponds to a 100 exp(βbi)% increase in the odds of being a healthy control.

Each battery was evaluated by considering the corresponding model’s misclassification rate and how feasible the included variables are in the clinic (see below). The commonly used misclassification rate was estimated using leave-one-out cross validation, and should be interpreted as the probability of being classified into the wrong group. The models could include both significant and insignificant variables.

The correlation between the estimated knee function w from the final test battery and other variables were calculated using Spearman’s rank correlation and, for two group comparisons, Wilcoxon’s rank sum test was used. All statistical analyses were performed using the software R, version 2.15.2.

Compilation of test battery

Altogether about 72 000 test batteries, representing different combinations of the included test variables, were selected and evaluated. All test batteries with a misclassification rate lower than 0.2 were investigated further. This cut-off value was chosen arbitrarily to define a subset of reasonable size for further investigation. The feasibility of those test batteries was estimated by the sum of the variables’ feasibility indexes. For example, a battery including the variable OLH-i from the one-leg hop for distance test and the variable Qc-i from the concentric contraction representing quadriceps has an aggregated feasibility index of 7 according to Table 2. A battery with a low index is regarded as highly feasible. Further, if a test battery includes the variable OLH-i, the variable OLH-c will be available without increasing the feasibility. For the identified functional tests, we considered all possible combinations of variables available for these tests. A condition for the final test battery, based on clinical relevance, was that it would mainly consist of variables related to the injured leg. A relevant final test battery was compiled using these established criteria, and in combination with existing clinical evidence.


The data from the KACL20-study used in the present paper included data from both healthy-knee controls and ACL-injured individuals. When applying correlation analysis combined with hierarchical cluster analysis the variables fell into five major clusters that in fact represented clinically meaningful dimensions of knee functions. Generally, the pairwise correlation within the clusters was significantly higher than between the clusters (p-value = 0.005), see Fig 2. Each cluster broadly represents different dimensions of knee function; Cluster I: the Hop performance and knee strength included all absolute variables from the functional tests with the exception of the variables from RC and B. Cluster II: the Perceived knee function included most of the self-reported questionnaires and examiner-administrated scores related to perceived knee function, including the five sub scores of KOOS, Lysholm, SF36-bp and SF36-pf. Cluster III: Knee function reflected in activity and health was the most diverse group and included variables related to activity (Tegner, PAS) and general health (SF36), RC (RC-c, RC-i, RC-LSI), and B (B-c, B-i). Cluster IV: the Knee strength ratio and Cluster V: the Limb asymmetry were rather closely related and contained all the relative functional tests variables with the exception of RC-LSI. The average absolute correlations between variables within cluster I-V were 0.55 (SD = 0.11), 0.61 (SD = 0.15), 0.14 (SD = 0.13), 0.39 (SD = 0.08), and 0.29, (SD = 0.13) respectively. The average absolute correlation between variables from different clusters was 0.12 (SD = 0.05).

Fig 2. Results from Ward hierarchical cluster analysis based on Spearman correlation.

The analysis resulted in five clusters: the Hop performance and knee strength cluster is associated with absolute measurements of functional tests and knee strength measures; the Perceived knee function cluster is linked with scores and questionnaires; the Knee function reflected in activity and health cluster contains a mixture of variables of different character; the Knee strength ratio and the Limb asymmetry clusters were mainly associated with relative measurements between legs (LSI) in functional tests.

Potential test batteries, including only variables from functional tests, were obtained by selecting 1–30 variables from the clusters. The misclassification rate, defined as the proportion of incorrectly classified individuals with and without ACL injury, when using all 30 strength and functional test variables, was 0.40. The median misclassification rates for test batteries with 1–20 variables varied between 0.29 (15 variables) and 0.36 (3 variables). Interestingly, the test batteries with the lowest (and also highest) misclassification rates were found among test batteries with 3–5 variables, suggesting that a battery with few variables may more accurately reflect knee function (Fig 3).

Fig 3. Misclassification rates for different sizes of test batteries.

Misclassification rates for about 72 000 test batteries of different sizes, representing different combinations of the included test variables. The size of the test battery is the number of included variables. The misclassification rate should be as low as possible. The results for combinations consisting of 5, 10, 15 and 20 variables are based on 10000 random samples. The horizontal line indicates our threshold (0.2) for the highest acceptable misclassification rate.

We identified the following tests as typically connected to a low misclassification rate; one-leg hop for distance (OLH), side hop (SH), one-leg balance (B), rise from chair (RC), and quadriceps concentric (Qc) and eccentric (Qe) knee strength. Therefore, the misclassification rates for all combinations of variables related to these functional tests were additionally investigated. Table 3 shows some of the most interesting test batteries and illustrates how the misclassification rate and the total feasibility-index (indicating time and equipment requirements) change when variables are added to the model.

Table 3. Misclassification rates and the T:E-index for a selected subset of test batteries.

The analyses resulted in several models with misclassification rates below 0.2 and feasibility indexes below or equal to 11, some of which are shown in Table 3. Among them we selected a test battery with the variables OLH-i, SH-i, B-c, and Qc-i obtained from the functional assessments OLH, SH, B, and Qc. The battery has only four variables and three of them are related to the injured leg. The feasibility index for the battery was 11 and the estimated misclassification rate was 0.17 (Table 3). Based on the KACL20-data, the suggested battery resulted in the following model for estimating the patients overall knee function:

Interpret the new outcome variable, w, as an estimator of the individuals’ knee function where 0 represents very low function and 1 indicates very good knee function. The distributions of the variables in the battery were similar between the ACL-injured and the healthy-knee controls, while the corresponding distributions for the estimated knee function w were more distinct, see Fig 4.

Fig 4. Distributions of the estimator of knee function and the test battery variables.

The distribution of each of the variables included in the estimator of knee function (w) for each of the two groups, i.e. individuals with an ACL injury and healthy-knee controls. For the estimator of knee function, values close to 1 indicate a good knee function, and values close to 0 indicate the opposite. Quadriceps concentric strength was measured in Nm/kg; the one-leg hop for distance in meters, the one-leg balance in number of floor support, and the side hop in number of side hops.

Interestingly, the estimator w was positively correlated with perceived knee function taken from Lysholm score (ρ = 0.66, p-value < 0.001), all subscales of KOOS (ρ = 0.36–0.60, p-values < 0.001), Tegner activity scale (ρ = 0.26, p = 0.008), and three sub scores (pf, bp, and gh) of SF36 (ρ = 0.23–0.47, p-values < 0.017) and negatively correlated with SF36-mh (ρ = -0.22, p-value = 0.028). No significant difference between sexes was found (p-value = 0.6). Further, there was no correlation between w and age (ρ = -0.12, p-value = 0.25).


The aim of this paper was to suggest a solid statistical selection process to derive a comprehensive and yet feasible clinical test battery with different functional aspects to be used in rehabilitation after ACL injury. The test battery may be used to characterize knee function following an ACL injury. In this specific case, for the purpose of suggesting a test battery for long-term follow-up after ACL injury, we used data from the KACL20-study to investigate which combination of variables that optimally distinguished ACL-injured and healthy-knee controls, while still being feasible and clinically relevant. We extracted a test battery with four variables related to functional tests that may be used as a complement to questionnaires and scores in the long-term perspective after injury. It is a true challenge to define “good knee function” and many dimensions need to be considered. Previously reported test batteries have used data without healthy-knee controls, i.e. they compared knee function between the injured and the non-injured leg [12, 14, 15]. However, several studies show that there might be decreased bilateral function after a unilateral ACL injury [1618]. Therefore, it is essential that a test battery can also reliably discriminate between persons with an ACL injury and healthy-knee controls, since knee function may vary substantially across individuals, whether injured or not. This is accomplished by the suggested test battery, which distinguishes those with good knee function from those with less good knee function.

In the clinical setting and in research several different functional tests are used. Many of these test results are highly correlated, as corroborated in the present study (see Fig 2), and thus may provide similar information. Our study investigated which of the nine functional tests, and their related variables used in the KACL-study, that were able to optimally distinguish persons with bad knee function from those with good knee function. Even if adequate tests are used it is often difficult to interpret the combined information from several tests. In the construction of the new variable w, the statistical analysis identified two hop tasks from the same cluster, but from different branches (c.f. Fig 2). They are thus correlated, but represent different aspects of knee function. The OLH is an explosive maximal hop test for distance, which is performed in a forward direction and is assumed to be the most common test used in research and clinics after ACL injury [30, 31]. The SH on the other hand, is a multiple hop test, reflecting endurance as represented by a maximum number of hops which are performed in a medio-lateral direction. Thus, the OLH and the SH have different purposes and represent different coordinative function and performance. In addition, these tests challenge dynamic knee joint stability differently, where most likely the SH exerts higher demands on rotational stability (cf. [32, 33]) and may be more critical when evaluating capacity in the clinic [7]. For the SH, we have recently demonstrated that the test challenges knee stability substantially (data obtained from the same study population as in the present paper [32]). Both the OLH and SH tests are considered to have high reliability [12] and are considered highly relevant for evaluation of knee function after ACL injury [34].

In addition, the quadriceps concentric strength (Qc) and the one-leg balance test (B) were identified in the selected test battery. Regarding knee muscle strength, a review article by Palmieri-Smith et al. concludes that despite rehabilitation, knee muscle weakness is one of the main dysfunctions following ACL injury [35]. Isokinetic concentric quadriceps strength testing is frequently used in knee rehabilitation, is associated with self-reported knee function [36] and has high reliability [37]. Balance further adds yet another dimension, where studies indicate a reduced ability following ACL injury for the injured as well as the non-injured leg [16, 25]. The combination of the selected tests implies a range across various clinically relevant physical dimensions of knee function and thus seems appropriate.

In the present study, logistic regression is used to combine the information from the variables in the test battery, since the model, expressed in the contributing variables, is relatively easy to interpret. The results showed that the estimator of knee function, w, to a high extent discriminates individuals with ACL injury from healthy-knee controls. Other statistical approaches such as factor analysis or principal component analysis could alternatively be used to summarize information from several variables into a few factors or principal components. These factors or principal components may then be used in further analysis. However, our primary focus was to propose a test battery consisting of a few variables that are clinically observable and relevant. Since the different test batteries were evaluated using logistic regression, it was natural to use this approach to combine the available information. Once the dimension of the problem has been reduced, traditional statistical analyses for univariate data can be performed.

Our final choice of w was based on the calculations of misclassification rates, combined with expert reasoning regarding feasibility, where the latter is crucial for outcome measures to be used in the clinic. Interestingly, the misclassification rate typically decreased when variables were added to the model, up to models with five to six variables; then it increased again. The misclassification rate depends on the choice of cut-off for classification, here set to 0.5. It might not be the optimal cut-off, but it can still be used for comparing test batteries. The misclassification rate for the final test battery was 0.17, meaning that 17% of the individuals were misclassified using the chosen model. It would be possible to include additional variables related to the non-injured leg and thereby reduce the misclassification rate to 0.15, i.e., to classify two additional individuals correctly. Even though the T:E-index does not increase, a model with fewer variables is preferable due to interpretability. Moreover, calculation of the total T:E-index for a test battery was based on the assumption that a functional test is always performed on both legs, which is praxis in research and clinic, where the non-injured leg is used as a reference leg for comparisons. When data for a control group is available for comparisons, the performance of the non-injured leg may not be as necessary to observe. Our final estimator of knee function mainly included the variables discussed above that were related to the injured leg (i). However, for the one-leg balance test, the non-injured leg (c) is used. Indeed, as discussed above both the injured and non-injured leg display balance deficits after injury [16, 25].

The distributions for each of the variables included in w were similar within the two groups, as seen in Fig 4. This was not the case for the estimator of knee function where a clear difference in distribution was shown, demonstrating the capacity to discriminate between injured and healthy-knee controls. Nevertheless, some errors in misclassification could be expected considering the difficulty in clarifying objective criteria for who actually has good knee function, as is a common experience of clinical experts. A wide range of knee functions across individuals is expected, which may particularly be the case a long time after injury with increasing age and deconditioning; and also the case for non-injured individuals. In the KACL20 data set, which included individuals mainly in their forties, there were individuals that had been successfully rehabilitated and displayed knee function that was equally good as age-matched healthy-knee controls. Moreover, some of the controls showed results similar to injured individuals, and were therefore classified as injured. Indeed, as shown in Fig 3, a variation in knee function is present within both groups.

Lysholm and KOOS scores are well established and commonly used in ACL rehabilitation. Both scores have high reliability and validity [38, 39]. Our proposed estimator of physical knee function, w, correlated positively with both scores, indicating that it is concurrent with the individuals self-reported knee function. We also investigated the potential influence of age and sex, which are individual factors that have been shown to influence some of the outcome variables. For instance, Tegner activity level and KOOS differ between sexes and are negatively influenced by age [4042]. Physical capacity, including balance, strength and hop ability is likewise reduced with increasing age and lower for women than for men [43, 44]. Similarly, our study did not show any correlation of estimated knee function with age, and no difference between sexes. This may depend on the fact that our material is based on a long-term follow-up of knee function after ACL injury and thus covers ages between 35 and 63 years. These older age groups might be more homogenous than the younger athletes mainly tested in the above-mentioned studies, although many of the individuals with ACL injury in our data were athletes prior to injury. The controls were matched for age and sex but, although strived for, there was no matching of physical activity level.

The tests were performed by cohorts in a unique long-term follow-up with comparatively extensive testing. However, for our aims this is a limitation, since a larger reduction in knee function might be expected shortly after an injury. Thus, the suggested test battery needs to be further validated in other cohorts. In addition, the statistical approach should be applied to data obtained from shorter-term follow-ups after injury to verify the usefulness of the proposed test battery for other stages of rehabilitation. The data used from the KACL20-study includes nine different common functional tests as well as established scores and questionnaires adopted in rehabilitation after ACL injury, and hence seem well suited for our aims. Even so, there are many other existing tests that could have been used, e.g. triple-jump, running-eight test. Recent research has also identified the lack of measures of quality of movement; such indicators could be crucial factors for the prediction of outcome in knee rehabilitation [30]. Kinematic and kinetic variables aimed at capturing movement quality during coordination tasks could provide such measures [45, 46] but are not so feasible in the clinic. Nevertheless, research using laboratory-based evaluation could be used to identify and validate the most important clinical outcome measures. For instance, Di Stasi et al. used kinematics and kinetic assessments during gait to validate a clinical test battery in relation to return-to-sport criteria [47]. Kinematically-derived variables may also be used to characterize how challenging different functional tests are with regard to dynamic knee stability and compare functions across groups of individuals [32, 48]. However, kinematic and kinetic analyses usually generate huge/large numbers of variables, and there is a need to reduce these into the most representative parameters. In this context the present statistical approach would be of particular value, especially in large data sets obtained from various functional tasks. Thus, the model may be used to identify appropriate variables rather then arbitrarily selecting them.

In addition to kinematic and kinetic recordings of functional tests, proprioception and laxity may influence knee function [49] and it would be desirable to include such measures to ensure that as many dimensions of physical knee function as possible were to be considered. Altogether, our proposed test battery is comprised of four different test variables, reasonably feasible and, when combined, proven to reliably discriminate knee function at least in the long term after injury to the knee. An advantage with this test battery compared to previously proposed ones [1215], is that it includes both functional coordination tests and more direct knee muscle strength measurements. Further testing of the measurement properties (e.g., validity, reliability, sensitivity) of the suggested test battery in other long-term study populations, more than one year after ACL injury, is warranted.


The present study shows that with a solid statistical approach, we were able to construct a comprehensive and yet feasible test battery for evaluation of knee function after ACL injury which is appropriate in the long-term perspective. Our estimator of knee function combined several aspects, and could be said to more coherently represent true knee function than a single variable is able to. Consensus regarding clinical functional test batteries for various stages of rehabilitation, along with a general health score and a knee-specific health score, would ensure evidence-based assessment of knee function in patients following an ACL injury and enable reliable monitoring of knee function throughout the different phases of rehabilitation. Further, it would make it possible to carry out powerful retrospective and prospective studies over longer timespans post-injury while facilitating comparisons across studies.


The authors would like to acknowledge Lisbeth Brax Olofsson and Monica Edström for partaking in the data collection.

Author Contributions

  1. Conceived and designed the experiments: LS ET PR CH.
  2. Performed the experiments: LS ET PR CH.
  3. Analyzed the data: LS ET PR CH.
  4. Contributed reagents/materials/analysis tools: LS ET PR CH.
  5. Wrote the paper: CH ET LS PR.


  1. 1. Nordenvall R, Bahmanyar S, Adami J, Stenros C, Wredmark T, Fellander-Tsai L. A population-based nationwide study of cruciate ligament injury in Sweden, 2001–2009: incidence, treatment, and sex differences. Am J Sports Med. 2012;40(8):1808–13. pmid:22684536
  2. 2. Register SA. Annual Report 2011 2011.
  3. 3. Ageberg E, Thomee R, Neeter C, Silbernagel KG, Roos EM. Muscle strength and functional performance in patients with anterior cruciate ligament injury treated with training and surgical reconstruction or training only: a two to five-year followup. Arthritis and rheumatism. 2008;59(12):1773–9. pmid:19035430
  4. 4. Frobell RB, Roos HP, Roos EM, Roemer FW, Ranstam J, Lohmander LS. Treatment for acute anterior cruciate ligament tear: five year outcome of randomised trial. BMJ. 2013;346:f232. pmid:23349407
  5. 5. Meuffels DE, Favejee MM, Vissers MM, Heijboer MP, Reijman M, Verhaar JA. Ten year follow-up study comparing conservative versus operative treatment of anterior cruciate ligament ruptures. A matched-pair analysis of high level athletes. Br J Sports Med. 2009;43(5):347–51. pmid:18603576
  6. 6. Meunier A, Odensten M, Good L. Long-term results after primary repair or non-surgical treatment of anterior cruciate ligament rupture: a randomized study with a 15-year follow-up. Scand J Med Sci Sports. 2007;17(3):230–7. pmid:17501866
  7. 7. Tengman E, Brax Olofsson L, Nilsson KG, Tegner Y, Lundgren L, Hager CK. Anterior cruciate ligament injury after more than 20 years: I. Physical activity level and knee function. Scand J Med Sci Sports. 2014;24(6):e491–500. pmid:24673102
  8. 8. von Porat A, Roos EM, Roos H. High prevalence of osteoarthritis 14 years after an anterior cruciate ligament tear in male soccer players: a study of radiographic and patient relevant outcomes. Ann Rheum Dis. 2004;63(3):269–73. pmid:14962961
  9. 9. Irrgang JJ, Anderson AF, Boland AL, Harner CD, Kurosaka M, Neyret P, et al. Development and validation of the international knee documentation committee subjective knee form. The American journal of sports medicine. 2001;29(5):600–13. pmid:11573919
  10. 10. Roos EM, Roos HP, Lohmander LS, Ekdahl C, Beynnon BD. Knee Injury and Osteoarthritis Outcome Score (KOOS)—development of a self-administered outcome measure. J Orthop Sports Phys Ther. 1998;28(2):88–96. pmid:9699158
  11. 11. Tegner Y, Lysholm J. Rating systems in the evaluation of knee ligament injuries. Clin Orthop Relat Res. 1985;(198):43–9. pmid:4028566
  12. 12. Gustavsson A, Neeter C, Thomee P, Silbernagel KG, Augustsson J, Thomee R, et al. A test battery for evaluating hop performance in patients with an ACL injury and patients who have undergone ACL reconstruction. Knee Surg Sports Traumatol Arthrosc. 2006;14(8):778–88. pmid:16525796
  13. 13. Neeter C, Gustavsson A, Thomee P, Augustsson J, Thomee R, Karlsson J. Development of a strength test battery for evaluating leg muscle power after anterior cruciate ligament injury and reconstruction. Knee Surg Sports Traumatol Arthrosc. 2006;14(6):571–80. pmid:16477472
  14. 14. Noyes FR, Barber SD, Mangine RE. Abnormal lower limb symmetry determined by function hop tests after anterior cruciate ligament rupture. Am J Sports Med. 1991;19(5):513–8. pmid:1962720
  15. 15. Reid A, Birmingham TB, Stratford PW, Alcock GK, Giffin JR. Hop testing provides a reliable and valid outcome measure during rehabilitation after anterior cruciate ligament reconstruction. Physical therapy. 2007;87(3):337–49. pmid:17311886
  16. 16. Negahban H, Mazaheri M, Kingma I, van Dieen JH. A systematic review of postural control during single-leg stance in patients with untreated anterior cruciate ligament injury. Knee Surg Sports Traumatol Arthrosc. 2014;22(7):1491–504. pmid:23644752
  17. 17. Roberts D, Friden T, Stomberg A, Lindstrand A, Moritz U. Bilateral proprioceptive defects in patients with a unilateral anterior cruciate ligament reconstruction: a comparison between patients and healthy individuals. J Orthop Res. 2000;18(4):565–71. pmid:11052492
  18. 18. Urbach D, Nebelung W, Weiler HT, Awiszus F. Bilateral deficit of voluntary quadriceps muscle activation after unilateral ACL tear. Med Sci Sports Exerc. 1999;31(12):1691–6. pmid:10613416
  19. 19. Neeb TB, Aufdemkampe G, Wagener JH, Mastenbroek L. Assessing anterior cruciate ligament injuries: the association and differential value of questionnaires, clinical tests, and functional tests. J Orthop Sports Phys Ther. 1997;26(6):324–31. pmid:9402569
  20. 20. Reinke EK, Spindler KP, Lorring D, Jones MH, Schmitz L, Flanigan DC, et al. Hop tests correlate with IKDC and KOOS at minimum of 2 years after primary ACL reconstruction. Knee Surg Sports Traumatol Arthrosc. 2011;19(11):1806–16. pmid:21445595
  21. 21. Sernert N, Kartus J, Kohler K, Stener S, Larsson J, Eriksson BI, et al. Analysis of subjective, objective and functional examination tests after anterior cruciate ligament reconstruction. A follow-up of 527 patients. Knee Surg Sports Traumatol Arthrosc. 1999;7(3):160–5. pmid:10401652
  22. 22. Wilk KE, Romaniello WT, Soscia SM, Arrigo CA, Andrews JR. The relationship between subjective knee scores, isokinetic testing, and functional testing in the ACL-reconstructed knee. J Orthop Sports Phys Ther. 1994;20(2):60–73. pmid:7920603
  23. 23. Tengman E, Brax Olofsson L, Stensdotter AK, Nilsson KG, Hager CK. Anterior cruciate ligament injury after more than 20 years. II. Concentric and eccentric knee muscle strength. Scand J Med Sci Sports. 2014;24(6):e501–9. pmid:24684507
  24. 24. Lysholm J, Tegner Y. Knee injury rating scales. Acta Orthop. 2007;78(4):445–53. pmid:17965996
  25. 25. Stensdotter A, Tengman E., Brax Olofsson L., Häger C. Deficits in single-limb stance more than 20 years after ACL injury. European Journal of Physiotherapy. 2013;15(2):78–85.
  26. 26. Grimby G. Physical activity and muscle training in the elderly. Acta Med Scand Suppl. 1986;711:233–7. pmid:3535411
  27. 27. Sullivan M, Karlsson J, Ware JE Jr. The Swedish SF-36 Health Survey—I. Evaluation of data quality, scaling assumptions, reliability and construct validity across general populations in Sweden. Soc Sci Med. 1995;41(10):1349–58. pmid:8560302
  28. 28. Stensdotter AK, Tengman E, Hager C. Altered postural control strategies in quiet standing more than 20 years after rupture of the anterior cruciate ligament. Gait Posture. 2016;46:98–103. pmid:27131185
  29. 29. Hosmer DW, Lemeshow S. Applied Logistic Regression: Wiley; 2004.
  30. 30. Engelen-van Melick N, van Cingel RE, Tijssen MP, Nijhuis-van der Sanden MW. Assessment of functional performance after anterior cruciate ligament reconstruction: a systematic review of measurement procedures. Knee Surg Sports Traumatol Arthrosc. 2013;21(4):869–79. pmid:22581194
  31. 31. Tegner Y, Lysholm J, Lysholm M, Gillquist J. A performance test to monitor rehabilitation and evaluate anterior cruciate ligament injuries. The American journal of sports medicine. 1986;14(2):156–9. pmid:3717488
  32. 32. Grip H, Tengman E, Hager CK. Dynamic knee stability estimated by finite helical axis methods during functional performance approximately twenty years after anterior cruciate ligament injury. J Biomech. 2015;48(10):1906–14. pmid:25935685
  33. 33. Tengman E, Grip H, Stensdotter A, Hager CK. Anterior cruciate ligament injury about 20 years post-treatment: A kinematic analysis of one-leg hop. Scand J Med Sci Sports. 2015;25(6):818–27. pmid:25728035
  34. 34. Thomee R, Kaplan Y, Kvist J, Myklebust G, Risberg MA, Theisen D, et al. Muscle strength and hop performance criteria prior to return to sports after ACL reconstruction. Knee surgery, sports traumatology, arthroscopy: official journal of the ESSKA. 2011.
  35. 35. Palmieri-Smith RM, Thomas AC, Wojtys EM. Maximizing quadriceps strength after ACL reconstruction. Clinics in sports medicine. 2008;27(3):405–24, vii-ix. pmid:18503875
  36. 36. Moisala AS, Jarvela T, Kannus P, Jarvinen M. Muscle strength evaluations after ACL reconstruction. Int J Sports Med. 2007;28(10):868–72. pmid:17357967
  37. 37. Pua YH, Bryant AL, Steele JR, Newton RU, Wrigley TV. Isokinetic dynamometry in anterior cruciate ligament injury and reconstruction. Ann Acad Med Singapore. 2008;37(4):330–40. pmid:18461219
  38. 38. Salavati M, Akhbari B, Mohammadi F, Mazaheri M, Khorrami M. Knee injury and Osteoarthritis Outcome Score (KOOS); reliability and validity in competitive athletes after anterior cruciate ligament reconstruction. Osteoarthritis Cartilage. 2011;19(4):406–10. pmid:21255667
  39. 39. Briggs KK, Lysholm J, Tegner Y, Rodkey WG, Kocher MS, Steadman JR. The reliability, validity, and responsiveness of the Lysholm score and Tegner activity scale for anterior cruciate ligament injuries of the knee: 25 years later. The American journal of sports medicine. 2009;37(5):890–7. pmid:19261899
  40. 40. Briggs KK, Steadman JR, Hay CJ, Hines SL. Lysholm score and Tegner activity level in individuals with normal knees. The American journal of sports medicine. 2009;37(5):898–901. pmid:19307332
  41. 41. Frobell RB, Svensson E, Gothrick M, Roos EM. Self-reported activity level and knee function in amateur football players: the influence of age, gender, history of knee injury and level of competition. Knee Surg Sports Traumatol Arthrosc. 2008;16(7):713–9. pmid:18350275
  42. 42. Paradowski PT, Bergman S, Sunden-Lundius A, Lohmander LS, Roos EM. Knee complaints vary with age and gender in the adult population. Population-based reference data for the Knee injury and Osteoarthritis Outcome Score (KOOS). BMC Musculoskelet Disord. 2006;7:38. pmid:16670005
  43. 43. Yoon TS, Park DS, Kang SW, Chun SI, Shin JS. Isometric and isokinetic torque curves at the knee joint. Yonsei Med J. 1991;32(1):33–43. pmid:1877253
  44. 44. Ageberg E, Zatterstrom R, Friden T, Moritz U. Individual factors affecting stabilometry and one-leg hop test in 75 healthy subjects, aged 15–44 years. Scand J Med Sci Sports. 2001;11(1):47–53. pmid:11169235
  45. 45. Fox AS, Bonacci J, McLean SG, Spittle M, Saunders N. What is Normal? Female Lower Limb Kinematic Profiles During Athletic Tasks Used to Examine Anterior Cruciate Ligament Injury Risk: A Systematic Review. Sports Med. 2014;44(6):815–32. pmid:24682949
  46. 46. Hewett TE, Myer GD, Ford KR, Heidt RS Jr., Colosimo AJ, McLean SG, et al. Biomechanical measures of neuromuscular control and valgus loading of the knee predict anterior cruciate ligament injury risk in female athletes: a prospective study. The American journal of sports medicine. 2005;33(4):492–501. pmid:15722287
  47. 47. Di Stasi SL, Logerstedt D, Gardinier ES, Snyder-Mackler L. Gait patterns differ between ACL-reconstructed athletes who pass return-to-sport criteria and those who fail. Am J Sports Med. 2013;41(6):1310–8. pmid:23562809
  48. 48. Sole G, Tengman E, Grip H, Hager CK. Knee kinematics during stair descent 20years following anterior cruciate ligament rupture with and without reconstruction. Clin Biomech (Bristol, Avon). 2016;32:180–6.
  49. 49. Roberts D, Ageberg E, Andersson G, Friden T. Clinical measurements of proprioception, muscle strength and laxity in relation to function in the ACL-injured knee. Knee Surg Sports Traumatol Arthrosc. 2007;15(1):9–16. pmid:16791634