A new strength assessment to evaluate the association between muscle weakness and gait pathology in children with cerebral palsy

Aim The main goal of this validation study was to evaluate whether lower limb muscle weakness and plantar flexor rate of force development (RFD) related to altered gait parameters in children with cerebral palsy (CP), when weakness was assessed with maximal voluntary isometric contractions (MVICs) in a gait related test position. As a subgoal, we analyzed intra- and intertester reliability of this new strength measurement method. Methods Part 1 –Intra- and intertester reliability were determined with the intra-class correlation coefficient (ICC2,1) in 10 typical developing (TD) children (age: 5–15). We collected MVICs in four lower limb muscle groups to define maximum joint torques, as well as plantar flexor RFD. Part 2 –Validity of the strength assessment was explored by analyzing the relations of lower limb joint torques and RFD to a series of kinematic- and kinetic gait features, the GDI (gait deviation index), and the GDI-kinetic in 23 children with CP (GMFCS I-II; age: 5–15) and 23 TD children (age: 5–15) with Spearman’s rank correlation coefficients. Results Part 1 –The best reliability was found for the torque data (Nm), with the highest ICC2,1 (0.951) for knee extension strength (inter) and the lowest (0.693) for dorsiflexion strength (intra). For plantar flexor RFD, the most reliable window size was 300 milliseconds (ICC2,1: 0.828 (inter) and 0.692 (intra)). Part 2 –The children with CP were significantly weaker than the TD children (p <0.001). Weakness of the dorsiflexors and plantar flexors associated with delayed and decreased knee flexion angle during swing, respectively. No other significant correlations were found. Conclusion While our new strength assessment was reliable, intra-joint correlations between weakness, RFD, and gait deviations were low. However, we found inter-joint associations, reflected by a strong association between plantar- and dorsiflexor weakness, and decreased and delayed knee flexion angle during swing.

Part 2 -The children with CP were significantly weaker than the TD children (p <0.001). Weakness of the dorsiflexors and plantar flexors associated with delayed and decreased knee flexion angle during swing, respectively. No other significant correlations were found. PLOS

Introduction
Cerebral palsy (CP) is the most common physical disability in childhood with a prevalence of 2-3 in 1000 live born infants [1,2]. Children with CP have varying motor deficits, including neuromuscular symptoms, such as a shortened muscle-tendon unit, spasticity, lack of selective muscle control, and muscle weakness [2][3][4]. These symptoms adversely affect normal development of functional activities such as walking [3]. Many treatment modalities are focusing on these neuromuscular symptoms, aiming to improve gait in children with CP [3]. Therefore, a good insight into the interaction between these neuromuscular symptoms and gait may have a significant impact on the clinical decision-making process. Since muscle weakness, as one of these symptoms, is considered to be a major interfering factor on gait [5], there has been a clinical interest in the association between muscle weakness and pathological gait features. The relationship between muscle weakness and gait deviations in children with CP has been analyzed by several researchers, but their results are difficult to compare and sometimes contradictory [6][7][8][9][10][11][12][13][14][15]. The main problems are the methodological differences, due to the variety of applied strength measurement devices, test positions during the weakness assessments, and selected parameters extracted from the weakness assessment and from 3D gait analysis (Supplementary materials: S1 Table).
A maximal voluntary isometric contraction (MVIC) measured with a hand-held dynamometer is a rather simple, relatively cheap, and easy accessible way to assess muscle weakness in children with CP and the overall reliability is considered to be good [16][17][18][19][20][21] (Supplementary  materials: S2 Table). However, in all these reliability studies, the strength of the assessor had an influence on the measurement outcomes, plus compensatory movements of the participants during the measurements could not be excluded [16][17][18][19][20][21]. Further, in previous studies analyzing the effect of muscle weakness on gait, the test positions of the weakness assessment appear to be selected independently from the joint angles (and thus muscle lengths) observed during gait [6][7][8][9][10][11][12][13][14][15]. By selecting a test position that mimics the averaged joint angles of gait, the relationship between MVIC outcomes and gait parameters may be improved. Also, due to the changes in motor control and muscle morphology [4,22], not only the maximal net joint torques [6][7][8][11][12][13]15], but rate of force development (RFD) could be a relevant parameter that influences functional performance as well [23]. This is especially the case for specific gait phases that are characterized by high angular velocities, such as the push-off around the ankle joint.
Most of the extracted kinematic-and kinetic gait parameters in previous literature were linked to specific gait cycle phases during which the evaluated muscles were assumed to be active [6,[9][10][11][12][13][14][15]. The rationale behind these study designs was that weakness of a certain muscle group would be related to an altered joint angle or net joint torque delivered by that muscle group. However, a reduced ankle torque while walking at self-selected speed is frequently achieved by reducing the external lever arm, i.e. keeping the ground reaction force closely aligned to the joint center the muscle is acting on. These compensations for weakness at specific joints also cause kinematic-and kinetic changes in other joints. Therefore, a gait deviation index describing the entire kinematic-and kinetic gait pattern might be another interesting parameter to explore, when analyzing the relationship between pathological gait and muscle weakness [24][25][26].
To summarize, hand-held dynamometry seems appropriate to quantify weakness of lower limb muscle groups, but compensation mechanisms and influence of assessor strength were not taken into consideration in previous studies. Further, the test positions used during the weakness assessments were not related to gait. Finally, the selected parameters from both weakness assessments and gait appear to be incomplete. Therefore, the main goal of this study was to determine the validity of the new strength assessment by assessing the relationship between muscle weakness and the altered gait parameters in children with CP, when the above-mentioned limitations were minimized. As a subgoal, reliability of our new strength measurement was also analyzed. Muscle weakness was assessed when the participants and the dynamometer were fixed in a custom-made chair. The test position was based on the average joint angles of gait. The outcomes of the weakness assessment were compared with a series of kinematic and kinetic gait features, as well as gait deviation indices, and walking speed. A detailed overview of these study goals and our hypotheses can be found in Table 1.

Part 1: Reliability of the strength assessment
Subjects. We conducted a power analysis to determine the sample size. Based on the classification of Fleiss [27], minimal ICC-value (ρ 0 ) was set at 0.50 (fair to good) and maximal ICC-value (ρ 1 ) at 0.90 (excellent). With an α err prob = 0.05 and power (1-β err prob) = 0.80, this resulted in a minimal sample size of nine participants per reliability assessment (intra and inter) [28].
We recruited 10 TD children, between five and 15 years old without any neurological or neuromuscular problems (Table 2). This study was approved by our local ethics committee (Commissie Medische Ethiek KU Leuven; S56041) and written informed consent was obtained from next of kin, caretakers, or guardians on behalf of the children in accordance with the Declaration of Helsinki. Children aged 12 years or older, signed the informed consent forms themselves as well.
To limit the influence of assessor strength and to decrease compensation mechanisms, the dynamometer and the children were fixed in the chair. The children were secured with a strap over the pelvis and upper legs, and during all measurements the arms were crossed in front of their chest. For the dorsiflexion MVIC, the foot was placed in a heel cuff. A total of three testers, a physical therapist (PT) and two physical therapy master students (PTs1 and PTs2) participated in the reliability study ( Table 2).
Forces of both lower limbs were measured with a telemetric MicroFet1 2 hand-held dynamometer (Hogan Health Industries, West Jordan, UT USA), further referred to as dynamometer. We determined segment lengths of the lower limb (fibula head-lower border of lateral malleolus) and the foot (projection of lateral malleolus on lateral border of the footdistal metacarpal head V), and we placed the dynamometer at 75% of this segment length (S2 and S3 Figs). Each measurement consisted of one test trial, and three actual trials of a duration between three to five seconds. The resting period between each trial was at least ten seconds. Children had visual feedback during the measurements and were verbally instructed and encouraged in a standardized manner. We applied correction for gravity for the two MVICs that were influenced by gravity (knee flexion and plantar flexion), by subtracting the gravitational torque in rest position from the MVIC outcomes [30]. To determine the validity of the new strength assessment by assessing the relationship between muscle weakness and the altered gait parameters in children with CP. The children were positioned in a more gait related test position, while limiting the influence of assessor strength, and compensation mechanisms. Additionally, we determined the reliability of our new weakness assessment. First, intra-and intertester reliability were determined (part 1). Secondly, the validity of new strength measurement was explored, by analyzing the association between the weakness outcomes and several gait parameters (part 2).

Part 1 -Reliability of strength assessment Part 2 -Validity of the strength assessment
Hypotheses regarding the reliability of the new strength assessment: 1. By fixing the children and the dynamometer assessor strength and compensation mechanism will be decreased, resulting similar reliability outcomes for intra-and intertester measurements. 2. Averaging the MVIC outcomes and RFD values will have better reliability results compared to the absolute maximum values.
Hypotheses regarding the validity of the strength assessment: 3. Children with CP have lower maximal net joint torques and plantar flexor RFD during the MVICs. 4. High correlations are found between MVIC outcomes and the gait parameters when MVICs are measured with the new strength measurement method. 5. Highest correlations are found between plantar flexion RFD and power generation at the ankle, and between MVIC outcomes, and the GDI and GDI-kinetic.

Data analysis.
All parameters from the MVIC and RFD analyses were calculated with a custom-written Matlab script. First, we resampled the force data to 100 Hz and extracted the absolute maximal-and mean force (N) over three MVIC trials. Subsequently, force normalized to bodyweight (N/kg), torque (Nm), and torque normalized to bodyweight (Nm/kg) were calculated to allow comparison with previous research. Plantar flexor RFD was calculated based on: Δ force/Δ time (N/s), with a pre-fixed window size of minimally 100 ms and maximally 700 ms [31]. The onset of the measurement was determined automatically, by calculating when the force curve showed an uninterrupted increase in force based on the standard deviation of the force curve. To determine which window size would give the most reliable outcome, plantar flexor RFD was calculated for each window size from 100 ms to 700 ms with increments of 100 ms.
Statistical analysis. To test hypotheses 1 and 2, intra-and intertester reliability were determined for the net force and torque as well as the and plantar flexor RFD. The intra-class correlation coefficient (ICC 2,1 ) based on a two-way random effect model with absolute agreement and a 95% confidence interval (CI) was calculated in SPSS (SPSS Inc., Chicago, IL) [32]. Standard error of measurement (SEM) was calculated by p MSe in which MSe is the mean squared error from the two-way Anova, representing the degree of inaccuracy between the two measurements. The minimal detectable difference (MDD) was calculated by SEM Ã 1.96 Ã p 2. Both SEM and MDD were represented as a percentage of the overall mean (% SEM and % MDD respectively) to be able to compare our results with other studies. Additionally, the Fratios were extracted from the ANOVA to determine the presence of a systematic error [32]. The F-ratio was calculated as MSm/MSr in which MSm represents systematic variance and MSr the unsystematic variance, due to unspecified, random causes. Based on the degrees of freedom in this study, a systematic error would be present when F (1,9) ! 5.12 [32,33].  A total of 23 children with CP between five and 15 years old, planned for routine clinical gait analysis, were invited to participate if they: 1) were diagnosed with bilateral or unilateral CP without signs of dyskinesia, 2) had Gross Motor Function Classification System (GMFCS) level I or II, 3) had no Botulinum Toxin-A treatment within six months prior to the assessments and 4) had no history of lower limb surgery. Twenty-three TD children of a similar age, without any neurological-or neuromuscular problems, were recruited. Three TD children participating in reliability measurements of part 1, also took part in this part of the study. General subject information of both groups is summarized in Table 3. More detailed subject information can be found in S3-S5 Tables.
All children were tested at CMAL-Pellenberg and this study was approved by a local ethics committee (Commissie Medische Ethiek KU Leuven; S56041) and written informed consent was obtained from next of kin, caretakers, or guardians on behalf of the children in accordance with the Declaration of Helsinki. Children aged 12 years or older, signed the informed consent forms themselves as well.
Data collection. Gait kinematics and -kinetics were collected by means of 3D motion analysis. Markers were located according to the lower body Plug-in-Gait model and marker trajectories of 3D gait analyses were collected using a 10 to 15-camera VICON system (Vicon-UK, Oxford, UK), sampled at 100 Hz. Two force plates (AMTI, Watertown, MA, USA), embedded in the walkway registered force at 1500 Hz. All children walked barefoot on a 10-meter walkway at a self-selected, comfortable speed and as fast as possible without running. The latter could be considered a high demand task, potentially highlighting markers of weakness. We used Nexus software (Nexus 1.8.4. Vicon-UK, Oxford, UK) to define gait cycles and estimate the orientation of the pelvis and the joint angles of the ankle, knee and hip over the three anatomical planes, as well as the joint moments and power of the ankle, knee and hip.
MVICs were collected as described in part 1. Data analysis. From the kinematic curves, we extracted: sagittal ankle angle at initial contact, sagittal knee angle at initial contact, maximal knee flexion during stance, maximal hip extension angle during mid-stance, maximal dorsiflexion angle during swing, and maximal knee flexion angle and timing of maximal knee flexion during swing. Additionally, we extracted five kinetic features from the internal net joint torques and the joint power: maximal ankle torque during loading response, maximal knee extension torque during mid-stance, maximal knee flexion torque during stance, maximal plantar flexion torque during push-off, and maximal power generation at the ankle during push-off. All parameters were extracted per gait cycle and averaged for all included gait cycles per walking speed, for each participant. All kinetic parameters were normalized to bodyweight.
For the children with CP, the gait deviations indices GDI and the GDI-kinetic [25,26] were calculated for both walking speeds for which the 23 TD children of part 2 of this study were used as the control group. Both the GDI and GDI-kinetic are measures of overall gait pathology, based on the 'scaled distance between a pathological gait pattern and the average normal gait pattern' [25,26]. A GDI or GDI-kinetic of 100 or higher represents a typical gait pattern. Each 10-point decrement in GDI or GDI-kinetic from 100, indicates a gait pattern that is one standard deviation away from the average TD gait pattern. Walking speed in meters/second was extracted from the gait data and normalized to a nondimensional value to avoid the effect of leg length [36].
From the MVICs, average net joint torque normalized to bodyweight over three trials was calculated. For the plantar flexor RFD, a window size of 300 ms was applied (RFD300), from which the absolute maximal value of the three trials was used for further analyses.
Statistical analysis. Since not all data were normally distributed, non-parametric tests were applied in SPSS (SPSS Inc., Chicago, IL). Group differences were tested by means of the Mann-Whitney U-test for which a Bonferroni correction was applied, resulting in a critical pvalue of 0.005. The relationship between MVIC outcomes and gait parameters was checked by means of Spearman's rank correlation coefficients for which the classification of Altman was used to interpret the results [34].

Part 1: Reliability of the new strength assessment
The results of the reliability analyses are reported in Tables 4-6. Torque showed the highest overall reliability for all assessments with intra-tester ICCs 2,1 between 0.681 (dorsiflexors) and 0.934 (knee flexors), and intertester ICCs 2,1 between 0.878 (plantar flexors) and 0.947 (knee extensors). When torque was normalized to bodyweight, the ICCs 2,1 decreased and the confidence intervals (CI) increased, resulting in ICCs 2,1 between 0.399 (dorsiflexors) and 0.872 (knee flexors) for intra-tester measurements, and ICCs 2,1 between 0.220 (knee extensors) and 0.647 (knee flexors) for intertester assessments. No clear difference was found between the use of the absolute maximum value and the average of three trials, but in general, the reliability was better for the averaged data (intra-and intertester), with a maximal ICC 2,1 value of 0.951 (knee extension torque; intertester) and a minimal ICC 2,1 value of 0.186 (knee extension normalized force; intertester).
Systematic errors were found for both absolute maximum and averaged values for the knee flexor MVICs, with F-values between 5.330 (averaged torque data) and 8.932 (maximal normalized force) for the intra-tester measurements, and F-values between 6.139 (averaged torque data) and 8.134 (maximal normalized force) for the intertester measurements. The plantar flexor MVICs also showed systematic errors for the intertester measurements with F-values between 17.095 (maximal normalized force) and 45.478 (average force). These systematic errors indicated that the second measurement outcome was always higher than the first.
For plantar flexor RFD, the use of a fixed window size of 500 ms produced the highest ICCs 2,1 (0.752), smallest confidence intervals (CI) (0.283-0.932) and lowest % SEM (21.210) and % MDD (58.792) for the intra-tester measurements, when the absolute maximal value was used. A window size of 400 ms had the highest ICC 2,1 value (0.836) for the intertester assessments. No clear difference was found between the use of the absolute maximal plantar flexor RFD and the average of the three trials. However, for the intra-tester assessment, the absolute maximal value gave slightly better results for the window sizes higher than 200 ms, with ICCs 2,1 between 0.692 (RFD300) and 0.752 (RFD500). For the intertester measurements when using the absolute maximal value, ICCs 2,1 were generally higher when the window size was higher than 300 ms with ICCs 2,1 between 0.796 (RFD600) and 0.836 (RFD400). For the intertester measurements, systematic errors were found for all window sizes higher than 300 ms (for both absolute maximal values and averaged values), with F-values ranging between 4.615 (averaged value of RFD600) and 16.701 (maximal value of RFD500) indicating that the second assessment was significantly higher than the first.

Part 2: Validity of the strength assessment
Group differences in gait parameters and strength data between the children with CP and the TD children are reported in Table 7. At self-selected walking speed, the children with CP showed an increased knee flexion angle at initial contact, a decreased dorsiflexion torque during loading response, a decreased hip extension angle during stance, and a lower maximal net plantar flexion torque and power generation at the ankle during push-off (all p < 0.001). The median values and interquartile ranges for the children with CP were: 25.5˚(16.1) for knee angle at initial contact, -0.01 Nm/kg (0.03) for ankle torque during loading response, 4.8( 11.6) for hip angle during stance, and 1.08 Nm/kg (0.26) and 2.   The results of the correlation analysis for the children with CP are presented in Table 8. No clear intra-joint associations (for example between plantar flexion MVIC and ankle torque, or knee extension MVIC and knee angle at initial contact) were found, since correlation coefficients were always lower than 0.41. However, we did find two inter-joint relationships at selfselected walking speed: stronger plantar flexors were associated with an increased maximal knee flexion angle during swing (r = 0.61), and weaker dorsiflexors were related to a delayed timing of maximal knee flexion angle during swing (r = -0.44). At the higher walking speed  Muscle weakness and the association with gait pathology in children with cerebral palsy   this latter association disappeared, while the relationship between plantar flexion MVIC and maximal knee flexion during swing remained (r = 0.64).
Walking speed was correlated with several altered gait features. A lower self-selected walking speed was associated with an increase in knee flexion at initial contact (r = -0.42). Children with CP who were able to walk faster at the test condition of higher walking speed, showed a higher power generation at the ankle (r = 0.50) and had a more typical kinematic gait pattern, as was represented by the fair correlation between walking speed and the GDI (r = 0.41). However, when walking at the higher walking speed, the kinetic gait pattern of the children with CP deviated further from the TD children, which was reflected by the negative correlation between walking speed and the GDI-kinetic (r = -0.49).

Part 1: Reliability of the strength assessment
Our hypotheses regarding the reliability of the new strength assessment were confirmed. By fixing the children and the dynamometer, differences between intra-and intertester reliability were small when using force or torque as outcome values, suggesting that the influence of the assessor on the outcome parameters could be avoided. Normalizing the strength data to bodyweight decreased ICC 2,1 , resulted in wider confidence intervals (CIs), and increased the differences between intra-and intertester ICCs 2,1 , while % SEM and % MDD remained similar. This confirms that the ICC depends on variability of the data [32], since ICCs 2,1 decreased when normalizing to bodyweight, even though the overall reliability remained the same. The best reliability was found for the averaged torque data, with the highest ICC 2,1 value for knee extension strength (inter) and the lowest for dorsiflexion strength (intra) (0.951 and 0.693 resp.). Overall, our ICC 2,1 results were in the same line as the previously reported repeatability torque data [17,20]. In our study, the ICCs 2,1 were slightly lower for intratester reliability when compared to the results reported by Hebert et al., whereas our intertester ICCs 2,1 were generally higher [20]. Willemse et al. showed that averaging two or three trials over one or two sessions resulted in an improved reliability when compared to using only one (maximal) value [21]. This was also the case in our study, although the differences between using the absolute maximum or over the mean of three trials were small. Unfortunately, a more detailed comparison of our reliability data to the results of previously reported studies is difficult, because the reported reliability results were often incomplete (supplementary materials S2 Table).
The systematic errors in our study were most likely the result of a learning effect during both, the intra-as well as intertester measurements. This suggests that one test trial might not have been enough practice for the TD children. However, fatiguing needs to be avoided, so the number of repetitions should be limited.  https://doi.org/10.1371/journal.pone.0191097.t007 Table 8. Spearman's rho correlation coefficients between strength measurement outcomes and the gait parameters that differed from the TD children. Moderate or higher correlations are printed in bold.

Self-selected walking speed
Although torque systematically showed higher ICC-values, we considered normalized torque values more suitable for future studies, due to the known strong relationship between bodyweight and strength [37]. Since % SEM and % MDD were similar for both outcome units, it may be assumed that the lower ICCs 2,1 were mainly related to the decreased data variability due to the applied normalization. This was checked with the between subjects squared mean (as a ratio of the measurement mean) [32], which was indeed higher for the torque data than for the normalized torque values. Therefore, for future studies we recommend using the normalized torque data that are preferably averaged over multiple trials. Averaging the three trials did not only have the highest reliability outcomes, it was also considered to be more representative of the muscle activity needed during gait, since gait is a repetitive movement.
Using the fixed window size of 400 ms resulted in the highest ICC 2,1 value (0.836) for RFD of the plantar flexors when multiple testers were used and a window size of 500 ms showed the highest intratester ICC 2,1 value (0.752). This is contradictory to the results reported by Haff et al., who did not find an increase in ICC 2,1 values when the window size was higher than 200 ms [31]. However, their reliability analysis was performed on college volleyball players, for whom it can be assumed that their RFD is higher than for children, possibly resulting in a smaller window size. Intertester reliability of the RFD showed systematic errors when the window size was higher than 300 ms, suggesting that the second measurement was always higher than the first. Similar to the MVICs, this might have been caused by a learning effect. Therefore, for future studies, to avoid the influence of a possible learning effect, we opted for a window size of 300 ms instead of 400 ms or 500 ms.

Part 2: Validity of the strength measurement
We verified that the children with CP were weaker than the TD children in this study, thereby confirming our third hypothesis. The differences between children with CP and TD children regarding the MVIC outcomes are well known [13,38], and they are most likely the result of changes in motor control, and muscle-and tendon structure in CP [4,22,39,40].
Our final hypotheses were rejected due to the absence of moderate or higher (r > 0.41) intra-joint correlations between gait and muscle weakness. Muscle weakness of the plantarand dorsiflexors was related to a decreased and delayed knee flexion angle during swing (respectively). Dorsiflexion weakness has been associated with a decreased dorsiflexion angle in swing, and in order maintain foot clearance and avoid tripping (or falling), an increased and prolonged knee flexion is often observed during swing phase [41]. This might explain the moderate correlation between dorsiflexion weakness and delayed knee flexion in swing found in the current study. However, this finding needs to be taken with some caution, since the correlation was absent at the higher walking speed, and no relationship between dorsiflexion weakness and decreased dorsiflexion angle during swing was found. Further, dorsiflexion angle in swing in CP was not significantly different from the TD children. Stronger plantar flexors are known to promote a better push off and subsequently a fluent knee flexion motion during swing [42,43]. The children with CP had indeed a lower maximal plantar flexion torque and ankle power generation during push-off, but this was not associated to weakness in the plantar flexors nor to reduced plantar flexor RFD. A possible explanation could be the contribution of the passive structures to the internal net joint torques and power generation at ankle during gait. The ability to store energy in the Achilles tendon during the second ankle rocker and release energy during push-off has been well-recognized [44]. Previous studies on the muscle and tendon structure in children with CP reported the Achilles tendon to be longer than in TD children, but with a smaller cross-sectional area [40,45]. These alterations are likely compensatory for reduced muscle compliance, resulting in a higher tendon compliance and consequently an altered ability to store and release energy [45].These findings may explain the lower maximal net joint torques during gait, and the subsequent reduced knee flexion motion in swing [4,46]. Further, during gait (passive) connective tissues are also contributing to the net joint torques and power generation via myofascial force transmission [44,47,48], while this might be less the case during MVICs. Kaya et al. found evidence that myofascial force transmission plays a role during an isometric contraction. They measured force directly at the tendon of the semitendinosus and analyzed force, joint stiffness, and range of force exertion when the semitendinosus was activated individually and when three knee flexors (gracilis, semitendinosus, and semimembranosus) were activated simultaneously [49]. They report an increase in force when the three knee flexors are activated in conjunction, but no change in stiffness or range of force exertion [49]. Their findings indicate that myofascial force transmission does have an impact on net joint torque during a MVIC, but how this relates to myofascial force transmission during a dynamic task, such as gait, needs to be explored in future studies.
Another explanation for the low correlations between MVIC outcomes and kinematic-and kinetic gait deviations could be the discrepancy in contraction types between the two measurements. During a MVIC, the muscle is contracting isometrically, whilst during gait the muscles are changing length and force. Due to the lengthening of a muscle during gait, spasticity might have an influence on gait kinematics and kinetics as well. However, muscle weakness is considered to have a more extended influence on gait kinematics and kinetics than spasticity [9,12].
Not only spasticity, but also other neuromuscular and skeletal symptoms including altered selective muscle control, muscle contractures and bony deformities are associated with gait kinematics and -kinetics in children with CP [3]. Previous research, in a study population of 200 uni-and bilateral involved children with CP, indicated only poor to moderate correlations between pre-determined gait features, and passive range of motion (r 0.51), spasticity (r 0.50) and selective motor control (r 0.50) [9]. Within the group of children with CP in this study, the correlations between clinical symptoms and gait features at self-selected walking ranged between poor to good. For passive range of motion, the highest correlation was found between passive hip extension mobility and knee angle at initial contact (r = -0.61). Increased spasticity of the knee flexors was associated with increased knee flexion angle at initial contact (r = 0.73), and decreased selective control of the plantar flexors with decreased power generation at push-off (r = 0.75). Moreover, the clinical symptoms are inter-correlated [50]. A more involved child frequently has a higher level of spasticity, lower selective muscle control, and more muscle weakness, than a less involved child [50,51]. Our study sample size did not have sufficient power for detailed analyses of the interaction between various clinical symptoms and their combined relation to the gait deviations in CP.
Several studies used musculoskeletal models to analyze the effect of muscle weakness on gait kinematics and kinetics [5,52]. Van der Krogt et al. found that, in case of weakness of the hip abductors, hip flexors, and plantar flexors, normal gait is hard to maintain [5]. Similar results are reported by Steele et al., who determined that weakness of the hip abductors and the plantar flexors are contributing to crouch gait in children with CP [52]. The use of (subject-specific) musculoskeletal models in analyzing the underlying neuromuscular deficits contributing to muscle weakness and their interaction with gait, could further enhance our understanding into the relationship between muscle weakness and gait deviations in children with CP.
We measured MVICs in a standardized test position, while Ateş et al. and Yucesoy et al. determined that the optimal muscle length to deliver force differs per individual in CP [53][54][55]. They intraoperatively assessed maximal isometric force of three different spastic knee flexors (gracilis, semitendinosus, and semimembranosus) in children with CP. Force was measured directly at the tendon and at different joint angles [53][54][55]. Their results indicate intersubject variability regarding the optimal muscle length for generating isometric force in the three knee flexors [53][54][55]. This is in line with the wide range of joint torques we found for the knee flexion MVICs (minimal torque: 0.04; maximal torque 0.98 (Nm/kg)). Additionally, they determined that the angle at which these three knee flexors are able to generate their peak force differs per muscle, thereby allowing for a wider range of motion at which knee flexion torque can be generated [55]. This is very useful during dynamic tasks such as gait, and indicates that the maximal knee flexion torque measured with a MVIC could be an underestimation. These findings imply that there might not be one optimal test position to assess muscle weakness, but that test positions should ideally be muscle and subject specific.
Finally, walking speed has a big influence of gait kinematics and kinetics [56]. The TD children in this study walked faster than the children with CP at both walking speeds, which appears to have been the main reason behind the differences in kinematic-and kinetic parameters [56]. This was confirmed by the moderate correlations between self-selected walking speed, and the knee angle at initial contact (r = -0.42), and between the higher walking speed and power generation at the ankle (r = 0.50), the GDI (r = 0.41) and the GDI-kinetic (r = -0.49). The latter negative correlation between the high walking speed and the GDI-kinetic suggests that the children with CP used an altered gait strategy to increase their walking speed when compared to the TD children in this study. Riad et al. found a shift in power generation from the ankle to the hip during gait in children with CP, indicating an alternate strategy from ankle push-off to a hip pull-off, at pre-swing [57]. Analysis of the power generation at the hip of the participants in this study revealed that the children with CP were indeed prone to increase the hip-pull off, opposed to an ankle push-off, to increase walking speed.
One of the limitations of part 1 of this study was that both sides (left and right) were included in the reliability analysis. Also, the reliability study was only performed on TD children and should be extended with data from children with CP in future studies.
Further, during the weakness assessment, we selected a test position with joint angles that were representative for the averaged joint angles during gait found in children with CP, not TD children. We verified that the averaged joint angles for the TD children participating in part 2 were similar as the angles extracted from our retrospective dataset of children with CP (hip: 18.4 0 ± 6.5 0 ; knee: 25.5 ± 3.9 0 ; ankle: 2.7 ± 4.7 0 ) [29].
Additionally, we combined both unilateral and bilateral CP even though the natural history of certain clinical symptoms, such as level of spasticity and muscle weakness, is expected to be different between both patient groups. However, a separate data analysis revealed that there were no significant differences (Mann-Whitney U test) for passive range of motion, muscle strength, GDI or GDI-kinetic between children that were bilaterally or unilaterally involved.

Conclusion
The reliability of our new weakness assessment was found to be good. However, based on the weak intra-joint correlations between MVIC-outcomes and the altered kinematic-and kinetic gait parameters, validity is considered to be poor. We did find a strong association between weakness of the dorsiflexors and plantar flexors with delayed and decreased peak knee angle during swing. These inter-joint associations indicate that the relationship between muscle weakness and altered gait is complex. Also, the interdependency of muscle weakness with other neuromuscular symptoms, such as decreased passive range of motion, reduced selective motor control, and spasticity, should be taken into consideration in future studies.  1 Only fair (r ! 0.21) or higher correlations are listed. 2 Only the differences between group 1 and 2 are reported, indicating differences between the stronger vs the weaker children with CP. 3 Median (min-max), instead of mean ± SD. An increase in a value is indicated with a " and a decrease with a #. For instance, when looking at pelvic range of motion, Ross & Engsberg found that when the aggregated strength values of the tested muscles (aggF) decreased, pelvic ROM increased (# aggF "). For each study, only significant results are reported, unless the same parameter was also tested in another study in which they found significant results, such as cadence e.g. (DOCX) S2 Table. Results of intra-or intertester reliability studies on the use of the HHD in CP and/or TD children between 5-15 years of age. Only studies employing the make test and using ICCs as reliability metrics are summarized. When left and right side were tested separately, data was averaged for clarity. In case of multiple test protocols or reliability assessments, only the results of the underlined tests and the results of the highest reported ICCs (indicated with a Ã ) have been reported. If possible, parameters were calculated when missing from the paper.