Intra and inter-session reliability of rapid Transcranial Magnetic Stimulation stimulus-response curves of tibialis anterior muscle in healthy older adults

Objective The clinical use of Transcranial Magnetic Stimulation (TMS) as a technique to assess corticospinal excitability is limited by the time for data acquisition and the measurement variability. This study aimed at evaluating the reliability of Stimulus-Response (SR) curves acquired with a recently proposed rapid protocol on tibialis anterior muscle of healthy older adults. Methods Twenty-four neurologically-intact adults (age:55–75 years) were recruited for this test-retest study. During each session, six SR curves, 3 at rest and 3 during isometric muscle contractions at 5% of maximum voluntary contraction (MVC), were acquired. Motor Evoked Potentials (MEPs) were normalized to the maximum peripherally evoked response; the coil position and orientation were monitored with an optical tracking system. Intra- and inter-session reliability of motor threshold (MT), area under the curve (AURC), MEPmax, stimulation intensity at which the MEP is mid-way between MEPmax and MEPmin (I50), slope in I50, MEP latency, and silent period (SP) were assessed in terms of Standard Error of Measurement (SEM), relative SEM, Minimum Detectable Change (MDC), and Intraclass Correlation Coefficient (ICC). Results The relative SEM was ≤10% for MT, I50, latency and SP both at rest and 5%MVC, while it ranged between 11% and 37% for AURC, MEPmax, and slope. MDC values were overall quite large; e.g., MT required a change of 12%MSO at rest and 10%MSO at 5%MVC to be considered a real change. Inter-sessions ICC were >0.6 for all measures but slope at rest and MEPmax and latency at 5%MVC. Conclusions Measures derived from SR curves acquired in <4 minutes are affected by similar measurement errors to those found with long-lasting protocols, suggesting that the rapid method is at least as reliable as the traditional methods. As specifically designed to include older adults, this study provides normative data for future studies involving older neurological patients (e.g. stroke survivors).


Introduction
Neuroplasticity is an important marker for motor recovery during neurorehabilitation. One way to measure plasticity is by assessing corticospinal excitability (CSE) using Transcranial Magnetic Stimulation (TMS). TMS is a non-invasive, painless, and well-established technique to evaluate CSE in studies of motor learning and neurorehabilitation [1] [2]. When TMS is administered over the cortical motor area, a motor evoked potential (MEP) is extracted from the targeted muscle's electromyogram [3]. The MEP has been the primary measure used to quantify CSE. One way to use the MEP is to draw the input-output relationship between stimulation intensity and the size of the MEP, i.e. the Stimulus-Response (SR) curve [4,5].
The SR curve provides comprehensive information of the excitability of the nervous system [5], [6]. SR curves are traditionally acquired delivering stimuli at predefined stimulation intensities, often between 90% and 150% of the resting motor threshold (MT), with an inter-stimulus interval (ISI) of >4s, and the acquisition time for a whole SR curve is typically >8 minutes [5,7,8], [9]. SR curves take into account the response of both neurons with lower threshold, which are in the directly stimulated core region, and those that are activated with higher threshold, either because they are intrinsically less excitable or because they are far from the site where the stimulus is delivered [3]. As a result, the SR curve allows the investigation of changes in excitability of different neuronal populations, in contrast to stimulation at a single intensity [10]. However, despite the limitations of stimulating at a single intensity this technique is still the most used method to explore excitability, using the mean of the responses as the outcome measure.
While assessing CSE changes in longitudinal studies [11][12][13][14][15], the reliability of TMS-related measures is of primary importance in order to assure that any observed change is above the trial-to-trial variability of the measure itself. A recent systematic review describing the reliability of TMS outcome measures of primary motor cortex excitability in healthy subjects concluded that the evidence base is insufficient and is negatively affected by problems with methodological design and statistical analysis [16]. Beaulieu et al. (2017) also pointed out the importance of reporting the Minimal Detectable Change (MDC) for generalization to future work. The MDC represents the minimum difference required to determine if a significant change has occurred in an individual. The lack of appropriate statistical assessment and a general misunderstanding of the concept of reliability have been underlined by Schambra et al. (2015), who propose guidelines for the rigorous testing of TMS outcome reliability [17]. They clarify the two main subtypes of reliability: the measurement error (or absolute reliability) which assesses the agreement between repeated measurements in an individual and is mainly used for longitudinal evaluative purposes, and the so called reliability MP (or relative reliability; 'MP' standing for measurement property) which assesses how well an individual can be distinguished from the others and might be useful for diagnostic purposes.
Variability of TMS-related measures is due to both endogenous and exogenous sources [18], [19,20]. Spontaneous physiological fluctuation in excitability levels at both cortical and spinal level is a primary cause of endogenous variations on TMS-related measures [21]. Due to endogenous variability, reducing the length of the acquisition protocol is of utmost importance. Concerning exogenous variability, there are plenty of sources of which those crucial are: i) age and gender; ii) visual attention level [22]; iii) time of day the experiment is performed (related to cortisol levels) [23]; iv) the contraction level of the target muscle [19]; and v) the position and orientation of the coil over the target cortical area together with stimulation intensity [19,20], [24]. Whilst consensus is reached for the effect of most of these factors there are conflicting results regarding the effect of age. For example, Pitcher et al. (2003) showed that older adults required greater stimulus intensities to reach maximal motor output in the corticospinal projection to intrinsic hand muscles and were characterized by higher trial-totrial variability with respect to young participants, especially at near threshold stimulation intensities [25]. However, a subsequent study did not observe aging-related changes in corticospinal stimulus-response curve characteristics in a population of exclusively male subjects [26]. Therefore, since no final conclusion about the effect of age on CSE can be found in the literature, there is the need of age-matched normative data to be used as a reference for changes in CSE in patients suffering neurological disease (e.g. stroke).
To minimise variability and allow quantifying CSE across different intensities it would be advantageous to acquire data for the SR curve rapidly. To shorten the acquisition time, Mathias and colleagues studied the minimum ISI to avoid inhibitory or facilitatory interactions between the responses of two consecutive stimuli [9]. They showed that the ISI can be reduced up to 1.4s without inducing a depression of CSE which is a well-known effect of 1Hz repetitive TMS [27]. They also observed that 60 stimuli are sufficient to construct a representative curve, thus demonstrating that reliable SR curves can be acquired in less than 2 minutes. Next to the possibility of reducing variability, the reduction of the acquisition time is crucial for transferring TMS-related measures from research to clinical practice. Indeed, not only it helps in reducing variability, but it can also increase the patient's compliance, thus limiting dropout.
Whereas most studies explored MEP variability in upper limb and especially hand muscles, lower limb muscles are less commonly studied, e.g. the tibialis anterior (TA) or soleus muscle. Of the 34 studies selected in Beaulieu et al.'s 2017 systematic review [16], only 10 were focused on lower limb muscles. This may be caused by the generally moderate reliability found for MEPs in lower limb muscle of healthy participants [6,18,28] whilst in neurological patients (i.e. stroke, multiple sclerosis and incomplete spinal cord injury) even poorer results are obtained [29][30][31]. Nonetheless, the TA has a crucial role in the recovery of walking (i.e., to overcome the drop-foot phenomenon typical following e.g. stroke), and reliable information of CSE can be crucial for clinical decision making. Importantly, among all studies exploring reliability of the MEPs in lower limb muscles, only Cacchio and colleagues assessed reliability of the whole SR curves on healthy adults [6], whilst it allows investigation of excitability of different neuronal populations at the same time. The aim of this study is to investigate the interand intra-session reliability of measures derived from SR curves acquired rapidly from the TA muscle in healthy older adults, age-matched to stroke survivors. Both absolute and relative reliability, as well as MDC for TMS measures, are assessed.

Participants and study design
Healthy community-dwelling participants with an age between 55 and 80 years and no previous history of neurological injury were recruited for the study. Exclusion criteria were any contraindication recommended for TMS [1]; presence of metal implants or cardiac pacemaker; history of epilepsy or migraine; neurological or systemic diseases; assumption of antidepressants and/or anxiolytics; and any lower extremity injuries in the three months prior to the first experimental session [32].
The subjects participated to two experimental sessions in a test-retest design. The two sessions were separated by 4-7 days, in agreement with textbooks on psychometric properties which recommend using different days, but no more than 2 weeks apart to assess reliability [16,33].
Participants were asked to get sufficient sleep (>6 h), avoid coffee and minimize alcohol consumption the day and night before the experiment. During each session, six SR curves were acquired on the TA muscle of the dominant leg, three at rest and three during isometric muscle contractions at 5% of the maximal voluntary contraction (MVC). The dominant leg was identified by asking the subject the preferred leg to kick a ball [34].
The research protocol was approved by the central ethical committee of Fondazione Salvatore Maugeri (number: 931 CE, date of approval: 10/03/2014) and conducted in accordance with the Declaration of Helsinki. All participants provided written informed consent to participate.
Apparatus EMG. Surface self-adhesive Ag/AgCl electrodes (Kendall TM , COVIDIEN) were placed in a bipolar configuration over the TA muscle. EMG signals were acquired by a multi-channel signal amplifier (Porti 32™, TMS International) and sampled at 2048 Hz. TMS. A biphasic TMS stimulator (Magstim Rapid2, The Magstim Company, Dyfed, UK) with a double-cone coil was used to elicit MEPs. The coil position and orientation over the target cortical motor area were monitored in real-time using a frameless stereotaxic custom C+ + software [35] interfacing with an optical tracking system (Polaris Vicra, Northern Digital Inc.). The software allowed to retrieve coil position and orientation between sessions and helped the operator maintaining the correct coil position and orientation within each session by providing feedback about any errors with respect to the predefined stimulation site and coil orientation. Furthermore, it prevented unnecessary stimuli when the coil was not properly placed over the hotspot. A Graphical User Interface (GUI) developed in Matlab was used to deliver the TMS stimuli, to online display the SR curve and to store the data [36].
Peripheral nerve stimulation. A current-controlled stimulator (RehaStim TM , HASO MED GmbH) was used to evoke the maximal evoked muscle response (Mmax).
Muscle force level. A load cell (Tekkal, Milan, Italy) was used to measure the force produced during the isometric muscle contractions of the TA muscle. The force was visually displayed to the participants to help them maintaining 5%MVC during active SR curves acquisition.
The experimental setup is displayed in Fig 1.

Experimental procedure
Participants were comfortably seated in a quiet room on an armchair with knee and ankle angles of the dominant leg fixed at about 100˚and 90˚, respectively. A custom-built woodenmade support maintained the correct position of the foot (see Fig 1).
At the beginning of each session supramaximal electrical stimuli were delivered to the peroneal nerve to evoke the Mmax. The recorded Mmax was used to normalize the MEPs collected on the same day in order reduce the test-retest variability due to electrodes replacement and to ensure a valid statistical comparison among participants [37,38]. Care was taken to consistently replace the electrodes between the two sessions.
During the first session, the optimal stimulation site (hotspot) was identified: the coil was moved in small steps over the TA cortical motor area in order to find the position and orientation which evoked the maximal MEPs in the TA muscle with the lowest stimulation intensity. Once found, the coil position and orientation were saved in the frameless stereotaxic software [35]. On the second day, the frameless stereotaxic software was used to reposition the coil with the same location and orientation with respect to the head as on the first day. With the coil firmly placed over the hotspot, six SR curves were collected using the rapid acquisition The coil position and orientation on the skull were monitored by an optical tracking system. An electrical stimulus was delivered to the peroneal nerve to elicit the maximum peripheral muscular response. The operators were provided with two visual feedbacks (VF): one GUI helped in maintaining the correct coil position and orientation (top right) and a second one visually displayed the SR curve while it was acquired (middle right). A load cell was used to monitor the force level produced during active SR curves, which was displayed to the participant (bottom right): the two red lines indicate the target range the participant was asked to maintain, the blue line shows the acquired force level. protocol described in [9], three with the TA muscle at rest and three during an isometric contraction at 5%MVC. For each SR curve, a train of stimuli was delivered with an ISI of 3s; the stimulation intensity varied pseudo-randomly on a pulse-by-pulse basis in an online adjustable range. The operator could adjust the minimum and maximum stimulation intensity accordingly to the online displayed SR curve in order to identify the threshold on one end and the plateau on the other end. The operator manually stopped the acquisition after 3-5 stimuli since the curve had reached a steady state (i.e., it did not change with successive stimuli). An ISI of 3s, intermediate with respect to those tested on upper limb muscles (1.4-4s) in [9], was selected in order to maximize the comfort of the participants. Indeed, higher stimulation intensities are required to evoke MEPs from the TA muscle. Furthermore, this value of ISI gave the subject enough time to recover the correct level of muscle contraction after the TMS stimulus during active SR curves acquisition.
As all participants were naïve TMS participants, at the beginning of the first session an additional short SR curve (30-40 stimuli) was acquired in order to familiarise the participants with the method. This curve was discarded from the analysis.

Data analysis
The EMG signal was extracted from 150ms before to 300ms after each TMS stimulus and high-pass filtered (5 th order Butterworth filter, cut frequency of 5 Hz). The root mean square (RMS) of the EMG 100ms before the TMS stimulus, referred to as "background EMG", was computed to monitor the state of the muscle before stimulation: individual MEPs were excluded from the subsequent analysis if their respective background EMG was over mean ±3SD computed for the complete dataset of each SR curve.
The MEP size was computed as the peak-to-peak value (MEP pp ) of the EMG signal in a 60ms window placed 20ms from the start of the TMS stimulus.
To construct the SR curve, MEP pp values were first normalised to the peak-to-peak amplitude of the Mmax, then plotted as function of the stimulation intensity, and finally all data was modelled using a four-parameter Boltzmann sigmoid [9]: where MEP min and MEP max are the minimum and maximum asymptotes of the function; I50 is the percentage of maximal stimulator output (%MSO) at which the MEP is mid-way between MEP min and MEP max and S is the slope at I50. The goodness of the fit was evaluated by means of the coefficient of determination R 2 . Curves with R 2 0.75 were discarded.
Corticospinal excitability was assessed in terms of: 1. Motor Threshold (MT), i.e. the minimum stimulus that evokes an MEP in the muscles and reflects the membrane excitability of the neurons in the cortical region of the target muscle [39]. It was computed as the x-intercept of the tangent to the sigmoid function at the point of maximal slope, i.e. I50 (see Equation I) [7]. It is expressed as %MSO.
2. Area Under Recruitment Curve (AURC), computed as the integral under the sigmoid function. This parameter provides a global estimate of the corticospinal excitability and is suggested to characterize the corticospinal projections to a wide range of muscles. An increase of the area indicates an increase of excitability [40].
3. MEP max , which reflects the maximum corticospinal response of the cortical neurons evoked by the stimulation [2].
5. S, expressed as %MSO -1 . It has been suggested to give inion about the neurophysiological strength of intracortical and corticospinal projections [39,41]. 6. Latency, i.e. the time period between the TMS stimulus and the MEP onset. This parameter was computed only for MEPs elicited by stimuli with a stimulation intensity higher than the one corresponding to the 80% of the difference between maximum and minimum plateau of SR curve. Changes in MEP latency may reflects variation in central motor conduction time [6,42].
7. Silent period (SP), i.e. the period of EMG activity suppression following a supra-threshold TMS stimulus. It was computed only during muscle contractions and for the same MEPs selected to compute latency, as the time interval between the end of the MEP and the return of the background voluntary activity (i.e., >70% of the mean background EMG) [41]. SP is believed to reflect inhibitory mechanisms at the motor cortex level mediated by GABA-B receptors [42] [41].
While MT, AURC, MEP max , I50, and slope were derived from the sigmoid function of the SR curve, latency and silent period were derived from individual MEPs.

Statistical methods
An a priori power analysis showed that 22 was the minimum sample size required to establish that a reliability coefficient of 0.80 was significantly different from a minimally acceptable reliability coefficient of 0.50, considering α = 0.05 and 1-β = 0.80 [43]. A total of 24 participants were recruited to allow for a 10% drop-out rate.
After verifying the homoscedasticity of each dataset by means of the Breusch-Pagan test [44], the variance components of the observed measurements were estimated using the restricted maximum likelihood method and a random effects model, as follows [17]: where s 2 subjects is the between-subject variance, s 2 tests or days is the variance between the three measurements collected on the first day (intra-session) or the variance between days (inter-session), and s 2 residual is the error term which represents all other unexplained sources of variability. Please note that the variance components to derive inter-session reliability were estimated twice, once considering the average of the three daily measurements of each subject and once considering only the first measurement collected on each day.
The measurement error was estimated by the Standard Error of Measurement, which can be easily derived from the variance components as follows [17]: The SEM takes into account both systematic (s 2 tests or days ) and random error (s 2 residual ). The relative SEM (SEM rel %) was also computed by normalizing the SEM to the measurement mean, as follows: From the SEM computed between sessions, the MDC, i.e. the smallest change in score that is likely to reflect a true change rather than a measurement error, was estimated as follows [45]: where ffiffi ffi 2 p accounts for the variance associated with two independent sessions and 1.96 represents the 95% confidence interval.
Relative intra-and inter-session reliability of TMS-related measures was estimated by the Intraclass Correlation Coefficient (ICC), using the ICC(2,1) formula (model 2, random-effects 1-way, single measures) [17,45]. To take into account possible systematic differences, absolute agreement was selected. For intra-session reliability, the three measurements collected the first day were considered, while for inter-session reliability, as before, the average daily measurements and only the first measurement collected on each day were considered separately. ICC values >0.70 are usually interpreted as acceptable reliability MP [17].
Repeated-measures ANOVA and paired t-tests were used to evaluate possible systematic errors between the three dataset collected on the first day (intra-session) and between the average daily measurements or only the first measurement collected on the two days (inter-session), respectively.
The measurement properties were computed separately for dataset acquired at rest and during muscle contractions (referred to as "active").
Differences among the three dataset of the two sessions in terms of background EMG and force level were also investigated by means of repeated-measures ANOVA. A paired t-test was used to assess differences in terms of Mmax between the two sessions. Descriptive group data are reported as mean ± standard deviation unless otherwise noted.
The statistical analysis was performed with IBM SPSS Statistics v23 software.

Results
Twenty-four healthy participants (12 males and 12 females) aged between 55 and 75 years old were recruited. Participants' details are provided in Table 1. Two participants (age 64 and 69 years) did not return for the second session, while the MT at rest was >100%MSO for one subject (64 years) and therefore MEPs at rest could not be evoked. Thus, the intra-session reliability analysis was based on 23 and 24 subjects at rest and 5%MVC, respectively, while for the inter-session reliability analysis 21 and 22 subjects were considered for the passive and active conditions, respectively.
Each SR curve was acquired by delivering an average of 70±2 stimuli at rest and 67±3 during muscle contractions. Thus, the overall duration of each SR curve acquisition was of about 3.5 minutes. None or one MEP (average of 0.09%±0.11%) were excluded because of the background EMG. The SR curves obtained a coefficient of determination R 2 = 0.85±0.08, with values always bigger than 0.75.
The Mmax was not significantly different between the two sessions (5.3±2.0 at day 1 and 5.8±2.6 at day 2, p = 0.388). The participants were able to maintain the TA contractions at 5%MVC as required, exploiting the visual feedback of the force level. Indeed, no significant differences were obtained for the six active SR curves neither in terms of force level (F = 0.75, p = 0.595) nor in terms of background EMG (F = 0.55, p = 0.735).
Exemplary dataset acquired on one single subject (female, 63 years old, right dominant leg) are shown in Figs 2 and 3. Active SR curves (panels (b)) show lower motor thresholds, steeper slopes, higher MEP max , and higher AURC compared to the passive (panels (a)).
All dataset was found to be homoscedastic and therefore no transformation was needed to compute the measurement properties.

Intra-session reliability
The results of the intra-session reliability analysis are reported in Table 2. Systematic errors were found only for the silent period (repeated measures ANOVA, p<0.01). All the ICCs were significant (p-value<0.001). The measurement error for MT was <4%MSO both at rest and at 5%MVC. Analogously, the measurement error for MEP max , slope, and latency were similar between rest and active conditions and about 0.08, 2%MSO -1 , and 2ms, respectively. For the AURC, the SEM increased from 1.7 at rest to 3.3 during muscle contractions. Opposite was the behaviour of I50, whose SEM decreased from 5.5%MSO at rest to 3.8%MSO at 5%MVC. The relative SEM was always below or about 10% except for the AURC at rest (16%) and the MEP max and the slope in both conditions (MEP max : 26% and 12% at rest and at 5%MVC; slope: 37% and 29%). A relative SEM <10% was previously proposed as a cut-off for high measurement stability [17,46]. Considering this cut-off, all measurements but AURC at rest and MEP max and slope in both conditions were characterized by a low relative measurement error within the same session. ICC values were >0.70 for all parameters but slope at rest (ICC = 0.54) and SP (ICC = 0.67), suggesting a good reliability MP , which is the ability of the measurement to distinguish between subjects in a sample.

Inter-session reliability
The results of the inter-session reliability analysis based on the mean values of the 3 curves acquired during each testing session are shown in Table 3. The ICC of the slope at rest was not significant (p-value = 0.121); all the others were significant (p-value<0.02). The measurement errors for AURC and MEP max were higher than those obtained within a single session. These measures, as well as the slope, showed a relative SEM >10%, indicating a low measurement stability. All the other measures exhibited a relative SEM about or below 10%. The MDC values were overall quite large; for example, MT required a change of at least 12%MSO at rest and 10%MSO at 5%MVC to be considered a real change above the measurement error. A good reliability MP was found for MT and I50, both at rest and during muscle contractions, AURC and MEP max at rest, slope during muscle contractions, and SP, with ICC values >0.70.
To evaluate whether it was possible to further reduce the acquisition time, the inter-session reliability analysis was performed also just considering the first curve acquired at rest and at 5%MVC during the two sessions. Results are reported in Table 4. All the ICCs were significant (p-value<0.03), with the only exception of the slope at rest (p-value = 0.671). Both reliability measurement properties at rest were overall worsened and in the majority of the cases did not reached acceptable values. MT and I50 at 5% MVC maintained a relative SEM <10%, indicating a good measurement stability, and ICC values >0.7.

Discussion
This study assessed the reliability of TMS-related measures collected from the TA muscle at rest and at 5%MVC of a population of 24 healthy older adults (mean age of 62 years). These measures, although acquired with the rapid protocol for SR curves proposed in [9], were similar, in terms of magnitude, to those previously obtained on the TA muscle of healthy subjects [6,18,29,30,47]. Comparisons were not possible for the AURC and I50, since no previous studies evaluated this outcome on the TA muscle. The measurement error between sessions were overall quite large. However, all measures but AURC, MEP max , and slope could be considered as acceptable, with a relative measurement error 10% [17]. The lowest relative measurement errors were found for MT and I50 both at rest and at 5% MVC. MT was characterized by MDC values of 12.2%MSO (at rest) and 10.2%MSO (at 5% MVC), comparable to 9.3%MSO for resting MT found in [6,29]. Therefore, we concluded that the MT is reliable even when derived from the SR curve instead of using the traditional method (MT is usually defined as the lowest stimulation intensity inducing a MEPpp >50μV in at least 5 out of 10 consecutive trials [6,18,29]). Comparisons were not possible for I50 since no previous data were found.
MEP max showed a measurement error above the cut-off of 10%, as already observed in [6]: we obtained a value of SEM rel equal to 36% at rest and 28% at 5%MVC, compared to 16% found at rest in [6]. The slope showed a higher measurement error with respect to what previously found: in [6] the Authors observed a relative SEM of 8.2% for the TA at rest, while we found a relative SEM of 25.1% and 37.1% at rest and at 5%MVC, respectively.
Instead of considering each single curve parameter individually, AURC has been candidate as a global indicator of cortical excitability [40] and therefore as the most clinically meaningful outcome to be used in longitudinal studies. Our study assessed for the first time its measurement properties for lower limb muscles: high values of SEM rel were found between sessions Reliability of TMS measures of tibialis anterior muscle in healthy older adults (30% at rest and 26% at 5%MVC), indicating that big changes are needed to overcome the measurement error. This high measurement error reflected the high errors already found for MEP max and slope. When compared to the literature, higher and lower MDC values were found for latency (7/ 8ms versus <2ms found in [6,29]), and SP (38ms versus 60ms in [6]), respectively. However, comparisons must be made with caution since previous studies have assessed these outcomes at different stimulation intensities and the stimulation intensity has been seen to influence the magnitude of these measures: SP increases and latency decreases with increased stimulation intensity [48,49]. Furthermore, previous studies acquired active SR curves at contraction levels higher than 5%MVC (20%MVC in [6] and 10%MVC in [29]). Although this low background activity made comparison with the literature more difficult, it was chosen to reduce the risk of muscle fatigue, particularly relevant for healthy older adults and, even more, for neurological patients (e.g. stroke survivors).
Overall, we observed that individual changes needed to exceed the MDC values should be quite high; for example, we obtained that MT during slight muscle contractions should change of >10%MSO after an intervention to be considered a real change above the measurement error (MDC = 10.3%MSO at 5% MVC, as reported in Table 3). However, such a change is quite unusual to be observed [17]. Therefore, as already suggested in [17] and confirmed in [16], MDC values should be better used to identify changes within a homogenous group of subjects rather than to track individual changes. Indeed, in case of groups, MDC value is divided by the square root of the sample size and its value is strongly reduced, even for small samples.
Compared to absolute reliability, relative reliability was more commonly investigated in the literature. Several studies have estimated ICC values for TMS outcomes in TA muscle on healthy subjects [6,18,29,30,47]. Good relative reliability (ICC>0.70) was generally found for resting MT, slope, latency, silent period, and MEP amplitude; only one study [30] observed a moderate reliability for latency (ICC of 0.55-0.71 during isometric muscle contractions of 10% to 60%MVC) and silent period (ICC of 0.16-0.40). Similar results were observed in our study for MT (inter-session ICC of 0.84 and 0.77 at rest and at 5%MVC, respectively) and SP (ICC of 0.71), while a moderate reliability as in [30] was obtained for latency (ICC of 0.66 and 0.54 at rest and at 5%MVC, respectively). A lower reliability with respect to the literature was found for the slope at rest (inter-session ICC of 0.28 versus 0.78 found in [6]); however, during slight muscular contraction a good relative reliability was regained (ICC of 0.87). For MEP max , the same value of ICC was found in our study and in [6] at rest (ICC of 0.71). Concerning the AURC, ICC ranged from 0.88 at rest to 0.60 at 5%MVC, and to our knowledge no other studies evaluated its reliability MP so far. Based on our results, one could conclude that all TMS measures but AURC and MEP max during muscle contraction, latency, and slope at rest could be used to discriminate between subjects for staging or diagnosis.
As already observed [6,16], lower measurement errors were found when outcomes were acquired within the same day (intra-session, Table 2) than some days apart (inter-session, Table 3), in particular for AURC and MEP max . This is mainly due to the different sources of variability which affect intra-and inter-session reliability. The measurement error within the same day is mainly due to the physiological fluctuations of the excitability at cortical and spinal levels. Reducing the acquisition time to 3-4 minutes we expected to decrease the effect of longterm exogenous variability, and so to reduce the intra-session variability. However, this result was not achieved. The higher measurement error between different sessions is most likely due to the methodological sources of variability, such as EMG electrodes replacement and hotspot repositioning [16]. The use of an optical electronic system and a custom-made software to maintain the same coil position and orientation within and between sessions did not reduce the measurement variability, as already observed [50].
As expected, averaging the parameters over 3 curves increases the reliability with respect to single-curve parameters, since the average reduces the effect of random errors. Therefore, when studying small changes in corticospinal excitability, usually found in healthy participants following brief motor learning paradigms, three SR curves may be needed to optimally quantify changes in excitability. However, in patients where changes in CSE are expected to be large, a single curve will be sufficient to detect changes.
Our study has some limitations. Firstly, an ISI<3s could have been investigated in order to further reduce the acquisition time of the SR curves. Secondly, the study did not collect active SR curves at contraction levels >5%MVC (e.g. 10-20%MVC) and this limited the possibility to compare our results with previous findings.

Conclusion
This study showed that although a shorter time for data collection and an experimental protocol designed to minimise measurement variability, TMS measures acquired by stimulating the area of the motor cortex representing the TA muscle of healthy older adults using this method are comparable to traditional methods and are affected by a large measurement error. Therefore, our results support the use of TMS measures to detect changes significantly over the measurement error in group of subjects, instead of individual changes. In such a way, when used in longitudinal studies aimed at investigating neuroplasticity linked to motor rehabilitation, TMS measures might aid our understanding about how we can augment the effect of motor rehabilitation and identify the optimal treatment plans for its effects to persist and translate to improvements in daily life activities. As specifically designed to include older adults, this study provides normative data for future studies involving older neurological patients (e.g. stroke survivors).
Supporting information S1 File. MT, AURC, slope, I50, MEP max , latency, SP computed for each of the 24 patients are reported for passive (sheet 1) and active (sheet 2) test conditions. Each column corresponds to data extracted from a single SR curve (three at Day1 and three at Day2).