Intra-session and inter-rater reliability of spatial frequency analysis methods in skeletal muscle

Spatial frequency analysis (SFA) is a quantitative ultrasound (US) method originally developed to assess intratendinous tissue structure. This method may also be advantageous in assessing other musculoskeletal tissues. Although SFA has been shown to be a reliable assessment strategy in tendon tissue, its reliability in muscle has not been investigated. The purpose of this study was to examine the reliability of spatial frequency parameter measurement for a large muscle group within a healthy population. Ten participants with no history of lower extremity surgery or hamstring strain injury volunteered. Longitudinal B-mode images were collected in three different locations across the hamstring muscles. Following a short rest, the entire imaging procedure was repeated. B-mode images were processed by manually drawing a region of interest (ROI) about the entire muscle thickness. Four spatial frequency parameters of interest were extracted from the image ROIs. Intra- and inter-rater reliabilities of extracted SFA parameters were performed. Test-retest reliability of the image acquisition procedure was assessed between repeat trials. Intraclass correlation coefficients showed high intra- and inter-rater reliability (ICC(3,1) > 0.9 for all parameters) and good to moderate test-retest reliability (ICC(3,1) > 0.50) between trials. No differences in parameter values were observed between trials across all muscles and locations (p > 0.05). The high reliability metrics suggest that SFA will be useful for future studies assessing muscle tissue structure, and may have value in assessing muscular adaptations following injury and during recovery.


Introduction
Imaging modalities such as ultrasound (US) are typically used in conjunction with non-imaging clinical tests to aid in diagnoses of various musculoskeletal (MSK) conditions [1,2]. Measures of quantitative US, such as echo-intensity or first-order statistical analysis of individual image pixels [3][4][5][6][7], have been correlated to muscle quality and function in both animal models and humans [6,[8][9][10][11].
Ultrasound is a particularly useful imaging modality because it can reveal tissue organization at a greater resolution than other modalities [12,13]. For example, the same speckle pattern that limits apparent spatial resolution in US can indirectly show tissue health in tendon [14][15][16]. Spatial frequency analysis (SFA) is a quantitative US method, which leverages the coherent imaging properties of US by analyzing the cumulative reflected image pattern. The prominent spatial frequencies within an image show the overall organization of the underlying tissue hierarchy. These spatial frequency characteristics have successfully differentiated pathological and healthy tendons and correlated collagen organization to tendon properties in vivo [14,16]. Therefore, SFA builds upon other grayscale analysis methods by characterizing the two-dimensional speckle pattern of MSK tissue images such as tendon and muscle, rather than just analyzing individual pixels.
To date, SFA has only been applied to tendons. It is of interest to determine if SFA methods may be reliably used in other MSK tissues such as muscle. By measuring the reliability of SFA for muscle, future studies investigating differences in muscle architecture due to injury or during rehabilitation may be conducted with more confidence. The objectives of this study were to examine the use of SFA for assessment of muscle tissue, specifically by 1) determining the intra-and inter-rater reliability of extracted SFA parameters and 2) determining the test-retest reliability of SFA parameters within the hamstring muscles in a healthy population.

Materials and methods
An a priori power analysis was performed using R software [17] based upon the methodology proposed by Zou [18] using a hypothesized value of 0.75, a null hypothesis value of 0, alpha of 0.05, power of 0.80, and number of ratings of each subject of 2. A total sample size of N = 10 subjects was determined. Ten participants from the university community were recruited to participate in this study. Inclusion criteria were: 18-35 years of age, regularly participating (three or more days per week) in exercise or recreational sport; no history of lower extremity surgery or hamstring strain injury; and not being currently pregnant. This study was approved by the Health Sciences Institutional Review Board at the University of Wisconsin-Madison and all participants provided written informed consent.

Ultrasound imaging
Participants were positioned prone on an exam table with their hips and knees supported in neutral position and then asked to remain relaxed with no muscular contraction. In an effort to standardize imaging locations between participants, thigh length from the ischial tuberosity to the midpoint between the femoral condyles was measured and recorded. Skin marks were then made on the participant at 33%, 50%, and 67% of the thigh length from the ischial tuberosity, which corresponded to approximately proximal, mid-belly, and distal regions of the hamstring muscle, respectively. These locations were determined based upon pilot testing to ensure that images were collected at these different regions with minimal tendon infiltration, and are consistent with previous investigations [19][20][21][22].
All images were obtained using the same machine (Aixplorer, Supersonic Imaging, Weston, FL) and sonographer with over 18 years of US experience (5+ years in MSK US). A linear array transducer (2-10 MHz) was used with the following parameters: imaging depth of 5 cm, dual transmit foci depth of 2 and 3 cm (corresponding to approximately the center of the muscle [23]), and gain of 38%, as this was determined from preliminary image acquisitions to result in clear images without image saturation. All ultrasound settings were kept constant for all image acquisitions [6,24].
Ultrasound gel was liberally applied at each imaging site. To ensure that the targeted HS muscle was imaged, a transverse view was first visualized after ensuring the ultrasound probe was placed at the appropriate location with respect to the ischial tuberosity. Longitudinal Bmode images were then captured for each hamstring muscle (biceps femoris long head, BFlh; semitendinosus, ST; semimembranosus, SM) at each of the three locations along both thighs. Biceps femoris short head (BFsh) was excluded from the analysis primarily because BFsh is deep to BFlh, thereby making imaging difficult [25]. After image acquisitions were completed at all locations from both limbs, the participant sat on the exam table for 60 seconds, before laying back down. The same imaging procedure was then repeated for each limb.

Image analysis
The static B-mode images were saved on the local computer and extracted for subsequent analysis. Longitudinal images were processed using custom MATLAB algorithms (Mathworks, Natick, MA). For all image analyses, a polygonal region of interest (ROI) was drawn about the central portion of the muscle of interest with the superficial and deep boundaries of the ROI drawn approximately 3-4 mm from the aponeuroses [26].
All possible 96 x 96 pixel sub-images ("kernels") within the ROI, which correspond to a square with 6.6 mm sides, were analyzed in the spatial frequency domain. The kernel size was determined from pilot data of previously collected hamstring images by observing the cumulative image pattern of several fascicles within the kernel. A 2D Fourier Transform was applied to each kernel after zero-padding to 128 x 128 samples to increase frequency sampling. A 2D highpass filter (-3 dB cut-off about 1.0 mm -1 ) was then applied to attenuate low spatial frequency artifacts. The kernels were permitted to overlap and the spatial frequency parameters [14,27] were extracted and averaged over all kernels of the ROI. Thus, spatial frequencies were analyzed in both the axial and lateral directions and a single value was obtained for each parameter for the entire ROI (Table 1, Fig 1).

Reliability procedure
The reliability assessment protocol was divided into three parts. First, the intra-rater reliability was performed to assess the reliability of extracting spatial frequency parameters from ROIs drawn on the same images on two different days by the same rater (S.K.C.) (Fig 2) [26]. A total of 10 images per muscle and location combination were randomly selected for this analysis (90 images total). Second, the inter-rater reliability was performed to compare the extracted spatial frequency parameters from ROIs drawn by two different raters on the same image (Fig 3). A total of 30 images from the mid-belly location were randomly selected for analysis. The midbelly location was chosen since most HS imaging studies assessing architectural measures are typically performed at the mid-belly of the muscle [19,[28][29][30]. The use of this location is also  [26]. Therefore, the SFA method implemented at the mid-belly was performed as a means of comparison to previously reported protocols. The raters were instructed to draw ROIs within the central region in each image. Third, the reliability of the entire image acquisition procedure (test-retest) and SFA method was assessed between the repeat trials at the mid-belly of each muscle from each limb of each participant resulting in 60 images analyzed (Fig 4).

Statistical analysis
A two-way, mixed effects intraclass correlation (ICC) for absolute agreement (ICC(3,1)) was used to assess the intra-rater reliability of each spatial frequency parameter (Table 1) between the same images. To assess agreement in the ROI segmentation of the muscles between raters, the ROI was converted into a binary image where the enclosed ROI was white and the remainder of the image was black. The area of each polygonal ROI and a Sørensen-Dice similarity coefficient were calculated in MATLAB to compare the pixels of the converted binary images [31]. This was calculated as dice A; B ð Þ ¼ 2�jintersectionðA;BÞj jAjþjBj ; where A and B are the set of pixels in the first and second images, respectively, and |A|, |B| are the cardinal sets of A and B, respectively. The area of the pixels contained within the ROIs of each rater was calculated and compared. Two-way random, single rater ICCs for consistency (ICC(2,1)) was performed to determine the reliability between ROI areas and the extracted spatial frequency parameters from the ROIs drawn by different raters. A two-way, mixed effects ICC for absolute agreement (ICC (3,1)) was used to determine the test-retest reliability of the entire image acquisition protocol. Standard error measurement (SEM, SEM%) of each spatial frequency parameter was calculated as where MS E was the mean square error term from the ANOVA table, as this has previously been suggested to allow for more consistent interpretation of SEM values across studies  PLOS ONE [32,33]. The SEM % was defined as SEM% ¼ SEM=MeanðTrial 1; Trial 2Þ ð Þ x 100% [34]. Paired t-tests were performed to compare parameter values between trials. The level of reliability was defined as poor (ICC < 0.5), moderate (0.5 < ICC < 0.75), good (0.75 < ICC < 0.9), or excellent (ICC > 0.9) for both intra-session and inter-rater reliability measures [35,36]. All analyses were performed in IBM SPSS Statistics for Windows, Version 25.0 (IBM Corporation, Armonk, NY).

Results
The ten participants (6 males, 4 females) had a mean ± standard deviation (SD) age of 23.4 ± 1.2 years, body mass 69.9 ± 9.3 kg, height 174.2 ± 6.7 cm, and BMI 23.0 ± 3.0 kg/m 2 . The sizes of the ROIs drawn by the same rater on the same image were all within a mean 4.8 ± 7.8% absolute difference in the number of kernels across all muscles and locations. The intra-rater ICC reliability measures for the extracted spatial frequency parameters were excellent (ICC(3,1) > 0.90) for all muscle locations ( Table 2).
The reliability of the entire procedure at the mid-belly was good to moderate (ICC(3,1) > 0.50) across all parameters and muscles with the exception of Mmax in BFlh (ICC(3,1) = 0.45) ( Table 3). The SEM % for all parameters across all locations was less than 15% (Table 3, range 6.3 to 14.9%). Mean and SD values for each parameter for each trial are shown in Table 4. There were no differences in parameter values between trials across all muscles and locations (p = 0.06-0.95) ( Table 4).

Discussion
This study is the first to investigate the use of SFA for assessment of muscle tissue. The intraand inter-rater reliability of the SFA image analysis method and subsequent parameter extraction were excellent. The test-retest procedure reliability was good to moderate.
The intra-rater reliability of the extracted spatial frequency parameters was nearly unity. Furthermore, the reliability was not dependent upon the muscle (BFlh, ST, or SM) or the location along the muscle length. This highlights the fact that when the same rater performs the SFA procedure on similar images, then it can be expected that similar results will be obtained. This is an important factor when adapting this method for assessment of muscle structure between different muscles and within different locations along the length of the same muscle. The high ICC values for the intra-rater reliability are consistent with some findings using firstorder gray-scale statistics and backscatter analysis in healthy subjects [6,23,37,38] and the SFA method in Achilles tendons [39].
Similarly, the reliability of extracted spatial frequency parameters between raters was excellent and is consistent with other SFA investigations in the supraspinatus [40]. This is Table 2

PLOS ONE
particularly noteworthy since the repeatability of the ROI segmentation between raters had less agreement compared to ROIs drawn by the same rater (ICC = 0.79 vs 0.98, respectively). Despite the difference in ROI segmentation between raters (difference between raters was approximately 17%), similar spatial frequency parameter values resulted. It should be noted that the similarity between images was quite high (Dice coefficient = 0.90), so it is unknown what level of differences in ROI similarity would lead to unacceptable levels of reliability. However, the findings in this investigation highlight the strength of the SFA method in that it is more robust to variances in individual pixel intensities and ROI differences between raters [23]. When accounting for the entire procedure, the test-retest reliability was the most variable across muscles and SFA parameters. However, the majority of parameters had good reliability (ICC(3,1) > 0.80) between trials (Table 3), which is consistent with investigations using tendon [34]. It should be noted that the sonographer for the current study was given no instruction to try and capture the exact same images between trials, but was only instructed to capture images at the same locations. Despite the known variability in image capture with US, the reliability of the SFA method and subsequent extracted parameters was good. The Mmax parameter in the BFlh and SM muscles had the lowest repeatability between trials, which is not surprising. This parameter is a measure of the maximum frequency of the most prominent banded pattern in the image and is therefore the most sensitive to out-of-plane alignment. It is possible that the pennate structure of both BFlh and SM muscles compared to the ST [20,41], could make image acquisition more variable. However, the current test-retest results have similar measures of agreement-especially the Sum parameter (ICC(3,1) > 0.81 for all muscles)as another study investigating the test-retest reliability (ICC(1,1) = 0.74-0.90) of echo-intensity measures in the hamstrings [42].
To date, this SFA method has been limited in application to patellar or Achilles tendons and one study of the supraspinatus tendon [15,16,34,40,43,44]. Thus, some initial considerations had to be made when adapting SFA to muscle tissue. The first corresponded to the parent ROI selection. In this investigation, each rater was instructed to draw the ROI in the central portion of the image, approximately 0.5 cm from the lateral edges of the images to prevent artefact and avoid resolution drop-off on the lateral sides [37]. Additionally, the superficial and deep boundaries of the ROI were selected approximately 3-4 mm from the superficial and deep aponeuroses [26]. This distance was chosen to capture the tissue that best represented the muscle architecture and minimize changes in muscle architecture as the fascicles tend to curve near the aponeuroses [26,45]. Secondly, previous investigations in tendon have used a kernel size of 32 x 32 pixels corresponding to 2 x 2 mm square. However, the tendons in these investigations had different expected fascicle organization (i.e. different dominant

PLOS ONE
spacing seen in the speckle pattern) and different mean thicknesses ranging from 3-5 mm [15,34]. The thickness of the hamstring muscles range anywhere from 2-4 cm in previous cadaveric dissection [19,20,41] and US studies in young, active individuals [46,47]. After reviewing several training images, a kernel size of 96 x 96 pixels, which corresponded to a square with 6.6 mm sides, was deemed an appropriate size to capture the cumulative image pattern of several fascicles within the kernel. This kernel size is consistent with a previous investigation of frequency analysis of echo-intensity in US images of the quadriceps, which used a single ROI of 64 x 64 pixels corresponding to a square with 6.4 mm sides [9]. However, the SFA method used in the current study differs from this previous study in that spatial frequencies were analyzed in both axial and lateral directions compared to only a single direction [9]. The SFA method leverages information obtained from the frequency domain as it relates to the overall tissue structure. The hierarchical structure of skeletal muscle is well documented, and is comprised of muscle fibers grouped together in fascicles. The anisotropic structure of the muscle due to the fibers and their surrounding perimysium connective tissue, therefore, results in an observed speckle pattern of longer correlation length in the direction of the fascicles compared to the normal direction [14]. This is observed in normal sonographic images as parallel striations of hypoechoic muscle fibers and hyperechoic perimysium [48]. Analysis of the spatial frequency spectrum allows for quantification of the speckle pattern as it relates to the light-dark banding pattern observed in longitudinal B-mode images of healthy muscle.
In this study, we reported four different spatial frequency parameters extracted from the Fourier spectrum (Fig 1, Table 1). To date, most previous studies have investigated the peak spatial frequency radius (PSFR) as a means to classify tendinopathy [15,16,43]. This parameter measures the dominant spacing of the reflected banded pattern. For muscle, PSFR would therefore be expected to indicate the dominant spacing between parallel reflections of muscle fibers and perimysium. Thus, PSFR may prove useful in detecting changes in muscle due to hypertrophy, swelling, localized edema, or mechanical disruption of the permysium [49][50][51]. The other SFA parameters have not been as widely investigated. The Mmax and Mmax % parameters are complementary in describing the strength of the banded pattern. That is, if Mmax is large, then the banded pattern should be very prominent compared to image background (speckle). Therefore, Mmax and Mmax % might prove useful in detecting deviations from the normal prominence of the striated fascicles in the case of injury or pathology [1,52]. For example, in the case of muscle injury where the perimysium is disrupted, the prominent banding pattern would have an amplitude closer to the background, resulting in a lower Mmax % (i.e. isotropy throughout the imaged tissue). The Sum parameter is most similar to other measures of gray-scale analysis such as echo-intensity, but is the sum total of frequency amplitudes in both axial and lateral directions rather than mean pixel intensity. Although the PSFR has been primarily studied to this point, other parameters described in this investigation may provide added significance and complement PSFR when assessing tissue disruption in the case of tendinopathies or muscle strain injuries.
As this is the first study to use the SFA method in skeletal muscle, the clinical utility of this method in muscle has not yet been fully determined. We believe the current work is an important first step in a potential clinical tool. In order to elucidate possible uses, future studies should investigate any modulation of muscle tissue structure, which impacts functional capacity of the tissue [53]. For example, in the case of hamstring strain injuries, physicians typically assess pain in the posterior thigh and determine if any functional loss is present. Clinical assessments may be corroborated by imaging modalities [54]. In the case of US, the area of maximum tenderness may be imaged and reviewed to determine the presence and extent of any deviation from the normal anatomical structure [52,55,56]. If imaging is repeated following treatment, changes in the size of the injury and echogenicity of the B-mode image will typically be determined. However, changes in muscle architecture will not be quantified. Analysis of B-mode images with SFA may therefore provide objective determination of tissue organization as a result of injury and treatment.
Study limitations should be noted when interpreting the results of this investigation. Previous investigations using SFA have been performed mainly on superficial structures such as the Achilles and patellar tendons [14,15,27,34] or where the depth of the transmit focus point was not reported when assessing the supraspinatus tendon [40]. It is unknown how altering machine settings may influence SFA parameters. Other investigations in quantitative US analyses have highlighted the necessity of calibrated imaging systems for comparison across studies [37,57]. For this reason, US settings were kept constant between all subjects to provide standardized imaging. This may not be feasible in clinical investigations using these methods nor provide the best images between subjects of different body compositions. Future studies should investigate the influence of different machine settings on SFA parameters. Additionally, only young, active individuals were recruited for this study. It is unknown if these measures of reliability also apply to other populations, such as those with pathology or elderly individuals [38].

Conclusions
Spatial frequency analysis is a quantitative US method which has been previously used to assess intra-tendinous structure. This study has adapted the SFA method and assessed its reliability in the assessment of muscle tissue structure. The intra-and inter-reliability SFA was excellent with good to moderate test-retest agreement. This study indicates that SFA is a reliable method in assessing muscle tissue structure, as characterized by spatial frequency parameters. This method may prove to be useful in future studies in assessing differences in muscle structure following injury, pathology, or throughout rehabilitation. Heiderscheit.