Machine learning to extract muscle fascicle length changes from dynamic ultrasound images in real-time

Background and objective Dynamic muscle fascicle length measurements through B-mode ultrasound have become popular for the non-invasive physiological insights they provide regarding musculoskeletal structure-function. However, current practices typically require time consuming post-processing to track muscle length changes from B-mode images. A real-time measurement tool would not only save processing time but would also help pave the way toward closed-loop applications based on feedback signals driven by in vivo muscle length change patterns. In this paper, we benchmark an approach that combines traditional machine learning (ML) models with B-mode ultrasound recordings to obtain muscle fascicle length changes in real-time. To gauge the utility of this framework for ‘in-the-loop’ applications, we evaluate accuracy of the extracted muscle length change signals against time-series’ derived from a standard, post-hoc automated tracking algorithm. Methods We collected B-mode ultrasound data from the soleus muscle of six participants performing five defined ankle motion tasks: (a) seated, constrained ankle plantarflexion, (b) seated, free ankle dorsi/plantarflexion, (c) weight-bearing, calf raises (d) walking, and then a (e) mix. We trained machine learning (ML) models by pairing muscle fascicle lengths obtained from standardized automated tracking software (UltraTrack) with the respective B-mode ultrasound image input to the tracker, frame-by-frame. Then we conducted hyperparameter optimizations for five different ML models using a grid search to find the best performing parameters for a combination of high correlation and low RMSE between ML and UltraTrack processed muscle fascicle length trajectories. Finally, using the global best model/hyperparameter settings, we comprehensively evaluated training-testing outcomes within subject (i.e., train and test on same subject), cross subject (i.e., train on one subject, test on another) and within/direct cross task (i.e., train and test on same subject, but different task). Results Support vector machine (SVM) was the best performing model with an average r = 0.70 ±0.34 and average RMSE = 2.86 ±2.55 mm across all direct training conditions and average r = 0.65 ±0.35 and average RMSE = 3.28 ±2.64 mm when optimized for all cross-participant conditions. Comparisons between ML vs. UltraTrack (i.e., ground truth) tracked muscle fascicle length versus time data indicated that ML tracked images reliably capture the salient qualitative features in ground truth length change data, even when correlation values are on the lower end. Furthermore, in the direct training, calf raises condition, which is most comparable to previous studies validating automated tracking performance during isolated contractions on a dynamometer, our ML approach yielded 0.90 average correlation, in line with other accepted tracking methods in the field. Conclusions By combining B-mode ultrasound and classical ML models, we demonstrate it is possible to achieve real-time tracking of human soleus muscle fascicles across a number of functionally relevant contractile conditions. This novel sensing modality paves the way for muscle physiology in-the-loop applications that could be used to modify gait via biofeedback or unlock novel wearable device control techniques that could enable restored or augmented locomotion performance.


Crop:
Our crop consisted of directly limiting the matrix size with simple loops to focus on the soleus. For this, we manually measured the general area in which the soleus could be seen to obtain the cropping coordinates and kept these resulting matrix sizes constant for all images in each set (each 1400). Because the soleus moves throughout the task and the crop remains the same, in some tasks the crop may have captured adjacent features in addition to the soleus. Although this simple approach yielded results we could work with, we believe more robust cropping and feature extraction techniques have the potential of improving results.

Down-sampling:
Using the "block_reduce" function from Scikit Image, we first sparsely tested different value combinations to input into the "block_size" parameter while maintaining default machine learning (ML) model parameters. Our inputs into "block_size" were of the form (#, #). After preliminary testing, the "heaviest" down-sampling value combinations seemed the most promising, and hence we chose 4 downsampling rates accordingly where Rate 1 is the "heaviest" and Rate 4 the "lightest". Heavy downsampling means smaller resulting matrix, while lighter means the matrix retains more of the original image information. We did our hyperparameter sweep across all direct and cross-subject tasks, training combinations, and ML models for all four down-sampling rates. We averaged all these results and chose the "best" down-sampling rate based on the same metrics described in the main text consisting of scaling RMSE to match 0 to 1 and them summing the inverse with correlation to obtain the best performer (Supp Table 1).

Format:
To facilitate our matrix management, we used the "flatten" function to reduce the dimensionality of these matrices while keeping all the same information. We also changed data types as needed throughout the implementation code.

Real-Time:
Even though the paper focuses on our pseudo real-time implementation for testing and quantifying machine learning (ML) model capabilities, we have implemented our architecture in real-time. As evidence, we have included a short video of its preliminary performance. We have also included videos to show what the current standards look like (hand-tracking, UltraTrack) and a summary of our benchmarks for correlation, RMSE, and processing time (Supp Table 3).
Notes: -Hand-tracking (S1 Video): We, as many other labs, hand-track within the UltraTrack architecture. Yet even though there are other ways of hand-tracking muscle fascicles, they all revolve around going frameby-frame to ensure accuracy. Still, it is important to note that due to the common artifacts observed in B-mode ultrasound images, it is hard for any individual to be consistently accurate, and even harder for consistent measurements to be done between experimenters. [Kwah, et al. 2013, Van Hooren, et al. 2020 -UltraTrack (S2 Video): The most basic form of UltraTrack tracking can be performed in just a few minutes (depending on trial length), yet these basic results are often filled with heavy drift. The keyframe-correction feature greatly helps mitigate this, yet its performance is tied to the user being able to identify these "key-frames" that are the same length through the trial. This becomes especially hard during dynamical tasks where there is no clear base length that one can go back to. Hence, results might greatly vary depending on experimenter skill, experience, and willingness to spend time finding these "key-frames".
-Real-time (S3 Video): It took 20-30s minutes on average to go from attaching the probe to the participant to having the ML model ready to measure in real-time after processing training data via UltraTrack. Our video shows both direct training of free ankle dorsi/plantar/flexion, and cross task training by tracking walking with the same training as for the free ankle (Note that free ankle was the worst performing for cross task, Table 2c). Note that we used the optimized SVM model. Our current setup runs out of a Gigabyte Aero 15 XA laptop with an Intel i7-9750H CPU, RTX 2070 GPU, and 144Hz refresh rate 1080p display. Because our real-time implementation relies on taking screenshots of the live B-mode ultrasound feed, it is hard to measure the latency accumulated between the ultrasound probe, transducer, the laptop processing, and the laptop screen. Nonetheless, as can be seen in the video, this accumulated latency remains very small (~0.02 seconds by our estimates). The observed output frequency is of around 50 Hz, and there is no filtering or smoothing of the output curve. Each output value is an isolated estimate of the image currently on screen. Note that if the different system components (ultrasound probe, computer, etc.) were optimized towards this application, performance could be improved. Hence, we hope this work encourages future developments.