Shoe feature recommendations for different running levels: A Delphi study

Providing runners with footwear that match their functional needs has the potential to improve footwear comfort, enhance running performance and reduce the risk of overuse injuries. It is currently not known how footwear experts make decisions about different shoe features and their properties for runners of different levels. We performed a Delphi study in order to understand: 1) definitions of different runner levels, 2) which footwear features are considered important and 3) how these features should be prescribed for runners of different levels. Experienced academics, journalists, coaches, bloggers and physicians that examine the effects of footwear on running were recruited to participate in three rounds of a Delphi study. Three runner level definitions were refined throughout this study based on expert feedback. Experts were also provided a list of 20 different footwear features. They were asked which features were important and what the properties of those features should be. Twenty-four experts, most with 10+ years of experience, completed all three rounds of this study. These experts came to a consensus for the characteristics of three different running levels. They indicated that 12 of the 20 footwear features initially proposed were important for footwear design. Of these 12 features, experts came to a consensus on how to apply five footwear feature properties for all three different running levels. These features were: upper breathability, forefoot bending stiffness, heel-to-toe drop, torsional bending stiffness and crash pad. Interestingly, the experts were not able to come to a consensus on one of the most researched footwear features, rearfoot midsole hardness. These recommendations can provide a starting point for further biomechanical studies, especially for features that are considered as important, but have not yet been examined experimentally.

Need key words at the end of the abstract. >>Thank you for the reminder. We have included the following key words: Individualized footwear, running biomechanics, runner abilities, footwear experts, midsole hardness Introduction: Line 54: Delete the parenthetical citation fully written citation, should just be a reference number.
>>This citation has been replaced with the appropriate number.
Line 67: Same here, please deleted written citation, should just be a reference number. >>This citation has been replaced with the appropriate number.
Line 69-70: Reword this to not be a numbered list. Within the intro, it should just be written sentences.
>>We have removed the numbers from the sentence and updated the text to the following: "On the other hand, there has been little scientific attention on footwear features such as outsole traction or forefoot flares which could indicate: the prescription of these features to different runner levels is trivial, or that these features are not considered important by footwear professionals, or little is known on how to prescribe these features." (line 78-81) Line 71: You state, "it is close to impossible for running footwear professionals to provide evidence-based recommendations for footwear properties for runners of different levels." But then you go on to say you are performing a Delphi to find the best recommendations from the experts. I think this is contradictory. I think you should focus more on how there is not clarity on professional recommendations for footwear for different running skills or groups.
>>We have removed the last comment from the introduction to here as we addressed these two comments together. The corresponding statements now read: "In summary, there is a need to better understand how footwear research experts make decisions about different footwear features and their properties" (Lines 81-83) I think the third to last and second to last paragraphs can be amalgamated into one paragraph.
Further, the second to last paragraph ends abruptly and a better conclusion is need to set up the purpose paragraph.
>>We have combined the two paragraphs and updated the phrasing so that it is more focused on "how there is not clarity in professional recommendations": "Modern running shoes are complex systems. They incorporate many different features (e.g., crash-pads, heel counters, flares, midsole hardness) and each of these features can be included, excluded, and/or tuned individually to modify the characteristics of the final running shoe system (e.g., cushioning, stability, heel-to-toe transition, energy return

101)
You need to give inclusion/exclusion criteria for who was considered an expert for this study. Novice versus Recreational runner definition: In the novice group, you state that they run no more than 20km/week, but in the recreational group, they run 10-50 km/week? How do delineated between someone that runs 15-20 km/week? Is this based off of times per week (0-3 v 1-5)? Please clarify. High Caliber runners: I see the same thing here, they run 30km+/week. Please clarify >>Thank you for clarifying this. We have included the following description to clarify the overlapping mileage: "The proposed characteristics provide guidelines for runner classification. As such, there is overlap in the running distance per week between the different running levels in order to accommodate runners that train less and have a better running performance." (line 128-130) Line 211: Can you explain further why the 'don't know' questions were not included in round 3?
>>We have expanded upon our explanation with the following: General comment: It is not recommended to use bullet points within the results. Please edit accordingly >>We have eliminated the bullet points and updated the text to the following: "The respondents' rating of the running level definitions improved as the Delphi study progressed. The median score given to the running level definitions increased each round and the interquartile range decreased as 88% of respondents rated the running level definitions between 7 and 10 in the third-round as opposed to 69% in the first-round (see Fig. 3). The changes to the running level definitions for the second-round were: increased "novice" running experience to one year (from six months) and increased "recreational" running experience to greater than one year (from six months), increased "high caliber" running habits to >4 sessions/week (from >3 sessions/week) and >50 km/week (from >30 km/week), specified the running performance as males between the ages of 18 to 34, replaced "stress management" with enjoyment for running motivation for all levels, re-order the "high caliber" running motivation from 1) Improve general health, 2) Stress management, 3) Competition to: 1) Competition, 2) Improve general heath, and 3) Enjoyment, and re-order the priorities for footwear design for "High caliber" from: 1) Improve performance, 2) Improve comfort, 3) Reduce injury risk, to: 1) Improve performance, 2) Reduce injury risk, 3) Improve comfort. Subsequent changes to the running level definitions were to ensure that the high caliber and recreational runner 5km and 10 km time were indicative of the respective marathon times. These updates resulted in the final updated runner level definitions in Table 3 Line 315-17: This is a future research, implications, and/or conclusion sentence and should be moved >>I understand that this concluding sentence pertains to future research, in our opinion it is a major point of discussion. We have kept this sentence as it wraps together the discussion summary paragraph.
Line 369: Suggest deleting the term ground truth and just state that this should serve as valuable information, etc.
>>We have eliminated "ground truth" and updated the sentence to the following: Reviewer #2: Overall this manuscript fills an obvious void in the literature and aims to assist researchers, clinicians, coaches, and running enthusiasts with shoe prescriptions, while also informing future running shoe research. This work is generally well written and free from fundamental flaws; however, several minor revisions to the proposed article will undoubtedly improve this already great work.
>>Thank you for your kind comments. Your suggestions have improved the manuscript.
1. The words "the participants" are over utilized throughout the manuscript. Varied diction will help to maintain reader interest and attention.
>>We have updated the manuscript so that there is more varied diction.
2. As this is a study employing Delphi techniques no statistical analyses are necessary and furthermore, no analyses were actually conducted. The "Statistical Analysis" section is therefore unnecessary and the subsequent descriptive statistics can simply be presented in the "Results" as well as >>We have checked the entire manuscript and ensured that "property" and "feature" were used correctly.
4. "Appendix A" utilizes the term "categories" as opposed to "properties" further illustrating the previous point. >>We have updated "categories" to "Property categories" throughout the Appendix and updated the file name of Appendix A to: "S1 Appendix A -Shoe Feature Descriptions and Properties" 5. Additional headings for the "Footwear Properties" in the "Methods" and "Results" sections would assist readers navigating between parts of the manuscript. >>We have clarified the use of the Likert scale by including the following in our methods: "The list of important features was verified if over 75% of the second-round participants answered with a seven or higher on the 10 point-scale." (line 196-197) 10. Fig 2 is very helpful, but a threshold of >50% is provided when the text describes using a 51% threshold.
>>We have updated the Fig. 2 and replaced ">50%" with "≥51%". 11. While minor, the software used to produce images was not stated.  (6) and (13). The initial list of 31 was reduced to 23 features by removing or joining related features that were reflected in other features or similar in their function, respectively (e.g., remove midfoot midsole hardness and only retain forefoot and rearfoot midsole hardness). Pilot testing with four footwear science experts (not included in the main study) indicated that 23 features resulted in a questionnaire that would require more than an hour to complete and could potentially lead to a high-drop out rate. Therefore, we limited the number of footwear features to 20, by removing features for which pilot participants indicated low relevance (e.g. upper overlays or varus alignment). In return, the option was added for experts of the main study to suggest footwear features, that should be added to the questionnaire." (line 169-180) 13. The inclusion of 2 aims and 3 purposes is somewhat confusing. I recommend removing the aims from your "Introduction" as they do not match the "Methods" and "Results" sections as obviously.
>>We have removed the two aims from the introduction.
14. Please ensure that permissions for any adapted images (i.e. 15. A limitation that seems somewhat overlooked is that the definitions of runner levels changed throughout iterations. As these definitions changed, so too may have respondents' recommended properties. While the 3 repetitions and consensus measures may help to quell these concerns, it seems important to consider the implications of these interconnected moving targets.
>>We have added the following to the limitations as you suggested: "The recommended footwear feature properties may have been influenced by a dynamic definition of the runner levels, which changed slightly throughout the study. These changing definitions, however, seemed to have little effect on expert opinions on the footwear feature properties as the verifying consensus level was generally higher than the original consensus level (Table 4,." (line 403-408) 16. If possible, I would like to know more about your "Additional Delphi Questions" results in the discussion. I read some of the statements in your raw data set and found the additional insights very compelling. You do a good job of introducing some of the identified themes in your "Discussion" but I feel that a bit more would elevate the current manuscript.
>>We have integrated some expert feedback in the running level definitions discussion paragraph (line 434-451). Please see our response below #20 and #21.
17. Tables 5 and 6 both seem to provide complementary results. Is there a way to combine them or make the more exclusive from one another? >>We have eliminated Table 6 and added a column for "% Participant in agreement with consensus" to Table 5.

Reviewer #3: General
The paper is well written and the study uses appropriate methodology for reaching consensus regarding standards for classifying runners as well as for recommendations for running footwear.
>>Thank you for your compliments and suggestions.
One major concern that I have is that while the data was collected anonymously, the country and region of the country is provide din the raw data. This information along with the acknowledgment to specific participants, makes it quite easy to identify the responses of many of the participants in the raw data. The country and region data collected in the survey needs to be deleted to de-identify the data and preserve anonymity of the participants responses.
>>We have de-identified the raw data by removing the country and region for each participant.
Another concern I have is the use of a manuscript in review as a major reference for this study.
The Hoitz et al, manuscript that is listed as in review is not available to the reviewers of the current manuscript. As such it is difficult to discern how the current manuscript contributes to the literature. Moreover, depending on when or if the Hoitz, et al manuscript is accepted, it may not be available to the readers of the current manuscript. It would be acceptable to reference a manuscript that has been accepted and is in press. >> We have removed the citation in question (as the mentioned manuscript is still in review) and replaced it with the following: Sun, X, Lam, WK, Zhang X., Wang J, & Fu W (2020). Systematic

Review of the Role of Footwear Constructions in Running Biomechanics: Implications for
Running-Related Injury and Performance. Journal of Sports Science and Medicine,19, 20-37 Minor Line 111: the phase "reached out to", is awkward perhaps "contacted" or similar >>We have updated the phrasing as recommended. (line 112) Table 3 or discussion of runner classification. While consensus was reached on runner classification, was consensus reached on how to classify runners who may meet standards across categories (e.g. run at novice speed but with the habit or experience of recreational runners). For example, for a runner to be in a category do they have to meet 4 of the 5 categories or … ? >>While we did not specify how many criteria had to be fulfilled in order to decide the runner's category at the beginning of the survey, we acknowledge your points and added the following to the limitation section: "A limitation of the consensus process for the running level definitions was that we did not specify to the experts how many of the of the categories a runner must match to be considered a "novice", "recreational", or "high caliber" runner. As such, the definitions may lead to minor variations when different footwear experts categorize runners." (lines 408-411) Table 6. I re-read the methods paragraph describing the manner of reaching consensus multiple times, lines 181-194. I also read the results paragraph regarding shoe properties, lines 283 to 293, multiple times. However, it is not clear to be which specific variables qualified to be presented in table 6.
>>We have eliminated Table 6 and added the "% Participants in agreement with consensus" column from Table 6 to Table 5.