Development of the Physical Literacy Environmental Assessment (PLEA) tool

Physical literacy is becoming increasingly popular in sport, recreation, physical education and physical activity settings and programming. We developed an environmental assessment tool to evaluate the extent child and youth activity programs implement physical literacy across four domains: environment, programming, leaders and staff, and values and goals. The Physical Literacy Environmental Assessment (PLEA) tool was developed in 3 phases. First, the PLEA tool was created, content validity established, and physical literacy leaders were consulted. In the second phase, the PLEA tool was completed and tested by 83 child and youth programs and it was validated with individual physical literacy assessments completed on children in programs that scored in the top 10% and bottom 10% on the PLEA tool. Third, a National consultation was conducted, and program leaders provided feedback on the PLEA tool. In Phase 1, the PLEA tool was modified and shortened from 41 to 29 indicators, based on feedback from physical literacy content leaders. In Phase 2, participants in programs that scored in the top 10% had significantly higher scores on the upper body object control domain of PLAYfun (p = 0.018), and significantly higher PLAYself scores (p = 0.04) than participants in programs that scored in the bottom 10%. In Phase 3, over 80% of program leaders identified the PLEA tool was useful, and relevant to their areas of practice. The completed PLEA tool is a 20-item environmental assessment tool to evaluate to what degree child and youth programming implement physical literacy across four domains: environment, programming, leaders and staff, and values and goals. The application and validity of the PLEA tool beyond child and youth physical education, sport, dance and recreation sectors, such as in early years programs, should be investigated.


Introduction
Physical education, sports clubs, and recreation programming provide opportunities for children to engage in physical activity. According to Canadian parents and children, over 75% of 5-to 19-year-olds participate in organized physical activity or sports [1,2]. Almost 100% of Canadian elementary students take part in curricular physical education. In high school, 72% of students take part in physical education in Canada [2], as it becomes optional in all Integrated Research Ethics Board, and local school boards, as necessary. Other participating organizations did not require additional ethics approval.

Phase 1: Content validation
The objective of Phase 1 was to establish content validity [14]. The Partnership for Active Hamilton local partners were asked to identify any existing tools related to physical literacy that could be used as a framework for the development of the PLEA tool. Partners included representation from recreation organizations, sport clubs, school boards, researchers and public health. Physical literacy leaders, including researchers and professionals as identified by the Partnership for Active Hamilton, were consulted to establish key concepts and indicators to be included. Partners were contacted by email and given this prompt to generate indicators: "We would like your input as to what elements, information, or considerations of physical literacy do you feel are important and essential to sport-or physical activity-related programming for children and youth. In other words, what would a program have to include (that could be measured) to be considered physical literacy 'friendly'?" The first version of the PLEA tool was created and included the identified elements and information suggested.
Next, physical literacy content leaders were consulted regarding the content relevance of each indicator. Fifteen leaders, including physical literacy researchers and professionals, were contacted and asked to rate each indicator as extremely relevant, somewhat relevant, somewhat irrelevant, or totally irrelevant. They were also prompted for additional feedback: "you may also suggest we combine items or some other arrangement to result in the most logical checklist". Lastly, the PLEA tool was distributed to 5 local program leaders, identified by convenience sampling with the assistance of the Partnership for Active Hamilton. The leaders were asked to complete the PLEA Tool about their child and youth physical activity programs and to provide general comments on the PLEA tool.

Phase 2: Construct validation
The objective of Phase 2 was to establish construct validity, which refers to whether or not a test legitimately describes what the test is intending to measure. One type of construct validity, convergent validity, assesses the degree that a measure is associated to another measure that assesses the same construct (e.g. physical literacy) [15]. The PLEA tool was further tested and validated in children's physical activity, physical education, sport and recreation programs in Hamilton, ON.
Program recruitment. Local school boards, recreation departments, community organizations, sports clubs and camps were contacted by email or phone and invited to participate in the project. Once program leaders completed and returned a consent form, they were sent the PLEA tool to complete. Eligibility criteria included: 1) program operated within Hamilton; 2) program was physical activity, sport, physical education or movement based; 3) program was delivered to participants 7-to-18 years of age.
PLEA tool administration. Program leaders completed the PLEA tool and returned it to our research team by email, fax or it was collected by a member of our research team. Program leaders indicated if their programs met or did not meet each of the 29 indicators included in the PLEA Tool. In addition, program leaders were asked: "If checked, how can your program improve further in this indicator?" or "If not checked, what can your program do to meet this indicator? Comment on any barriers to meeting this indicator". A PLEA tool score was generated for each program as the sum of indicators selected (maximum score = 29).
PLEA tool construct validation. Programs that scored in the highest 10% and lowest 10% on the PLEA tool were invited to participate in the construct validation phase. This phase included individual assessments of participant's physical literacy. Research assistants visited each of these programs to distribute parent consent forms and a parent questionnaire. Research assistants returned to the programs 1-2 weeks later to conduct individual physical literacy assessments. Participants with parental consent were asked to complete an assent form.
Physical literacy assessments. Physical literacy was assessed with the Physical Literacy Assessment for Youth (PLAY) tools. The PLAY tools represent a series of workbooks designed to assess the multiple domains of physical literacy. The PLAY tools were designed for children 7 years and older. In combination, the PLAYfun, PLAYself and PLAYparent tools provide a multi-perspective assessment of a participant's physical literacy [7].
PLAYfun has very good inter-rater reliability (0.87) and construct validity [16], examined by studying variations in scores due to age and sex. PLAYfun and an obstacle course of motor competence demonstrated moderate-to-large correlations, showing convergent validity [17]. Criterion validity has not been established for the PLAY tools because a gold standard for the measure of physical literacy has not been identified in the literature. PLAYfun is an assessment of 18 movement skills within five domains: running, locomotor, object control (upper body), object control (lower body) and balance, stability and body control and were administered with the same methods as Cairney et al. [16]. All PLAYfun assessments were administered and scored by the same investigator (HATC).
The participants then completed the PLAYself questionnaire, a 22-item, self-evaluation of a child's perception of their own physical literacy [7]. The PLAYself questionnaire includes four subsections: environment, physical literacy self-description, relative rankings of literacies (literacy, numeracy, physical literacy) and fitness. The PLAYself score was calculated by adding up the totals of the subsections and dividing by 27, as outlined in the PLAYself workbook [7]. Parents completed the PLAYparent questionnaire to assess the parent's perception of their child's level of physical literacy, including questions about the child's ability, confidence, and participation. PLAYparent provided researchers with an additional perspective from outside the child's current activity program and identified positive and negative factors that affected the child's ability to lead a healthy lifestyle. The PLAYparent is divided into five subsections: physical literacy visual analogue scale, cognitive domain, environment, motor competence (locomotor and object control) and fitness [7]. The PLAYparent questionnaire was scored by summing the parents responses and multiplying by 2.63 to give a total out of 150, as outlined in the PLAYparent workbook [7]. To date, no psychometric properties of the PLAYself or PLAYparent have been reported.

Statistical analyses
Statistical analyses were conducted using STATA Version 14SE for Mac (Statacorp, College Station, Texas). Significance was set at p<0.05 for all analyses. Descriptive summary statistics were computed, and continuous variables are expressed as means ± standard deviation. Normality of outcome variables (PLAYfun, PLAYself, and PLAYparent) was assessed with the Shaprio-Wilk Test and verified visually using histograms. Normality of the residuals of the models were assessed with the Shapiro-Wilk Test for Normality, Skewness/Kurtosis Test for Normality and visual inspection of P-P plots, Q-Q plots and histograms.
Mixed multi-level modelling was used to determine differences in PLAYfun, PLAYself and PLAYparent between programs that scored high versus low on the PLEA tool. Multi-level modeling was used to account for clustering of participants within the same programs. Models were adjusted for participant's age and sex. High or low scores on the PLEA tool were treated as the fixed effects and program was treated as the random effect. An independent variancecovariance structure was used.
Effect sizes were also calculated using Cohen's d for the PLAYfun, PLAYself and PLAYparent scores. Cohen's d is calculated as the average of the programs that scored in the top 10% on the PLEA tool minus the average of the programs that scored in the bottom 10% on the PLEA tool divided by the standard deviation of the outcome variable for the sample. Cohen suggests d = 0.2 is a small effect size, d = 0.5 is a medium effective size and d = 0.8 is a large effect size [18].

Phase 3: National consultation phase
Administration of the PLEA tool. Phase 3 of PLEA tool development was an online National consultation, administered with Research Electronic Data Capture (REDCap). Participants read and completed an online consent form before viewing and completing the PLEA tool and then a feedback survey about the PLEA Tool.
Participants completed the updated 27-item PLEA tool by selecting if they met or did not meet each indicator. If they did not meet the indicator, the following follow-up questions were displayed: 1) what can your program do to meet this indicator?; and 2) What barriers prevent your program from meeting this indicator". If they met the indicator, the following follow-up question was displayed: "how can your program improve further in this area?". Participants then completed a short questionnaire about the usefulness and relevance of the PLEA tool (see Table 4 for questions).
Participants and recruitment. The PLEA tool was distributed by email through relevant organizations (e.g., Sport for Life, national and provincial sport organizations, provincial physical education organizations, etc.). The goal was to reach coaches, teachers, public health professionals and physical activity leaders. Sport for Life assisted researchers with recruitment by contacting certain organizations. The researchers or Sport for Life personnel provided text and a survey link to organizations to include in their newsletter or email to members. The organizations then communicated their number of email list recipients or social media reach to researchers. The PLEA tool and questionnaire were communicated by email to approximately 45,055 recipients and shared on social media (Facebook and Twitter) with an estimated reach of 12,118 followers.
Thematic analysis. In Phase 3, responses to the follow-up questions (listed above) were coded and categorized by two independent reviewers (HC and SL). HC first reviewed all responses and developed categories for each question (e.g., program planning, leader training). SL was provided with the categories and asked to code all responses into one of the provided categories. The categories from both reviewers were reviewed. When both reviewers agreed on the same category/ categories for a response, no further discussion was needed. HC and SL discussed any conflicting categories and came to a consensus on the most appropriate category for each response. These responses were used to update the wording of relevant indictors.

Removal of indicators.
To reduce redundancy in the PLEA tool, we a priori decided that if �90% of programs met an indicator, it was unlikely that the particular indicator was differentiating between programs. These indicators were removed from the PLEA tool.

Phase 1
In Phase 1, the Partnership for Active Hamilton partners (listed above) and the authors of this research identified 12 related, existing tools, questionnaires and surveys (See S1 Appendix). The indicators proposed by physical literacy leaders were grouped into four themes to create the first version of the PLEA tool. The first version included 41 indicators. The environment theme had 13 indicators, the programming theme had 12 indicators, and the leaders and staff and values and goal themes each had 8 indicators. Three of the 15 (20%) invited content leaders responded.
The PLEA tool was modified based on the physical literacy content leader's responses. In the environment domain, 2 indicators were removed as content leaders indicated they were both "somewhat irrelevant" to the questionnaire. The indicator "equipment is available for unstructured and structured play" was modified to be "space, facility and equipment are available for unstructured and structured play". In the programming domain, 3 indicators were removed, 1 was added and one was modified. "Encourages participation in a wide variety of physical activities" was added based on content leader's comments. In this domain, 2 indicators were removed as leaders indicated they were both "somewhat irrelevant" to the questionnaire. In the leaders and staff domain, 3 indicators were removed, 1 was added and 3 were modified. The indicator "program leaders are trained in effective teaching strategies" was replaced with "program leaders develop and execute plans for effective teaching strategies". Three indicators were modified to be clearer to those using the PLEA tool. In the values and goals domain, 2 indicators were removed and 1 was modified. The indicator "emphasis is on learning and improvement" was modified to "emphasis is on learning and improvement and personal achievement". Two indicators were removed in this domain as content leaders indicated they were both "somewhat irrelevant" to the questionnaire.
The five local program leaders that completed the updated PLEA tool did not suggest any further changes to the PLEA tool and this version was used in Phase 2.

Phase 2
In Phase 2, 135 organizations and programs in Hamilton were contacted to participate, 89 (66%) programs submitted consent forms and 83 (61%) programs completed the PLEA tool. Thirteen of these organizations completed the PLEA tool for multiple individual programs (e.g., different sites for an organization's after-school programs). Programs that participated in Phase 2 included: 12 after-school programs, 2 dance programs, 3 fitness programs, 10 physical education curriculum delivery, 22 recreation programs, 6 school sport programs, and 28 sport programs. In total, 32 sports and physical activities participated in Phase 2. Summary scores for the PLEA Tool and each domain are included in Table 1. Fig 1 reports the frequency of PLEA scores.
Parents of 144 participants provided consent for participants to complete the construct validation phase. One of the high scoring programs was scheduled for assessments but due to changes in their facility availability, assessments with participants were not completed. As such, 8 high scoring programs and 7 low scoring programs participated in the validation phase. Scores are presented as mean ± SD (minimum to maximum score). All high scoring programs achieved a maximum score on the PLEA Tool (29 out of 29). High scoring programs represent programs that scored in the top 10% and low scoring programs represent programs that scored in the bottom 10% on the PLEA Tool. https://doi.org/10.1371/journal.pone.0230447.t001

PLOS ONE
The Physical Literacy Environmental Assessment Tool Seventy-two (70% female) participants were in the high scoring programs and 72 (60% female) in low scoring programs. The higher percentage of females is attributed to 2 female-only programs, and higher female enrollment in the remaining coeducational programs. Eight participants of the 144 who provided consent (5.5%) were absent for data collection and 136 participated in the PLAYfun assessment. Two participants did not complete all items on the PLAYfun assessment because they chose to terminate the assessment early or skip an item and 134 completed assessments were included in the analysis. One hundred and twenty-eight participants completed the PLAYself questionnaire and 97 parents completed the PLAYparent questionnaire. In the programs that scored in the top 10% of programs and the bottom 10% of programs on the PLEA tool, the ages of participants were 9.3 ± 1.5 years and 10.0 ± 1.6 years and, respectively. The dependent variables were normally distributed: PLAYfun (p = 0.998), PLAYself (p = 0.128) and PLAYparent (p = 0.086). Based on inspection of a histogram, P-P plot and Q-Q plot, model residuals were normally distributed. Table 2 displays the outcomes between participants in programs that scored in the top 10% versus those in programs that scored in the 10% on the PLEA tool. The high scoring programs included: 4 after-school programs, 1 physical education class, 2 recreation programs and 2 sport programs while the low scoring programs included 1 physical education class, 3 recreation and 3 sport programs.
Children in the programs that scored in the top 10% on the PLEA tool demonstrated significantly higher PLAYfun object control (upper body) skills and PLAYself scores than those in the Updates to PLEA tool following Phase 2. Based on feedback from participants in Phase 2, the PLEA tool was reviewed and updated. Following Phase 2, the question "How relevant is this indicator to your program?" was removed. One indicator was removed: "staff and leaders help motivate and encourage participants to try their best and achieve their goals" because 100% of participating programs responded "yes" for this indicator and it was determined that this indicator was not differentiating programs from one another. The language in several indicators was modified to ensure the indicator was as clear as possible. For example, "staff and leaders are trained in safety protocols" was modified to "staff and leaders are trained in safety protocols designed to minimize the risk of injury of participants", to capture the importance of injury prevention and safety management. In addition, "program leaders develop and execute plans for effective teaching strategies" was changed to "program leaders are given time to develop and execute plans for effective teaching strategies" to capture that programs are providing time for leaders to develop plans, not only leaders who may use their own time to develop plans.
The first two indicators of the PLEA tool were combined into one indicator. The first indicator was "access to indoor opportunities for activity" and the second was "access to outdoor opportunities for activity". This information was modified into one indicator: "access to more than one environment for activity". After program leaders selected if they met or did not meet the indicator, they selected which environments were available for their program (ice/snow, land, water, air, indoors, outdoors). After all modifications at the end of Phase 2, the PLEA tool had 27 indicators.

Phase 3
Three hundred and thirty-three program leaders participated in Phase 3, including 213 leaders who completed the PLEA tool and all follow-up questions, and 120 program leaders who partially completed the PLEA tool and follow-up questions. Program leaders represented multiple sectors and multiple provinces and territories (see Table 3). The most common sectors represented were recreation, sports and education. Program leaders could indicate if they represented more than one sector. The greatest number of programs leaders were from Ontario, British Columbia and Alberta. For the thematic analysis, HC and SL were in agreement for 81.5% responses. Responses were categorized into 1-3 categories based on the comments provided. Seven Program leaders provided feedback on the PLEA tool (see Table 4). For example, 82% of program leaders were satisfied or very satisfied with the PLEA tool as a way of helping them understand if and how their programs implement the principles of physical literacy, and 72% of program leaders identified that they were likely or very likely to use the PLEA tool.

Discussion
Child and youth physical education, sport and recreation sectors are becoming increasingly interested in physical literacy and how it can be implemented in their programming. As a result, it is imperative that program leaders have the appropriate tools to evaluate how their programs support the development of physical literacy through people, programs, facilities and values. While some assessment tools are available to measure an individual's physical literacy [6][7][8], no assessment tool existed that could evaluate how programs across multiple sectors implement elements of physical literacy. Therefore, we developed, validated, tested and evaluated the PLEA tool, an evaluation tool for child and youth physical education, sport, and physical activity programs to evaluate how programs support the development of physical literacy. The PLEA assessed programs across the 4 domains of environment, programming, leaders and staff, and values and goals. It includes indicators that reflect the elements most commonly identified in definitions of physical literacy: motor competence, motivation, confidence and knowledge to engage in lifelong physical activity [4,5]. The revised PLEA tool is a 20-item checklist of program indicators related to physical literacy within a program's environment, programming, leaders and staff, and values and goals. For each indicator, a program leader indicates if their program currently meets or does not meet that indicator. Upon completion of the PLEA tool, program leaders are issued a score for each domain and for the overall PLEA tool. The PLEA tool was specifically designed for program evaluation; however, it would also be useful as a checklist when planning and delivering a physical activity-based program, including its environment, programming, leaders and staff, and values and goals.

Phase 1: PL content leaders consultation, pilot and content validation
In Phase 1, the PLEA Tool was created and modified based on physical literacy content leader's opinions and piloted with local program leaders. The first version of the PLEA tool aimed to be as inclusive as possible to ensure important items were not excluded. The next step was to establish content validity of the indicators. It was necessary to determine the relevance of each item, and if items needed to be added, removed or modified. The physical literacy content leaders provided this critical feedback and these suggestions ensured the PLEA tool was asking about specific elements of physical literacy and that the indicators were relevant across multiple sectors. Content validity is based on judgement of experts and their view on the items included in the instrument. This method is common practice in developing or updating health measures [19]. The establishment of content validity gave us confidence that the PLEA tool was ready to move to Phase 2.

Phase 2: Construct validation
In Phase 2, construct validity was determined by assessing if the participants in programs that scored in the top 10% on the PLEA tool had higher individual physical literacy scores than participants in programs that scored in the bottom 10% on the PLEA tool. When developing the PLEA tool, a criterion, gold-standard measure did not previously exist. In the absence of a gold standard measure, construct validation can be carried out with two extreme groups [19], programs that scored in the top 10% versus bottom 10% on the PLEA tool. Our results showed that only one domain of PLAYfun (object control-upper body) and, PLAYself were higher in the participants in the programs who scored in the top 10% on the PLEA tool. The higher total PLAYself scores reported by participants in high scoring programs may be attributed to higher object control skills, which can influence perceived sport competence and lead to increased physical activity participation [20,21]. For PLAYself, the individual domain scores were all higher in participants who attended the high scoring programs, but the domain differences were not significant. There is evidence that children's self-motivation, self-efficacy, perceived athletic/ sport competence and perceived self-worth are associated with their physical activity participation [21][22][23][24]. The PLAYparent scores did not differ between children attending high and low scoring programs, suggesting possible responder bias about parent's perceptions of their children's physical literacy or parent's choice in selecting activity programs for their children. Phase 2 had limitations and challenges that help explain these results.
Eighty-three child and youth physical activity, sport, and physical education programs completed the PLEA tool, including 32 different activities and sports from multiple sectors. The range of participating programs and activities in Phase 2 of the development of the PLEA tool ensures the PLEA tool has external validity [25]. In this case, the programs that participated were similar to the target population of the PLEA tool, which includes child and youth sport, physical education and physical activity programs. All programs that scored in the top 10% on the PLEA tool identified that they met 100% of the indicators, suggesting possible responder bias and a ceiling effect. The scores were more variable for the bottom 10% of programs, and the scores ranged from 7 to 23 out of 29, with an average score of 17.7±5.1. The comparison of individual physical literacy between participants in the programs that scored in the top and bottom 10% on the PLEA Tool is a limitation as the physical literacy of participants in the middle is unknown. In addition, we had less than 10 programs in the high and scoring programs, limiting the generalizability of the results to all programs. Unfortunately, it was not feasible to complete additional assessments due to limitations in time and personnel.
There were also several challenges to the validation portion of Phase 2. For the validation phase, programs were invited if they scored in the top or bottom 10% on the PLEA Tool, and were not matched based on participant's age or gender, or program's sport or activity. Seven programs that scored in the bottom 10% on the PLEA tool that were invited to participate in the individual assessments were unable to participate. As such, a roll-down method was used, and the subsequent low scoring programs were invited to participate and do not represent the true lowest scores. It was not feasible for programs with short sessions to participate because the assessments could not be scheduled during the session. One program that scored in the top 10% on the PLEA tool agreed to take part in the individual assessments, but due to facility and scheduling changes, the assessments could not be completed. It was also not possible to complete the assessment at one water-based program that did not have adequate land space close by. Another limitation was the inability to control if children attended other physical activity programs, the quality of other programs or, how long and frequently they attended all physical activity programs.

Phase 3: National consultation
In Phase 3, 213 program leaders completed the PLEA tool and feedback questions, and 120 programs partially completed the PLEA tool and feedback questions. The partially completed questionnaires may be explained by the length of the PLEA tool in this phase. To collect qualitative data to potentially inform changes to the PLEA tool, each of the 27 indicators included follow-up questions, which would not be included in the final PLEA tool version. The added duration to complete that form of the PLEA tool may have deterred some program leaders. Over twenty participants commented that the PLEA tool was too long or suggested the followup questions be removed. These comments were considered, and the final version of the PLEA tool is now 20 indicators and there are no follow-up questions. The final version of the PLEA tool can now be completed in 10 to 15 minutes.
In Phase 3, program leaders almost equally represented education, recreation, not-forprofit organizations and sport. The PLEA tool is most relevant to program leaders to evaluate programs and make changes for future program sessions. The PLEA Tool may also be appropriate for public health professionals, government officials and researchers to evaluate and plan and evaluate physical activity programming and interventions. The question about where participants live and practice was at the end of the survey, so only the locations of participants who completed the entire PLEA Tool are known. Participation was limited in the territories. The largest representation of participants was from Ontario, Alberta and British Columbia. Only 4 participants from Quebec participated, and this may have been because the PLEA tool was not available in French. It was not feasible to translate the PLEA tool into French and then translate open-ended responses back to English for interpretation. Several national organizations were invited to circulate the PLEA tool to members (such as coaching groups). Many of these organization's mandates outline that information circulated to members must be available in English and French. The final PLEA tool is translated into French. The broad representation of participants from multiple sectors is a strength of Phase 3.
In Phase 3, overall feedback was very positive. Over 80% of program leaders indicated that they were satisfied or very satisfied with the PLEA tool, that the PLEA tool was important or very important for program planning, program delivery, and program evaluation and, that it was relevant or very relevant to their areas of practice. Just over 70% of participants indicated they were likely to use the PLEA tool, and this may be based on the participant's profession or role. For example, some participants may be interested in the PLEA Tool and it's outcomes, but may not use the PLEA tool in their everyday work.
After Phase 2, only minor changes to the PLEA tool were made. Seven indicators (�90% of programs met the indictor) were removed because they did not differentiate between high and low scoring programs. This change also supported the comments that the PLEA tool was too long. Additional feedback from participants was used to re-word several indicators to ensure they were clearer and that they captured the appropriate information about programs. The final PLEA tool is 20 indicators.