Exploring Game Performance in the National Basketball Association Using Player Tracking Data

Recent player tracking technology provides new information about basketball game performance. The aim of this study was to (i) compare the game performances of all-star and non all-star basketball players from the National Basketball Association (NBA), and (ii) describe the different basketball game performance profiles based on the different game roles. Archival data were obtained from all 2013-2014 regular season games (n = 1230). The variables analyzed included the points per game, minutes played and the game actions recorded by the player tracking system. To accomplish the first aim, the performance per minute of play was analyzed using a descriptive discriminant analysis to identify which variables best predict the all-star and non all-star playing categories. The all-star players showed slower velocities in defense and performed better in elbow touches, defensive rebounds, close touches, close points and pull-up points, possibly due to optimized attention processes that are key for perceiving the required appropriate environmental information. The second aim was addressed using a k-means cluster analysis, with the aim of creating maximal different performance profile groupings. Afterwards, a descriptive discriminant analysis identified which variables best predict the different playing clusters. The results identified different playing profile of performers, particularly related to the game roles of scoring, passing, defensive and all-round game behavior. Coaching staffs may apply this information to different players, while accounting for individual differences and functional variability, to optimize practice planning and, consequently, the game performances of individuals and teams.


Introduction
One of the most recent advances in assessing basketball performance is player-tracking technology [13,14]. This technology uses computer vision systems designed with algorithms capable of measuring the positions of players with a sampling rate around 25 frames per second [15]. Of course, kinematic variables such as distance, velocity or acceleration may be derived from these data, and sampling frequencies might improve in future [16]. Currently, the tracking technology is being used with data obtained from notational analysis providing combined information about sports performance; for example, by analyzing the distance covered by players when the team is attacking and when the same team is defending. Research in basketball using positional-derived variables however is limited at present to small samples of young basketball players examining physical demands [17], effects of defensive pressure on movement behavior [18], and how tactical performances are affected by activity workload [19].
These new tracking data open up possibilities that advance understanding of game performance by embracing a more holistic approach to analyzing sports behavior. For example, movement patterns (kinematics) from tracking data complement variables from the physiological (e.g., work rate), technical (e.g., actions) and tactical (e.g., individual/team behavioural patterns) domains leading to a more complete description and understanding of sports behavior in its entirety. As noted, an issue to address in this study using the large amounts of tracking data at hand concerns different basketball game performance profiles for different players and teams. That is, to categorize individual player performances into like groupings for use as baseline reference for the future development and preparation of players. The aim of the present study then is twofold: (i) to compare basketball game performances from the all-star and non all-star players, and (ii) to identify and describe the different basketball game performance profiles based on different game roles in the NBA.
Regarding the first aim, it was hypothesized that all-star players will outperform the non allstars in game statistics. Therefore, the player performances on an actions-by-minute of play basis were compared, in aim of identifying performance variables that discriminate between the two separate groups of players. It is expected that all-star players should outperform the non all-star players in their performance statistics, particularly in scoring and passing related variables, as these important variables are thought to place higher demands on anticipatory processes [20][21][22]. In the second aim it was hypothesized that player performance profiles will present similarities and dissimilarities that can be used to identify different groups of players based on playing position. This aim is accomplished by using actions-per-game, in order to identify different groups of player performances, regardless of minutes of play in the games, thereby identifying those performance variables that discriminate between different player groupings.
Finally, it is important to describe the data within these performance-based groupings according to the players (all-star vs. non all-star) and playing positions. For example, some groups might have strong presence from all-star players and other groups might comprise both all-star and non all-star players from specific positions. This information can be useful when used in planning representative tasks in practice sessions, thereby fine-tuning playing behaviors in competition by using representative tasks in training [23,24]. In fact, players are often divided in practice into smaller groups according to specific positions as well as their playing standard. Non-starting players, for example, lack the same amount of playing time as starting players, and this competitive playing deficit likely affects their responses to competition throughout the season [21,25]. It follows that a detailed description of these different performance profiles using available objective measures would serve as an appropriate performance baseline for optimizing practice planning and, ultimately, for improving game performance.

Sample and variables
Archival data were obtained from open-access official NBA records for 1230 games played during the 2013-2014 regular season (available at http://stats.nba.com, these records contained both non-tracking and tracking data). A total of 30 teams played 82 games between October 29, 2013 and April 16, 2014. The gathered database had records of game performances from 548 players. The cases of player transfer between teams were counted as two different records.
The variables analyzed included the points per game, minutes played and the following game actions, as defined by the NBA and the company responsible for the player tracking process (SportsVU, Northbrook, IL, USA): • Pull-up shots: any jump shot outside 10 feet where a player took one or more dribbles before shooting. Gathered variables include pull-up points per game (PPG) or minute (PPM), fieldgoal percentage (FG%) and 3-point field-goal percentage (3FG%).
• Catch and shoot: any jump shot outside of 10 feet where a player possessed the ball for two seconds or less and took no dribbles. Gathered variables include catch and shoot PPG or PPM, FG% and 3FG%.
• Close shots: any jump shot taken by a player on any touch that starts within 12 feet of the basket, excluding drives. Gathered variables include close PPG or PPM and FG%.
• Drives: any touch that starts at least 20 feet of the hoop and is dribbled within 10 feet of the hoop and excludes fast breaks. Gathered variables include drives PPG or PPM and FG%.
• Passing-variables: the total number of passes a player makes and the scoring opportunities that come from those passes, whether they lead directly to a teammate scoring a basket (assists) or free throw (free-throw assists), or if they set up an assist for another teammate (secondary assists). Gathered variables also include total assists opportunities and total points created by assists.
• Touches-variables: the number of times a player touches and possesses the ball (touches per game), where those touches occur on the court (front, close or elbow), how long the player possessed the ball (time of possession), and the number of points per touch or per half-court touch. Gathered variables also include blocks, steals and the opponent field goals made at the rim while being defended.
• Speed and distance: variables that measure the distance covered (expressed in miles) and the average speed of all movements (expressed in miles per hour) by a player while attacking or defending.
• Rebounds: the number of rebounds secured (rebounds), the times when the player was within the vicinity (3.5 feet) of a rebound (chances), the number of rebounds a player recovers compared to the number of rebounding chances available (percentage chances) as well as if the rebound was uncontested by an opponent (uncontested). These variables were gathered either for defensive and offensive rebounds.
• Free-throw percentage: the number of free-throws made divided by the number of freethrows attempted.
Video footage from the entire court was unavailable making assessment of the NBA tracking data impossible. The NBA non-tracking data (e.g., assists, steals or defensive rebounds) however was assessed for reliability as follows. Two games were selected at random and analyzed conjointly through systematic observation by two experts. The minimum Cohen's κ value for all variables exceeded 0.91 demonstrating high inter-rater reliability [26] between the NBA non-tracking data and the two experts.

Data analysis
Variables expressed as counts per game were divided by average minutes played. Records were screened for univariate outliers (cases outside the range Mean ± 3SD) and distribution tested, together with advised assumptions for each following inferential analysis [27]. To identify which variables best predict the player category (i.e., all-star vs. non all-star), the performance per minute of play was analyzed using a descriptive discriminant analysis. Structure coefficients greater than |0.30| were interpreted as meaningful contributors for discriminating between the two groups [27]. Validation of discriminant models was conducted using the leave-one-out method of cross-validation [28]. Also, a k-means cluster analysis was performed on the entire sample with the aim of creating and describing maximal different groups of game performance profiles. The cubic clustering criterion, together with Monte Carlo simulations, was used to identify the optimal number of clusters, thereby avoiding using subjective criteria. This statistical technique requires that all cases have no missing values in any of the variables introduced in the model; there were a total of 339 cases meeting this condition (62%). Afterwards, a descriptive discriminant analysis was performed to identify which of the variables best predicts the playing clusters.
One-way independent measures ANOVA was used to compare the variables not selected in the discriminant models (i.e., points scored per game and minutes played). Tukey post-hoc homogeneous subsets were used to describe post-hoc results. Statistical significance was set at 0.05 and calculations were performed using JMP statistics software package (release 11.0, SAS Institute, Cary, NC, USA) and SPSS software (release 22.0, SPSS Inc., Chicago, IL).

Comparing all-star and non all-star players
The means and standard deviations from the variables according to the all-star vs. non all-star categories are presented in Table 1. The most important variables for differentiating all-star and non all-star performances per minute of play were identified using discriminant analysis. The obtained function was statistically significant (p0.001) with a canonical correlation of 0.59 (Λ = 0.65) and reclassification of 97.2%. The structure coefficients (SC) from the function reflected emphasis on elbow touches (SC = 0.43), defensive rebounds (SC = 0.35), close touches (SC = 0.34), close points (SC = 0.33), pull-up points (SC = 0.33) and speed in defense (SC = -0.33) (see Table 1). There were six cases misclassified (60.0% accuracy) in the all-star group and seven cases misclassified (97.8% accuracy) in the non all-star group, therefore, the obtained mathematical model shows high accuracy in classifying the players into their original groups. Figs 1 and 2 present the distribution from the discriminant scores in each group of players. The all-star players presented higher mean scores when compared to non all-star players (3.04 ±1.45 and -0.13±0.87, respectively).

Describing different game performance profiles
The cubic clustering criterion (CCC) along with Monte Carlo simulations was used to identify the optimal number of clusters. The largest value (CCC = 252.6) was obtained for a model of seven clusters. Therefore, a k-means cluster analysis was performed to create and describe seven maximal different groups of performance profiles per game. The means and standard deviations from the variables according to the cluster solutions are presented in Table 2. The discriminant analysis revealed four statistically significant functions (p.001), however, the first two yielded a total of 94.7% from the total variance, with canonical correlations of 0.98 and 0.88, respectively. The reclassification of the cases in the original groups was very high (96.2%). The structure coefficients from the functions are presented in Table 2. The first function had stronger emphasis on total distance covered in offense (SC = 0.83) and defense (SC = 0.80), whereas the second function was emphasized by performance obtained in passingrelated variables (see Table 2). Table 3 presents the differences between clusters in points scored per game, minutes played and distance from each case (player) to cluster centroid. The clusters 2 and 4 had more playing minutes and points per game. The clusters 1 and 5 were the most homogeneous, as identified in smaller distances to group centroid. In addition, player distributions in the seven clusters   were contrasted against player category, presence in the NBA first team, and specific court position of players. The all-star players were grouped in clusters 2, 3 and 4. The NBA first team was grouped in clusters 2 and 4. Fig 3 presents the territorial map from the cases and created clusters within the space from the first and second discriminant functions. Players from clusters 4 and 2 exhibited better overall performances, however, players from cluster 6 also performed well in variables related to function 2.

Discussion
This study aimed to compare game performances of all-star and non all-star basketball players and to identify and describe different basketball game performance profiles in the NBA. In general terms, key performance indicators were identified that discriminate all-star players from non all-stars and, also, the different groupings of performance profiles in competition.

Comparing all-star and non all-star players
As expected, all-star players outperformed non all-star players in performance statistics, particularly in defensive rebounds, close touches and close points, pull-up points and assists. (Note. These results may be confounded in that the distinction between all-star and non all-star players is determined by sportswriters and broadcasters. This said, discrimination between these prejudged player groups is reflected in some game performance variables as reported in this study.) Noted previously, the variables obtained from the tracking systems allow use of court locations for better understanding several game statistics. Therefore, these results increase knowledge of basketball game behavior by identifying key performance variables and by reducing prior emphasis on the importance of distance covered and velocity. The reclassification obtained was very high and hence affirms accuracy of the mathematical model.
The close touches and points were identified as key variables, suggesting that all-star players performed consistently better than non all-star players within 12 feet of the basket. These court locations are highly concentrated with teammates and opponents with frequent physical contact between players. These complex actions require high anticipatory skills [29] and allstar players outperform non all-stars in producing these complex skills under extreme adverse conditions [20][21][22]. Also, related with these findings, all-star players demonstrated the ability to score pull-up points, again showing how well these players perceive environmental information and adapt their behavior accordingly [30,31], as they strive to reach a better position from which to score (oftentimes using one or more dribble actions before shooting, for example).
Several studies from basketball [32], football [33] and futsal [34] analyzing space-time dynamics of player dyads inform how the formation of playing patterns are influenced by scoring targets (i.e., baskets and goals). This higher ability to perceive the environment requires a developed attention span [35,36], perhaps evidenced in the higher number of assists given that assists constitute passes to a teammate leading directly to a subsequent field goal. The distance covered and average speeds were not discriminant variables between the allstar and non all-star players. Until the availability of recent technology, getting reliable time motion data in basketball games has been difficult to acquire and, as such, low accuracy in the measures reported and/or small sample sizes have been a concern since early times [37]. The present results however provide measures of distance and velocity from an entire NBA season that are considered reliable [13,14], despite the 25 Hz sampling frequency limitation [16]. Although discriminant analysis only emphasized velocity in defense, there seems to be a tendency for all-star players to cover slightly shorter distances at lower average velocities. This might be important in that it is consistent with previous observations on the enhanced attunement of players to perceive affordances [38,39]. Thus, all-star players may well make less mistakes when deciding when and where to run in both offense and defense, possibly taking shorter paths to reach their destinations. These fewer mistakes in a game might well result in lower distances covered by these players. In addition, these considerations might also suggest that all-star players are more efficient, having less energy demands placed on them during a game. In fact, research suggests that motor efficiency achieved through intensive training, leads to improved perception, focus, anticipation, planning and fast responses [40]. The finding of lower defensive velocities for all-star players may reinforce this observation, but might also suggest that these players might be focusing their efforts more on offensive performances, as they are more complex and depend more upon their high level expertise [22,41].

Describing different game performance profiles
The results reported different performance profiles for different player groupings. There were seven different groups identified by the analysis, obtaining very high reclassifications of the cases (96.2%). These groupings, based on total distance covered in the season and performance per game, might be used in developing specific playing profiles that, taking into consideration the influence of individual differences and functional variability, may serve as baseline to facilitate optimizing practice planning and game performance.
The clusters 2, 3 and 4 performed best at discriminant variables from function 1 (78.3% of total variance) and they contained all of the all-star players. These players participated in more than 30 minutes per game and scored many points per game (from 12.8±3.4 to 17.8±6.3). As an effect of these higher playing times, the most discriminant variables of this function were the distances covered either in defense or offense. Other discriminant variables included participation in offense (touches and front court touches) and passing-related variables (passes, assists, secondary assists, assists opportunities and points created by assists). There are also unique traits from each cluster that could be used to optimize the training process. For example, due to their high playing times in game competition, players from cluster 2 are likely high conditioned players, however, they should also give the most concern for coaches when planning recovery time between games [42]. Conversely, players from cluster 4 comprised all guards or shooting guards with extremely high values from time of possession, touches per game or passing-related variables. This is key information for coaches to optimize representative task designs that enable players to perceive adequate environmental information and to subsequently act accordingly [25,30,43]. Finally, players from cluster 3 demonstrated less possession time and touches, despite the higher minutes of play, which suggests a predominant defensive role for these players. The defensive tasks are particularly related to player fitness variables as high-level defensive performances seem to require higher energy demands [44] and these kind of tasks are therefore particularly related to player fitness variables.
In addition, the worst performance variables in function 1 belong to players from cluster 1, as they exhibited lower playing times (12.6±5.0) distributed equally on playing position. In fact, the most unclear player positions (and missing values), in reference to players that can play in several different positions, were grouped in this cluster. Therefore, these results might be suggesting a profile of an all-round player that can be used in a game to serve multiple purposes, or a profile of a very specialized player (e.g. in shooting or rebounding). Together with workload compensation of reduced playing time, coaching staffs can modify the tasks to optimize the performance produced by these all-round players or specialists.
When adding the results from the second discriminant analysis function, clusters 4 and 6 emerge as active performers in the analyzed variables, such as time in possession, touches, passing, pull-up points and drives per game. These results confirm the guards profile (in cluster 4) identified previously, and also for players in cluster 6. In fact, there is an important requirement to adjust the tasks required of these players in order to fine-tune the environmental information necessary for information pick-up in game play [30,45]. From the same perspective, players from cluster 3, identified as defensive-related, demonstrate less activity in these variables, consistent with their roles in the game.

Conclusions
In summary, this study provided analysis of an NBA regular season using player-tracking variables and notation data. It was found that all-star players performed consistently better than non all-star players within 12 feet of the basket, possibly a result of optimized attention processes that are key for perceiving the required appropriate environmental information for action production. In addition, different groupings were identified based on playing performance, particularly in relation to the roles of scoring, passing, defensive and all-round duties. These findings can be used to optimize preparation for individual player groupings and, ultimately, improve game performances of the players and teams.