Measuring individual worker output in a complementary team setting: Does regularized adjusted plus minus isolate individual NBA player contributions?

Shankar Ghimire; Justin A. Ehrlich; Shane D. Sanders

doi:10.1371/journal.pone.0237920

Abstract

Adjusted plus minus (APM) measures have redefined our understanding of player value in basketball and hockey, where both are team games featuring player productivity spillovers. APM measures use seasonal play-by-play data to estimate individual player contributions. If a team's overall score margin success is figuratively represented by a pie, APM measures are well-designed to slice the pie and attribute individual contributions accordingly. However, they do not account for the possibility that better players can increase the overall size of the pie and thus increase the size of the slice (overall APM value) for teammates. Herein, we use data from NBA player-season Real Plus Minus (RPM)—a leading APM measure—for all recorded player-seasons from 2013–19 and player lineup data to test whether RPM is related to teammate quality. We run sets of linear fixed effect regression models to explain variation in RPM across player-seasons. We also employ a two-stage least square (2-SLS) method for robustness check. Both empirical approaches address potential endogeneity in the relationship of interest. We find strong evidence that RPM is related to on-court teammate quality. Despite adjusting for teammate and opponent quality, RPM does not control for complementarity effects. As such, RPM is not suited for out-of-sample prediction.

Citation: Ghimire S, Ehrlich JA, Sanders SD (2020) Measuring individual worker output in a complementary team setting: Does regularized adjusted plus minus isolate individual NBA player contributions? PLoS ONE 15(8): e0237920. https://doi.org/10.1371/journal.pone.0237920

Editor: Corrado Andini, University of Madeira, PORTUGAL

Received: March 9, 2020; Accepted: August 5, 2020; Published: August 25, 2020

Copyright: © 2020 Ghimire et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: Data are publicly available from: https://stats.nba.com/

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

I. Introduction

Adjusted plus minus (APM) measures have redefined our understanding of player value in basketball and hockey, where both are team games featuring player productivity spillovers [1]. In basketball, a player’s APM value represents his marginal effect on the score margin per 100 possessions as compared to a league average player. It has become a leading measure for comprehensive player analysis [2–8]. The measure utilizes ridge regression in an effort to isolate individual contributions to average team score margin differentials per 100 possessions [9]. We extend the analysis by using ESPN’s estimated values as explanatory variables in a set of fixed effects and the two-stage least square (2-SLS) regressions that seek to explain player-season APM variation.

In an ESPN article introducing the real (adjusted) plus minus measure, Steve Ilardi [8] highlights the serious flaw in the much familiar unadjusted +/- statistics: each player’s rating is heavily influenced by the play of his on-court teammates. APM measures were created to disentangle setting-of-play spillovers and render player value measures that are adjusted for teammate and opponent quality and thus (purportedly) unrelated to teammate quality.

In a related paper, Brian McDonald points out that APM values seek to disentangle player marginal effects from one another by using lineup variation across a season of play and are thus commonly interpreted as teammate independent [10]. It is the case that regularized adjusted plus minus measures feature (asymptotically) consistent estimators of player value toward an understanding of marginal player contributions from the season that occurred. From this, however, can we interpret the measure as providing teammate-independent player contribution values? Let us think of the value that a player creates—along with his lineup teammates—as a pie whose size represents (score margin) performance. APM measures can divide each such pie into constituent slices that represent consistently estimated player contributions. Due to the asymptotic property of consistency, that is, APM is capable of dividing credit at the player level based on the lineups and performances in which the player actually played. Is this division reliable when considering the player’s (counterfactual) productivity “out of sample”? Consider a teammate all of whose lineup teammates improve in ability (i.e., in every lineup in which the player plays). In this case, we know that the overall pie with which the player is associated has become larger, while the individual player retains the same player attributes. Do we expect this player to maintain the same expected APM value? More generally, do we expect player APM values in general not to be influenced by out of sample changes in playing conditions? If we are to treat the APM measure as teammate independent, then such an outcome must obtain. We present this analysis herein.

The remainder of the paper is organized as follows: Section II provides a summary of the data and their visualizations; Section III describes the estimation approach; Section IV presents the empirical results; Section V discusses the implications of the results; and Section VI concludes.

II. Data summary and visualizations

For out of sample shifts in teammate quality that cause changes in the size of a figurative lineup performance pie, do we expect the size of a player’s slice to remain the same? Rather than simply a theoretical consideration, this question has substantial empirical bearing in the modern NBA, which features a high level of player entry and player movement from year to year. Often, players change teams and are presented with near wholesale upgrades or downgrades in terms of lineup-teammate quality. To test whether APM is teammate-independent, we take advantage of frequent player movements in the NBA to determine whether a representative player’s season-level APM values are related to the (minutes-weighted) average APM value of his lineup teammates in the same season. We do so using a six-year panel data of NBA player-season level ESPN real plus minus data and corresponding lineup data from NBA.com. The two databases are available at https://www.espn.com/ and https://stats.nba.com/ respectively. We also control for time variant features of the panel data (e.g., age and age squared). To the authors’ knowledge, ESPN’s real plus minus represents the only publicly available dataset of NBA player regularized APM values, where regularized adjusted plus minus estimation has become a standard methodology for NBA player analysis [2, 5–10]. In the remainder of the paper, we use the terms real plus minus and APM interchangeably, as real plus minus is, in fact, an APM measure.

The present paper considers whether APM measures (e.g., ESPN real plus minus) are teammate dependent. It does so by considering player movement within and entry into the NBA. The following plots summarize the level of mobility within the NBA over the data period. Specifically, Fig 1 considers the proportion of NBA players who played for at least two teams during the 2-season period that ended with the season listed on the x-axis.

Download:

Fig 1. Mobility in the NBA.

https://doi.org/10.1371/journal.pone.0237920.g001

Fig 1 demonstrates that there is a high level of mobility in the NBA. Typically, more than a third of League players change teams at least once over a given two-year window. Given this level of mobility, changes in teammate ability from season to season are pervasive in the NBA. Many players explicitly change teams, and the players who do not change teams typically receive several new teammates from one season to the next. The following directed graph visualizes player flows in the NBA over the data period.

Within the directed graph (Fig 2), the direction of player flow is depicted by color-coding. The color of an edge is coordinated with the predecessor node (original team), and the edge flows to the successor node (latter team). The thickness of an edge indicates the number of players flowing along an edge over a data period. The graph demonstrates that NBA teams are very connected through the player market and that player turnover for teams is therefore high. An additional source of lineup change in the NBA is player entry into the NBA. Fig 3 demonstrates the proportion of rookie players in a given NBA season.

Download:

Fig 2. Player flows within the NBA.

https://doi.org/10.1371/journal.pone.0237920.g002

Download:

Fig 3. Proportion of players who are rookies.

https://doi.org/10.1371/journal.pone.0237920.g003

NBA teams have relied heavily on rookie players in recent years. In each of the last three NBA seasons, more than one-fifth of the league players have been in their rookie season. Hence, player entry into the league has also been a substantial source of lineup disruption from season to season over the sample period.

While there is substantial player movement in the NBA, this matters only if players exhibit significant heterogeneity in terms of productivity (i.e., real plus minus value). The following plots represent density plots of player-season real plus minus values for the sample. Fig 4 represents an overall player-season density plot, while Fig 5 represents a set of color-coded density plots by position-of-play.

Download:

Fig 4. Real plus minus density plot for all NBA players, 2013–14 through 2018–19.

https://doi.org/10.1371/journal.pone.0237920.g004

Download:

Fig 5. Real plus minus density plots by playing position, 2013–14 through 2018–19.

The RPM densities are plotted for the Point Guard (PG), Shooting Guard (SG), Power Forward (PF), Small Forward (SF), and the Center (C) positions.

https://doi.org/10.1371/journal.pone.0237920.g005

As these plots demonstrate, there is substantial player ability heterogeneity at each position in the NBA. While each plot is roughly bell-shaped, we observe substantial dispersion in player productivity levels. In terms of range, the most productive players have an estimated +10 point score margin effect per 100 possessions, while the least productive have an estimated -7 point score margin effect. Moreover, substantial heterogeneity is observed at each position. The following real plus minus summary (Table 1) further demonstrates the level of player productivity (RPM) heterogeneity and of average lineup-teammate productivity (OtherPlayersRPM) heterogeneity in the NBA.

Download:

Table 1. Summary statistics.

https://doi.org/10.1371/journal.pone.0237920.t001

Note that the average player RPM is substantially negative because player-season observations are not weighted by minutes played within the sample. However, OtherPlayersRPM is weighted by teammate minutes played and is therefore closer to zero. Such a weighting would impose a zero-average constraint upon the variable. Even if a player is able to maintain a stable set of lineup-teammates across seasons, aging effects can change teammate ability levels fairly quickly in the NBA. Fig 6 plots the estimated aging curve (i.e. productivity profile) for NBA players over the sample period. On the y-axis is real plus minus and on the x-axis is age.

Download:

Fig 6. Estimated real plus minus by age for the Typical NBA player.

https://doi.org/10.1371/journal.pone.0237920.g006

Fig 6 demonstrates that aging affects estimated player productivity substantially in the NBA. As such, the productivity of lineup-teammates can change from year-to-year even without substantial roster turnover. In our fixed effects and 2-SLS regressions to follow, we control for a player’s age for this very reason as it affects baseline ability for a given player-season.

III. Estimation approach

To answer our key question as to whether team member’s RPM affects an individual player’s RPM, we specify a baseline linear regression model with player RPM as a function of the player’s individual characteristics such as age, age-squared, and minutes-weighted average RPM of lineup teammates.

Difficulties exist in estimating the relationship between player APM and lineup-teammate APM due to potential endogeneity. Just as a player’s lineup-teammates may improve his APM value, so too might a player improve the APM values of his lineup-teammates. Moreover, players are often selected into lineups that potentially form a cluster of players with similar abilities. For example, starters disproportionately play alongside other starters, and backups disproportionately play alongside other backups. This latter issue also exacerbates endogeneity between player APM and lineup-teammate APM values. If we attempt to estimate the direct, uncontrolled relationship between these variables, we may actually pick up ability-clustering in lineup selections. We use the following techniques to address the endogeneity issue.

a. Fixed effects estimation

One manner by which to treat endogeneity is through the use of fixed effects [11]. In the present setting, we take advantage of our six-year panel data structure to specify player fixed effects and treat endogeneity of these performance variables. (1) where the response variable RPM_i,t is the real plus-minus measure for player i in season t; player_i and season_t capture the player specific and time fixed effects, and OtherPlayersRPM represents the RPM for the rest of the team members. We also control for the player’s individual characteristics by age and its squared term. This information is available for 810 unique NBA players (i = 1 to 810) over 6 different seasons (t = 2014 to 2019) giving us 2852 total observations. This provides us the opportunity to control for individual specific as well as time specific fixed effects. There are 262 instances in which a player participated in more than one team in the same season. In that case, we average the player’s RPM for that season. We perform a separate set of analyses using three-dimensional panel data (player i, team j, season t), but this dramatically reduces the total number of observations included in the final results. However, the final results do not change. To include the maximum possible observations in our analysis, we ignore the latter approach.

Player fixed effects control for unobserved baseline player ability such that the endogenous effect of the player’s ability upon teammate performance is then isolated within the fixed effects model. With the specification of player fixed effects, one is implicitly regressing the performance of a player relative to his baseline (i.e., a performance residual for the player) against the APM level of lineup-teammates.

Given that we have specified player fixed effects herein, the fixed effects model specification in (1) represents an exogenous test as to whether RPM, a leading APM measure, is influenced by teammate productivity (i.e., through some complementarity effect that influences the overall size of the lineup “performance pie” and of the size of constituent slices).

b. Two-stage least squares (2-SLS) estimation

Another approach we follow to subdue the problem of endogeneity is the use two-stage least squares (2-SLS) instrumental variable estimation using the lagged values of the minutes-weighted average APM of lineup teammates as an instrument. While it may be that lineup-teammate ability affects the player performance residual, there is a possibility that a player may influence the team’s current period performance. To address such reverse causality, we use previous season performance of present season lineup teammates as an instrument for present quality of lineup teammates and use the estimated values in the regression as shown in model (2). (2) where is the estimated values of other player RPM from the first stage regression. This technique follows the argument that present period teammate quality, as estimated by past performance of teammates, is exogenous of how a player performs in the current period and hence is a valid instrument for this estimation. This is a commonly used technique to subdue the problem of endogeneity [12]. The validity of the instrument is supported by the Cragg-Donald Wald F-statistic for the weak-identification test, as evidenced by the large enough F-statistics from the first stage regression. The NBA is a singular league in terms of competitive level that does not feature a great deal of in-season roster movement. Therefore, the best and perhaps only counterfactual measure of current player productivity in this case is lagged productivity of the same player. In sum, the result is that we can determine whether APM measures are truly teammate-independent within a model that estimates the relationship exogenously.

We regress model (1) and (2) for all NBA players and also in separate sub-samples by position-of-play for the Point Guard, Shooting Guard, Power Forward, Small Forward, and the Center positions. As such, we obtain six sets of regression results for each of the two estimation techniques explained above. We report the results in the subsequent section.

IV. Empirical results

Estimation results are presented in Tables 2 and 3.

Download:

Table 2. The fixed effects estimation results.

Response Variable: Individual Player RPM.

https://doi.org/10.1371/journal.pone.0237920.t002

Download:

Table 3. Instrumental variable regression.

Response Variable: Individual Player RPM.

https://doi.org/10.1371/journal.pone.0237920.t003

In the fixed effects regression results shown in Table 2, there is a significant, positive relationship between RPM and OtherPlayersRPM for the overall sample. For each unit of improvement in average lineup-teammate RPM, a player gains approximately 0.17 additional RPM points. As this result is conditional upon a given player’s baseline productivity (via player fixed effects and age), we interpret this as a significant and fairly strong complementarity effect that is uncontrolled in the RPM measure. For each one standard deviation change in OtherPlayersRPM, a player gains an estimated 0.25 (i.e. 1.44 * 0.174) RPM units. Moreover, the estimated coefficient for OtherPlayersRPM is positive for each position-of-play in the position-of-play specific regressions, where the relationship is significant at standard p-values for the Small Forward and Center positions.

In the 2-SLS regressions results shown in Table 3, the sign of the coefficient for OtherPlayersRPM is positive and significant at the standard significance level for each regression. Whereas the fixed effects results for uncontrolled lineup-teammate complementarity are not significant for the PG, SG, and PF sub-samples, they are now statistically significant for all regressions. With respect to the size of the coefficients, they are each larger compared to those in the fixed effects model. In particular, for each unit of improvement in average lineup-teammate RPM, a player gains approximately 0.664 additional RPM points. Coefficients for other sub-samples can be interpreted accordingly.

V. Discussion of the results

Based on the estimations above, for each unit average gain in the teammate’s RPM, a player’s RPM is overestimated by a range of 0.17 to 0.66 points according to point estimates. We find that RPM is not context-independent. Rather, positive spillover effects from more capable teammates improve a given player’s RPM value, ceteris paribus. This result has implications upon player trade value. Namely, a player’s trade value is estimated to be a function of where he is traded. Indeed, basketball is a game of complementarity and no measure—not even a measure such as RPM—has demonstrated the ability to isolate these complementarity effects toward an understanding of player value in the abstract. While this is perhaps the holy grail of basketball data science, it is not assured that such a measure is possible. As perhaps similar to quantum mechanical information, it may be impossible to understand the contribution of the basketball player and that of the player’s environment in some absolute sense. Indeed, basketball leagues are not natural experiments in which players are randomly paired and resampled. Rather, players are organized into often stable team environments and resampling occurs infrequently such that players have often aged by the time they receive a new set of teammates with whom to play. In such an environment, counterfactuals concerning player value will go largely unobserved.

VI. Conclusions

The results provide strong evidence that regularized adjusted plus minus player productivity measures are not, in fact, “teammate-independent.” Rather, we find evidence that lineup-teammate productivity positively influences a given player’s real plus minus value. As this result is conditional upon a given player’s baseline productivity via player fixed effects and age, we interpret this as a significant and fairly strong complementarity effect that is uncontrolled in adjusted plus minus measures such as real plus minus. While real plus minus may control for in-sample teammate effects well, it appears that the measure does not control for out-of-sample lineup-teammate quality effects. We find this within a model that accounts for teammate quality changes from season to season. We note that basketball leagues are not natural experiments in which players are randomly paired and resampled. Rather, players are organized into often stable team environments and resampling occurs infrequently such that players have often aged by the time they receive a new set of teammates with whom to play. In such an environment, counterfactuals concerning player value will go largely unobserved. From this estimation, further (out-of-sample) adjustments to the APM estimation methodology can be explored in future work.

References

1. Horrace WC, Liu X, Patacchini E. Endogenous network production functions with selectivity. Journal of Econometrics. 2016 Feb 1;190 (2):222–32.
- View Article
- Google Scholar
2. Fearnhead P, Taylor BM. On estimating the ability of nba players. Journal of Quantitative analysis in sports. 2011 Jul 1;7(3).
- View Article
- Google Scholar
3. Ilardi, Steve. 2014. “Ilardi: How real plus-minus (RPM) gauges players. ESPN.Com. 2014, April 7. http://www.espn.com/nba/story/_/id/10740818.
4. Macdonald B. Adjusted plus-minus for nhl players using ridge regression with goals, shots, fenwick, and corsi. Journal of Quantitative Analysis in Sports. 2012 Jan 1;8(3).
- View Article
- Google Scholar
5. Rosenbaum Dan. T. "Measuring how NBA players help their teams win." 82Games.com (http://www.82games.com/comm30.htm), 2004, 4–30.
- View Article
- Google Scholar
6. Winston WL. “Mathletics: How gamblers, managers, and sports enthusiasts use mathematics in baseball, basketball, and football.” Princeton University Press. 2012.
7. Sill J. Improved NBA adjusted+/-using regularization and out-of-sample testing. In Proceedings of the 2010 MIT Sloan Sports Analytics Conference 2010 Mar 6.
8. Ilardi, S., Barzilai. A. Adjusted Plus-Minus Ratings: New and Improved for 2007–2008. http://www.82games.com/ilardi2.htm. 2008.
9. Ehrlich J, Sanders S, Boudreaux CJ. The relative wages of offense and defense in the NBA: a setting for win-maximization arbitrage? Journal of Quantitative Analysis in Sports. 2019 Aug 27;15(3):213–24.
- View Article
- Google Scholar
10. Macdonald B. A regression-based adjusted plus-minus statistic for NHL players. Journal of Quantitative Analysis in Sports. 2011 Jul 1;7(3).
- View Article
- Google Scholar
11. Ebbes P, Papies D, van Heerde HJ. Dealing with endogeneity: A nontechnical guide for marketing researchers. Handbook of market research. 2016.
- View Article
- Google Scholar
12. Hansen H, Tarp F. Aid and growth regressions. Journal of development Economics. 2001 Apr 1: 64(2):547–70.
- View Article
- Google Scholar

[ref1] 1. Horrace WC, Liu X, Patacchini E. Endogenous network production functions with selectivity. Journal of Econometrics. 2016 Feb 1;190 (2):222–32.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Fearnhead P, Taylor BM. On estimating the ability of nba players. Journal of Quantitative analysis in sports. 2011 Jul 1;7(3).
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Ilardi, Steve. 2014. “Ilardi: How real plus-minus (RPM) gauges players. ESPN.Com. 2014, April 7. http://www.espn.com/nba/story/_/id/10740818.

[ref4] 4. Macdonald B. Adjusted plus-minus for nhl players using ridge regression with goals, shots, fenwick, and corsi. Journal of Quantitative Analysis in Sports. 2012 Jan 1;8(3).
View Article
Google Scholar

[9] View Article

[10] Google Scholar

[ref5] 5. Rosenbaum Dan. T. "Measuring how NBA players help their teams win." 82Games.com (http://www.82games.com/comm30.htm), 2004, 4–30.
View Article
Google Scholar

[12] View Article

[13] Google Scholar

[ref6] 6. Winston WL. “Mathletics: How gamblers, managers, and sports enthusiasts use mathematics in baseball, basketball, and football.” Princeton University Press. 2012.

[ref7] 7. Sill J. Improved NBA adjusted+/-using regularization and out-of-sample testing. In Proceedings of the 2010 MIT Sloan Sports Analytics Conference 2010 Mar 6.

[ref8] 8. Ilardi, S., Barzilai. A. Adjusted Plus-Minus Ratings: New and Improved for 2007–2008. http://www.82games.com/ilardi2.htm. 2008.

[ref9] 9. Ehrlich J, Sanders S, Boudreaux CJ. The relative wages of offense and defense in the NBA: a setting for win-maximization arbitrage? Journal of Quantitative Analysis in Sports. 2019 Aug 27;15(3):213–24.
View Article
Google Scholar

[18] View Article

[19] Google Scholar

[ref10] 10. Macdonald B. A regression-based adjusted plus-minus statistic for NHL players. Journal of Quantitative Analysis in Sports. 2011 Jul 1;7(3).
View Article
Google Scholar

[21] View Article

[22] Google Scholar

[ref11] 11. Ebbes P, Papies D, van Heerde HJ. Dealing with endogeneity: A nontechnical guide for marketing researchers. Handbook of market research. 2016.
View Article
Google Scholar

[24] View Article

[25] Google Scholar

[ref12] 12. Hansen H, Tarp F. Aid and growth regressions. Journal of development Economics. 2001 Apr 1: 64(2):547–70.
View Article
Google Scholar

[27] View Article

[28] Google Scholar

Figures

Abstract

I. Introduction

II. Data summary and visualizations

III. Estimation approach

a. Fixed effects estimation

b. Two-stage least squares (2-SLS) estimation

IV. Empirical results

V. Discussion of the results

VI. Conclusions

References