When the NBA's analytics community gathers Friday in Boston for the first day of the annual MIT Sloan Sports Analytics Conference, it does so with the intent of expanding its knowledge of the game it loves. In that spirit, this seems like an appropriate time to consider the most important questions statistical analysis can help shed light on in basketball -- and give a progress report on where we stand in attempting to answer them.
My inspiration is the German mathematician David Hilbert. In 1900, at the dawn of the new century, Hilbert set forth 23 problems for the field of mathematics to solve in the next 100 years.
Hilbert's problems were translated to baseball by Keith Woolner in Baseball Prospectus at the start of this century, and Insider contributor Aaron Schatz published a list of football's Hilbert problems in the Journal of Quantitative Analysis in Sports, but no one has set forth such a list in basketball.
Ten of Hilbert's 23 problems are now considered "resolved;" because of the nature of basketball analysis, I don't think we can ever solve any of the problems I present, but with more data and research, we can improve our understanding of them. While Hilbert came up with 23 problems and Jay-Z famously had 99, I settled on 10 important topics.
1. Which better describes player value -- individual stats or team impact?
When it comes to player evaluation, there are two schools. There are stats derived from the box score -- such as player efficiency rating, my WARP, Basketball-Reference.com's win shares and Wages of Wins' wins produced -- and plus-minus statistics. The problem with plus-minus is that it tends to be highly variable from season to season, but the advanced techniques and multiple seasons utilized by regularized adjusted plus-minus have allowed it to predict future team performance nearly as well as the best box score stats.
Ultimately, the answer probably isn't one or the other but a combination of both methods, since they measure slightly different things. Eventually, adding data from SportVU's optical tracking, such as screens and shot defense, could get us closer to measuring the value of players who don't make an impact on the box score.
2. How do basketball players age?
One of the most important discoveries of sabermetrics was that the peak age for baseball players was much younger than conventional wisdom held. For the most part, basketball research has found a similar prime to baseball, with a peak age around 27. However, the answer depends on the metric of choice because different skills age differently; players grab fewer offensive rebounds from the time they enter the league but shoot more accurately. Defense might peak latest of all because of the importance of experience. The continued success of this generation's veterans, such as Kobe Bryant and Tim Duncan, also is forcing us to rethink how players decline in their 30s.
3. What is the best way to develop young players?
If there's one criticism fans always have of their favorite team's coach, it's that he doesn't give young players enough minutes. The assumption here is that time on the court is the only way for NBA players to develop, which makes sense intuitively but has never been demonstrated statistically, at least not publicly. This is one area in which future research could have a major impact on how teams play.
4. How do players' roles on offense affect their efficiency?
Usage versus efficiency is the fundamental question for rating offense, and it explains much of the difference among various rating systems (PER heavily rewards usage, while wins produced doesn't value it at all, and WARP falls somewhere in between). This has huge implications when it comes to comparing volume scorers with role players. Because players tend to shoot more when they have favorable matchups, it's impossible to use game-to-game performance to measure how much trade-off there is between using more plays and efficiency. The best study, done by Houston Rockets analyst Eli Witus, looked at the performance of various lineups and found a change of 1.25 points of offensive rating for each percent of usage rate above or below league average. But that's not true for every player; stars such as Bryant are able to take more shots without much hit in terms of efficiency, while specialists such as Tyson Chandler are very sensitive to changes in their role.
5. How much effect do players have on their teammates' statistics?
The notion that stars "make their teammates better" is probably as old as basketball. Now that we can track performance by players depending on their teammates, it has become statistically demonstrable. Elite passers and high scorers improve their teammates' efficiency by setting up easier shots. Teammates also are extremely important on the defensive glass, where elite rebounders tend to take some rebounds from teammates who would have gotten them otherwise.
6. Do per-minute (or per-play) stats translate across changes in playing time?
Few contentions of the statistical community are more divisive than the new conventional wisdom that players are best measured on a per-minute basis (or, better yet, per play to account for pace). Again, it's difficult to separate players forced into bigger roles from those earning more minutes with their improved play. John Hollinger attempted to do that in Pro Basketball Prospectus 2002-03 by looking at players who moved up due to injury. My follow-up with Tom Ziller on Ballhype.com showed players actually improving with additional minutes, but there's still room for additional research on players going from, say, 20 minutes per game to 35.
7. What role does coaching play in the success of teams and players?
Measuring coaching might be the last frontier for analytics. Dave Berri found in "Stumbling on Wins" that coaches show little difference in developing players. Of course, this is just one aspect of coaching. Including coaches as a sixth player in adjusted plus-minus calculations backed up the conventional wisdom that coaches have more impact at the defensive end. However, we've yet to make any real inroads in measuring late-game strategy or evaluating how well coaches put together lineups.
8. How do statistics translate from other leagues to the NBA?
Sabermetricians tackled minor league translations early on in their development. Adjusting college stats for the pros is trickier because of the wide differences in strength of schedule, changes in player roles and the importance of age in development. Still, my NCAA translations have proved reasonably accurate when it comes to predicting first-year performance, and adding in age helps account for long-term potential. Hollinger and I have found translating European stats even more reliable, and D-League stats can be translated to the NBA as well.
9. What is the market value for player performance?
Converting player value into dollar figures is another topic that surely has gotten more attention inside front offices than in the public sphere. We touched on this topic recently in Per Diem, but there's plenty more to explore. It's unclear whether value and salary are linear, for example, or how the new collective bargaining agreement has affected teams' willingness to spend in free agency. Stay tuned.
10. How do we best predict the outcome of games or series?
Lastly, there's the matter of evaluating teams. At the most basic level, we know point differential predicts performance going forward better than win-loss record. Other factors muddy the water. Are recent results more predictive than those from early in the season? And how do we deal with teams such as Denver and Utah that consistently display significant home/road splits because they play at altitude? When it comes to picking playoff series, head-to-head results show some value above and beyond overall level of play. Also, rest is better than rust, although the magnitude of the difference is relatively small.
Of course, there are many more questions besides these that statistical analysis can help answer, some of which will be addressed by the papers presented this weekend during the Sloan conference. Still, even these 10 problems will give statistical analysts plenty to work on in years to come.