Hockey analytics are for anyone who wants them.
First things first: you can enjoy a hockey game with or without analytics. Knowing your team is getting horrendously outshot does not mean you can’t enjoy a gorgeous breakaway goal, or no-look pass, or a clean hip-check along the boards.
Analytics answer questions. They don’t necessarily tell us who will win, but they help us understand why a team might. If the top scorer in the league will get that many goals next season. Who ought to stay on a roster and who ought to go. Whether or not you should pick that guy up in your keeper league. It’s a place to begin.
A lot– though not all– is public. Corsica is public, and sometimes it even works on a cell phone. Hundreds of thousands of words are available for anyone with an internet connection to read, for free, whenever you want. Due to the public nature of all this data, anyone can get involved: you can track zone entries and exits, or passes, or argue about shot quality1Tulsky, Eric. “How to Bring Shot Quality into Player Evaluation.” SB Nation. December 12, 2013. Accessed March 28, 2016. http://www.sbnation.com/nhl/2013/12/12/5114366/nhl-stats-shot-quality-player-evaluation. on the internet.
It’s important to note that not everything you read will be accurate. Some will be outdated (remember when we were all hyped up about score close?) and some might be misleading. Older articles might have dead links, or reference research that isn’t publicly available anymore. But what’s there is thought-provoking, and having more people involved can only make the state of hockey analytics better.
You win Corsi. You don’t win the game. You win Corsi. You don’t win the game. Objectively, the best team does not always win2JLikens. “How Often Does the Best Team Win?” Objective NHL. June 25, 2011. Accessed March 28, 2016. http://objectivenhl.blogspot.ca/2011/06/how-often-does-best-team-win.html. .
That’s fine. Score-adjusted metrics, our current best measure3McCurdy, Micah. “Better Way to Compute Score-Adjusted Fenwick.” HockeyViz. October 2014. Accessed March 28, 2016. http://hockeyviz.com/txt/shiftsArticle/senstats.html. , has an average R^2 for Corsi of .234Beckman, Robert. “Improving Our Score Adjustment.” Puck On Net. November 15, 2014. Accessed March 28, 2016. http://puckon.net/articles/improving-our-score-adjustment.php. . A top possession team in the NHL can miss the playoffs and recently has; my condolences to the Los Angeles Kings.
The basics: puck possession, the statistic you’ll see on Corsica, cited in tweets or articles– it doesn’t actually measure time of possession. It’s a proxy5Wilson, Kent. “Advanced Stats Crash Course: Possession (Part 1).” NHLNumbers. December 27, 2013. Accessed March 28, 2016. http://nhlnumbers.com/2013/12/27/advanced-stats-crash-course-possession. . The team that has the puck more often will get more shots, and a shot on goal, missed shots, and blocked shots are all accounted for by Corsi. A shot from the point is accounted for, and so is a soft lob from the neutral zone. Fenwick, the other one you’ve probably heard about, has one difference: it assumes shot blocking may be a skill, and therefore is only shots on goal and missed shots.
Over a period of time, the team that outshoots– outpossesses– their opponents will outscore them6JLikens. “The Relationship Between Outshooting and Outscoring over Time.” Objective NHL. February 20, 2010. Accessed March 28, 2016. http://objectivenhl.blogspot.ca/2010/02/relationship-between-outshooting-and.html. .
Here’s why this is important: goals are pretty rare over a hockey game, and often pretty random. How many goals come off of weird deflections, or a #buttgoal, or a lucky bounce? Shots occur far more than goals. Over 90% of shots are not goals. Due to the larger sample for possession, shots are more repeatable, which means it’s more representative of a team or a player’s talent. This doesn’t mean goalscoring is not a talent, but that future even strength performance can be predicted more confidently using possession statistics.
Corsi and Fenwick are named for the people who came up with them, which manages to be confusing and simple at the same time, but eventually you just accept it. (Similarly, PDO– the sum of on-ice sv% and on-ice sh%– is named for the initials of the blogger who invented it. It’s an acronym for nothing).
Possession isn’t the only statistic, and it’s generally used as an even strength stat. But it’s a real big one.
Your team is leading. Congrats! They are now, most likely, going to sit back in a defensive shell. (The turtle emoji is great for complaining about it on Twitter.)
When a team is leading, especially late in the third, they tend to turtle to varying degrees and protect the lead, with the idea that fewer risks won’t lead to goals against. If a team is trailing, they’ll attack more aggressively. If you’re already losing, what do you have to lose?
This is why adjusted possession metrics exist. Teams winning the game tend to get outshot; adjusting for this allows for more accurate predictions7Tulsky, Eric. “Adjusting for Score Effects to Improve Our Predictions.” Broad Street Hockey. January 23, 2012. Accessed March 28, 2016. http://www.broadstreethockey.com/2012/1/23/2722089/score-adjusted-fenwick. .
Sports are skill, and hard work, and luck. Baseball is lucky, and hockey is inherently more chaotic than baseball. Baseball is more static, more controlled, has position players, and even baseball encounters imperfect models.
Due to luck– unexplained variance, randomness, gripping the stick too tight– a player’s performance may not reflect their actual level of talent. Shooting 25% over 35 games likely doesn’t indicate true talent, but neither does shooting 5% over that same sample of games.
I like this explanation from Chris Kreider on the Rangers’ up-and-down performance in the 2015-16 season: “The stats as of late are better than the underlying numbers. Before that, the underlying numbers were better than the stats. It’s just how it goes sometimes. You get to a point where the underlying numbers are just as good as the stats.”8Caldwell, Dave. “Former BC Star Has Been Working toward Degree as NHL Career Takes off.” Boston.com. March 16, 2016. Accessed March 28, 2016. https://www.boston.com/sports/boston-bruins/2016/03/16/former-bc-star-has-been-working-toward-degree-as-nhl-career-takes-off.
Again, knowing about regression doesn’t mean you can’t enjoy a hot streak. You should– hockey is fun, and watching pucks go in and points pile up for someone unexpected is super fun. But statistics, as a field, serve to evaluate past performance (who was the best team last year?) and predict future performance (who will be the best team next year?). Regression gives predictions more accuracy. It does not mean, especially in sports, that everything will come true.
You do not need to know R to understand hockey analytics. You don’t need to download it onto your laptop. R is a fantastic statistics and graphing tool, it’s free, and there are plenty of R cookbooks9Chang, Winston. “Cookbook for R.” Cookbook for R. Accessed March 28, 2016. http://www.cookbook-r.com/. out there. Lots of Twitter graphs come straight from R.
I recommend R if you want to make things yourself– but you don’t need it to start. Excel is honestly fine at the beginning, and many people are more comfortable with Excel. At a certain point when you’re in deep, yes, get R10Russ, Brice. “How To Use R For Sports Stats, Part 1: The Absolute Basics.” TechGraphs. July 27, 2015. Accessed March 28, 2016. http://www.fangraphs.com/techgraphs/how-to-use-r-for-sports-stats-part-1-the-absolute-basics/. . But you don’t need necessarily R to look at the numbers behind this year’s Norris candidates, and you don’t need R to understand those numbers.
What Matters (And Doesn’t)
In my opinion, researching what doesn’t matter is equally important as researching what does matter. As the field of hockey analytics advances, and the amount of available data increases, some of it will ultimately be discarded.
Plus/minus, to go for low-hanging fruit, used to be seen as valuable. As goaltending stats evolve, and we learn how save percentage is affected by a team’s defensive play, some of what used to matter won’t anymore. Deployment and zone starts used to be given more weight, too11McCurdy, Micah. “Shift Starts and Ends, Part 2.” HockeyViz. September 3, 2015. Accessed March 28, 2016. http://hockeyviz.com/txt/shiftsArticle/shifts2.html. .
You might do a lot of work to prove something isn’t important. Still. That’s useful in the end. The point is to have better, more complete analysis on all sides.
The numbers are objective. You are not objective. The players themselves are not objective.
Hockey does not have a refined, all-encompassing wins above replacement statistic the way that baseball does. Furthermore, research in hockey analytics isn’t complete. We can explain some variance, sometimes.
Systems matter. A good player on Darryl Sutter’s Kings team will look a hell of a lot better than a good player on Dan Bylsma’s controlled-entry-averse Sabres. Teammates matter. Shackling Ryan McDonagh, who is good, to Dan Girardi, who is bad, makes Ryan McDonagh look worse. Goalies matter. Sorry about your on-ice save percentage, probably more defensemen than we can count.
Improvement matters. A rookie with a high shooting percentage and high number of secondary assists might win the Calder. He might not deserve it. Next season, he posts lower numbers, but he’s generating more shots. Those secondary assists weren’t repeatable12Tulsky, Eric. “Simplify Scoring: Drop the Pointless Secondary Assist.” Broad Street Hockey. May 15, 2011. Accessed March 28, 2016. http://www.broadstreethockey.com/2011/3/15/2046512/simplify-scoring-drop-secondary-assists. , but long-term, those goals he scored? That production rate is more sustainable. That’s good.
The evaluation tools you choose matter. You might weight some statistics more heavily than others. Other people might weight those same numbers differently. Don’t use bad stats. Don’t use evaluation tools that have been flat-out disproved. Don’t, for the love of god, use plus/minus. Do think about if you’re cherry-picking, if you’re swayed by narrative, if you’re biased toward recent performance or your favorite player.
Numbers also introduce objectivity. In a league where draft picks come from college, the CHL, Europe– where teams have a height bias13Richardson, Megan. “Decoding the Draft: The Effect of Height on Draft Position.” ShinnyStats. September 10, 2014. Accessed March 28, 2016. https://shinnystats.wordpress.com/2014/09/10/decoding-the-draft-the-effect-of-height-on-draft-position/. and grit is included in the rankings– qualitative analysis aids prospect rankings, helps add depth to scouting14Pronman, Corey. “From Daigle To Datsyuk: Analyzing Draft Selections.” Hockey Prospectus. October 12, 2012. Accessed March 28, 2016. http://www.hockeyprospectus.com/puck/article.php?articleid=1394. . Still, sometimes there’s noise.
Start with a game.
Megan Richardson says: “My recommended first dive is always a single game. Track data, put it in Excel, play with some formulas, and form some conclusions about linemates/opposing players/luck or patterns of entries/shots, just because it’s so overwhelming and EVERYONE does laborious analysis-intensive stuff.”
Going for the cliche, you don’t need to start by reinventing the wheel. There’s a place for replicating experiments now that we have better data. You don’t have to do new research right away. Watching games more closely, tracking games, and testing what you see can lead you at the start.
Furthermore, you do not have to limit yourself to NHL games and NHL data. The NHL currently has the most data available, and the most research, for even-strength play. The NWHL has shot and passing data for players and teams15Wilke, Carolyn. NWHL Shot Data. Raw data. https://twitter.com/Classlicity/status/675828720242835456. ; the CWHL does not. The NCAA has individual team data16“2015-16 NCAA Team Statistics.” CHN News. Accessed March 28, 2016. http://www.collegehockeynews.com/stats/#adv. , but not individual player shot data beyond total shot attempts. The CHL has spotty data, which varies in between leagues. There are international tournaments for both the men and women’s national teams, the SHL, the KHL– there are millions of hours of games. It’d be incredibly valuable to see if NHL-proven assumptions hold in the AHL, the ECHL, or any of the developmental junior leagues leagues.
If you want to get the hang of tracking games for shot attempts, it’s worth trying an archived game first.
You can volunteer to track for any of the various projects out there, such as the Passing Project, or track neutral zone data17Garik16. “A Guide to Neutral Zone Tracking Part 2 of 2: How to Track the Neutral Zone.” Hockey Graphs. January 06, 2016. Accessed March 28, 2016. https://hockey-graphs.com/2016/01/06/a-guide-to-neutral-zone-tracking-part-2-of-2-how-to-track-the-neutral-zone/. ; there’s also tons of work to be done on special teams play. It’s practically guaranteed there will be other fans out there interested in the results.
Don’t be afraid to ask questions or make mistakes. Imperfect models can still teach us something of value18Cameron, Dave. “Nate Silver and Imperfect Modeling.” FanGraphs Baseball. November 7, 2012. Accessed March 28, 2016. http://www.fangraphs.com/blogs/nate-silver-and-imperfect-modeling/ . Not every single thing you do will be important or correct, but at the absolute minimum, one person will learn from it, and that’s great.
In the End
You don’t need to pay attention to advanced stats to enjoy a game. Not everyone does, or will, and that’s fine. Advanced stats enhance understanding. They explain. They help predict. Still, the best team doesn’t always win.
Analytics in hockey aren’t yet at the same place as baseball. Everyone is learning together. Challenge your preconceived notions! Who knows– it might even turn out to be fun.
References [ + ]
|1.||↑||Tulsky, Eric. “How to Bring Shot Quality into Player Evaluation.” SB Nation. December 12, 2013. Accessed March 28, 2016. http://www.sbnation.com/nhl/2013/12/12/5114366/nhl-stats-shot-quality-player-evaluation.|
|2.||↑||JLikens. “How Often Does the Best Team Win?” Objective NHL. June 25, 2011. Accessed March 28, 2016. http://objectivenhl.blogspot.ca/2011/06/how-often-does-best-team-win.html.|
|3.||↑||McCurdy, Micah. “Better Way to Compute Score-Adjusted Fenwick.” HockeyViz. October 2014. Accessed March 28, 2016. http://hockeyviz.com/txt/shiftsArticle/senstats.html.|
|4.||↑||Beckman, Robert. “Improving Our Score Adjustment.” Puck On Net. November 15, 2014. Accessed March 28, 2016. http://puckon.net/articles/improving-our-score-adjustment.php.|
|5.||↑||Wilson, Kent. “Advanced Stats Crash Course: Possession (Part 1).” NHLNumbers. December 27, 2013. Accessed March 28, 2016. http://nhlnumbers.com/2013/12/27/advanced-stats-crash-course-possession.|
|6.||↑||JLikens. “The Relationship Between Outshooting and Outscoring over Time.” Objective NHL. February 20, 2010. Accessed March 28, 2016. http://objectivenhl.blogspot.ca/2010/02/relationship-between-outshooting-and.html.|
|7.||↑||Tulsky, Eric. “Adjusting for Score Effects to Improve Our Predictions.” Broad Street Hockey. January 23, 2012. Accessed March 28, 2016. http://www.broadstreethockey.com/2012/1/23/2722089/score-adjusted-fenwick.|
|8.||↑||Caldwell, Dave. “Former BC Star Has Been Working toward Degree as NHL Career Takes off.” Boston.com. March 16, 2016. Accessed March 28, 2016. https://www.boston.com/sports/boston-bruins/2016/03/16/former-bc-star-has-been-working-toward-degree-as-nhl-career-takes-off.|
|9.||↑||Chang, Winston. “Cookbook for R.” Cookbook for R. Accessed March 28, 2016. http://www.cookbook-r.com/.|
|10.||↑||Russ, Brice. “How To Use R For Sports Stats, Part 1: The Absolute Basics.” TechGraphs. July 27, 2015. Accessed March 28, 2016. http://www.fangraphs.com/techgraphs/how-to-use-r-for-sports-stats-part-1-the-absolute-basics/.|
|11.||↑||McCurdy, Micah. “Shift Starts and Ends, Part 2.” HockeyViz. September 3, 2015. Accessed March 28, 2016. http://hockeyviz.com/txt/shiftsArticle/shifts2.html.|
|12.||↑||Tulsky, Eric. “Simplify Scoring: Drop the Pointless Secondary Assist.” Broad Street Hockey. May 15, 2011. Accessed March 28, 2016. http://www.broadstreethockey.com/2011/3/15/2046512/simplify-scoring-drop-secondary-assists.|
|13.||↑||Richardson, Megan. “Decoding the Draft: The Effect of Height on Draft Position.” ShinnyStats. September 10, 2014. Accessed March 28, 2016. https://shinnystats.wordpress.com/2014/09/10/decoding-the-draft-the-effect-of-height-on-draft-position/.|
|14.||↑||Pronman, Corey. “From Daigle To Datsyuk: Analyzing Draft Selections.” Hockey Prospectus. October 12, 2012. Accessed March 28, 2016. http://www.hockeyprospectus.com/puck/article.php?articleid=1394.|
|15.||↑||Wilke, Carolyn. NWHL Shot Data. Raw data. https://twitter.com/Classlicity/status/675828720242835456.|
|16.||↑||“2015-16 NCAA Team Statistics.” CHN News. Accessed March 28, 2016. http://www.collegehockeynews.com/stats/#adv.|
|17.||↑||Garik16. “A Guide to Neutral Zone Tracking Part 2 of 2: How to Track the Neutral Zone.” Hockey Graphs. January 06, 2016. Accessed March 28, 2016. https://hockey-graphs.com/2016/01/06/a-guide-to-neutral-zone-tracking-part-2-of-2-how-to-track-the-neutral-zone/.|
|18.||↑||Cameron, Dave. “Nate Silver and Imperfect Modeling.” FanGraphs Baseball. November 7, 2012. Accessed March 28, 2016. http://www.fangraphs.com/blogs/nate-silver-and-imperfect-modeling/|
1 thought on “1-2-3 Let’s Have Fun: Getting Started With Hockey Analytics”
Not sure where to post this, or if it will even be seen here, but…
Can someone explain how Columbus has the second best xGF% in the East, ahead of Tampa and just behind Dallas? That’s gotta be a bug in the data right? When looking at that team, from roster construction, to possession stats, to the eyeball test, they just are not good. Very not good. I’m curious what is going on there.