Tuesday, November 5, 2013

Where Chris Boyle's Shot Quality work can and probably can't be useful.

Chris Boyle has been slowly unveiling the results of his shot quality tracking for Sportsnet over the past few weeks.  Like a lot of smart people, Boyle has recognized an area where the NHL is extremely poor at tracking something in today's NHL hockey games - in this case, where shots on goal are taken from on the ice - and attempted to fix the situation and watch games himself and track where each shot comes from.  Boyle had previously shown that the NHL shot trackers are hopelessly incompetent and this is a pretty nice way to try and get better data.

So Boyle has tracked a ton of hockey games and marked down where shots have come from - and, perhaps even more interestingly, what types of shots these shots were (Clean shots, Shots off of passes, shots off of deflections, and shots off of rebounds).  This is all very cool stuff and can be useful if we have a large enough sample and we apply it in the proper places.

Unfortunately, so far Boyle hasn't quite done that.  After finding the average SV%s of each type of shot, Boyle posted today his results as to several goaltenders, focusing mainly upon James Reimer.  Boyle finds that Reimer's first season's results may have been skewed in Reimer's favor by the sheer # of clean shots Reimer faced.  In other words, Reimer faced low quality shots more often than average, resulting in an inflated SV% that we might expect to drop as Reimer faces higher quality shots in the future.  So far, this is pretty interesting stuff, especially when we're dealing with small goalie sample sizes as a single season.

Reimer's season in 2012-2013 was different as per Boyle's findings - Reimer featured a much tougher shot selection, but still had way above average results, particularly against rebound shots.  The money graphic is this one:

According to Boyle, League average rebound SV% (at 5 on 5) is .760.  After two years of being below average at this mark, Reimer actually hit a rebound SV% of .844 in the last 1196 shots.  And here is where Boyle goes way off the deep end:

His clean save percentage remains fairly static, but his rebound and transition rates continue to improve.
All of this could be small-sample magic, but it is consistent in regards to most young goaltenders’ learning curves. The biggest adjustment a goaltender makes during the adjustment period to the NHL is the speed of the game. When overwhelmed by the pace, he is late tracking the puck and will trail the play. When that happens, he must rely on reflex and reaction. That leads him to play with less control and means more of a struggle to anticipate and read the play.

As a young goalie gains experience, he relies less on reflex and reaction saves because he tracks the puck better. This allows him to skate ahead of the play and employ a game plan. Reimer has quieted his game and with that his deflection and rebound save percentages have risen thanks to better positioning.

The Reimer of 2010-11 could not have survived the onslaught of rebound opportunities he faced in ’12-13—he saw twice the rebound opportunities, yet maintained the exact same goal rate.

.....

With a strong start to 2013-14, Reimer’s numbers have climbed to a point where he is not just a reliable NHL starter, but a good one. Even if he remains weak in transition, maintaining a dominant performance when faced with rebound shots will push him above the middle class.
 Now Boyle - to his credit - gets it right in the final few paragraphs of the piece, but here he's gone incredibly wrong: He's assuming that goalies improve in a certain way (apparently due to his own past experience) and that's what is the cause here, rather than noting how improbable this improvement actually is.  Quoting the correct passages later: 

" Only five of the 30 goaltenders studied have been able to register a +.800 on rebounds and two of them (Price and Lundqvist) couldn’t maintain it the following season"
 In other words, Reimer's save % is based upon an incredibly fluky rebound save % over a small sample that is unlikely to continue!  That's not a sign that Reimer has improved and can actually  handle rebounds now which he couldn't back then!  By the way, the converse here is also true: it's certainly possible Reimer's poor rebound sv% back in 2010-2011 was ALSO fluky in the reverse direction! 

None of this is to say that Reimer may not be an above average goaltender.  But none of the data presented by Boyle makes that clear to us any more than the classic save percentage measure.  This is because the data doesn't solve the issue of showing us whether something is REPEATABLE or something that is just an instance of unlikely play to continue (as is likely the case with Reimer).

BUT:  BOYLE'S WORK CAN BE INCREDIBLY USEFUL

I hope Boyle will go into this in the future, because as I hint at above, this isn't to say his work doesn't add something.  For one, knowing that Toronto faced harder shot quality last year than 2 years prior is certainly fascinating given the Leafs vs Analytics war that's been going on.

But the information is still useful as a goaltender evaluation tool (even beyond simply helping team coaches work with goalies on weak areas).  Again, Boyle's data suggests that Reimer's 2010-2011 season #s are inflated and were not likely to repeat, and that's extremely big information that could help us smoke out goalies who are the next Steve Mason as opposed to the next Henrik Lundqvist. 

Essentially, this is the equivalent to baseball stats like batted ball stats (LD%, GB%, FB%) or BABIP, which - if they were reliable (Batted Ball Stats sadly are less reliable than one would like) - would help us know when certain players are getting lucky or they're actually managing to show something that is repeatable skill. 

Here it's unlikely that Boyle's #s can really show something as repeatable skill more so than our existing measures.  But they can show when certain results are fluky and bear more watching more than we could before - instead of simply saying that goalies after X amount of shots regress Y amount, we can see whether a goalie's #s are particularly likely to regress due to shot quality against regressing.  That's what's genuinely interesting here.  And that's why I hope Boyle releases the raw data and am still interested in the project.

Tuesday, September 24, 2013

Is Linus Omark the next Michael Grabner?

Picture this:  A European player is drafted by an NHL team.  He spends the age 19-22 seasons in lower hockey leagues, but performs excellently in them.  Finally he gets a chance in the big show, and in fact, in the small sample size, performs very solidly - not putting up numbers like a star, but putting up a decent number of points while being a very solid possession player. 

However, his team tires of him for some reason or another, and he performs poorly in training camp/preseason in his next NHL shot.  As a result of his poor performance in camp, the European is waived - unthinkable to fans who remember him as a pretty good prospect of the team. 

Sound familiar?  That MIGHT sound like the story of Linus Omark, the Oilers prospect who was just waived today.  But that IS the story of Michael Grabner, the Islander who was waived by the Florida Panthers after a horrible camp in 2010-2011, after being basically included as a throw-in in a trade by the Canucks the offseason before. 

Grabner of course, would put up solid possession #s, good Penalty killing #s, and 34 goals (the most for an nhl rookie) in that very same season he was waived in camp.  While he's certainly not a star, he's definitely a pretty good player at this point and it seems extremely weird to think of him being given away for free by the Panthers. 

Linus Omark is not the same type of player as Michael Grabner.  Grabner played multiple seasons in the AHL as a fairly good goal scorer (30 goals in 08-09, on pace for 30 goals in 09-10 had he played a full AHL season); Omark has played overseas instead and, while playing very successfully in Sweden, the KHL, the Swiss league, and hell even a tiny bit in the AHL, is much more of a playmaker than a goal scorer. 

Yet their stories are very very similar.  Omark has been successful in every league but the NHL, and even had one NHL season where he played quite well - just like Grabner had 20 games in Vancouver where his performance was quite solid.  And now Omark has been waived because he had a horrible training camp/preseason and the Oilers don't seem to think he belongs on the roster.

The Grabner story makes me wonder if this is a really stupid mistake.  Don't get me wrong, I don't disbelieve that Omark has been bad in camp/preseason.  Even the shot numbers, as quoted by Michael Parmatti in this tweet, suggest his performance in the games has been poor.  And training camp/preseason should be a place - small sample as it is - to evaluate players as they fit on your team.
But it should be FAR from the end-all-be-all.  Grabner probably did have a bad training camp with the Panthers that year when he was waived - he certainly looked poor his first few games with the Isles.   But the Isles could afford to keep him on that year and eventually the guy who played well enough in lower leagues played excellently in the big leagues.  And certainly the Panthers could've afforded to be patient that year too - that wasn't exactly a loaded team.  A bad training camp can sometimes mean that a guy just got off to a bad start - it doesn't mean he's necessarily unable to handle the NHL for your team.  If your team can afford patience with such a player - either because you're loaded at other positions or because you're so bare bones that you need lots of scrappy forwards - why not take a chance on a guy who has good #s before the preseason?

Now look at the Oilers.  The forward line is a mess due to injuries and well....roster management.  Their current likely wings include: Jesse Joensuu, Ryan Smyth, Ryan Hamilton, Steve MacIntyre, Ryan Jones, and Mike Brown - with Joesnuu somehow a possibility for 2nd line wing.  Are you saying that the Oilers could be worse with Omark in there instead of any of those guys?  Omark outproduced Joensuu in the AHL in the limited time the two were both in that league, for instance. 

I suspect the Oilers will luck out and teams will continue to ignore solid players/prospects who are waived - Keith Aucoin for one just cleared waivers after a very capable season as the Isles' 3rd line center (why the Oilers didn't claim Aucoin who would fit one of their needs incredibly well, I have no idea).  But the Omark situation shows that a team is placing way too much emphasis on a training camp that is only 3 weeks long instead of a history that is years of success.  When a team can afford to play the long game, there's no excuse for that.  And that will hurt the Oilers' attempts to climb back into relevance. 



Monday, August 5, 2013

The dangers of using video in hockey analysis.

Tyler Dellow (@mc79hockey if for some reason you don't know this) recently made a post about the use of video analysis in conjunction with analysis.  But Tyler's post regarding this, along with some of his twitter comments, gave me pause, because I do think he's making a crucial mistake with regards to HOW you can use this analysis. 

Dellow's been doing a lot of work going through individual shifts of the Oilers, particularly before won and lost faceoffs in each zone, as a way of diagnosing why the Oilers' possession #s collapsed last year.  One thing he noted (on twitter) was the following:

Nashville were the NHL's best team when they lost an NZ draw this year. Not a lot of places to make a play here: pic.twitter.com/x54Wim3XY0

Weird thing is that NSH was bad at this for years then good this year. Still better to win an NZ draw.
 Alas, these tweets don't seem to be linkable in any active form, but for now google's cache still contains them here (http://webcache.googleusercontent.com/search?q=cache:CNZ02Yls2ogJ:https://twitter.com/mc79hockey/statuses/359712573161103360+&cd=7&hl=en&ct=clnk&gl=us&client=firefox-a). 

Dellow followed this up with a post about this topic - Nashville's success on neutral zone faceoff losses - and on the value of video analysis as an accompaniment to statistical analysis.  To quote Dellow, so I'm not making any misrepresentations here:

I’m fooling around with some data, which I don’t propose to discuss in any detail in this post. I want to make a point about something though: the intersection of data and video when it comes to understanding how things go.

Nashville was excellent last year after they lost a neutral zone faceoff. Edmonton struggled when they won one. Simple thing to do? Look at them and see what’s going on. I pulled a collection of Nashville neutral zone faceoff losses against Edmonton to examine and it’s kind of amazing how easy it is for even someone like me, a non-expert in technical hockey matters to see what sort of a scheme the Preds are running.

[VIDEO]

...

Also of note: you can see what an advantage having the faceoff just outside the Predator blue is, as the puck can just be dumped in before the Preds can get their structure set. I love seeing stuff like this – for all the talk about systems and technical stuff in hockey, I’ve found it pretty easy to understand what a team is doing whenever you see a bunch of clips of a given situation lined up like this. The difficulty lies in the fact that you don’t get to see twenty identical situations played back to back in a game, which makes it harder to spot this stuff.
LINK:  http://www.mc79hockey.com/?p=6238

Now, here's the issue I have here.  Dellow is seeing a statistical result: that Nashville is good when they lose Neutral Zone faceoffs.  He's then going to video and seeing what they are doing.  He's next IMPLYING that this strategy/system is the cause of the results.  The problem here is that assuming causation here simply...doesn't work.  In order to show causation we'd first need to see whether the result (Good at NZ Faceoff Losses) is a consistent result in the first place - which may be questionable given that Nashville didn't have that result in the past mind you.  Then we'd have to have some way of comparing the results from Nashville with and without that strategy/system.  And even then we'd have issues.   

In short, there's nothing making such a video analysis different from the type of analysis talked about by Cam Charron in a post today

It’s tough to blame any particular Bruin on that play. Nobody seems in good position but that could just be sheer fatigue. On some goals there’s a player on the team that was scored on that makes some grave error, one repeated maybe a dozen times by various players throughout the game but only noticed on an instance where it pops up as being evidenced in a replay on a goal against.
 
I’ve been a little curious as to “analysis by replay” and have had thoughts in the past to record which scoring chances for and against I tracked were shown again on replays. A goal or a not, a bad outlet pass that results in a two-on-one against is a bad outlet pass that results on a two-on-one against, and the more fans get to see a particular player’s mistakes, the more likely they are to be convinced that the defenceman is mistake-prone.

The beauty of hockey lies in its randomness, that marvellous things happen that we have no way of expecting. It’s the sort of pleasure you derive from the game when you actually watch it, and something you’re attempting to match when you’re catching up on junior camps throughout the summer because you miss the distinctive smell of hockey sweat mixed with artificial ice. But those hours of hanging around rinks aren’t going to make you any better at understanding the game. Non-hockey concepts can have value in hockey, and it’s worthwhile to occasionally step away from the sport, and rather than focusing on a specific random event, learn to modify our expectations for the improbable and unlikely by determining what’s random and what’s talent. That goes not only for hockey, but also life.

Again, even before we can test causation, we need to test that the result is real and not simply randomness.  That's not done here.  And yet the implication is made. 

Now Dellow knows this is a problem - so he's denied this is the implication he's making on twitter.  Fine, if you say so.  But here's the issue with this - simply knowing the strategy/system a team is playing isn't particularly useful for analysis if we don't know the consistent results.  

Think about it - we have results - Nashville good at neutral zone D faceoffs.  We have what they did in those situations (Assuming for now this was the consistent strategy all year for Nashville).  What does the video therefore tell us from an analytical standpoint?  The answer is basically nothing - because as presented, you can't show from the video that certain systems/strategies/plays are what are causing the results.

I mean it's nice to know how a game will play out, and what the opposing styles of each team are is an interesting thing for a viewer to know.  But from the perspective of analytics, it doesn't help us to simply know "Team X runs System A" or "Team Y runs System B" if we don't know what the results of those systems are going to be.  And knowing the results of a sample doesn't necessarily tell us that "system" used in those samples has accomplished (or failed to accomplish)

---------------------------------------------------------------------------------------------------------

To explain in a better way, there is a clear place for video analysis as an addition to statistical analysis.  If you can show that one team is consistently good at some form of play, then go to replay and see how they're executing it, and show that there is a pattern showing how certain execution of a system results in that great performance, well then you've made a case for one way for other teams to improve or what may be a better system than what other teams are running. 

But arguably, this is the backwards method of analysis.  You have a result and you're searching for an explanation, which means you're predisposed to find something in the video that may or may not actually be there.  That's what Cam is talking about in the post above.

A better form of analysis would be to see what type of system teams are running and then try and find the results of those systems to find the benefits and cons of those systems.  By not having a predisposed finding you don't have a bias that colors what you're seeing. 

Mind you, this is difficult to do because it's unlikely that we're really going to be going into any video analysis blind. 

------------------------------------------------------------------------------------------------------

Again, video analysis can be cool.  I enjoy reading Justin Bourne's system analysis column - even outside the humorous text (which is hilarious btw) and you can easily learn about different types of systems teams use by doing such analyses. 

But for actual analysis of performance - for finding out how teams can improve and how players fit in, this type of analysis is heavily limited unless you take certain steps, steps which Dellow in his post is failing to mention.  And well, we're concerned with analysis, are we not? 

At least I am. 

Tuesday, July 30, 2013

The Oilers and reasonable expectations

I don't follow many people on twitter - as of today I only follow 37 people.  But I do keep track of a number of other people on twitter, particularly those in the sabermetrics and hockey analytics community.  A good # of those people, including one person who I follow (Jonathan Willis), are Oilers fans.  This shouldn't be a surprise to anyone - the Oilogosphere (as I believe it's called) is heavily active online and features some of the brighter analytical thinkers out there for Hockey. 

That said, I do think there's a pattern out there in the Oiler blogosphere, at least in certain parts, where people who should know better have higher than reasonable expectations for the team to begin the season. 

What touched this off by the way, though a similar idea has been in my head for a bit, was Willis taking issue with Rob Vollman labeling the Oilers as likely 30th next year.  Now I don't agree a lot with Vollman's methodology, and I don't know what he used here.  But let's begin by saying such a prediction isn't TERRIBLY unreasonable.  Again, last year's Oilers were:

28th in Fenwick Close (a miserable 44.48%) and 29th in Corsi tied (at 44.5% per HockeyAnalysis).  This was not a good team, despite the solid goaltending of Devan Dubynk and the 3 #1 overall picks. 

Who's returning from that team?  (Relying upon this: http://oilersnation.com/2013/7/8/the-edmonton-oilers-today-tomorrow/)
First Line: Hall-RNH-Eberle - Completely the same, although RNH may miss some time.
2nd  Line: Yakupov-Gagner
3rd   Line: Hemsky
4th   Line: Smyth, Jones
Extras: Brown
D: Smid-Petry
D: Schultz-Schultz
D: Potter

So you have a decent amount of turnover, but a large amount of players from the awful Oilers team of last year ARE coming back.  Of course, saying that is missing part of the picture.  RNH should be better if he recovers fine from surgery, as should Hall ( a scary thought).  Same for Yakupov of course.  On the other hand, that 4th line is still bad.  And well, the D was pretty awful, and 5 of those guys are coming back.  And only Justin Schultz should really be expected to be a little better. 

So again, despite the presence of 3 top picks, this isn't exactly a strong core.  How about the additions?  Well on D, the Oilers clearly have made a bunch additions:

Andrew Ference is probably the most high profile, and as noted on Copper & Blue, he's just not...very good.  Some of the negative parts of his relative corsi undoubtedly comes from the fact that the #1 D on Boston was a guy named Chara, who he didn't play with.*  On the other hand, the last two years, his relative corsi is more than a little bit bad - it's pretty darn bad.  He's not exactly being buried in the D Zone either (Neutral Zone Starts).   In the small sample of the playoffs last year, he was easily Boston's worst D Man in scoring chances.

*Tyler Dellow has argued that in addition, Ference's time with Chara was handicapped by playing off-hand, further limiting his relative corsi.  Of course, the negatives of playing off-hand are at this point purely anecdotal (SOMEBODY RESEARCH THIS) and I can point to plenty of examples of off-hand players performing quite excellently.  

Denis Grebeshkov is another addition - a D man who went to Russia after a few years in the NHL.  He had two good years for the Oilers under MacTavish, and then had a bad year where he was eventually shipped to Nashville.  From there he went to the KHL, where his career is interesting.   After a few years of high usage suggesting being a top KHL D man, Grebeshkov's TOI dropped with his team SKA until this year, when he was traded to Yugra.  In Yugra, his minutes returned.  I'd guess he was probably considered a borderline top pair KHL D man in Russia, but it's hard to tell from that kind of time.  As for the NHL - it's hard to say - is the the 2009-2010 guy or the 2008-2009 guy?  The latter guy won't help the Oilers too much.  Grebeshkov is also going to be 30, so he's not going to get much better. 

The other addition is Anton Belov, another KHL D man.  Belov played a ton of KHL minutes, and has a pretty praising scouting report up on Oilers Nation.  On the other hand, Derek Zona's recently tweeted out D pairs suggested he was the current 6th DMan or fighting for that spot with Nick Schultz.  There's a lot of uncertainty there, and he could be good.



Finally, Oskar Klefbom is the wild card.  Pronman projects him as a #2 D Man.  If he was to play, we'd probably figure he's more likely to be a #4 type guy right now.  But that's still an improvement.  Of course, it's hard to see where he cracks the lineup right away. 

So the Oilers will add potentially 3 guys whose abilities are highly unclear but could range from good 2nd pair to meh 3rd pair at best and a meh player in Andrew Ference.  Other than Ference - a bad signing - these are actually smart moves for the Oilers.  But the ranger of expectations is high here.  It's entirely possible that Grebeshkov and Belov bust, and then you're left again with the same bad Oilers D.  So it's not like you can't see the D of the Oilers leading to another poor finish. 

How about the forwards?

The Oilers have added 2 "major" additions at Forward.  The first, a swap of Parjaavi for David Perron, seems like an improvement (although Parjaavi could've improved as well).  He's a solid possession forward (although some argument can be made others were driving the bus) and adds some points.  The 2nd line last year was dreadful for the Oilers, despite the supposed high quality of Gagner, but at the very least Yakupov should get better and Perron should be an improvement at wing.  Is that enough? Hard to tell. 

The other is Boyd Gordon, a defensive center to replace the failure of Eric Belanger.  Of course, Belanger was once thought to be a pretty good defensive center whose signing was celebrated at Copper and Blue.  Meanwhile, it's unclear how well such guys are at consistently maintaining good performance in defensive minutes.  Look at Gordon - in 12-13 his D possession #s are terrific.  In 11-12 they're poor.  In 10-11 they're good in easier minutes.  In 09-10 they're horrible.  The odds of Gordon being another Belanger seem awfully high.

A few others were added of course.  Jesse Joensuu was an enigma on the Isles, and now he's one for the Oilers, for example. 

CONCLUSION:

The Oilers this offseason, with the exception of Ference (and the signing of LaBarbara to be the backup, an excellent signing), went for the unknown, taking a # of gambles on D improvements from the KHL, but didn't improve really at forward.  When your additions are like this however, your range of outcomes is wide.  A bottom 5 finish is very plausible.  So is a middle of the pack finish. 

But I've forgotten one element.  Coaching. 

Willis argues, as have several other Oilers fans, that Kreuger was the major problem and the reason for the step back last year, and that his failure to line match and flawed strategies were responsible for the gigantic step back.  I'm far from convinced. 

For one, line matching competition wise doesn't seem to be very influential on results - coaches simply have little ability to get players out against specific opponents (particularly on the road), resulting in each player playing similar loads of competition.  This is why the gaps in QOC metrics are at most 4 shots, and usually within 2 if not lower.  That's not much.  ZONE matching does seem more effective, but I've seen nothing to suggest Eakins is going to make this extreme.  So perhaps we get a marginal benefit here by getting line 1 out in the O Zone more often (on the other hand, this isn't going to help the extremely poor other lines). 

For another the suggestion that the players' step back was due to Krueger seems to imply that a coach's impact is HUGE: able to take a just sub average team into a bottom 3 team.  This seems incredibly unlikely, and would make certain coaches dramatically underpaid. 

Yes both C&B and Dellow have used video to show that certain plays seem particularly poorly designed, or that the strategies used were unconventional (C&B at one point argued there was a defensive strategy by Krueger which no one in the NHL used).  On the other hand, we don't have data showing that such poor looking plays were the results of the coach's strategies, as opposed to the players' being poor.  Or that certain plays were destined to get poor results.  We simply do NOT have the data to conclude that Krueger's coaching was to blame, despite the results.    (How to separate coaching from roster talent is a difficult one). 

Finally, as hinted previously, like the players, it's not a guaranty that Eakins is a good coach at all, or even better than Krueger.  Yes the Marlies had good results under him.  That's not a guaranty of anyone being a good coach.  Kreuger may not have been as high profile for instance, but he was once considered a pretty good coach in waiting. 

Jonathan Willis stated on twitter that, and i'm quoting to avoid misquoting here:
Shot differential collapsed in 2012-13 despite an improved roster, and it seems a safe bet to me that Dallas Eakins will reverse that trend.
The problem is that the only proof of 12-13 being an improved roster is well, the presupposition that those players were better.  That seems unclear at best - they certainly performed a lot worse!  Why is that on the coach and not the players!   Moreover, Eakins is not a safe bet - he's an unknown.  Like the rest of the Oilers moves this offseason. 

The Oilers certainly could be in the middle of the pack next year.  Or they could be in the bottom 5 yet again.  Nothing is certain about this team, and thus a prediction of utter atrocity is not exactly unreasonable.