There seems to be an unwritten rule in baseball – not on the field, but in the stands, at home, in the press box, etc.

“You can’t criticize a manager’s decision if it doesn’t directly affect the outcome of the game, if it appears to ‘work’, or if the team goes on to win the game despite the decision.”

That’s ridiculous of course. The outcome of a decision or the game has nothing to do with whether the decision was correct or not. Some decisions may raise or lower a team’s chances of winning from 90% and other decisions may affect a baseline of 10 or 15%.

If decision A results in a team’s theoretical chances of winning of 95% and decision A, 90%, obviously A is the correct move. Choosing B would be malpractice. Equally obvious is if manager chooses B, an awful decision, he is still going to win the game 90% of the time, and based on the “unwritten rule” we rarely get to criticize him. Similarly, if decision A results in a 15% win expectancy (WE) and B results in 10%, A is the clear choice, yet the team still loses most of the time and we get to second guess the manager whether he chooses A or B. All of that is silly and counter-productive.

If your teenager drives home drunk yet manages to not kill himself or anyone else, do you say nothing because “it turned out OK?” I hope not. In sports, most people understand the concept of “results versus process” if they are cornered into thinking about it, but in practice, they just can’t bring themselves to accept it in *real time*. No one is going to ask Terry Collins in the post-game presser why he didn’t pinch hit for DeGrom in the 6^{th} inning – no one. The analyst – a competent one at least – doesn’t give a hoot what happened after that. None whatsoever. He looks at a decision and if it appears questionable at the time, he tries to determine what the average consequences are – with all known data at the time the decision is made – *with *the decision or with one or more alternatives. That’s it. What happens after that is irrelevant to the analyst. For some reason this is a hard concept for the average fan – the average person – to apply. As I said, I truly think they understand it, especially if you give obvious examples, like the drunk driving one. They just don’t seem to be able to break the “unwritten rule” in practice. It goes against their grain.

Well, I’m an analyst and I don’t give a flying ***k whether the Mets won, lost, tied, or Wrigley Field collapsed in the 8^{th} inning. The “correctness” of the decision to allow DeGrom to hit or not in the top of the 6^{th}, with runners on second and third, boiled down to this question and this question only:

“What is the average win expectancy (WE) of the Mets with DeGrom hitting and then pitching some number of innings and what is the average WE with a pinch hitter and someone else pitching in place of DeGrom?”

Admittedly the gain, if there is any, from making the decision to bring in a PH and reliever or relievers must be balanced against any known or potential negative consequences for the Mets not related to the game at hand. Examples of these might be: 1) limiting your relief possibilities in the rest of the series or the World Series. 2) Pissing off DeGrom or his teammates for taking him out and thus affecting the morale of the team.

I’m fine with the fans or the manager and coaches including these and other considerations in their decision. I am *not* fine with them making their decision *not* knowing how it affects the win expectancy of the game at hand, since that is clearly the most important of the considerations.

My guess is that if we asked Collins about his decision-making process, and he was honest with us, he would *not* say, “Yeah, I knew that letting him hit would substantially lower our chances of winning the game, but I also wanted to save the pen a little and give DeGrom a chance to….” I’m pretty sure he thought that with DeGrom pitching well (which he usually does, by the way – it’s not like he was pitching well-above his norm), his chances of winning were *better* with him hitting and then pitching another inning or two.

At this point, and before I get into estimating the WE of the two alternatives facing Collins, letting DeGrom hit and pitch or pinch hitting and bringing in a reliever, I want to discuss an important concept in decision analysis in sports. In American civil law, there is a thing called a *summary judgment*. When a party in a civil action moves for one, the judge makes his decision based on the known facts and *assuming controversial facts and legal theories in a light most favorable to the non-moving party.* In other words, if everything that the other party says is true *is *true (and is not already known to be false) and the moving party would still win the case according to the law, then the judge must accept the motion and the moving party wins the case without a trial.

When deciding whether a particular decision was “correct” or not in a baseball game or other contest, we can often do the same thing in order to make up for an imperfect model (which all models are by the way). You know the old saw in science – *all models are wrong, but some are useful*. In this particular instance, we don’t know for sure how DeGrom will pitch in the 6^{th} and 7^{th} innings to the Cubs order for the 3^{rd} time, we don’t know for how much longer he *will* pitch, we don’t know how well DeGrom will bat, and we don’t know who Collins can and will bring in.

I’m not talking about the fact that we don’t know whether DeGrom or a reliever is going to give up a run or two, or whether he or they are going to shut the Cubs down. That is in the realm of “results-based analysis” and I‘ve already explained how and why that is irrelevant. I’m talking about what is DeGrom’s true talent, say in runs allowed per 9 facing the Cubs for the third time, what is a reliever’s or relievers’ true talent in the 6^{th} and 7^{th}, how many innings do we estimate DeGrom will pitch on the average if he stays in the game, and what is his true batting talent.

Our estimates of all of those things will affect our model’s results – our estimate of the Mets’ WE with and without DeGrom hitting. But what if we assumed everything in favor of keeping DeGrom in the game – we looked at all controversial items in a light most favorable to the non-moving party – and it was still a clear decision to pinch hit for him? Well, we get a summary judgment! Pinch hitting for him would clearly be the correct move.

There is one more caveat. If it is true that there are indirect negative consequences to taking him out – and I’m not sure that there are – then we also have to look at the magnitude of the gain from taking him out and then decide whether it is *worth it*. In order to do that, we have to have some idea as to what is a small and what is a large advantage. That is actually not that hard to do. Managers routinely bring in closers in the 9^{th} inning with a 2-run lead, right? No one questions that. In fact, if they didn’t – if they regularly brought in their second or third best reliever instead, they would be crucified by the media and fans. How much does bringing in a closer with a 2-run lead typically add to a team’s WE, compared to a lesser reliever? According to *The Book*, an elite reliever compared to an average reliever in the 9^{th} inning with a 2-run lead adds around 4% to the team’s WE. So we know that 4% is a big advantage, which it is.

That brings up another way to account for the imperfection of our models. The first way was to use the “summary judgment” method, or assume things most favorable to making the decision that we are questioning. The second way is to simply estimate everything to the best of our ability and then look at the magnitude of the results. If the difference between decision A and B is 4%, it is *extremely unlikely* that any reasonable tweak to the model will change that 4% to 0% or -1%.

In this situation, whether we assume DeGrom is going to pitch 1.5 more innings or 1.6 or 1.4, it won’t change the results much. If we assume that DeGrom is an average hitting pitcher or a poor one, it won’t change the result all that much. If we assume that the “times through the order penalty” is .25 runs or .3 runs per 9 innings, it won’t change the results much. If we assume that the relievers used in place of DeGrom have a true talent of 3.5, 3.3, 3.7, or even 3.9, it won’t change the results all that much. Nothing can change the results from 4% in favor of decision A to something in favor of decision B. 4% is just too much to overcome even if our model is not completely accurate. Now, if our results assuming “best of our ability estimates” for all of these things yield a 1% advantage for choosing A, then it is entirely possible that B is the real correct choice and we might defer to the manager in case he knows some things that we don’t or we simply are mistaken in our estimates or we failed to account for some important variable.

Let’s see what the numbers say, assuming “average” values for all of these relevant variables and then again making reasonable assumptions in favor of allowing DeGrom to hit (assuming that pinch hitting for him appears to be correct).

What is the win expectancy with DeGrom batting. We’ll assume he is an average-hitting pitcher or so (I have heard that he is a poor-hitting pitcher). An average pitcher’s batting line is around 10% single, 2% double or triple, .3% HR, 4% BB, and 83.7% out. The average WE for an average team leading by 1 run in the top of the 6^{th}, with runners on second and third, 2 outs, and a *batter with this line*, is…..

63.2%.

If DeGrom were an automatic out, the WE would be 59.5%. That is the average WE leading off the bottom of the 6^{th} with the visiting team winning by a run. So an average pitcher batting in that spot adds a little more than 3.5% in WE. That’s not wood. What if DeGrom were a poor hitting pitcher?

Whirrrrr……

62.1%.

So whether DeGrom is an average or poor-hitting pitcher doesn’t change the Mets’ WE in that spot all that much. Let’s call it 63%. That is reasonable. He adds 3.5% to the Mets’ WE compared to an out.

What about a pinch hitter? Obviously the quality of the hitter matters. The Mets have some decent hitters on the bench – notably Cuddyer from the right side and Johnson from the left. Let’s assume a league-average hitter. Given that, the Mets’ WE with runners on second and third, 2 outs, and a 1-run lead, is 68.8%. A league-average hitter adds over 9% to the Mets’ WE compared to an out. **The difference between DeGrom as a slightly below-average hitting pitcher and a league-average hitter is 5.8%.** **That means, unequivocally, assuming that our numbers are reasonably accurate, that letting DeGrom hit cost the Mets almost 6% in their chances of winning the game.**

That is enormous of course. Remember we said that bringing in an elite reliever in the 9^{th} of a 2-run game, as compared to a league-average reliever, is worth 4% in WE. You can’t really make a worse decision as a manager than reducing your chances of winning by 5.8%, unless you purposely throw the game. But, that’s not nearly the end of the story. Collins presumably made this decision thinking that DeGrom pitching the 6^{th} and perhaps the 7^{th }would more than make up for that. Actually he’s not quite thinking, “Make up for that.” He is not thinking in those terms. He does not know that letting him hit “cost 5.8% in win expectancy” compared to a pinch hitter. I doubt that the average manager knows what “win expectancy” means let alone how to use it in making in-game decisions. He merely thinks, “I really want him to pitch another inning or two, and letting him hit is a small price to pay,” or something like that.

So how much does he gain by letting him pitch the 6^{th} and 7^{th} rather than a reliever. To be honest it is debatable whether he gains anything at all. Not only that, but if we look back in history to see how many innings starters end up pitching, *on the average*, in situations like that, we will find that it is *not* 2 innings. It is probably not even 1.5 innings. He was at 82 pitches through 5. He may throw 20 or 25 pitches in the 6^{th} (like he did in the first), in which case he may be done. He may give up a base runner or two, or even a run or two, and come out in the 6^{th}, perhaps before recording an out. At best, he pitches 2 more innings, and once in a blue moon he pitches all or part of the 8^{th} I guess (as it turned out, he pitched 2 more effective innings and was taken out after seven). Let’s assume 1.5 innings, which I think is generous.

What is DeGrom’s expected RA9 for those 2 innings? He has pitched well thus far but not spectacularly well. In any case, there is no evidence that pitching well through 5 innings tells us anything about how a pitcher is going to pitch in the 6^{th} and beyond. What is DeGrom’s normal expected RA9? Steamer, ZIPS and my projection systems say about 83% of league-average run prevention. That is equivalent to a #1 or #2 starter. It is equivalent to an elite starter, but not quite the level of the Kershaw’s, Arrieta’s, or even the Price’s or Sale’s. Obviously he could turn out to be better than that – or worse – but all we can do in these calculations and all managers can do in making these decisions is use the best information and the best models available to estimate player talent.

Then there is the “times through the order penalty.” There is no reason to think that this wouldn’t apply to DeGrom in this situation. He is going to face the Cubs for the third time in the 6^{th} and 7^{th} innings. Research has found that the third time through the order a starter’s RA9 is .3 runs worse than his overall RA9. So a pitcher who allows 83% of league average runs allows 90% when facing the order for the 3^{rd} time. That is around 3.7 runs per 9 innings against an average NL team.

Now we have to compare that to a reliever. The Mets have Niese, Robles, Reed, Colon, and Gilmartin available for short or long relief. Colon might be the obvious choice for the 6^{th} and 7^{th} inning, although they surely could use a combination of righties and lefties, especially in very high leverage situations. What do we expect these relievers’ RA9 to be? The average reliever is around 4.0 to start with, compared to DeGrom’s 3.7. If Collins uses Colon, Reed, Niese or some combination of relievers, we might expect them to be better than the average NL reliever. Let’s be conservative and assume an average, generic reliever for those 1.5 innings.

How much does that cost the Mets in WE? To figure that, we take the difference in run prevention between DeGrom and the reliever(s), multiply by the game leverage and convert it into WE. The difference between a 3.7 RA9 and a 4.0 RA9 in 1.5 innings is .05 runs. The average expected leverage index in the 6^{th} and 7^{th} innings where the road team is up by a run is around 1.7. So we multiply .05 by 1.7 and convert that into WE. The final number is .0085, or less than 1% in win expectancy gained by allowing DeGrom to pitch rather than an average reliever.

That might shock some people. It certainly should shock Collins, since that is presumably his reason for allowing DeGrom to hit – he really, really wanted him to pitch another inning or two. He presumably thought that that would give his team a *much* better chance to win the game as opposed to one or more of his relievers. I have done this kind of calculation dozens of times and I *know* that keeping good or even great starters in the game for an inning or two is not worth much. For some reason, the human mind, in all its imperfect and biased glory, overestimates the value of 1 or 2 innings of a pitcher who is “pitching well” as compared to an “unknown entity” (of course we know the expected performance of our relievers almost as well as we know the expected performance of the starter). It is like a manager who brings in his closer in a 3-run game in the 9^{th}. He thinks that his team has a *much* better chance of winning than if he brings in an inferior pitcher. The facts say that he is wrong, but tell that to a manager and see if he agrees with you – he won’t. Of course, it’s not a matter of opinion – it’s a matter of fact.

Do I need to go any further? Do I need to tweak the inputs? Assuming average values for the relevant variables yields a loss of over 5% in win expectancy by allowing DeGrom to hit. What if we *knew *that DeGrom were going to pitch *two *more innings rather than an average of 1.5? He saves .07 runs rather than .05 which translates to 1.2% WE rather than .85%, which means that pinch hitting for him increases the Mets’ chances of winning by 4.7% rather than 5.05%. 4.7% is still an enormous advantage. *Reducing* your team‘s chances of winning by 4.7% by letting DeGrom hit is criminal. **It’s like pinch hitting Jeff Mathis for Mike Trout in a high leverage situation – twice!**

What about if our estimate of DeGrom’s true talent is too conservative? What if he is as good as Kershaw and Arrieta? That’s 63% of league average run prevention or 2.6 RA9. Third time through the order and it’s 2.9. The difference between that and an average reliever is 1.1 runs per 9, which translates to a 3.1% WE difference in 1.5 innings. **So allowing Kershaw to hit in that spot reduces the Mets chances of winning by 2.7%.** That’s not wood either.

What if the reliever you replaced DeGrom with was a replacement level pitcher – the worst pitcher in the major leagues? He allows around 113% league average runs, or 4.6 RA9. Difference between DeGrom and him for 1.5 innings? 2.7% for a net loss of 3.1% by letting him hit rather than pinch hitting for him and letting the worst pitcher in baseball pitch the next 1.5 innings? **If you told Collins, “Hey genius, if you pinch hit for Degrom and let the worst pitcher in baseball pitch for another inning and a half instead of DeGrom, you will increase your chances of winning by 3.1%,” what do you think he would say?**

What if DeGrom were a good hitting pitcher? What if….?

You should be getting the picture. Allowing him to hit is *so* costly, assuming reasonable and average values for all the pertinent variables, that even if we are missing something in our model, or some of our numbers are a little off – even if assume everything in the best possible light of allowing him to hit – the decision is a no-brainer in favor of a pinch hitter.

If Collins truly wanted to give his team the best chance of winning the game, or in the vernacular of ballplayers, *putting his team in the best position to succeed*, the clear and unequivocal choice was to lift DeGrom for a pinch hitter. It’s too bad that no one cares because the Mets ultimately won the game, which they were going to do at least 60% of the time anyway, regardless of whether Collins made the right or wrong decision.

The biggest loser, other than the Cubs, is Collins (I don’t mean he is a *loser*, as in the childish insult), because every time you use *results* to evaluate a decision and the results are positive, you deprive yourself of the opportunity to learn a valuable lesson. In this case, the analysis could have and should have been done before the game even started. All managers should know the importance of bringing in pinch hitters for pitchers in high leverage situations in important games, no matter how good the pitchers are or how well they are pitching in the game so far. Maybe someday they will.

Excellent breakdown. Probably best I’ve ever seen. Small nitpick but I think when you say “The average expected leverage index in the 6th and 7th innings where the road team is up by a run is around 1.7. So we multiply .05 by 1.5 and convert that into WE”, it should say multiply by 1.7 and not 1.5.

Fixed. Thanks!

Really interesting read. Very in depth and though provoking. One other minor error, in the fourth paragraph you list decision A twice rather than decision A and decision B.

Great work. Fun to read.

*thought