In *The Book: Playing the Percentages in Baseball*, we found that when a batter pinch hits against *right-handed* *relief* pitchers (so there are no familiarity or platoon issues), his wOBA is 34 points (10%) worse than when he starts and bats against relievers, after adjusting for the quality of the pitchers in each pool (PH or starter). We called this the *pinch hitting penalty*.

We postulated that the reason for this was that a player coming off the bench in the middle or towards the end of a game is not as physically or mentally prepared to hit as a starter who has been hitting and playing the field for two or three hours. In addition, some of these pinch hitters are not starting because they are tired or slightly injured.

We also found no evidence that there is a “pinch hitting skill.” In other words, there is no such thing as a “good pinch hitter.” If a hitter has had exceptionally good (or bad) pinch hitting stats, it is likely that that was due to chance alone, and thus it has no predictive value. The best predictor of a batter’s pinch-hitting performance is his regular projection with the appropriate penalty added.

We found a similar situation with designated hitters. However, their penalty was around half that of a pinch hitter, or 17 points (5%) of wOBA. Similar to the pinch hitter, the most likely explanation for this is that the DH is not as physically (and perhaps mentally) prepared for each PA as a player who is constantly engaged in the game. As well, the DH may be slightly injured or tired, especially if he is normally a position player. It makes sense that the DH penalty would be less than the PH penalty, as the DH is more involved in a game than a PH. Pinch hitting is often considered “the hardest job in baseball.” The numbers suggest that that is true. Interestingly, we found a small “DH skill” such that different players seem to have more or less of a true DH penalty.

Andy Dolphin (one of the authors of *The Book*) revisited the PH penalty issue in this Baseball Prospectus article from 2006. In it, he found a PH penalty of 21 points in wOBA, or 6%, significantly less than what was presented in *The Book* (34 points).

Tom Thress, on his web site, reports a PH penalty of .009 in “player won-loss records” (offensive performance translated into a “w/l record”), which he says is similar to that found in *The Book* (34 points). However, he finds an even larger DH penalty of .011 wins, which is more than twice that which we presented in *The Book*. I assume that .011 is slightly larger than 34 points in wOBA.

So, everyone seems to be in agreement that there is a significant PH and DH penalty, however, there is some disagreement as to the magnitude of each (with empirical data, we can never be sure anyway). I am going to revisit this issue by looking at data from 1998 to 2012. The method I am going to use is the “delta method,” which is common when doing this kind of “either/or” research with many player seasons in which the number of opportunities (in this case, PA) in each “bucket” can vary greatly for each player (for example, a player may have 300 PA in the “either” bucket and only 3 PA in the “or” bucket) and from player to player.

The “delta method” looks something like this: Let’s say that we have 4 players (or player seasons) in our sample, and each player has a certain wOBA and number of PA in bucket A and in bucket B, say, DH and non-DH – the number of PA are in parentheses.

wOBA as DH | wOBA as Non-DH | |

Player 1 | .320 (150) | .330 (350) |

Player 2 | .350 (300) | .355 (20) |

Player 3 | .310 (350) | .325 (50) |

Player 4 | .335 (100) | .350 (150) |

…

In order to compute the DH penalty (difference between when DH’ing and playing the field) using the “delta method,” we compute the difference for each player separately and take a weighted average of the differences, using the lesser of the two PA (or the harmonic mean) as the weight for each player. In the above example, we have:

((.330 – .320) * 150 + (.355 – .350) * 20 + (.325 – .310) * 50 + (.350 – .335) * 100) / (150 + 20 + 50 + 100)

If you didn’t follow that, that’s fine. You’ll just have to trust me that this is a good way to figure the “average difference” when you have a bunch of different player seasons, each with a different number of opportunities (e.g. PA) in each bucket.

In addition to figuring the PH and DH penalties (in various scenarios, as you will see), I am also going to look at some other interesting “penalty situations” like playing in a day game after a night game, or both games of a double header.

In my calculations, I adjust for the quality of the pitchers faced, the percentage of home and road PA, and the platoon advantage between the batter and pitcher. If I don’t do that, it is possible for one bucket to be inherently more hitter-friendly than the other bucket, either by chance alone or due to some selection bias, or both.

First let’s look at the DH penalty. Remember that in *The Book*, we found a roughly 17 point penalty, and Tom Thresh found a penalty that was greater than that of a PH, presumably more than 34 points in wOBA.

Again, my data was from 1998 to 2012, and I excluded all inter-league games. I split the DH samples into two groups: One group had more DH PA than non-DH PA in each season (they were primarily DH’s), and vice versa in the other group (primarily position players).

**The DH penalty was the same in both groups – 14 points in wOBA.**

The total sample sizes were 10,222 PA for the primarily DH group and 32,797 for the mostly non-DH group. If we combine the two groups, we get a total of 43,019 PA. That number represents the total of the “lesser of the PA” for each player season. One standard deviation in wOBA for that many PA is around 2.5 wOBA points. For the difference between two groups of 43,000 each, it is 3.5 points (the square root of the sum of the variances). So we can say with 95% confidence that the true DH penalty is between 7 and 21 points with the most likely value being 14. This is very close to the 17 point value we presented in *The Book*.

I expected that the penalty would be greater for position players who occasionally DH’d rather than DH’s who occasionally played in the field. That turned out not to be the case, but given the relatively small sample sizes, the true values could very well be different.

Now let’s move on to pinch hitter penalties. I split those into two groups as well: One, against starting pitchers and the other versus relievers. We would expect the former to show a greater penalty since a “double whammy” would be in effect – first, the “first time through the order” penalty, and second, the “sitting on the bench” penalty. In the reliever group, we would only have the “coming in cold” penalty. I excluded all ninth innings or later.

**Versus starting pitchers only, the PH penalty was 19.5 points** in 8,523 PA. One SD is 7.9 points, so the 95% confidence interval is a 4 to 35 point penalty.

**Versus relievers only, the PH penalty was 12.8 points** in 17,634 PA. One SD is 5.5 points – the 95% confidence interval is a 2 to 24 point penalty.

As expected, the penalty versus relievers, where batters typically only face the pitcher for the first and only time in the game, whether they are in the starting lineup or are pinch hitting, is less than that versus the starting pitcher, by around 7 points. Again, keep in mind that the sample sizes are small enough such that the *true* difference between the starter PH penalty and reliever PH penalty could be the same or could even be reversed. Of course, our *prior* when applying a Bayesian scheme is that there is a strong likelihood that the true penalty is larger against starting pitchers for the reason explained above. So it is likely that the true difference is similar to the one observed (a 7-point greater penalty versus starters).

Notice that my numbers indicate penalties of a similar magnitude for pinch hitters and designated hitters. The PH penalty is a little higher than the DH penalty when pinch hitters face a starter, and a little lower than the DH penalty when they face a reliever. I expected the PH penalty to be greater than the DH penalty, as we found in *The Book*. Again, these numbers are based on relatively small sample sizes, so the *true* PH and DH penalties could be quite different.

Role |
Penalty (wOBA) |

DH | 14 points |

PH vs. Starters | 20 points |

PH vs. Relievers | 13 points |

…

Now let’s look at some other potential “penalty” situations, such as the second game of a double-header and a day game following a night game.

**In a day game following a night game, batters hit 6.2 wOBA points worse than in day games after day games or day games after not playing at all the previous day.** The sample size was 95,789 PA. The 95% certainty interval is 1.5 to 11 points.

What about the when a player plays both ends of a double-header (no PH or designated hitters)? Obviously many regulars sit out one or the other game – certainly the catchers.

**Batters in the second game of a twin bill lose 8.1 points of wOBA compared to all other games. **Unfortunately, the sample is only 9,055 PA, so the 2 SD interval is -7.5 to 23.5. If 8.1 wOBA points (or more) is indeed reflective of the true double-header penalty, it would be wise for teams to sit some of their regulars in one of the two games – which they do of course. It would also behoove teams to make sure that their two starters in a twin bill pitch with the same hand in order to discourage fortuitous platooning by the opposing team.

Finally, I looked at games in which a player *and his team* (in order to exclude times when the player sat because he wasn’t 100% healthy) did not play the previous day, versus games in which the player had played at least 8 days in a row. I am looking for a “consecutive-game fatigue” penalty and those are the two extremes. I excluded all games in April and all pinch-hitting appearances.

**The “penalty” for playing at least 8 days in a row is 4.0 wOBA points in 92,287 PA**. One SD is 2.4 so that is not a statistically significant difference. However, with a Bayesian prior such that we *expect* there to be a “consecutive-game fatigue” penalty, I think we can be fairly confident with the empirical results (although obviously there is not much certainty as to the magnitude).

To see whether the consecutive day result is a “penalty” or the day off result is a bonus, I compared them to all other games.

**When a player and his team has had a day off the previous day, the player hits .1 points better than otherwise in 115,471 PA (-4.5 to +4.5). Without running the “consecutive days off” scenario, we can infer that there is an observed penalty when playing at least 8 days in a row, of around 4 points, compared to all other games (the same as compared to after an off-day).**

**So having a day off is not really a “bonus,” but playing too many days in row creates a penalty.** It probably behooves all players to take an occasional day off. Players like Cal Ripken, Steve Garvey, and Miguel Tejada (and others) may have had substantially better careers had they been rested more, at least rate-wise.

I also looked at players who played in fewer days in a row (5, 6, and 7) and found penalties of *less than 4 points*, suggesting that **the more days in a row a player plays, the more his offense is penalized.** It would be interesting to see if a day off after several days in a row restores a player to his normal offensive levels.

There are many other situations where batters and pitchers may suffer penalties (or bonuses), such as game(s) after coming back from the DL, getaway (where the home team leaves for another venue) games, Sunday night games, etc.

Unfortunately, I don’t have the time to run all of these potentially interesting scenarios – and I have to leave *something *for aspiring saberists to do!

**Addendum**: Tango Tiger suggested I split the DH into “versus relievers and starters.” I did not expect there to be a difference in penalties since, unlike a PH, a DH faces the starter the same number of times as when he isn’t DH’ing. However, I found a penalty difference of 8 points – **the DH penalty versus starters was 16.3 and versus relievers, it was 8.3.** Maybe the DH becomes “warmer” towards the end of the game, or maybe the difference is a random, statistical blip. I don’t know. We are often faced with these conundrums (what to conclude) when dealing with limited empirical data (relatively small sample sizes). Even if we are statistically confident that an effect exists (or doesn’t), we are are usually quite uncertain as to the magnitude of that effect.

I also looked at getaway (where the home team goes on the road after this game) night games. It has long been postulated that the home team does not perform as well in these games. **Indeed, the home team batter penalty in these games was 1.6 wOBA points**, again, not a statistically significant difference, but consistent with the Bayesian prior. Interestingly, the road team batters performed .6 points *better *suggesting that home team pitchers in getaway games might have a small penalty as well.