Archive for the ‘Pitching’ Category

Yesterday I looked at how and whether a hitter’s mid-season-to-date stats can help us to inform his rest-of-season performance, over and above a credible up-to-date mid-season projection. Obviously the answer to that depends on the quality of the projection – specifically how well it incorporates the season-to-date data in the projection model.

For players who were having dismal performances after the first, second, third, all the way through the fifth month of the season, the projection accurately predicted the last month’s performance and the first 5 months of data added nothing to the equation. In fact, those players who were having dismal seasons so far, even into the last month of the season, performed fairly admirably the rest of the way – nowhere near the level of their season-to-date stats. I concluded that the answer to the question, “When should we worry about a player’s especially poor performance?” was, “Never. It is irrelevant other than how it influences our projection for that player, which is not much, apparently.” For example, full-time players who had a .277 wOBA after the first month of the season, were still projected to be .342 hitters, and in fact, they hit .343 for the remainder of the season. Even halfway through the season, players who hit .283 for 3 solid months were still projected at .334 and hit .335 from then on. So, ignore bad performances and simply look at a player’s projection if you want to estimate his likely performance tomorrow, tonight, next week, or for the rest of the season.

On the other hand, players who have been hitting well-above their mid-season projections (crafted after and including the hot hitting) actually outhit their projections by anywhere from 4 to 16 points, still nowhere near the level of their “hotness,” however. This suggests that the projection algorithm is not handling recent “hot” hitting properly – at least my projection algorithm. Then again, when I looked at hitters who were projected at well-above average 2 months into the season, around .353, the hot ones and the cold ones each hit almost exactly the same over the rest of the season, equivalent to their respective projections. In that case, how they performed over those 3 months gave us no useful information beyond the mid-season projection. In one group, the “cold” group, players hit .303 for the first 2 months of the season, and they were still projected at .352. Indeed, they hit .349 for the rest of the season. The “hot” batters hit .403 for the first 2 months, they were projected to hit .352 after that and they did indeed hit exactly .352. So there would be no reason to treat these hot and cold above-average hitters any differently from one another in terms of playing time or slot in the batting order.

Today, I am going to look at pitchers. I think the perception is that because pitchers get injured more easily than position players, learn and experiment with new and different pitches, often lose velocity, their mechanics can break down, and their performance can be affected by psychological and emotional factors more easily than hitters, that early or mid-season “trends” are important in terms of future performance. Let’s see to what extent that might be true.

After one month, there were 256 pitchers or around 1/3 of all qualified pitchers (at least 50 TBF) who pitched terribly, to the tune of a normalized ERA (NERA) of 5.80 (league average is defined as 4.00). I included all pitchers whose NERA was at least 1/2 run worse than their projection. What was their projection after that poor first month? 4.08. How did they pitch over the next 5 months? 4.10. They faced 531 more batters over the last 5 months of the season.

What about the “hot” pitchers? They were projected after one month at 3.86 and they pitched at 2.56 for that first month. Their performance over the next 5 months was 3.85. So for the “hot” and “cold” pitchers after one month, their updated projection accurately told us what to expect for the remainder of the season and their performance to-date was irrelevant.

In fact, if we look at pitchers who had good projections after one month and divide those into two groups: One that pitches terribly for the first month, and one that pitches brilliantly for the first month, here is what we get:

Good pitchers who were cold for 1 month

First month: 5.38
Projection after that month: 3.79
Performance over the last 5 months: 3.75

Good pitchers who were hot for 1 month

First month: 2.49
Projection after that month: 3.78
Performance over the last 5 months: 3.78

So, and this is critical, one month into the season if you are projected to pitch above average, at, say 3.78, it makes no difference whether you have pitched great or terribly thus far. You are going to pitch at exactly your projection for the remainder of the season!

Yet the cold group faced 587 more batters and the hot group 630. Managers again are putting too much emphasis in those first month’s stats.

What if you are projected after one month as a mediocre pitcher but you have pitched brilliantly or poorly over the first month?

Bad pitchers who were cold for 1 month

First month: 6.24
Projection after that month: 4.39
Performance over the last 5 months: 4.40

Bad pitchers who were hot for 1 month

First month: 3.06
Projection after that month: 4.39
Performance over the last 5 months: 4.47

Same thing. It makes no difference whether a poor or mediocre pitcher had pitched well or poorly over the first month of the season. If you want to know how he is likely to pitch for the remainder of the season, simply look at his projection and ignore the first month. Those stats give you no more useful information. Again, the “hot” but mediocre pitchers got 44 more TBF over the final 5 months of the season, despite pitching exactly the same as the “cold” group over that 5 month period.

What about halfway into the season? Do pitchers with the same mid-season projection but one group was “hot” over the first 3 months and the other group was “cold,” pitch the same for the remaining 3 months? The projection algorithm does not handle the 3-month anomalous performances very well. Here are the numbers:

Good pitchers who were cold for 3 months

First month: 4.60
Projection after 3 months: 3.67
Performance over the last 3 months: 3.84

Good pitchers who were hot for 3 months

First month: 2.74
Projection after 3 months: 3.64
Performance over the last 3 months: 3.46

So for the hot pitchers the projection is undershooting them by around .18 runs per 9 IP and for the cold ones, it is over-shooting them by .17 runs per 9. Then again the actual performance is much closer to the projection than to the season-to-date performance. As you can see, mid-season pitcher stats halfway through the season are a terrible proxy for true talent/future performance. These “hot” and “cold” pitchers whose first half performance and second half projections were divergent by at least .5 runs per 9, performed in the second half around .75 runs per 9 better or worse than in the first half. You are much better off using the mid-season projection than the actual first-half performance.

For poorer pitchers who were “hot” and “cold” for 3 months, we get these numbers:

Poor pitchers who were cold for 3 months

First month: 5.51
Projection after 3 months: 4.41
Performance over the last 3 months: 4.64

Poor pitchers who were hot for 3 months

First month: 3.53
Projection after 3 months: 4.43
Performance over the last 3 months: 4.33

The projection model is still not giving enough weight to the recent performance, apparently. That is especially true of the “cold” pitchers. It over values them by .23 runs per 9. It is likely that these pitchers are suffering some kind of injury or velocity decline and the projection algorithm is not properly accounting for that. For the “hot” pitchers, the model only undervalues these mediocre pitchers by .1 runs per 9. Again, if you try and use the actual 3-month performance as a proxy for true talent or to project their future performance, you would be making a much bigger mistake – to the tune of around .8 runs per 9.

What about 5 months into the season? If the projection and the 5 month performance is divergent, which is better? Is using those 5 month stats a bad idea?

Yes, it still is. In fact, it is a terrible idea. For some reason, the projection does a lot better after 5 months than after 3 months. Perhaps some of those injured pitchers are selected out. Even though the projection slightly under and over values the hot and cold pitchers, using their 5 month performance as a harbinger of the last month is a terrible idea. Look at these numbers:

Poor pitchers who were cold for 5 months

First month: 5.45
Projection after 5 months: 4.41
Performance over the last month: 4.40

Poor pitchers who were hot for 5 months

First month: 3.59
Projection after 5 months: 4.39
Performance over the last month: 4.31

For the mediocre pitchers, the projection almost nails both groups, despite it being nowhere near the level of the first 5 months of the season. I cannot emphasize this enough: Even 5 months into the season, using a pitcher’s season-to-date stats as a predictor of future performance or a proxy for true talent (which is pretty much the same thing) is a terrible idea!

Look at the mistakes you would be making. You would be thinking that the hot group were comprised of 3.59 pitchers when in fact they were 4.40 pitchers who performed as such. That is a difference of .71 runs per 9. For your cold pitchers, you would undervalue them by more than a run per 9! What do managers do after 5 months of “hot” and “cold” pitching, despite the fact that both groups pitched almost the same for the last month of the season? They gave the hot group an average of 13 more TBF per pitcher. That is around a 3 inning difference in one month.

Here are the good pitchers who were hot and cold over the first 5 months of the season:

Good pitchers who were cold for 5 months

First month: 4.62
Projection after 5 months: 3.72
Performance over the last month: 3.54

Good pitchers who were hot for 5 months

First month: 2.88
Projection after 5 months: 3.71
Performance over the last month: 3.72

Here the “hot,” good pitchers pitched exactly at their projection despite pitching at .83 runs per 9 better over the first 5 months of the season. The “cold” group actually outperformed their projection by .18 runs and pitched better than the “hot” group! This is probably a sample size blip, but the message is clear: Even after 5 months, forget about how your favorite pitcher has been pitching, even for most of the season. The only thing that counts is his projection, which utilizes many years of performance plus a regression component, and not just 5 months worth of data. It would be a huge mistake to use those 5 month stats to predict these pitchers’ performances.

Managers can learn a huge lesson from this. The average number of batters faced in the last month of the season among the hot pitchers was 137, or around 32 IP. For the cold group, it was 108 TBF, or 25 IP. Again, the “hot” group pitched 7 more IP in only a month, yet they pitched worse than the “cold” group and both groups had the same projection!

The moral of the story here is that for the most part, and especially at the beginning and end of the season, ignore actual pitching performance to-date and use credible mid-season projections if you want to predict how your favorite or not-so favorite pitcher is likely to pitch tonight or over the remainder of the season. If you don’t, and that actual performance is significantly different from the updated projection, you are making a sizable mistake.

 

 

Advertisements

Yesterday, I posted an article describing how I modeled to some extent a way to tell whether and by how much pitchers may be able to pitch in such a way as to allow fewer or more runs than their components, including the more subtle ones, like balks, SB/CS, WP, catcher PB, GIDP, and ROE suggest.

For various reasons, I suggest taking these numbers with a grain of salt. For one thing, I need to tweak my RA9 simulator to take into consideration a few more of these subtle components. For another, there may be some things that stick with a pitcher from year to year that have nothing to do with his “RA9 skill” but which serve to increase or decrease run scoring, given the same set of components. Two of these are a pitcher’s outfielder arms and the vagueries of his home park, which both have an effect on base runner advances on hits and outs. Using a pitcher’s actual sac flies against will mitigate this, but the sim is also using league averages for base runner advances on hits, which, as I said, can vary from pitchers to pitcher, and tend to persist from year to year (if a pitcher stays on the same team) based on his outfielders and his home park. Like DIPS, it would be better to do these correlations only on pitchers who switch teams, but I fear that the sample would be too small to get any meaningful results.

Anyway, I have a database now of the last 10 years’ differences between a pitcher’s RA9 and his sim RA9 (the runs per 27 outs generated by my sim), for all pitchers who threw to at least 100 batters in a season.

First here are some interesting categorical observations:

Jared Cross, of Steamer projections, suggested to me that perhaps some pitchers, like lefties, might hold base runners on first base better than others, and therefore depress scoring a little as compared to the sim, which uses league-average base running advancement numbers. Well, lefties actually did a hair worse in my database. Their RA9 was .02 greater than their sim RA. Righties were -.01 better. That does not necessarily mean that RHP have some kind of RA skill that LHP do not have. It is more likely a bias in the sim that I am not correcting for.

How about number of pitches in a pitcher’s repertoire. I hypothesized that pitchers with more pitches would be better able to tailor their approach to the situation. For example, with a base open, you want your pitcher to be able to throw lots of good off-speed pitches in order to induce a strikeout or weak contact, whereas you don’t mind if he walks the batter.

I was wrong. Pitchers with 3 or more pitches that they throw at least 10% of the time are .01 runs worse in RA9. Pitchers with only 2 or fewer pitches, are .02 runs better. I have no idea why that is.

How about pitchers who are just flat out good in their components such that their sim RA is low, like under 4.00 runs? Their RA9 is .04 worse. Again, their might be some bias in the sim which is causing that. Or perhaps if you just go out and there “air it out” and try and get as many outs and strikeouts as possible, regardless of the situation, you are not pitching optimally.

Conversely, pitchers with a sim RA of 4.5 or greater shave .03 points off their RA9. If you are over 5 in your sim RA, your actual RA9 is .07 points better and if you are below 3.5, your RA9 is .07 runs higher. So, there probably is something about having extreme components that even the sim is not picking up. I’m not sure what that could be. Or, perhaps if you are simply not that good of a pitcher, you have to find ways to minimize run scoring above and beyond the hits and walks you allow overall.

For the NL pitchers, their RA9 is .05 runs better than their sim RA, and for the AL, they are .05 runs worse. So the sim is not doing a good job with respect to the leagues, likely because of pitchers batting. I’m not sure why, but I need to fix that. For now, I’ll adjust a pitcher’s sim RA according to his league.

You might think that younger pitchers would be “throwers” and older ones would be “pitchers” and thus their RA skill would reflect that. This time you would be right – to some extent.

Pitchers less than 26 years old were .01 runs worse in RA9. Pitchers older than 30 were .03 better. But that might just reflect the fact that pitchers older than 30 are just not very good – remember, we have a bias in terms of quality of the sim RA and the difference between that and regular RA9.

Actually, even when I control for the quality of the pitcher, the older pitchers had more RA skill than the younger ones by around .02 to .04 runs. As you can see, none of these effects, even if they are other than noise, is very large.

Finally, here are the lists of the 10 best and worst pitchers with respect to “RA skill,” with no commentary. I adjusted for the “quality of the sim RA” bias, as well as the league bias. Again, take these with a large grain of salt, considering the discussion above.

Best, 2004-2013:

Sean Chacon -.18

Steve Trachsel -.18

Francisco Rodriguez -.18

Jose Mijares -.17

Scott Linebrink -.16

Roy Oswalt -.16

Dennys Reyes -.15

Dave Riske -.15

Ian Snell -.15

5 others tied for 10th.

Worst:

Derek Lowe .27

Luke Hochevar .20

Randy Johnson .19

Jeremy Bonderman .18

Blaine Boyer .18

Rich Hill .18

Jason Johnson .18

5 others tied for 8th place.

(None of these pitchers stand out to me one way or another. The “good” ones are not any you would expect, I don’t think.)

We showed in The Book that there is a small but palpable “pitching from the stretch” talent. That of course would effect a pitcher’s RA as compared to some kind of base runner and “timing” neutral measure like FIP or component ERA, or really any of the ERA estimators.

As well, a pitcher’s ability to tailor his approach to the situation, runners, outs, score, batter, etc., would also implicate some kind of “RA talent,” again, as compared to a “timing” neutral RA estimator.

A few months ago I looked to see if RE24 results for pitchers showed any kind of talent for pitching to the situation, by comparing that to the results of a straight linear weights analysis or even a BaseRuns measure. I found no year-to-year correlations for the difference between RE24 and regular linear weights. In other words, I was trying to see if some pitchers were able to change their approach to benefit them in certain bases/outs situations more than other pitchers. I was surprised that there was no discernible correlation, i.e., that it didn’t seem to be much of a skill if at all. You would think that some pitchers would either be smarter than others or have a certain skill set that would enable them, for example, to get more K with a runner on 3rd and less than 2 outs, more walks and fewer hits with a base open, or fewer home runs with runners on base or with 2 outs and no one on base. Obviously all pitchers, on the average, vary their approach a lot with respect to these things, but I found nothing much when doing these correlations. Essentially an “r” of zero.

To some extent the pitching from the stretch talent should show up in comparing RE24 to regular lwts, but it didn’t, so again, I was a little surprised at the results.

Anyway, I decided to try one more thing.

I used my “pitching sim” to compute a component ERA for each pitcher. I tried to include everything that would create or not create runs while he was pitching, like WP/PB, SB/CS, GIDP, roe, in addition to s,d,t,hr,bb, and so. I considered an IBB as a 1/2 BB in the sim, since I didn’t program IBB into it.

So now, for each year, I recorded the difference between a pitcher’s RA9 and his simulated component RA9, and then ran year-to-year correlations. This was again to see if I could find a “RA talent” wherever it may lie – clutch pitching, stretch talent, approach talent, etc.

I got a small year-to-year correlation which, as always, varied with the underlying sample size – TBF in each of the paired years. When I limited it to pitchers with at least 500 TBF in each year, I got an “r” of .142 with an average PA of 791 in each year. That comes out to a 50% regression at around 5000 PA, or 5 years for a full-time starter, similar to BABIP for pitchers. In other words, the “stabilization” point was around 5,000 TBF.

If that .142 is accurate (at 2 sigma the confidence interval is .072 to .211), I think that is pretty interesting. For example, notable “ERA whiz” Tom Glavine from 2001 to 2006, was an average of .246 in RA9 better than his sim RA9 (simulated component RA). If we regress that difference 50%, we get .133 runs per game, which is pretty sizable I think. That is over 1/3 of a win per season. Notable “ERA hack” Ricky Nolasco from 2008 to 2010 (I only looked at 2001-2010) was an average of .357 worse in his ERA. Regress that 62.5%, and we get .134 runs worse per season, also 1/3 of a win.

So, for example, if you want to know how to reconcile fWAR (FG) and bWAR (B-R) for pitchers, take the difference and regress according to the number of TBF, using the formula 5000/(5000+TBF) for the amount of regression.

Here are a couple more interesting ones, off the top of my head. I thought that Livan Hernandez seemed like a crafty pitcher, despite having inferior stuff late in his career. Sure enough, he out-pitched his components by around .164 runs per game over 9 seasons. After regressing, that’s .105 rpg.

The other name that popped into my head was Wakefield. I always wondered if a knuckler was able to pitch to the situation as well as other pitchers could. It does not seem like they can, with only one pitch with comparatively little control. His RA9 was .143 worse than his components suggest, despite his FIP being .3 runs per 9 worse than his ERA! After regressing, he is around .095 worse than his simulated component RA.

Of course, after looking at Wake, we have to check Dickey as well. He didn’t start throwing a knuckle ball until 2005, and then only half the time. His average difference between RA9 and simulated RA9 is .03 on the good side, but our sample size for him is small with a total of only 1600 TBF, implying a regression of 76%.

If you want the numbers on any of your favorite or no-so-favorite pitchers, let me know in the comments section.

This is a follow up to my article on baseballprospectus.com about starting pitcher times through the order penalties (TTOP).

Several readers wondered whether pitchers who throw lots of fastballs (or one type of pitch) have a particularly large penalty as opposed to pitchers who throw more of a variety of pitches. The speculation was that it would be harder or take longer for a batter to acclimate himself to a pitcher who has lots of different pitches in his arsenal. As well, since most starters tend to throw more fastballs the first time through the order, those pitchers who follow that up with more off-speed pitches for the remainder of the game would have an advantage over those pitchers who continue to throw mostly fastballs.

First I split all the starters up into 3 groups: One, over 75% fastballs, two, under 50% fastballs, and three, all the rest. The data is from 2002-2012. I downloaded pitcher pitch type data from fangraphs.com. The results will amaze you.

FB %

N (Pitcher Seasons)

Overall

First Time

Second Time

Third Time

Fourth Time

Second Minus First

Third Minus Second

Fourth Minus Third

> 75%

159

.357

.341

.363

.376

.348

.027

.020

-.013

< 50%

359

.352

.346

.349

.360

.361

.003

.015

.010

All others

2632

.359

.346

.361

.370

.371

.015

.015

.013

Pitchers who throw mostly fastballs lose 35 points in wOBA against by the third time facing the lineup. Those with a much lower fastball frequency only lose 24 points. Interestingly, the former group reverts back to better than normal levels the fourth time (I don’t know why that is, but I’ll return to that issue later), but the latter group continues to suffer a penalty as do all the others. Keep in mind that the fourth time numbers are small samples for the first two groups, and that fourth time TBF are only around 15% of first time TBF (i.e., starters don’t often make it past the third time through the order) .

The takeaway here is that a starter’s pitch repertoire is extremely important in terms of how long he should be left in the game and whether he should start or relieve (we already knew the latter, right?). If we look at columns three and four, we can get an idea as to the difference between a pitcher as a starter and as a reliever, at least as far as times through the order is concerned (there are other considerations, such as velocity – e.g., when a pitcher is a short reliever, he can usually throw harder). The mostly fastball group is 16 points (around .5 runs per 9 innings) more effective the first time through the order than overall, while the low frequency fastball group only has a 6 point (.20 RA9) advantage. Keep in mind that some of that first time through the order advantage for all groups is due to the “first inning” effect (see my original article on BP).

Next I split the pitchers into four groups based on the number of pitches they threw at least 10% of the time. The categories of pitches (from the FG database) were fast balls, sliders, cutters, curve balls, change ups, splitters, and knuckle balls.

# Pitches in Repertoire (> 10%)

N (Pitcher Seasons)

Overall

First Time

Second Time

Third Time

Fourth Time

Second Minus First

Third Minus Second

Fourth Minus Third

1

41

.359

.344

.370

.375

.303

.027

.009

-.061

2

1000

.358

.343

.359

.371

.366

.016

.018

.007

3

1712

.361

.349

.362

.371

.372

.013

.015

.014

4

378

.351

.340

.351

.360

.368

.011

.013

.019

This is even more interesting. It appears that the fewer pitches you have in your repertoire, the more that batters become quickly familiar with you, we we might expect. One-pitch pitchers lose 36 points by the third time through the order, while four-pitch pitchers lose only 24 points. The fourth time through the order is exactly the opposite. Against one-pitch pitchers, pitchers gain 61 points (small sample size warning – 639 PA). Again, I have no idea why. Maybe fastball pitchers are able to ramp it up in the later innings, or maybe they start throwing more off-speed pitches. A pitch f/x analysis would shed some more light on this issue. Against the four-pitch pitchers, batters gain 19 points the fourth time around compared to the third. If we weight and combine the third and fourth times in order to increase our sample sizes, we get this:

# Pitches in Repertoire (> 10%)

N (Pitcher Seasons)

Overall

First Time

Second Time

Third and Fourth Times

Second Minus First

Third+ Minus Second

1

41

.359

.344

.370

.364

.027

-.001

2

1000

.358

.343

.359

.370

.016

.017

3

1712

.361

.349

.362

.371

.013

.015

4

378

.351

.340

.351

.361

.011

.015

Again, we see the largest, by far, second time penalty for the one-pitch pitchers (27 points), and a gradually decreasing penalty for two, three, and four-pitch pitchers (16, 13, and 11). Interestingly, they all have around the same penalty the third time and later, other than the one-pitch pitchers, who essentially retain their quality or even get a bit better, although this is driven by their large fourth time advantage, as you saw in the previous table.

It is not clear that you should take your one-pitch starters out early and leave in those who have multiple pitches in their weaponry. In fact, it may be the opposite. While the one-pitch pitchers would do well if they only face the order one time (and so would the two-pitch starters actually), once you allow them to stay in the game for the second go around, you might as well keep them in there as long as they are not fatigued, at least as compared to the multiple-pitch starters. Starters with more than one pitch appear to get 10-15 points worse each time through the order even though they don’t have the large penalty between the first and second time, as the one-pitch pitchers do. Remember, for the last two tables, a pitch is considered part of a starter’s repertoire if he throws it at least 10% of the time.

I’ll now split the pitchers into four groups again based on how many pitches they throw, but this time, the cutoff for a “pitch” will be 15% rather than 10%. The number of pitchers who throw four pitches at least 15% of the time each are too few for the their numbers to be meaningful, so I’ll throw them in with the three pitch pitchers. I’ll also combine the third and fourth times through the order again.

# Pitches in Repertoire (> 15%)

N (Pitcher Seasons)

Overall

First Time

Second Time

Third and Fourth Times

Second Minus First

Third+ Minus Second

1

447

.358

.342

.362

.364

.027

-.001

2

1954

.359

.346

.361

.370

.016

.017

3+

742

.355

.347

.352

.371

.013

.015

The three and four-pitch starters are better overall by three or four points of wOBA (.11 RA9). The first time through the order, however, the one-pitch starters are better by 5 points or so (.15 RA9). The second time around, the one-pitch pitchers fare the worst, but by the third and fourth times through the order, they are once again the best (by 6 or 7 points, or .22 RA9). It is difficult to say what the optimal use of these starters would look like. At the very least, these numbers should give a manager/team more information in terms of estimating a starter’s penalty at various points in the game, based on his pitch repertoire.

I’ll try one more thing: Two groups. The first group are pitchers who throw at least 80% of one type of pitch, excluding knuckleballers. These are truly one-pitch pitchers. The second group throw three (or more) pitches at least 20% of the time each. These are truly three-pitch pitchers. Let’s see the contrast.

# Pitches in Repertoire

N (Pitcher Seasons)

Overall

First Time

Second Time

Third and Fourth Times

Second Minus First

Third+ Minus Second

1 (> 80%)

47

.360

.343

.367

.370

.025

.009

3+ (> 20%)

104

.353

.350

.357

.357

.008

.009

It certainly looks like the 42 one-pitch pitchers (47 is the number of pitcher seasons) would be much better off as relievers, facing each batter only one time. They are not very good overall, and after only one go around, they are 25 points (.85 RA9) worse than the first time facing the lineup! The three-pitch pitchers suffer only a small (8 point) penalty after the first time through the order. Both groups actually suffer the same penalty from the second to the third (and more)  time through the order (9 points).

So who are these 42 pitchers who are ill-suited to being a starter? Perhaps they are swingmen or emergency starters. I looked at all pitchers who started at least one game – not just regular starters. Here is the complete list from 2002 to 2012. The numbers after the names are the number of TBF faced as starters and as relievers.

Mike Timlin 20, 352

Kevin Brown 206, 68

Ben Diggins 114, 0

Jarrod Wahburn 847, 0

Mike Crudale 9, 199

Grant Balfour 17, 94

Shane Loux 69, 69

Jimmy Anderson 180, 3

Kirk Reuter 620, 0

Jaret Wright 768, 0

Logan Kensing 55, 11

Tanyon Sturze 57, 277

Chris Young 156, 0

Nate Bump 33, 286

Bartolo Colon 2683, 49

Carlos Silva 876, 10

Aaron Cook 3337, 0

Cal Eldred 12, 141

Rick Bauer 21, 281

Mike Smith 18, 0

Shawn Estes 27, 0

Troy Percival 4, 146

Andrew Miller 306, 0

Luke Hochevar 12, 41

Luke Hudson 13, 0

Dana Eveland 15, 13

Denny Bautista 9, 38

Dennis Safarte 81, 274

Roberto Hernandez 548, 0

Mike Pelfrey 812, 0

Daniel Cabrera 881, 0

Frankie de la Cruz 15, 37

Mark Mulder 3, 9

Ty Taubenheim 27, 0

Brad Kilby 7, 58

Darren Oliver 17, 264

Justin Masterson 1794, 4

Luis Mendosa 60, 0

Ross Detwiler 627, 51

Cesar Ramos 11, 109

Josh Stinson 17, 21

Ross Detwiler 627, 51

Many of these pitchers barely had a cup of coffee in the majors. Others were emergency starters, swingmen, or they changed roles at some point in their careers. Others were simply mediocre or poor starting pitchers, like Kirk Reuter, Jarrod Washburn, Mike Pelfrey, Carlos Silva, and Daniel Cabrera, while others were good or even excellent starters, like Kevin Brown, Mark Mulder, and Bartolo Colon.

I think the lesson is clear. Unless a team has a compelling reason to make a one-pitch pitcher a starter (perhaps they are an extreme sinker-baller, like Brown, Cook, and Masterson), they should probably only relieve. If a team is going to use a swingman for an occasional start or a reliever for an emergency start, they would do well to use a two or three-pitch pitcher or limit him to one time through the order.

If we remove the swingmen and emergency starters as well as those pitchers who faced fewer than 50 batters in a season, we get this:

# Pitches in Repertoire

N (Pitcher Seasons)

Overall

First Time

Second Time

Third and Fourth Times

Second Minus First

Third+ Minus Second

1 (> 80%)

28

.353

.336

.364

.365

.028

.004

3+ (> 20%)

104

.353

.350

.357

.357

.008

.009

Even if we only look at regular starters with one primary pitch other than a knuckleball, we still see a huge penalty after the first time facing the order. In fact, the second time penalty (compared to the first) is worse than when we include the swingmen and emergency starters. Although these pitchers overall are as good as multiple-pitch starters, they still would have been much better off as short relievers.

Here is that updated list of starters once we remove the ones who rarely start. These guys as a whole should probably have been short relievers.

Cook

Miller

Colon

Diggins

Silva

Young

Cabrera

Wright

Washburn

Anderson

Masterson

Brown

Rueter

Kensing

Mendoza

Pelfrey

Hernandez

Detwiler

You might think that the one-pitch starters in the above list who are good or at least had one or two good seasons might not necessarily be good candidates for short relief. You would be wrong. These pitchers had huge second to first penalties and pitched much better the first time through the order than overall. Here is the same chart as before, but only including above-average starters for that season.

# Pitches in Repertoire

N (Pitcher Seasons)

Overall

First Time

Second Time

Third and Fourth Times

Second Minus First

Third+ Minus Second

1 (> 80%)

11

.328

.307

.332

.332

.039

-.013

3+ (> 20%)

35

.321

.318

.323

.323

.004

.003

Here are those pitchers who pitched very well overall, but were lights out the first time facing the lineup. Remember that these pitchers were above average in the season or seasons that they went into this bucket – they were not necessarily good pitchers throughout their careers or even in any other season.

Kevin Brown

Jarrod Washburn

Jaret Wright

Chris Young

Bartolo Colon

Carlos Silva

Justin Masterson

Ross Detwiler

Interestingly, the very good multiple-pitch pitchers had very small penalties each time through the order. These are probably the only kind of starters we want to go deep into games! Here is that list of starters.

Sonnanstine

B. Myers

Pavano

Sabathia

Billingsley

Carpenter

Hamels

Haren

F. Garcia

Iwakuma

Shields

J. Contreras

Beckett

Duchscherer

Gabbard

K. Rogers

Buehrle

M. Clement

Halladay

R. Hernandez

T. Hunter

Finally, in case you are  interested, here are the numbers for all of the one-pitch knuckleballers that I have been omitting in some of the tables thus far:

Knuckle Ballers Only

N

First Time

Second Time

Third+ Time

Second Minus First

Third+ Minus Second

20 .321 .354  .345 .034 -.006

Where are all the knuckle ball relievers? Although we don’t have tremendous sample sizes here (3024 second time TBF), so we have to take the numbers with a grain of salt, it looks like they are brilliant the first time through the order but once a batter has seen a knuckleballer one time, he does pretty well against him thereafter (although we do see a 6 point rebound the third time and later through the order).

I think that more research, especially using the pitch f/x data, is needed. However, I think that teams can use the information above to make more informed decisions about what roles pitchers should occupy and when to take out a starter during a game.

Last night I lambasted the Cardinals’ sophomore manager, Mike Matheny, for some errors in bullpen management that I estimated cost his team over 2% in win expectancy (WE). Well, after tonight’s game, all I have to say is, as BTO so eloquently said, “You ain’t seen nothin’ yet!”

Tonight (or last night, or whatever), John Farrell, the equally clueless manager of the Red Sox (God, I hope I don’t ever have to meet these people I call idiots and morons!), basically told Matheny, “I’ll see your stupid bullpen management and raise you one moronic non-pinch hit appearance!”

I’m talking of course about the top of the 7th inning in Game 5. The Red Sox had runners on second and third, one out, and John Lester, the Sox’ starter was due to hit (some day, I’ll be telling my grandkids, “Yes, Johnny, pitchers once were also hitters.”). Lester was pitching well (assuming you define “well” as how many hits/runs he allowed so far – not that I am suggesting that he wasn’t  pitching “well”) and had only thrown 69 pitches, I think. I don”t think it ever crossed Farrell’s mind to pinch hit for him in that spot. The Sox were also winning 2-1 at the time, so, you know, they didn’t need any more runs in order to guarantee a win <sarcasm>.

Anyway, I’m not going to engage in a lot of hyperbole and rhetoric (yeah, I probably will). It doesn’t take a genius to figure out that not pinch hitting for Lester in that particular spot (runners on 2nd and 3rd, and one out) is going to cost a decent number of fraction of runs. It doesn’t even take a genius, I don’t think, to figure out that that means that it also costs the Red Sox some chance of ultimately winning the game. I’ll explain it like I would to a 6-year-old child. With a pinch hitter, especially Napoli, you are much more likely to score, and if you do, you are likely to score more runs. And if on the average you score more runs that inning with a pinch hitter, you are more likely to win the game, since you only have a 1 run lead and the other team still gets to come to bat 3 more times. Surely, Farrell can figure that part out.

How many runs and how much win expectancy does that cost, on the average? That is pretty easy to figure out. I’ll get to that in a second (spoiler alert: it’s a lot). So that’s the downside. What is the upside? It is two-fold, sort of. One, you get to continue to pitch Lester for another inning or two. I assume that Farrell does not know exactly how much longer he plans on using Lester, but he probably has some idea. Two, you get to rest your bullpen in the 7th and possibly the 8th.

Both of those upsides are questionable in my opinion, but, as you’ll see, I will actually give Farrell and any other naysayer (to my way of thinking) the benefit of the doubt. The reason I think it is questionable is this: Lester, despite pitching well so far, and only throwing 69 pitches, is facing the order for the 3rd time in the 7th inning, which means that he is likely .4 runs per 9 innings worse than he is overall, and the Red Sox, like most World Series teams, have several very good options in the pen who are actually at least as good as Lester when facing the order for the third time, not to mention the fact that Farrell can mix and match his relievers in those two innings on order to get the platoon advantage. So, in my opinion, the first upside for leaving in Lester is not an upside at all.  But, when I do my final analysis, I will sort of assume that it is, as you will see.

The second upside is the idea of saving the bullpen, or more specifically, saving the back end of the bullpen, the short relievers. In my opinion, again, that is a sketchy argument. We are talking about the Word Series, where you carry 11 or 12 pitchers in order to play 7 games in 9 days and then take 5 months off. In fact, tomorrow (today?) is an off day followed by 2 more games and then they all go home. Plus, it’s not like either bullpen has been overworked in the post-season so far. But, I will be happy to concede that “saving your pen” is indeed an upside for leaving Lester in the game. How much is it worth? No one knows, but I don’t think anyone would disagree with this: A manager would not choose to “save” his bullpen for 1-2 innings when there is an off day followed by 2 more games, followed by 100 off days, when the cost of that savings is a significant chunk of win expectancy in the game he is playing at the present time. I mean, if you don’t agree with that, just stop reading and don’t ever come back to this site.

The final question, then, is how much in run or win expectancy did that non-pinch hit cost? Remember in my last post how I talked about “categories” of mistakes that a manager can make? I said that a Category I mistake, a big one, cost a team 1-2% in win expectancy. That may not seem like a lot for one game, but it is. We all criticize managers for “costing” their team the game when we think  they made a mistake and their team loses. If you’ve never done that, then you can stop reading too. The fact of the matter is that there is almost nothing a manager can do, short of losing his mind and pinch hitting the bat boy in a high leverage situation, that is worth more than 1 or 2% in win expectancy. Other than this.

The run expectancy with runners on second and third and one out in a low run environment is around 1.40. That means that on the average with a roughly average hitter at the plate, the batting team will score, on the average, 1.40 runs during that inning, from that point on. We’ll assume that it is about the same if Napoli pinch hit. He is a very good pinch hitter, but there is a pinch hitting penalty and he is facing a right handed pitcher. To be honest, it doesn’t really matter. It could be 1.2 runs or 1.5 runs. It won’t make much of a difference.

What is the run expectancy with Lester at the plate? I don’t know much about his hitting, but I assume that since he has never been in the NL, and therefore hardly ever hits, it is not good. We can easily say that it is below that of an average pitcher, but that doesn’t really matter either. With an average pitcher batting in that same situation, and the top of the order coming up, the average RE is around 1.10 runs. So the difference is .3 runs. Again, it doesn’t matter much if it is .25 or .4 runs. And there really isn’t much wiggle room. We know that it is a run scoring situation and we know that a pinch hitter like Napoli (or almost anyone for that matter) is going to be a much better hitter than Lester. So .3 runs sounds more than reasonable. Basically we are saying that, on the average, with a pinch hitter like Napoli at the plate in that situation, runners on 2nd and 3rd with 1 out, the Red Sox will score .3 more runs than with Lester at the plate. I don’t know that anyone would quarrel with that – even someone like a Tim McCarver or Joe Morgan.

In order to figure out how much in win expectancy that is going to cost, again, on the average, first we need to multiply that number by the leverage index in that situation. The LI is 1.64.  1.64 times .3 runs divided by 10 is .049 or 4.9%. That is the difference in WE between batting Lester or a pinch hitter. It means that with the pinch hitter, the Red Sox can expect, on the average, to win the game around 5% more often than if Lester hits, everything else being equal. I don’t know whether you can appreciate the enormity of that number. I have been working with these kinds of numbers for over 20 years. If you can’t appreciate it, you will just have to take my word for it that that is a ginormous number when it comes to WE in one game. As I said, I usually consider an egregious error to be worth 1-2%. This is worth almost 5%. That is ridiculous. It’s like someone offering you a brand new Chevy or Mercedes for the same price. And you take the Chevy, if you are John Farrell.

Just to see if we are in the right ballpark with our calculations, I am going to to run this scenario through my baseball simulator, which is pretty darn accurate (even though it does not have an algorithm for heart or grit) in these kinds of relatively easy situations to analyze.

Sound of computers whirring….

With Lester hitting, the Red Sox win the game 76.6% of the time. And therein lies the problem! Farrell knows that no matter what he does, he is probably going to win the game, and if he takes out Lester, not only is he going to bruise his feelings (boo hoo), but if the relief core blows the game, he is going to be lambasted and probably feel like crap. If he takes Lester out, he knows he’s also going to probably win the game, and what’s a few percent here and there. But if he lets Lester continue, as all of Red Sox nation assumes and hopes he will, and then they blow the game, no one is going to blame Farrell. You know why? Because at the first sign of trouble, he is going to pull Lester, and no one is going to criticize a manager for leaving in a pitcher who is pitching a 3-hitter through 6 innings and only 69 pitches and yanks him as soon as he gives up a baserunner or two. So letting Lester hit for himself is the safe decision. Not a good one, but a safe one.

After that rant, you probably want to know how often the Sox win if they pinch hit for Lester. 79.5% of the time. So that’s only a 2.9% difference. Still higher than my formerly highest Category of manager mistakes, 1-2%.

Let’s be conservative and call it a 3% mistake. I wonder if you told John Farrell that by not pinch hitting for Jon Lester his team’s chances of winning go from 79.5% to 76.6%. Even if he believed that, do you think it would sway his decision? I don’t think so, because he feels with all his heart and soul that having Lester, who is “dealing,” pitch another inning or two, and saving his bullpen, is well worth the difference between 77% and 80%. After all, either way, they probably win.

So how much does Lester pitching another inning or two (we’ll call it 1.5 innings, since at the time it could have been anywhere from 0 to 2, I think  – I am pretty sure that Koji was pitching the 9th no matter what) gain over another pitcher? Well, I already said that the answer is nothing. Any of their good relievers are at least as good as Lester the 3rd time though the order. But I also said that I will concede that Lester is going to be just amazing, on the average, if Farrell leaves him in the game. How good does he have to be in order to make up the .3 runs or 3% in WE that are lost by allowing Lester to hit?

A league average reliever allows around 4 runs a game. It doesn’t matter what that exact number is – we are only using it for comparison purposes. A good short reliever actually allows more like 3 or 3.5 runs a game. Starting pitchers, in general, are a little worse than the average pitcher (because of that nasty times through the order penalty). A very good pitcher like Lester allows around 3.5 runs a game (a pitcher like Wainwright around 3 runs a game). So let’s assume that a very average reliever came into the game to pitch the 7th and half the 8th rather than Lester. They would allow 4 runs a game. That is very pedestrian for a reliever. Almost any short reliever can do that with his eyes closed. In order to make up the .3 runs we lost by letting Lester hit, Lester needs to allow fewer runs than 4 runs a game. How much less? Well, .3 runs in 1.5 innings is .2 runs per inning. .2 runs per inning times 9 innings is 1.8 runs. So Lester would have to pitch like a pitcher who allows 2.2 runs per 9 innings. No starting pitcher like that exists. Even the best starter in baseball, Clayton Kershaw, is a 2.5 run per 9 pitcher at best.

Let’s go another route. Remember that I said Lester was probably around a 3.5 run pitcher (Steamer, a very good projection system, has him projected with a 3.60 FIP, which is around a 3.5 pitcher in my projection system), but that the third time through the order he is probably a 3.80 or 3.90 pitcher. Forget about that. Let’s decree that Lester is indeed going to pitch the 7th and 8th innings, or however long he continues, like an ace reliever. Let’s call him a 3.00 pitcher, not the 3.80 or 3.90 pitcher that I think he really is, going into the 7th inning.

In case, you are wondering, there is no evidence that good or even great pitching through 6 or 7 innings predicts good pitching for future innings. Quite the contrary. Even starters who are pitching well have the times through the order penalty, and if they are allowed to continue, they end up pitching worse than they do overall in a random game. That is what real life says. That is what happens. It is not my opinion, observation, or recollection. A wise person once said that, “Truth comes from evidence and not opinion or faith.”

But, again, we are living on Planet Farrell, so we are conceding that Lester is a great pitcher going into the 7th inning and the third time through the order. (Please don’t tell me how he did that inning. If you do or even think that, you need to leave and never come back. Seriously.)  We are calling him a 3.0 pitcher, around the same as a very good closer.

How bad does a replacement for Lester for 1.5 innings have to be to make up for that .3 runs? Again, we need .2 runs per inning, times 9 innings, or a total of 1.8 runs per 9. So the reliever to replace him would have to be a 4.8 pitcher. That is a replacement pitcher folks, There is no one on either roster who is even close to that.

So there you have it. Like Keith Olbermann’s, Worst person in the world, we have the worst manager in baseball – John Farrell.

Addendum: Please keep in mind that some of the hyperbole and rhetoric is just that. Take comments like, “Farrell is an idiot,” or, “the worst manager in baseball,” with a grain of salt and chalk it up to flowery emotion. It is not relevant to the argument of course. The argument speaks for itself, and you, the reader, are free to conclude what you want about whether his moves, or any other managerial moves that I might discuss, were warranted or not.

I am not insensitive to factors that drive all managers’ decisions, like the reaction, desires, and opinions of the fans, media, upper management, and especially, the players. As several people have pointed out, if a manager were to do things that were “technically” correct, yet in doing so, alienate his players (and/or the fans) thereby affecting morale, loyalty, and perhaps a conscious or subconscious desire to win, then those “correct” decisions may become “incorrect” in the grand scheme of things.

That being said, my intention is to inform the reader and to take the hypothetical perspective of informing the manager of the relevant and correct variables and inputs such that they and you can make an informed decision. Imagine this scenario: I am sitting down with Farrell and perhaps the Red Sox front office and we are rationally and intelligently discussing ways to improve managerial strategy. Surely no manager can be so arrogant as to think that everything he does is correct. You would not want an employee like that working for your company no matter how much you respect him and trust his skills. Anyway, let’s say that we are discussing this very same situation, and Farrell says something like, “You know, I really didn’t care whether I removed Lester for a pinch hitter or not, and I don’t think he or my players would either. Plus, the preservation of my bullpen was really a secondary issue. I could have easily used Morales, Dempster, or even Breslow again. Managers have to make tough decisions like that all the time. I genuinely thought that with Lester pitching and us already being up a run, we had the best chance to win. But now that you have educated me on the numbers, I realize that that assumption on my part was wrong. In the future I will have to rethink my position if that or a similar situation should come up.”

That may not be a realistic scenario, but that is the kind of discussion and thinking I am trying to foster.

MGL

If you followed my tweets last night, you know the answer. They both did something very wrong, got away with it, and then got punished for something that was not their fault!

Disclaimer: I actually believe that there is a good chance that OJ is not guilty and that his oldest son Jason, was the real culprit. Check out this book if you are interested in another point of view. Your guess is as good as mine as to whether the information in the book is made up or not. If it isn’t, there is a whole “nother” side to the story.

(If you want to “skip to the chase” go to the 6th paragraph from the bottom starting with, “So let’s see if…“)

Well, that was rich. We went from a mildly funny joke to a serious “ting.”

Back to baseball. Rob Neyer beat me to the punch on this one, but he has not told you the whole story either. So I am here to tell you the rest of the story as Paul Harvey used to do so well back in the day.

Rob talks about Matheny’s mistake of not using Choate, his LOOGY, against Ortiz in the top of the 6th inning with a runner on first and 2 outs. That wasn’t the big mistake. The mistake was letting Lynn, the starter, start the inning and pitch to Ellsbury, Nava, Pedroia, and Ortiz. It was the the start of Lynn’s third time through the order. We all know about the “times through the order” penalty for starters. I, and many others, have been talking about this a lot lately, It is the new Moneyball (not really, but that sounds cool).

On top of that, 3 out of the first 4 batters due up that inning are lefty batters (Nava is a switch hitter, much better from the left side). And, Lynn has a pretty big platoon split, mainly because he throws from a three quarter arm slot, which is fairly unusual for a RH pitcher. Nonetheless, he is excellent versus RH batters and very mediocre versus lefties. The third time through the order, he is close to replacement level versus lefty batters. So, essentially Matheny’s choice starting the 6th inning, was to have a replacement level pitch pitch to 3 of Boston’s first 4 batters and a slightly better than average pitcher (even against a righty, the third time through the order, Lynn becomes almost average) for the other one. That is not a good choice in the 4th game of the World Series is.

So his decision was really fait accompli long before Big Papi stepped to the plate. Now, you don’t want to bring in Choate to face Ellsbury because then he has to face Nava (or a pinch hitter like Napoli) from the right side, and then Pedroia from the right side, before he faces another lefty in Ortiz. And you do not want Randy Choate anywhere near a right handed batter. I mean if he just walks by a righty in the clubhouse, I think a ball goes careening off the walls. He is terrible against RHB. Just awful. Worse than replacement. Your right- handed grandmother would be better.

So let’s see, does Matheny have someone in the pen, who can get out lefties and righties. Hmmm. Let’s see. No, I don’t…Wait a minute. There is this guy named Siegrest, I think, who throws with his left hand, can fire the ball into the catcher’s mitt oh, maybe 95 mph or so. Let’s see, his career (albeit in a small sample) wOBA against lefties is .195 and .216 versus righties. You think maybe this guy is the man for the job? To face 3 lefties, a righty and a switch hitter who can’t hit lefty pitchers? Or would your rather use a near-replacement level pitcher in Lynn?

Oh yeah, Lynn is throwing a 1 hitter so far and Siegrest once gave up a home run to Ortiz (I think it was a couple days ago, but I’m not sure – like most managers, I have little long-term memory anymore).

Yeah right! Having given up a home run to Ortiz is worthless as far as pitching to him now, and the fact that Lynn is pitching a one hitter has almost zero predictive value and doesn’t negate the fact that he is likely a crap pitcher facing the lineup for the third time, with 3 out of the first 4 lefties to boot!

Anyway, you know what went down. Lynn retires two batters, gives up a hit and a walk to Ortiz (pitching to Ortiz was the piece de resistance of Matheny’s utter cluelessness), Maness comes in to pitch to Gomes ( fine move, but too little, too late) and bang!

But, let’s not worry at all about the results. The correctness or not of his moves has nothing whatsoever to do with what ensued in that inning or whether the Cards lost the game or not. A decision is to be judged solely on what we know at the time it was made. It was only ironic that when he finally brought in the right pitcher, everything blew up in his face.

For the record, if you were not following my tweets last night, just as be brought in Maness to pitch to Gomes, and after I had been screaming bloody murder, I tweeted this.

Let’s see if we can figure out about how much win expectancy Matheny cost his team by his “non moves” in the 6th, since, really, that is the only thing that counts in terms of evaluating his decisions – not how it turned out (please, memorize that and recite every night 10 times before you go to bed).

Overall, I project Lynn as a pitcher who allows 75% league average runs versus RHB and 108% versus LHB. That’s a large split for a starter. Compare that to Bucholz, who is 88% and 100%. The third time through the order, a good rule of thumb is to add 10% to those numbers. So  Lynn becomes an 85%/118% pitcher, not too good, especially the latter number.

Siegrest, on the other hand, is terrific against both RHB and LHB. I have his projection as 83% and 54%, respectively. Compare that, BTW, to Choate, at (wait, get a barf bag ready) 165% and 68%. You don’t have to take these numbers as the gospel. There are certainly error bars around them, but it doesn’t really matter. We know about the times through the order penalty, we know that Lynn, at his best, is no Adam Wainwright, we are pretty sure that Lynn has a large true platoon split, and we are pretty sure that Siegrest is a really, really good reliever with very small platoon splits.

The average leverage during these 4 batters was around 1.25. So any run impact we get is multiplied by that number. Against Ellsbury, the difference between Siegrest and Lynn is around .07 runs. You’ll just have to take my word for it since it is 2 in the AM and I am tired of writing. Nava, around the same even though he is a switch hitter, since he hits almost like a lefty only. Pedroia is around a .002 difference only. And Ortiz is around .08. These are all  ballpark numbers, no pun intended. Add them all up and multiply by 1.25 (the average LI), and we get a grand total of .22 runs or .022 wins, which is 2.2% in WE.

That is huge folks! Ginormous! A couple of days ago in a post I wrote on SBN, I think, I constructed a set of criteria for what I called Category I, II, II, and IV mistakes by a manager. Category I contained the most egregious ones, and I think I said that those cost 1-2% in WE. I can’t imagine making any mistakes that cost a team more than that.

2.2%?

I may have to invent a new category.

I am afraid OJ’s got nothing on Matheny!

Edit: The following article was edited and revised from its original version. There were some mistakes and coding errors. I take full responsibility for the errors.

It goes without saying, per conventional wisdom at least, that when your starter is pitching a shutout and he has not thrown too many pitches, say, less than 90 or even 100, you leave him in there until his pitch count is elevated, he gets into some trouble, or you have to pinch hit for him in a close game or one in which you are losing. And even this last consideration is sometimes ignored – you often see a manager let an NL pitcher hit in the 6th-8th inning in a close or losing contest, if he is pitching well.

Not only is this conventional wisdom, but it would be heresy to think otherwise. For example, in the NLCS game 4, Lynn, the Cardinals starter, came out to pitch the 5th inning in a close game. No one thought anything of it even though he was facing the batting order for the 3rd time, and he wasn’t even pitching a shutout – he had allowed 2 runs in 4 innings. In game 2 of the ALCS, Tigers manager JIm Leyland was mildly criticized for taking Scherzer out after 7 innings – he had thrown 108 pitches and allowed 2 hits and 1 run.

So what happens when we let a starter pitch another inning when he is throwing a shutout? Surely we know the answer to that – at least managers do, right? Yeah, right! Managers do 100’s of things right and wrong, when they have no idea what the numbers are (somehow they think they do, I guess). I truly find it hard to believe that after 100 some odd years of baseball no one can tell us what happens when a starter is pitching a shutout versus, say, after allowing a couple of runs, or even 5 or 6. Managers will gladly remove a starter after 5 or 6 innings when they have allowed 4 or 5 runs, but almost never do that when they are pitching a shutout. I realize that some of that has to do with not alienating your players, building confidence and stamina in young starters, etc. You can’t, I guess, be yanking pitchers left and right in the early innings when they have been pitching well.

However, there is not a manager alive, I don’t think, or most everyone for that matter, who does not think that a pitcher who is pitching well through 4, 5, 6, or 7 innings will not continue to pitch well, as long as his pitch count is reasonable (and he is a regular starter who is used to throwing 6 or 7 innings, etc.).

We addressed this issue to some extent in The Book. If you want to know what that research had to say, you’ll have to consult your copy or buy one. I also posted about this earlier in the year on The Book blog.

Here is some new research:

I looked at all games in which the starter was pitching a shutout so far and how they did in the subsequent inning. For example, if they were pitching a shutout after 1 inning, I looked at how they performed in the second inning. If they pitched a shutout through 2 innings, I looked at how they pitched in the 3rd inning. Etc.

I could not just look at innings that they completed – that would be a biased sample. Completed innings tend to be good ones and partial innings, when a pitcher is yanked min-inning, tend to be bad ones. So if a pitcher was yanked mid-inning, after facing at least one batter, I used the run expectancy of the base/out state when they left as a proxy for the number of runs that they allowed that inning. In other words, I assumed that the starter completed the inning and that he allowed a number of runs equal to the RE when he left (plus any runs he did allow). That is a little bit of a fudge, but I don’t think that it is a huge deal.

In presenting the numbers, the number of runs allowed in each inning after pitching a shutout so far, I adjusted for the quality of the batters in that inning. For example, if a pitcher pitches a shutout through one inning, he likely is facing the middle or bottom of the order in the second. If he allowed 3 or 4 runs, he is likely back to the top of the order in the second inning (and he is facing them for the second time). I also multiplied the runs per inning, typically around .5 of course, by 9, in order for it to look like a runs allowed per game.

I also adjusted for park factors. Shutouts tend to occur in pitchers’ parks and when pitchers allow lots of runs, that tends to occur in hitters’ parks. These are minor tendencies.

Column 3, next to the adjusted runs allowed, is the pitchers’ collective actual runs allowed for that season. This is the expected number of runs allowed in any inning (once you adjust for the batters in that inning). So comparing column 2 with column 3 is the key to this analysis. If there is a carryover effect to pitching a shutout, we would expect column 2, runs allowed in that inning, to be less than column 3. runs allowed per 9 for the whole season for these pitchers. If there is a carryover effect for getting shelled (allowing 4 or more runs), we expect column 2 to be greater than column 3.

Keep in mind that it is expected that after 1 or more shutout innings, subsequent innings will be slightly worse than seasonal numbers. What I mean by that is this: If a pitcher allows exactly .5 runs per inning for an entire season, in any inning other than 1 or more shutout innings, he is naturally going to allow slightly more than .5 runs. In other words, in any game where that pitcher has some shutout innings, the other innings will be slightly worse than his overall seasonal average – and vice versa for games in which a pitcher allows lots of runs through 1 or more innings. I adjusted for this by subtracting out the requisite number of shutout or shelled innings. So for example, if the seasonal RA9 were 5.00, then we might expect a starter to allow 5.15 runs per 9 after 5 shutout innings. After 5 innings of 4 runs or more, we might expect an RA9 of 4.90. These adjusted numbers are what is presented in column 3.

Anyway, here is the data:

These are pitchers who are pitching a shutout through X-1 innings, where the first column is inning X. The second column is the number of runs allowed in inning X for pitchers who allowed no runs in innings 1-X, adjusted for the park and the batters in that inning. Remember these are not quite actual runs allowed (almost). They are runs allowed adjusted for batters and park. And remember that they include the run expectancy for that inning when a starter leaves mid-inning.

The time period I studied was 1998-2012, and I limited the sample to games in AL parks only for various reasons. As I said, the runs in each inning were prorated to 9 innings (multiplied by 9) so that they look like runs scored per game per team.

The 4th column is the league-wide average number of runs scored in that inning (for all games and all pitchers), also adjusted for the batter pool in that inning, but not adjusted for the quality of all pitchers who pitch in that inning.

The last column is the percentage of batters who batted from the same side as the pitcher throws in that inning. As starters stay later in the game, they tend to face more same side batters, which makes sense. This column is more FYI only, although to some extent it can affect the numbers.

Note that the 5th column is now opponent runs allowed (the pitching team’s runs scored) through that inning, prorated to 9 innings. This gives us an idea as to the run scoring environment, although it appears to be dependent on how long the starter pitches so don’t put too much stock in it.

Shutout prior to inning X

Inning Adj RA Pitchers’ season RA9 Lg Avg runs Pitchers’ team RS Batter wOBA Park Factor Pitcher platoon adv
 2 4.68 5.07 4.83 4.88 .335 1.00 41.1%
 3 4.90 5.00 5.16 4.87 .331 1.00 40.4
 4 4.73 4.94 5.13 4.96 .346 1.00 40.0
 5 5.05 4.86 5.25 4.94 .326 1.00 41.4
 6 4.85 4.80 5.15 4.96 .340 1.00 39.5
 7 4.92 4.70 4.86 4.95 .334 .99 40.3
 8 4.80 4.61 4.61 4.69 .322 .99 39.8

I don’t know about you, but I think that those numbers in the runs allowed column are a little troubling. They should be to managers too – virtually no one thinks that you should take out a pitcher who is throwing a shutout and has a reasonable pitch count, because surely he is going to continue to pitch well, right?

In the first few innings, we see a small carry over effect from throwing a shutout thus far – maybe (it could just be a lower run environment, other than the park, for various reasons – weather, umpires, year, etc.). After 4 shutout innings, we don’t see any carryover effect at all – in fact, we see starters pitching worse than they normally do.

As it turns out, the deeper in the game they go, despite pitching very well so far, the more they face the lineup. By the 6th, 7th, and 8th innings, they are facing the lineup for the 3rd and 4th time. Look at the 7th and 8th innings. Starters pitching a shutout are allowing between .08 and .1 runs more than the average pitcher (which are mostly relievers of course) in those innings!

In the 5th and 6th innings, these pitchers are allowing .20 to .30 more runs per 9 than they typically allow for the season as a whole. By the time they pitch in the 7th and 8th, they are allowing .40 more runs than usual! Again, that is the times through the order penalty. There is no carry over from pitching a shutout that trumps that penalty.

And look how good the pitchers are (column 5) who make it deep into the game. By the time we get into the 7th and 8th innings, only aces are allowed to continue, on the average. But still, they pitch more like middle relievers.

The number of times a starter faces the batting order is way, way, way more important than how he has been pitching. I cannot emphasize that enough and it may be the single most important thing that managers (and everyone else) should get through their thick skulls!

Let’s look at the same chart for all pitchers who have allowed at least 4 runs prior to the listed inning.

 4 or more runs allowed prior to inning X 

Inning Adj RA Pitchers’ season RA9 Lg Avg runs Pitchers’ team RS Batter wOBA Park Factor Pitcher platoon adv
 2 5.90 5.53 4.90 5.01 .347 1.01 41.6%
 3 5.02 5.40 5.21 4.94 .342 1.01 40.8
 4 5.72 5.28 5.08 4.94 .335 1.01 41.4
 5 5.45 5.20 5.27 5.05 .350 1.00 41.3
 6 5.25 5.02 5.23 4.91 .332 1.00 42.6
 7 5.21 4.83 4.88 4.54 .336 1.00 43.5
 8 4.78 4.74 4.59 3.99 .346 1.00 45.8

Other than the 3rd inning, here we also see a small carry over effect in the other direction from pitching badly in the first few innings. By the 5th inning, however, as with the shutout pitchers, the times through the order penalty is evident with very little additional carry over effect. Of course allowing 4 or more runs after 4 or 5 innings is not terrible pitching.

What about pitch count? How does that play into it?

Let’s look at pitchers who are throwing a shutout, but we’ll only look at those innings in which he starts the inning with fewer than 100 pitches:

Shutout so far, under 100 pitches going into inning

Inning Adj RA Pitchers’ season RA9  Lg Avg runs Pitchers’ team RS Batter wOBA Park Factor Pitcher platoon adv
 2 4.68 5.07 4.83 4.88 .335 1.00 41.1%
 3 4.90 5.00 5.16 4.87 .331 1.00 40.4
 4 4.73 4.94 5.13 4.96 .346 1.00 40.0
 5 5.05 4.86 5.25 4.94 .326 1.00 41.4
 6 4.85 4.80 5.15 4.97 .340 1.00 39.5
 7 4.95 4.69 4.86 4.98 .334 .99 40.2
 8 4.71 4.65 4.61 4.78 .322 .99 39.7

So even at fewer than 100 pitches going into the mid to late innings of a shutout game, you are going to give up lots of runs. Pitch count, and I presume fatigue, appears to have little to do with why pitchers, even when throwing well, give up lots of runs late in the game. Times through the order, times through the order, times through the order!

Pitching a shutout, but over 100 pitches going into the 7th or 8th inning

Inning Adj RA Pitchers’ season RA9  Lg Avg runs Pitchers’ team RS Batter wOBA Park Factor Pitcher platoon adv
 7 4.39 4.78 4.86 4.61 .326 .99 43.6
 8 5.19 4.38 4.61 4.27 .332 .99 40.4

If you do have a high pitch count while throwing a shutout, you are worse in the 8th, but better in the 7th, overall about the same. The sample sizes are small (205 and 333 innings, respectively) so these numbers are not particularly reliable. Notice that only aces are allowed to pitch into the 8th inning with a high pitch count. Nevertheless they allow a lot more runs than they normally do – .81 runs per 9 .

What if you have given up 4 or more runs, but still have a low pitch count (< 100)?

Allowing 4 runs or more – fewer than 100 pitches going into the 6th– 8th inning.

Inning Adj RA Pitchers’ season RA9  Lg Avg runs Pitchers’ team RS Batter wOBA Park Factor Pitcher platoon adv
 6 5.21 5.03 5.23 4.91 .331 1.00 42.7%
 7 5.15 4.86 4.88 4.56 .333 1.00 43.5
 8 4.42 4.69 4.59 3.96 .344 1.00 48.6

And finally, here is what happens when pitching badly (4 or more runs allowed) and you have a high pitch count:

Allowing 4 runs or more – more than 100 pitches going into the 6th– 8th inning.

Inning Adj RA Pitchers’ season RA9 Lg Avg runs Pitchers’ team RS Batter wOBA Park Factor Pitcher platoon adv
 6 5.69 4.96 5.23 4.96 .335 1.00 41.7%
 7 5.33 4.75 4.88 4.49 .344 1.00 43.6
 8 5.18 4.80 4.59 4.03 .348 1.00 42.6

As you can see, if you have allowed lots of (4 or more) runs going into the middle and late innings, it matters how many pitches you have thrown. If you have thrown more than 100 pitches, you will allow anywhere from .03 to .07 more runs (in the next inning) than if you have thrown fewer than 100. Again, small sample size warning for these numbers!

Summary

To summarize these results, a starter who is throwing a shutout does not appear to allow runs in any subsequent inning much less than what he normally allows. His runs allowed in innings 2-4 are slightly lower than league average as well as what he normally allows based on that season’s stats, but that could be due to an overall depressed run environment in these games. Once he gets into the middle and late innings, where he is facing the order for the 3rd or 4th time, he experiences the normal “times through the order” penalty despite the fact that he has not yet allowed any runs to score. In other words, even a very good pitcher throwing a shutout is not so good starting in the 5th or 6th inning. Don’t expect a pitcher who is pitching a shutout to continue to pitch well into the middle and late innings even with a low or moderate pitch count. There is little carry over effect and the times through the order penalty is too powerful.

Pitch count does not seem to be a factor for pitchers throwing a shutout. If they are under or over 100 pitches going into the middle to late innings, they pitch about the same – in a very mediocre fashion.

Pitchers who allow 4 or more runs through X innings don’t really continue to pitch badly other than innings 2 and 4 where they pitch a little worse than they usually do. For some reason in inning 3, they pitch very well – I have no idea why that might be. Maybe batters the second time through the order are acting sub-optimally since they scored 4 or more runs in the first 2 innings or maybe the starters are trying to make up for a poor first 2 innings. Or perhaps it is just a random anomaly (although it is a pretty large sample size – 2031 innings).

Pitch count does seem to be a factor for pitchers who have allowed 4 or more runs. Under 100 and you actually pitch better in the later innings than pitchers who are throwing a shutout, relative to how you pitch normally! Over 100 and it is time for you to hit the showers, although your performance in the late innings is not terrible.

I have Miller as way the better pitcher than Kelly – more than a half run per 9. Miller is projected very well by Steamer, had a very good FIP this season, and fantastic K and BB rates in his minor and major careers thus far.

Is there something wrong with him?

While we are on the subject of mediocre pitchers (Kelly), surely you want to pinch hit for him leading off the 5th inning in today’s game, down 3-2. Of course I want to get mediocre starting pitchers  out of the game as soon as possible, preferably after facing the order at most 2 times. Which brings up another interesting point: When a starter gets in trouble early and then settles down a bit, why is it important to still get him out of the game as soon as possible? Because he has likely burned through the order twice by virtue of getting in trouble! Folks – and I know I sound like a broken record – times through the order is everything for a starting pitcher. Not “how they are pitching” or pitch count, or anything else. Times. Through. The. Order.

You want a funny/ironic illustration of the nonsense that is spewed by commentators/ex-pitchers about how pitchers “are doing during the game?” After Greinke was in all kinds of trouble in the first inning, and then quickly retired the Cardinals pitcher (yes, the pitcher) to lead off the 2nd, Ron Darling, who I’ve lost all respect for (never really had much in the first place), remarked that “Greinke was now locked in after a shaky first.” What happened next? He gave up 4 straight hits and 2 runs!

And I’m not sure why Choate didn’t pitch to A-Gon and Ethier in the 8th. Isn’t that what he is there for (he is less than worthless versus RHB)? Gotta punish Mattingly for back to back lefties in the order.

 

No, Benoit and Leyland were not caught using coke. In fact, if you listen to and watch the Tigers’ skipper, you might think he was on anything but cocaine.

In any case, I am wondering if the correct pitcher to face Big Papi wasn’t Coke rather than Benoit. Benoit actually has a career reverse split, likely because he throws so many change ups, but my platoon projection for him is still positive. Coke has a pretty large platoon split, so, accordingly, I have Coke as much better against lefties than Benoit. Even strictly using career numbers, Coke has actually done a little better (wOBA-wise) against LHB than Benoit. And since managers are often reluctant to use closers for more than one inning, it would not be totally unreasonable to nick Benoit just a tad for coming into the 8th inning rather than the 9th.

Now, once the Sox tie up the game and Detroit doesn’t score in the top of the 9th, I can’t think of any reason not to leave in Benoit to pitch the bottom of the 9th. Once he’s out, he obviously can’t come back if the Tigers should take the lead in extra innings. It is hard to say what kind of a pitcher Porcello is as a reliever, but he is probably not one of their better short relievers. In other words, the difference between Porcello and Benoit in the 9th is likely very large.

Leyland has also been criticized in the media and blogosphere for not letting Scherzer go a little longer. I will almost never fault a manager for taking out a starter “too soon.” As many of you know, I am a big believer in taking out starters as soon as possible (especially, especially, especially non-ace starters), since almost all relievers are better than starters once the starter faces the lineup for the 3rd time, especially since you can start looking for favorable platoon match ups. Even a great pitcher like Scherzer gets a lot worse the 3rd and 4th times he faces the batting order. In addition, there is no evidence that a pitcher who is thus far pitching a great game through 6 or 7 innings, is any more likely to continue pitching well than is a pitcher who has been pitching less well. So I think taking out Scherzer after 7 is just fine as long as you replace him with an excellent short reliever, including the effects of chaining.