Archive for November, 2013

I just downloaded my Kindle version of the brand spanking new Hardball Times Annual, 2014 from Amazon.com. It is also available from Createspace.com (best place to order).

Although I was disappointed with last year’s Annual, I have been very much looking forward to reading this year’s, as I have enjoyed it tremendously in the past, and have even contributed an article or two, I think. To be fair, I am only interested in the hard-core analytical articles, which comprise a small part of the anthology. The book is split into 5 parts, according to the TOC: The “2013 season,” which consists of reviews/views of each of the six divisions plus one chapter about the post-season. Two, general Commentary. Three, History, four, Analysis, and finally, a glossary of statistical terms, and short bios on the various illustrious authors (including Bill James and Rob Neyer).

As I said, the only chapters which interest me are the ones in the Analysis section, and those are the ones that I am going to review, starting with Jeff Zimmerman’s, “Shifty Business, or the War Against Hitters.” It is mostly about the shifts employed by infielders against presumably extreme pull (and mostly slow) hitters. The chapter is pretty good with lots of interesting data mostly provided by Inside Edge, a company much like BIS and STATS, which provides various data to teams, web sites, and researchers (for a fee). It also raised several questions in my mind, some of which I wish Jeff had answered or at least brought up himself. There were also some things that he wrote which were confusing – at least in my 50+ year-old mind.

He starts out, after a brief intro, with a chart (BTW, if you have the Kindle version, unless you make the font size tiny, some of the charts get cut off) that shows the number, BABIP, and XBH% of plays where a ball was put into play with a shift (and various kinds of shifts), no shift, no doubles defense (OF deep and corners guarding lines), infield in, and corners in (expecting a bunt). This is the first time I have seen any data with a no-doubles defense, infield in, and with the corners up anticipating a bunt. The numbers are interesting. With a no-doubles defense, the BABIP is quite high and the XBH% seems low, but unfortunately Jeff does not give us a baseline for XBH% other than the values for the other situations, shift, no shift, etc., although I guess that pretty much includes all situations. I have not done any calculations, but the BABIP for a no-doubles defense is so high and the reduction in doubles and triples is so small, that it does not look like a great strategy off the top of my head. Obviously it depends on when it is being employed.

The infield-in data is also interesting. As expected, the BABIP is really elevated. Unfortunately, I don’t know if Jeff includes ROE and fielder’s choices (with no outs) in that metric. What is the standard? With the infield in, there are lots of ROE and lots of throws home where no out is recorded (a fielder’s choice). I would like to know if these are included in the BABIP.

For the corners playing up expecting a bunt, the numbers include all BIP, mostly bunts I assume. It would have been nice had he given us the BABIP when the ball is not bunted (and bunted). An important consideration for whether to bunt or not is how much not bunting increases the batter’s results when he swings away.

I would also have liked to see wOBA or some metric like that for all situations – not just BABIP and XBH%. It is possible, in fact likely, that walk and K rates vary in different situations. For example, perhaps walk rates increase when batters are facing a shift because they are not as eager to put the ball in play or the pitchers are trying to “pitch into the shift” and are consequently more wild. Or perhaps batters hit more HR because they are trying to elevate the ball as opposed to hitting a ground ball or line drive. It would also be nice to look at GDP rates with the shift. Some people, including Bill James, have suggested that the DP is harder to turn with the fielders out of position. Without looking at all these things, it is hard to say that the shift “works” or doesn’t work just by looking at BABIP (and even harder to say to what extent it works).

Jeff goes on to list the players against whom the shift is most often employed. He gives us the shift and no shift BABIP and XBH%. Collectively, their BABIP fell 37 points with the shift and it looks like their XBH% fell a lot too (although for some reason, Jeff does not give us that collective number, I don’t think). He writes:

…their BABIP [for these 20 players] collectively fell 37 points…when hitting with the shift on. In other words, the shift worked.

I am not crazy about that conclusion – “the shift worked.” First of all, as I said, we need to know a lot more than BABIP to conclude that “the shift worked.” And even if it did “work” we really want to know by how much in terms of wOBA or run expectancy. Nowhere is there an attempt by Jeff to do that. 37 points seems like a lot, but overall it could be only a small advantage. I’m not saying that it is small – only that without more data and analysis we don’t know.

Also, when and why are these “no-shifts” occurring? Jeff is comparing shift BIP data to no-shift BIP data and he is assuming that everything else is the same. That is probably a poor assumption. Why are these no-shifts occurring? Probably first and foremost because there are runners on base. With runners on base, everything is different. It might also be with a completely different pool of pitchers and fielders. Maybe teams are mostly shifting when they have good fielders? I have no idea. I am just throwing out reasons why it may not be an apples-to-apples comparison when comparing “shift” results to “no-shift” results.

It is also likely that the pool of batters is different with a shift and no shift even though he only looked at the batters who had the most shifts against them. In fact. a better method would have been a “delta” method, whereby he would use a weighted average of the differences between shift and no-shift for each individual player.

He then lists the speed score and GB and line drive pull percentages for the top ten most shifted players. The average Bill James speed score was 3.2 (I assume that is slow, but again, I don’t see where he tells us the average MLB score), GB pull % was 80% and LD pull % was 62%. The average MLB GB and LD pull %, Jeff tells us, is 72% and 50%, respectively. Interestingly several players on that list were at or below the MLB averages in GB pull %. I have no idea why they are so heavily shifted on.

Jeff talks a little bit about some individual players. For example, he mentions Chris Davis:

“Over the first four months of the season, he hit into an average of 29 shifts per month, and was able to maintain a .304 BA and a .359 BABIP. Over the last two months of the season, teams shifted more often against him…41 times per month. Consequently, his BA was .250 and his BABIP was .293.

The shift was killing him. Without a shift employed, Davis hit for a .425 BABIP…over the course of the 2013 season. When the shift was set, his BABIP dropped to .302…

This reminds me a little of the story that Daniel Kahneman, 2002 Nobel Prize Laureate in Economics, tells about teaching military flight instructors that praise works better than punishment. One of the instructors said:

“On many occasions I have praised flight cadets for clean execution of some aerobatic maneuver, and in general when they try it again, they do worse. On the other hand, I have often screamed at cadets for bad execution, and in general they do better the next time.”

Of course the reason for that was “regression towards the mean.” No matter what you say to someone who has done poorer than expected, they will tend to do better next time, and vice versa for someone who has just done better than expected.

If Chris Davis hits .304 the first four months of the season with a BABIP of .359, and his career numbers are around .260 and .330, then no matter what you do against him (wear your underwear backwards, for example), his next two months are likely going to show a reduction in both of these numbers! That does not necessarily imply a cause and effect relationship.

He makes the same mistake with several other players that he discusses. I fact, I have always had the feeling that at least part of the “observed” success for the shift was simply regression towards the mean. Imagine this scenario – I’m not saying that this is exactly what happens or happened, but to some extent I think it may be true. You are a month into the season and for X number of players, say they are all pull hitters, they are just killing you with hits to the pull side. Their collective BA and BABIP is .380 and .415. You decide enough is enough and you decide to shift against them. What do you  think is going to happen and what do you think everyone is going to conclude about the effectiveness of the shift, especially when they compare the “shift” to “no-shift” numbers?

Again, I think that the shift gives the defense a substantial advantage. I am just not 100% sure about that and I am definitely not sure about how much of an advantage it is and whether it is correctly employed against every player.

Jeff also shows us the number of times that each team employs the shift. Obviously not every team faces the same pool of batters, but the differences are startling. For example, the Orioles shifted 470 times and the Nationals 41! The question that pops into my mind is, “If the shift is so obviously advantageous (37 points of BABIP) why aren’t all teams using it extensively?” It is not like it is a secret anymore.

Finally, Jeff discusses bunting to beat the shift. That is obviously an interesting topic. Jeff shows that not many batters opt to do that but when they do, they reach base 58% of the time. Unfortunately, out of around 6,000 shifts where the ball was put into play, players only bunted 48 times! That is an amazingly low number. Jeff (likely correctly) hypothesizes that players should be bunting more often (a lot more often?). That is probably true, but I don’t think we can say how often and by whom? Maybe most of the players who did not bunt are terrible bunters and all they would be doing is bunting back to the pitcher or fouling the ball off or missing. And BTW, telling us that a bunt results in reaching base 58% of the time is not quite the whole story. We also need to know how many bunt attempts resulted in a strike. Imagine that if a player attempted to bunt 10 times, fouled it off or missed it 9 times and reached base once.  That is probably not a good result even though it looks like he bunted with a 1.000 average!

It is also curious to me that 7 players bunted into a shift almost 4 times each, and reached base 16 times (a .615 BA). They are obviously decent or good bunters. Why are they not bunting every time until the shift is gone against them? They are smart enough to occasionally bunt into a shift, but not smart enough to always do it? Something doesn’t seem right.

Anyway, despite my many criticisms, it was an interesting chapter and well-done by Jeff. I am looking forward to reading the rest of the articles in the Analysis section and if I have time, I will review one or more of them.

This is a follow up to my article on baseballprospectus.com about starting pitcher times through the order penalties (TTOP).

Several readers wondered whether pitchers who throw lots of fastballs (or one type of pitch) have a particularly large penalty as opposed to pitchers who throw more of a variety of pitches. The speculation was that it would be harder or take longer for a batter to acclimate himself to a pitcher who has lots of different pitches in his arsenal. As well, since most starters tend to throw more fastballs the first time through the order, those pitchers who follow that up with more off-speed pitches for the remainder of the game would have an advantage over those pitchers who continue to throw mostly fastballs.

First I split all the starters up into 3 groups: One, over 75% fastballs, two, under 50% fastballs, and three, all the rest. The data is from 2002-2012. I downloaded pitcher pitch type data from fangraphs.com. The results will amaze you.

FB %

N (Pitcher Seasons)

Overall

First Time

Second Time

Third Time

Fourth Time

Second Minus First

Third Minus Second

Fourth Minus Third

> 75%

159

.357

.341

.363

.376

.348

.027

.020

-.013

< 50%

359

.352

.346

.349

.360

.361

.003

.015

.010

All others

2632

.359

.346

.361

.370

.371

.015

.015

.013

Pitchers who throw mostly fastballs lose 35 points in wOBA against by the third time facing the lineup. Those with a much lower fastball frequency only lose 24 points. Interestingly, the former group reverts back to better than normal levels the fourth time (I don’t know why that is, but I’ll return to that issue later), but the latter group continues to suffer a penalty as do all the others. Keep in mind that the fourth time numbers are small samples for the first two groups, and that fourth time TBF are only around 15% of first time TBF (i.e., starters don’t often make it past the third time through the order) .

The takeaway here is that a starter’s pitch repertoire is extremely important in terms of how long he should be left in the game and whether he should start or relieve (we already knew the latter, right?). If we look at columns three and four, we can get an idea as to the difference between a pitcher as a starter and as a reliever, at least as far as times through the order is concerned (there are other considerations, such as velocity – e.g., when a pitcher is a short reliever, he can usually throw harder). The mostly fastball group is 16 points (around .5 runs per 9 innings) more effective the first time through the order than overall, while the low frequency fastball group only has a 6 point (.20 RA9) advantage. Keep in mind that some of that first time through the order advantage for all groups is due to the “first inning” effect (see my original article on BP).

Next I split the pitchers into four groups based on the number of pitches they threw at least 10% of the time. The categories of pitches (from the FG database) were fast balls, sliders, cutters, curve balls, change ups, splitters, and knuckle balls.

# Pitches in Repertoire (> 10%)

N (Pitcher Seasons)

Overall

First Time

Second Time

Third Time

Fourth Time

Second Minus First

Third Minus Second

Fourth Minus Third

1

41

.359

.344

.370

.375

.303

.027

.009

-.061

2

1000

.358

.343

.359

.371

.366

.016

.018

.007

3

1712

.361

.349

.362

.371

.372

.013

.015

.014

4

378

.351

.340

.351

.360

.368

.011

.013

.019

This is even more interesting. It appears that the fewer pitches you have in your repertoire, the more that batters become quickly familiar with you, we we might expect. One-pitch pitchers lose 36 points by the third time through the order, while four-pitch pitchers lose only 24 points. The fourth time through the order is exactly the opposite. Against one-pitch pitchers, pitchers gain 61 points (small sample size warning – 639 PA). Again, I have no idea why. Maybe fastball pitchers are able to ramp it up in the later innings, or maybe they start throwing more off-speed pitches. A pitch f/x analysis would shed some more light on this issue. Against the four-pitch pitchers, batters gain 19 points the fourth time around compared to the third. If we weight and combine the third and fourth times in order to increase our sample sizes, we get this:

# Pitches in Repertoire (> 10%)

N (Pitcher Seasons)

Overall

First Time

Second Time

Third and Fourth Times

Second Minus First

Third+ Minus Second

1

41

.359

.344

.370

.364

.027

-.001

2

1000

.358

.343

.359

.370

.016

.017

3

1712

.361

.349

.362

.371

.013

.015

4

378

.351

.340

.351

.361

.011

.015

Again, we see the largest, by far, second time penalty for the one-pitch pitchers (27 points), and a gradually decreasing penalty for two, three, and four-pitch pitchers (16, 13, and 11). Interestingly, they all have around the same penalty the third time and later, other than the one-pitch pitchers, who essentially retain their quality or even get a bit better, although this is driven by their large fourth time advantage, as you saw in the previous table.

It is not clear that you should take your one-pitch starters out early and leave in those who have multiple pitches in their weaponry. In fact, it may be the opposite. While the one-pitch pitchers would do well if they only face the order one time (and so would the two-pitch starters actually), once you allow them to stay in the game for the second go around, you might as well keep them in there as long as they are not fatigued, at least as compared to the multiple-pitch starters. Starters with more than one pitch appear to get 10-15 points worse each time through the order even though they don’t have the large penalty between the first and second time, as the one-pitch pitchers do. Remember, for the last two tables, a pitch is considered part of a starter’s repertoire if he throws it at least 10% of the time.

I’ll now split the pitchers into four groups again based on how many pitches they throw, but this time, the cutoff for a “pitch” will be 15% rather than 10%. The number of pitchers who throw four pitches at least 15% of the time each are too few for the their numbers to be meaningful, so I’ll throw them in with the three pitch pitchers. I’ll also combine the third and fourth times through the order again.

# Pitches in Repertoire (> 15%)

N (Pitcher Seasons)

Overall

First Time

Second Time

Third and Fourth Times

Second Minus First

Third+ Minus Second

1

447

.358

.342

.362

.364

.027

-.001

2

1954

.359

.346

.361

.370

.016

.017

3+

742

.355

.347

.352

.371

.013

.015

The three and four-pitch starters are better overall by three or four points of wOBA (.11 RA9). The first time through the order, however, the one-pitch starters are better by 5 points or so (.15 RA9). The second time around, the one-pitch pitchers fare the worst, but by the third and fourth times through the order, they are once again the best (by 6 or 7 points, or .22 RA9). It is difficult to say what the optimal use of these starters would look like. At the very least, these numbers should give a manager/team more information in terms of estimating a starter’s penalty at various points in the game, based on his pitch repertoire.

I’ll try one more thing: Two groups. The first group are pitchers who throw at least 80% of one type of pitch, excluding knuckleballers. These are truly one-pitch pitchers. The second group throw three (or more) pitches at least 20% of the time each. These are truly three-pitch pitchers. Let’s see the contrast.

# Pitches in Repertoire

N (Pitcher Seasons)

Overall

First Time

Second Time

Third and Fourth Times

Second Minus First

Third+ Minus Second

1 (> 80%)

47

.360

.343

.367

.370

.025

.009

3+ (> 20%)

104

.353

.350

.357

.357

.008

.009

It certainly looks like the 42 one-pitch pitchers (47 is the number of pitcher seasons) would be much better off as relievers, facing each batter only one time. They are not very good overall, and after only one go around, they are 25 points (.85 RA9) worse than the first time facing the lineup! The three-pitch pitchers suffer only a small (8 point) penalty after the first time through the order. Both groups actually suffer the same penalty from the second to the third (and more)  time through the order (9 points).

So who are these 42 pitchers who are ill-suited to being a starter? Perhaps they are swingmen or emergency starters. I looked at all pitchers who started at least one game – not just regular starters. Here is the complete list from 2002 to 2012. The numbers after the names are the number of TBF faced as starters and as relievers.

Mike Timlin 20, 352

Kevin Brown 206, 68

Ben Diggins 114, 0

Jarrod Wahburn 847, 0

Mike Crudale 9, 199

Grant Balfour 17, 94

Shane Loux 69, 69

Jimmy Anderson 180, 3

Kirk Reuter 620, 0

Jaret Wright 768, 0

Logan Kensing 55, 11

Tanyon Sturze 57, 277

Chris Young 156, 0

Nate Bump 33, 286

Bartolo Colon 2683, 49

Carlos Silva 876, 10

Aaron Cook 3337, 0

Cal Eldred 12, 141

Rick Bauer 21, 281

Mike Smith 18, 0

Shawn Estes 27, 0

Troy Percival 4, 146

Andrew Miller 306, 0

Luke Hochevar 12, 41

Luke Hudson 13, 0

Dana Eveland 15, 13

Denny Bautista 9, 38

Dennis Safarte 81, 274

Roberto Hernandez 548, 0

Mike Pelfrey 812, 0

Daniel Cabrera 881, 0

Frankie de la Cruz 15, 37

Mark Mulder 3, 9

Ty Taubenheim 27, 0

Brad Kilby 7, 58

Darren Oliver 17, 264

Justin Masterson 1794, 4

Luis Mendosa 60, 0

Ross Detwiler 627, 51

Cesar Ramos 11, 109

Josh Stinson 17, 21

Ross Detwiler 627, 51

Many of these pitchers barely had a cup of coffee in the majors. Others were emergency starters, swingmen, or they changed roles at some point in their careers. Others were simply mediocre or poor starting pitchers, like Kirk Reuter, Jarrod Washburn, Mike Pelfrey, Carlos Silva, and Daniel Cabrera, while others were good or even excellent starters, like Kevin Brown, Mark Mulder, and Bartolo Colon.

I think the lesson is clear. Unless a team has a compelling reason to make a one-pitch pitcher a starter (perhaps they are an extreme sinker-baller, like Brown, Cook, and Masterson), they should probably only relieve. If a team is going to use a swingman for an occasional start or a reliever for an emergency start, they would do well to use a two or three-pitch pitcher or limit him to one time through the order.

If we remove the swingmen and emergency starters as well as those pitchers who faced fewer than 50 batters in a season, we get this:

# Pitches in Repertoire

N (Pitcher Seasons)

Overall

First Time

Second Time

Third and Fourth Times

Second Minus First

Third+ Minus Second

1 (> 80%)

28

.353

.336

.364

.365

.028

.004

3+ (> 20%)

104

.353

.350

.357

.357

.008

.009

Even if we only look at regular starters with one primary pitch other than a knuckleball, we still see a huge penalty after the first time facing the order. In fact, the second time penalty (compared to the first) is worse than when we include the swingmen and emergency starters. Although these pitchers overall are as good as multiple-pitch starters, they still would have been much better off as short relievers.

Here is that updated list of starters once we remove the ones who rarely start. These guys as a whole should probably have been short relievers.

Cook

Miller

Colon

Diggins

Silva

Young

Cabrera

Wright

Washburn

Anderson

Masterson

Brown

Rueter

Kensing

Mendoza

Pelfrey

Hernandez

Detwiler

You might think that the one-pitch starters in the above list who are good or at least had one or two good seasons might not necessarily be good candidates for short relief. You would be wrong. These pitchers had huge second to first penalties and pitched much better the first time through the order than overall. Here is the same chart as before, but only including above-average starters for that season.

# Pitches in Repertoire

N (Pitcher Seasons)

Overall

First Time

Second Time

Third and Fourth Times

Second Minus First

Third+ Minus Second

1 (> 80%)

11

.328

.307

.332

.332

.039

-.013

3+ (> 20%)

35

.321

.318

.323

.323

.004

.003

Here are those pitchers who pitched very well overall, but were lights out the first time facing the lineup. Remember that these pitchers were above average in the season or seasons that they went into this bucket – they were not necessarily good pitchers throughout their careers or even in any other season.

Kevin Brown

Jarrod Washburn

Jaret Wright

Chris Young

Bartolo Colon

Carlos Silva

Justin Masterson

Ross Detwiler

Interestingly, the very good multiple-pitch pitchers had very small penalties each time through the order. These are probably the only kind of starters we want to go deep into games! Here is that list of starters.

Sonnanstine

B. Myers

Pavano

Sabathia

Billingsley

Carpenter

Hamels

Haren

F. Garcia

Iwakuma

Shields

J. Contreras

Beckett

Duchscherer

Gabbard

K. Rogers

Buehrle

M. Clement

Halladay

R. Hernandez

T. Hunter

Finally, in case you are  interested, here are the numbers for all of the one-pitch knuckleballers that I have been omitting in some of the tables thus far:

Knuckle Ballers Only

N

First Time

Second Time

Third+ Time

Second Minus First

Third+ Minus Second

20 .321 .354  .345 .034 -.006

Where are all the knuckle ball relievers? Although we don’t have tremendous sample sizes here (3024 second time TBF), so we have to take the numbers with a grain of salt, it looks like they are brilliant the first time through the order but once a batter has seen a knuckleballer one time, he does pretty well against him thereafter (although we do see a 6 point rebound the third time and later through the order).

I think that more research, especially using the pitch f/x data, is needed. However, I think that teams can use the information above to make more informed decisions about what roles pitchers should occupy and when to take out a starter during a game.