There is a prolific base stealer on first base in a tight game. The pitcher steps off the rubber, varies his timing, or throws over to first several times during the AB. You’ve no doubt heard some version of the following refrain from your favorite media commentator: “The runner is disrupting the defense and the pitcher, and the latter has to throw more fastballs and perhaps speed up his delivery or use a slide step, thus giving the batter an advantage.”

There may be another side of the same coin: The batter is distracted by all these ministrations, he may even be distracted if and when the batter takes off for second, and he may take a pitch that he would ordinarily swing at in order to let the runner steal a base. All of this leads to *decreased* production from the batter, as compared to a proverbial statue on first, to which the defense and the pitcher pay little attention.

So what is the *actual* net effect? Is it in favor of the batter, as the commentators would have you believe (after all, they’ve played the game and you haven’t), or does it benefit the pitcher – an unintended negative consequence of being a frequent base stealer?

Now, even if the net effect of a stolen base threat is negative for the batter, that doesn’t mean that being a prolific base stealer is necessarily a bad thing. Attempting stolen bases, given a high enough success rate, presumably provides extra value to the offense independent of the effect on the batter. If that extra value exceeds that given up by virtue of the batter being distracted, then being a good and prolific base stealer may be a good thing. If the pundits are correct and the “net value of distraction” is in favor of the batter, then perhaps the stolen base or stolen base attempt is implicitly worth more than we think.

Let’s not also forget that the stolen base *attempt*, independent of the success rate, is surely a net positive for the offense, not withstanding any potential distraction effects. That is due to the fact that when the batter puts the ball in play, whether it is a hit and run or a straight steal, there are fewer forces at second, fewer GDP’s, and the runner advances the extra base more often on a single, double, or out. Granted, there are a few extra line drive and fly ball DP, but there are many fewer GDP to offset those.

If you’ve already gotten the feeling that this whole steal thing is a lot more complicated than it appears on its face, you would be right. It is also not easy, to say the least, to try and ascertain whether there is a distraction effect and who gets the benefit, the offense or the defense. You might think, “Let’s just look at batter performance with a disruptive runner on first as compared to a non-disruptive runner.” We can even use a “delta,” “matched pairs,” or “WOWY” approach in order control for the batter, and perhaps even the pitcher and other pertinent variables. For example, with Cabrera at the plate, we can look at his wOBA with a base stealing threat on first and a non-base stealing threat. We can take the difference, say 10 points in wOBA in favor of with the threat (IOW, the defense is distracted and not the batter), and weight that by the number of times we find a matched pair (the lesser of the two PA). In other words, a “matched pair” is one PA with a stolen base threat on first and one PA with a non-threat.

If Cabrera had 10 PA with a stolen base threat and 8 PA with someone else on first, we would weight the wOBA difference by 8 – we have 8 matched pairs. We do that for all the batters, weighting each batter’s difference by their number of matched pairs, and voila, we have a measure of the amount that a stolen base threat on first affects the batter’s production, as compared to a non-stolen base threat. Seems pretty simple and effective, right? Eh, not so fast.

Unfortunately there are myriad problems associated with that methodology. First of all, do we use all PA where the runner started on first but may have ended up on another base, or was thrown out, by the time the batter completed his PA? If we do that, we will be comparing apples to oranges. With the base stealing threats, there will be many more PA with a runner on second or third, or with no runners at all (on a CS or PO). And we know that wOBA goes down once we remove a runner from first base, because we are eliminating the first base “hole” with the runner being held on. We also know that the value of the offensive components are different depending on the runners and outs. For example, with a runner on second, the walk is not as valuable to the batter and the K is worse than a batted ball out which has a chance to advance the runner.

What if we only look at PA where the runner was still at first when the batter completed his PA? Several researchers have done that, included myself and my co-authors in *The Book*. The problem with that method is that those PA are not an unbiased sample. For the non-base stealers, most PA will end with a runner on first, so that is not a problem. But with a stolen base threat on first, if we only include those PA that end with the runner still on first, we are only including PA that are likely biased in terms of count, score, game situation, and even the pitcher. In other words, we are only including PA where the runner has not attempted a steal yet (other than on a foul ball). That could mean that the pitcher is difficult to steal on (many of these PA will be with a LHP on the mound), the score is lopsided, the count is biased one way or another, etc. Again, if we only look at times where the PA ended with the runner on first, we are comparing apples to oranges when looking at the difference in wOBA between a stolen base threat on first and a statue.

It almost seems like we are at an impasse and there is nothing we can do, unless perhaps we try to control for everything, including the count, which would be quite an endeavor. Fortunately there is a way to solve this – or at least come close. We can first figure out the overall difference in value to the offense between having a base stealer and a non-base stealer on first, *including the actual stolen base attempts.* How can we do that? That is actually quite simple. We need only look at the change in run expectancy starting from the beginning to the end of the PA, starting with a runner on first base only. We can then use the delta or matched pairs method to come up with an average difference in change in RE. This difference represents the *sum total *of the value of a base stealer at first versus a non-base stealer, including any effect, positive or negative, on the batter.

From there we can try and back out the value of the stolen bases and caught stealings (including pick-offs, balks, pick-off errors, catcher errors on the throw, etc.) as well as the extra base runner advances and the avoidance of the GDP when the ball is put into play. What is left is any “distraction effect” whether it be in favor of the batter or the pitcher.

First, in order to classify the base runners, I looked at their number of steal attempts per times on first (BB+HP+S+ROE) for that year and the year before. If it was greater than 20%, they were classified as a “stolen-base threat.” If it was less than 2%, they were classified as a statue. Those were the two groups I looked at vis-à-vis the runner on first base. All other runners (the ones in the middle) were ignored. Around 10% of all runners were in the SB threat group and around 50% were in the rarely steal group.

Then I looked at all situations starting with a runner on first (in one or the other stolen base group) and ending when the batter completes his PA or the runner makes the third out of the inning. The batter may have completed his PA with the runner still on first, on second or third, or with no one on base because the runner was thrown out or scored, via stolen bases, errors, balks, wild pitches, passed balls, etc.

I only included innings 1-6 (to try and eliminate pinch runners, elite relievers, late and close-game strategies, etc.) and batters who occupied the 1-7 slots. I created matched pairs for each batter such that I could use the “delta method” described above to compute the average difference in RE change. I did it year by year, i.e., the matched pairs had to be in the same year, but I included 20 years of data, from 1994-2013. The batters in each matched pair had to be on the same team as well as the same year. For example, Cabrera’s matched pairs of 8 PA with base stealers and 10 PA with non-base stealers would be in one season only. In another season, he would have another set of matched pairs.

Here is how it works: Batter A may have had 3 PA with a base stealer on first and 5 with a statue. His average change in RE (everyone starts with a runner on first only) at the end of the PA may have been +.130 runs for those 3 PA with the stolen base threat on first at the beginning of the PA.

For the 5 PA with a non-threat on first, his average change in RE may have been .110 runs. The difference is .02 runs in favor of the stolen base on first and that gets weighed by 3 PA (the lesser of the 5 and the 3 PA). We do the same thing for the next batter. He may have had a difference of -.01 runs (in favor of the non-threat) weighted by, say, 2 PA. So now we have (.02 * 3 – .01 * 2) / 5 as our total average difference in RE change using the matched pair or delta method. Presumably (hopefully) the pitcher, score, parks, etc. are the same or very similar for both groups. If they are, then that final difference represents the advantage of having a stolen base threat on first base, including the stolen base attempts themselves.

A plus number means a total net advantage to the offense with a prolific base stealer on first, including his SB, CS, and speed on the bases when the ball is put into play, and a negative number means that the offense is better off with a slow, non-base stealer on first, which is unlikely of course. Let’s see what the initial numbers tell us. By the way, for the changes in RE, I am using Tango’s 1969-1992 RE matric from this web site: http://www.tangotiger.net/re24.html.

We’ll start the analysis with no out situations. One of the advantages of a base stealer on first is staying out of the GDP (again, offset by a few extra line drive and fly ball DP). There were a total of 5,065 matched pair PA (adding the lesser of the two PA for each matched pair). Remember a matched pair is a certain batter with a base stealing threat on first and that same batter in the same year with a non-threat on first. The runners are on first base when the batter steps up to the plate but may not be when the PA is completed. That way we are capturing the run expectancy change of the entire PA, regardless of what happens to the runner during the PA.

The average advantage in RE change (again, that is the ending RE after the PA is over minus the starting RE, which is always with a runner on first only, in this case with 0 out) was .032 runs per PA. So, as we expect, a base stealing threat on first confers an overall advantage to the offensive team, at least with no outs. This includes the net run expectancy of SB (including balks, errors, etc.) and CS (including pick-offs), advancing on WP and PB, advancing on balls in play, staying out of the GDP, etc., as well as any advantage or disadvantage to the batter by virtue of the “distraction effect.”

The average wOBA of the batter, for all PA, whether the runner advanced a base or was thrown out during the PA, was .365 with a non-base stealer on first and .368 for a base stealer.

What are the differences in individual offensive components between a base stealing threat and a non-threat originally on first base? The batter with a statue who starts on first base has a few more singles, which is expected given that he hits with a runner on first more often. As well, the batter with a base stealing threat walks and strikes out a lot more, due to the fact he is hitting with a base open more often.

If we then compute the RE value of SB, CS (and balks, pickoffs, errors, etc.) for the base stealer and non-base stealer, as well as the RE value of advancing the extra base and staying out of the DP, we get an advantage to the offense with a base stealer on first of .034 runs per PA.

So, if the overall value of having a base stealer on first is .032 runs per PA, and we compute that .034 runs comes from greater and more efficient stolen bases and runner advances, we must conclude that that there is a .002 runs *disadvantage* to the batter when there is a stolen base threat on first base. That corresponds to around 2 points in wOBA. So we can say that with no outs, there is a 2 point penalty that the batter pays when there is a prolific base stealer on first base, as compared to a runner who rarely attempts a SB. In 5065 matched PA, one SD of the difference between a threat and non-threat is around 10 points in wOBA, so we have to conclude that there is likely no influence on the batter.

Let’s do the same exercise with 1 and then 2 outs.

With 1 out, in 3,485 matched pair, batters with non-threats hit .388 and batters with threats hit .367. The former had many more singles and of course fewer BB (a lot fewer) and K. Overall, with a non-base stealer starting on first base at the beginning of the PA, batters produced an RE that was .002 runs per PA *better *than with a base stealing threat. In other words, having a prolific, and presumably very fast, base stealer on first base offered no overall advantage to the offensive team, including the value of the SB, base runner advances, and avoiding the GDP.

If we compute the value that the stolen base threats provide on the base paths, we get .019 runs per PA, so the disadvantage to the batter by virtue of having a prolific base stealer on first base is .021 runs per PA, which is the equivalent of the batter *losing 24 points in wOBA*.

What about with 2 outs? With 2 outs, we can ignore the GDP advantage for the base stealer as well as the extra value from moving up a base on an out. So, once we get the average RE advantage for a base stealing threat, we can more easily factor out the stolen base and base running advantage to arrive at the net advantage or disadvantage to the batter himself.

With 2 outs, the average RE advantage with a base stealer on first (again, as compared to a non-base stealer) is .050 runs per PA, in a total of 2,390 matched pair PA. Here, the batter has a wOBA of .350 with a non-base stealer on first, and .345 with a base stealer. There is a still a difference in the number of singles because of the extra hole with the first baseman holding on the runner, as well as the usual greater rate of BB with a prolific stealer on base. (Interestingly, with 2 outs, the batter has a *higher* K rate with a non-threat on base – it is usually the opposite.) Let’s again tease out the advantage due to the actual SB/CS and base running and see what we’re left with. Here, you can see how I did the calculations.

With the non-base stealer, the runner on first is out before the PA is completed 1.3% of the time, he advances to second, 4.4% of the time, and to third, .2%. The total RE change for all that is .013 * -.216 + .044 * .109 + .002 * .157, or .0023 runs, not considering the count when these events occurred. The minus .216, plus .109, and plus .157 are the change in RE when a base runner is eliminated from first, advances from first to second, and advances from first to third prior to the end of the PA (technically prior to the beginning of the PA). The .013, .044, and .002 are the frequencies of those base running events.

For the base stealer, we have .085 (thrown out) times -.216 + .199 (advance to 2^{nd}) * .109 + .025 (advance to 3^{rd}) * .157, or .0117. So the net advantage to the base stealer from advancing or being thrown is .0117 minus .0023, or .014 runs per PA.

What about the advantage to the prolific and presumably fast base stealers from advancing on hits? The above .014 runs was from advances prior to the completion of the PA, from SB, CS, pick-offs, balks, errors, WP, and PB.

The base stealer advances the extra base from first on a single 13.5% more often and 21.7% more often on a double. Part of that is from being on the move and part of that is from being faster.

12.5% of the time, there is a single with a base stealing threat on first. He advances the extra base 13.5% more often, but the extra base with 2 outs is only worth .04 runs, so the gain is negligible (.0007 runs).

A runner on second and a single occurs 2.8% of the time with a stolen base threat on base. The base stealer advances the extra base and scores 14.6% more often than the non-threat for a gain of .73 runs (being able to score from second on a 2-out single is extremely valuable), for a total gain of .73 * .028 * .146, or .003 runs.

With a runner on first and a double, the base stealer gains an extra .0056 runs.

So, the total base running advantage when the runner on first is a stolen base threat is .00925 runs per PA. Add that to the SB/CS advantage of .014 runs, and we get a grand total of .023 runs.

Remember that the overall RE advantage was .050 runs, so if we subtract out the base runner advantage, we get a presumed advantage to the batter of .050 – .023, or .027 runs per PA. That is around 31 points in wOBA.

So let’s recap what we found. For each of no outs, 1 out, and 2 outs, we computed the average change in RE for every batter with a base stealer on first (at the beginning of the PA) and a non-base stealer on first. That tells us the value of the PA from the batter and the base runner combined. (That is RE24, by the way.) We expect that this number will be higher with base stealers, otherwise what is the point of being a base stealer in the first place if you are not giving your team an advantage?

**Table I – Overall net value of having a prolific and disruptive base stealing threat on first base at the beginning of the PA, the value of his base stealing and base running, and the presumed value to the batter in terms of any “distraction effect.” Plus is good for the offense and minus good for the defense.**

Outs |
Overall net value |
SB and base running value |
“Batter distraction” value |

0 | .032 runs (per PA) | .034 runs | -.002 runs (-2 points of wOBA) |

1 | -.002 runs | .019 | -.21 runs (-24 pts) |

2 | .050 runs | .023 | + .027 (31 pts) |

We found that very much to be the case with no outs and with 2 outs, but not with 1 out. With no outs, the effect of a prolific base runner on first was .032 runs per PA, the equivalent of raising the batter’s wOBA by 37 points, and with 2 outs, the overall effect was .050 runs, the equivalent of an extra 57 points for the batter. With 1 out, however, the prolific base stealer is in effect lowering the wOBA of the batter by 2 points. Remember that these numbers include the base running and base stealing value of the runner as well as any “distraction effect” that a base stealer might have on the batter, positive or negative. In other words, RE24 captures the influence of the batter as well as the base runners.

In order to estimate the effect on the batter component, we can “back out” the base running value by looking at how often the various base running events occur and their value in terms of the “before and after” RE change. When we do that, we find that with 0 outs there is no effect on the batter from a prolific base stealer starting on first base. With 1 out, there is a 24 point wOBA *disadvantage *to the batter, and with 2 outs, there is a 31 point *advantage *to the batter. Overall, that leaves around a 3 or 4 point negative effect on the batter. Given the relatively small sample sizes of this study, one would not want to reject the hypothesis that having a prolific base stealer on first base has *no net effect on the batter’s performance*. Why the effect depends so much on the number of outs, and what if anything managers and players can do to mitigate or eliminate these effects, I will leave for the reader to ponder.

MGL: you said “With 1 out, in 3,485 matched pair, batters with non-threats hit .388 and batters with threats hit .367.” So, you have much worse hitters with basestealing threats, right?

So, when you show: “which is the equivalent of the batter losing 24 points in wOBA.” Well, isn’t this explained by the fact that you have batters that are 21 points worse in wOBA? Or did you control for that in your calculations?

I’m not sure I saw that.

Tango, I am using matched pairs. So those wOBA’s are after adjusting for the different batter pools. So if batter A hits .350 with a base stealer on base in 20 PA and .360 with a non-base stealer in 10 PA, .350 goes into the base stealer bucket and .360 goes into the non-base stealer bucket, both weighted by 10 PA. I do that for every batter that had at least one PA with a base stealer and one PA with a non-base stealer. I should have explained those wOBA more clearly. Does that make sense?

In any case the wOBA’s were in case anyone was interested. The only numbers that matter in terms of whether the batter is positively or negatively affected by the base stealing threats on base are the overall RE (which are based on the “matched pairs” do it doesn’t matter what the batter pools are) and the RE from steals and base running.

I didn’t put this in the article, but I found it interesting:

I used the “matched pair” method to calculate the wOBA difference for batters when there was a runner on first only, second only and no one on base, in the early innings, and with 0, 1 and 2 outs.

Just looking at wOBA overall with certain base/out combinations is not that helpful because there are different batter pools, just like there are different batter pools with base stealers and non-base stealers on base.

Here are the results:

With no outs, The difference between a runner on first only and on second only is 30 points of wOBA! That is huge! One reason for that which many people might overlook is that with a runner on second and no outs, many batters are trying to hit to RF to advance the runner on an out, and they severely depress their wOBA in doing so. I don’t know if their strategy is correct or not. The “batter adjusted” wOBA with 0 out and a runner on first is .358 and with a runner on second, it is .328. Again, no other runners on base.

With 1 out, where the batter is NOT trying to advance the runner on an out (at least he shouldn’t), the wOBA difference is only 15 points. So we can infer that batters are giving up around 15 points in wOBA trying to go to right field to advance the runner on an out. Of course there are sacrifice issues too when it comes to comparing no outs to 1 out. And BTW, in this and my study in the article, I treated sacrifice attempts like any other PA.

With 2 outs the difference between a runner on first and one on second is also 15 points, the same as with 1 out, which is interesting. You would think that with a runner on first with 1 out, because the middle infielders are playing at “double play depth” that there would be a few more singles that sneak through the IF, thus raising the wOBA with a runner on first.

Keep in mind that comparing wOBA with different base/out states is not exactly fair, since wOBA used fixed coefficients for the various offensive events. The values of those coefficients varies with the base/out state. So, for example, with a runner on first a wOBA of .350 may not create the same value as a .350 wOBA with a runner on second. For example, the value of the BB is completely different in those situations.

The difference between the wOBA with a runner on first and with no runners on is the following:

With 0 outs, it is 7 points. That is likely due to the first base “hole” being open and the middle infielders playing at double play depth.

With 1 out, it is 19 points. We would expect it to be the same as with 0 outs. The probable reason it is not is because of all those sac bunts with 0 outs. They severely depress the wOBA.

With 2 outs, where we don’t have the infielders at DP depth AND we don’t have any sac bunts, we have a difference of 8 points.

We also have issues of the runner going with a runner on first which will increase the wOBA of the batter by creating more holes in the IF. I probably should have run these numbers with only non-base stealers on base.

Ah, ok, got it.

Do you think with 1 out, it could be that the batter (and manager) wants to avoid a potential double play to end the inning? Try for the hit-and-run play instead of just trying to hit the ball?

Sure, could be. Do you think that he is using the hit and run with the base stealers on base or with the non-base stealers?

I’m imagining that hitting ground balls and hitting balls to RF with 1 out is more favorable with a fast runner on base so batters are predisposed to do that more often with a fast runner on. Then, when the advantages of a speedy runner are backed out, those outcomes (which were more frequent and more desirable with a fast runner) are treated as being as desirable as they would be with a slow runner on.

First, I assume you mean with 0 outs. Batters don’t try and hit to RF with a runner on base and 1 out. Second, non-base stealers are rarely on second base in my study. They only get there before the PA ends 2.5% of the time. Third, I don’t think a batter cares who the runner is on second. If there is a runner on second with no outs, some batters (not all – the power hitters probably don’t) try and hit the ball to RF, regardless of who is on second.

I meant that what you said about runner on 2nd with no outs made me think about this but I’m not referring to situations with runners on second here. I was referring to this:

[quote]With no outs, The difference between a runner on first only and on second only is 30 points of wOBA! That is huge! One reason for that which many people might overlook is that with a runner on second and no outs, many batters are trying to hit to RF to advance the runner on an out, and they severely depress their wOBA in doing so. [/quote]

because otherwise it wouldn’t have occurred to me that batter adjustments of that sort could help explain such a large wOBA differences.

With respect to the 1 out batter distraction results you found, I was thinking about batters worrying less about grounding into double plays with fast runners on and, although I was thinking about hitting to RF, maybe thinking about hit and runs makes more sense — although the fast guys in this study might be prolific enough base stealers that they aren’t really the ideal base runners for a hit and run.

The amount that wOBA is suppressed by batters trying to hit to right field with a runner on 2nd and no outs makes me wonder if correctly backing out the value of staying out of the DP and advancing extra bases is hard to do. If the batter is hitting somewhat more ground balls and trying to hit to right field somewhat more often when he has a fast runner on than when he has a slow runner on, and the adjustment is, if I’m understanding it correctly, docking his RE as if he had those same outcomes with a slower runner on then perhaps the RE is being docked too much. Could that batter distraction effect with 1 out in part reflect a change in strategy with a fast runner on first?

Not sure I follow you. Can you explain in different words or with an example?

[…] 얼마전 MGL은 본인의 블로그에 이 분석 결과를 공개했다. 분석 방법을 간단히 정리하면 다음과 같다. 1) 도루 가능성이 매우 높거나/낮은 주자가 1루에 있을 때, 타자의 타석 전후 기대 득점(Run Expectancy,RE) 변화량(△RE)을 서로 비교한다. 2) 순수하게 주자가 만들어낸 RE 변화(△RE) – 도루, 도루 실패, 와일드 피치 등 – 를 비교한다. 3) 앞에서 계산한 두 값을 바탕으로 순수하게 타자의 생산력을 계산한다. […]

[…] Comment From Timmgl just covered that http://mglbaseball.wordpres… […]

Excellent work, MGL. Glad to see the weighted matched pair methodology getting some exposure–I’ve been using it for years, having forgotten who I picked it up from.

I’m of the opinion that the single most unexamined and fundamental sabermeric truth is that there are not only significant differences among hitting performance across all 24 base-out situations, but that such differences have varied significantly across time (something I discovered while proving that hitting differentials with runners on base are real over the course of careers). This is clearly the result of both pitcher and batter altering their approach based on base-out state, and there is no reason to assume that the approaches have been optimized. For instance, it certainly appears as if pitchers pay a large penalty for attempting to reduce SF.

Someone needs to devote a good chunk of time to this, looking further at how the score differential and style and quality of both the pitcher and hitter affect the results. We know that there have been individual pitchers who defied conventional wisdom about situational strategies–Jim Palmer having never yielded a GS, for instance. If there are base-out situations where either the pitcher or hitter approach has been suboptimal, can we identify groups of players who have defied the incorrect CW and owe some of their success to that? (Geesh, you’re going to have to include the manager and maybe the ballpark in this analysis.)

I don’t think there’s any understanding of your results until this nut has been cracked. And this reminds me that I’ve wanted to ask Dan Brooks to add base-out filters to all of his pitch/fx tools!

“In order to estimate the effect on the batter component, we can “back out” the base running value by looking at how often the various base running events occur and their value in terms of the “before and after” RE change. When we do that, we find that with 0 outs there is no effect on the batter from a prolific base stealer starting on first base. With 1 out, there is a 24 point wOBA disadvantage to the batter, and with 2 outs, there is a 31 point advantage to the batter. Overall, that leaves around a 3 or 4 point negative effect on the batter.”

0 & & +31 = +7 advantage to to the batter. Yet (again) I quote: “Overall, that leaves around a 3 or 4 point negative effect on the batter.” In other words: Overall, that leaves around a 3 or 4 point disadvantage to the batter.

Is it a 7 point advantage or a 3 to 4 point disadvantage to the batter? Or am I stupidly missing something? Sincere thanks for the work that you, Tango & all the analytics people do.

Apparently, WordPress does not recognize the negative signs above the comma and period on a Qwerty keyboard. So I’ll rephrase my second paragraph in English. No effect PLUS a 24 point disadvantage to the batter PLUS a 31 point advantage to the batter equals a 7 point advantage to the batter.

Oh, for crying out loud. My real name is Patrick McCabe. Bob Laughlin is a pen name I used for a book I wrote and other things. Tango would understand – I was going to cheekily ask if MLB makes out a paycheck every two weeks to: TANGO, THOMAS T.