Your Lyin’ Eyes – Recency Bias and Pitcher ERA at Mid-Season

Posted: July 4, 2018 in Pitching, Projections, Statistics

Let’s face it. Most of you just can’t process the notion that a pitcher who’s had 10 or 15 starts at mid-season can have an ERA of 5+ and still be expected to pitch well for the remainder of the season. Maybe, if they’re a Kershaw or Verlander or a known ace, but not some run of the mill hurler. Similarly, if a previously unheralded and perhaps terrible starter were to be sporting a 2.50 ERA in July after 12 solid starts, the notion that he’s still a bad pitcher, although not quite as bad as we previously estimated, is antithetical to one of the strongest biases that human beings have when it comes to sports, gambling, and in fact, many other aspects of life in general – recency bias. According to the online skeptics dictionary, recency bias is, “the tendency to think that trends and patterns we observe in the recent past will continue in the future.”

I looked at all starting pitcher in the last 3 years who either:

  1. In the first week of July, had a RA9 (runs allowed per 9 innings) adjusted for park, weather, and opponent, that was at least 1 run higher than their mid-season (as of June 30) projection. In addition, these pitchers had to have a projected context-neutral RA9 of less than 4.00 (good pitchers).
  2. In the first week of July, had an adjusted RA9 at least 1 run lower than their mid-season projection. They also had to have a projection greater than 4.50 (bad pitchers).

Basically, group I pitchers above were projected to be good pitchers but had very poor results for around 3 months. Group II pitchers were projected to be bad pitchers despite having very good results in the first half of the season.

A projection is equivalent to estimating a player’s most likely performance for the next game or for the remainder of the season (not accounting for aging). So in order to test a projection, we usually look at that player’s or a group of players’ performance in the future. In order to mimic the real-time question, “How do we expect this pitcher to pitch today, I looked at the next 3 games performance, in RA9.

Here are the aggregate results:

The average RA9 from 2015-2017 was around 4.39.

Group I pitchers (cold first half) N=36 starts after first week in July

Season-to-date RA9 Projected RA9 Next 3 starts RA9
5.45 3.76 3.71

Group II Pitchers (hot first half) N=84 starts after first week in July

Season-to-date RA9 Projected RA9 Next 3 starts RA9
3.33 4.95 4.81


As you can see, the season-to-date context neutral (adjusted for park, weather and opponent) RA9 tells us almost nothing about how these pitchers are expected to pitch, independent of our projection. Keep in mind that the projection has the current season performance baked into the model, so it’s not that the projection is ignoring the “anomalous” performance, and somehow magically the pitcher reverts to somewhere around his prior performance.

Actually, two things are happening here to create these dissonant (within the context of recency bias) results: One, these projections are using 3 or 4 years of prior performance (including the minor leagues), if available, such that another 3 months, even the most recent 3 months (which gets more weight in our projection model), often doesn’t have much effect on the projection (depending on how much prior data there is). As well, even if there isn’t that much prior data the very bad or good 3-month performance is going to get regressed towards league average anyway.

Two, how much integrity is there in a very bad RA9 for a pitcher who was and is considered a very good pitcher, and vice versa? By that, I mean does it really reflect how well the pitcher has pitched in terms of the components allowed or was he just lucky or unlucky in terms of the timing of those events? We can attempt to answer that question by looking at our same pitchers above and see how their season-to-date RA9 looks compared to a a component RA9, which is an RA9 looking number constructed from a pitcher’s component stats (using a BaseRuns formula). Let’s add that to the charts above.

Group I

Season-to-date RA9 To-date component RA9 Projected RA9 Next 3 starts RA9
5.45 4.40 3.76 3.71

Group II

Season-to-date RA9 To-date component RA9 Projected RA9 Next 3 starts RA9
3.33 4.25 4.84 4.81


These pitchers’ component results were not nearly as bad or good as their RA9 suggests.

So, if a pitcher is still projected to be a good pitcher, even after a terrible first half (or vice versa), RA9-wise (and presumably ERA-wise), two things are going on to justify that projection: One, the first half may be a relatively small sample compared to 3 or 4 years prior performance – remember, everything counts (albeit recent performance is given more weight)! Two, and more importantly, that RA9 is mostly timing-driven luck. The to-date components suggest that both the hot and cold pitchers have not pitched nearly as badly or as well as their RA9 suggests. The to-date component RA9’s are around league-average for both groups.

The takeaway here is that your recency bias will cause to you reject these projections in favor of to-date performance as reflected in RA9 or ERA, when in fact the projections are still the best predictor of future performance.

  1. evo34 says:

    Note sure anyone is still using RA to predict future RA.

    What projection system did you use?

    If a pitcher has a bad first half, and falls out of the rotation before July, he will not qualify for Group I, correct? So Group I is pitchers who underperformed in the first half, but still maintained a spot in the rotation, which is a pretty specific subset of cold first half pitchers.

  2. […] Beyond the Box Score On Jon Gray and the use of “luck” factors. Mitchel Lichtman looks at Recency Bias and Pitcher ERA at Mid-Season and whether pitch framing can be worth that […]

  3. saladitos says:

    It seems like we have better metrics with witch to judge player performance these days. Stuff like exit velocity and launch angle can be translated into expected ERA and that information has higher correlation at smaller sample sizes. So in season, give me the exit velo translated babip/fip/era/etc. over projected RA.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s