I think they way they do that is by looking over what actually happened in thousands of games in the past, and asking "in this situation, which team scores next - the one currently on offense or the one currently on defense - what percentage of the time, and how many points?"

For a simple example with way too small a sample size that I'll pull out of the air - let's say they have data on ten games where a team had the ball at their own 47 yard facing 2nd and 6. Six of those times, the next score in the game came by the team with the ball (not necessarily on that same possession) - 3 TD and 3 FG = +30 pts. Four times, the next score in the game came by the team currently on defense (not meaning a defensive score, probably they got the ball back and then scored), 2 TD and 2 FG = -20 points. 30 - 20 = 10. 10/10 = 1. So a team with 2nd and 6 at their own 47 would have an EPA of +1.

Now let's say the Chiefs have 2nd and 6 at their own 47, and Jamaal runs for 10 yards. Now they have 1st and 10 at the opponent's 43. If EPA for that yardline, down, and distance is +1.4, then Charles gets credit for adding 0.4 expected points with his run. Next play he runs for 3 yards and it's 2nd and 7 at the opposing 40; if this EPA is +1.3, then Charles gets debited 0.1 expected points, and now has 2 carries for 0.3 EPA, or 0.15 EPA/carry.

So, while it surprises me that so few guys are positive, it makes sense that your average runner, who gets a lot of 1-3 yard carries and then breaks a 20 yarder a couple of times, would have an EPA slightly in the negative. It fits with the idea that running is more of low-risk low-reward and passing is high-risk high-reward, since pass plays have a higher EPA overall, but also have a lot more large negatives than runs do, since there are more turnovers and since sacks tend to lose bigger yards than a negative run play.

Last edited: May 19, 2021