Massey Schedule Ratings

Let's restrict this discussion to 5 types of teams with the following ratings and probabilities of beating each other:

Probability of Beating

So for example, a "good" team has a 75% chance of beating an "average" team.

How it usually works

Schedule ratings are typically calculated by averaging the difficulty of the games a team has played. This typically accounts for the venue of the game as well as the strength of the opponent. However, for simplicity let's assume every game is played at a neutral site.

Example: Consider the following schedule


The schedule strength calculation might go as follows:


The average schedule strength is simply the total rating of the opponents (30) divided by the number of games (10). Hence this team has a schedule rating of 3.

What's wrong with that?

Defining schedule strength to be the average rating (adjusted for homefield) of your opponents seems reasonable enough. However, there are circumstances in which this can be quite misleading.

Consider the following schedules:

Schedule A: (Good, Good, Good, Average)
Schedule B: (Great, Great, Good, Pathetic)

It is easy to see that the average opponent rating is the same (30/4 = 7.5) for both schedules A and B. But the following table shows that in fact a schedule should be measured relative to which team it belongs to.

Expected Wins Against
Team A's ScheduleTeam B's Schedule

The expected wins values were calculated by adding the % values from the table at the top of this page. For example if a "great" team played Schedule A, it would be expected to win (.75 + .75 + .75 + .9 = 3.15) games.

Now the interesting observation is that a "great" or "good" team would be expected to win more games if it were to play Schedule A, while an "average" or below team would be expected to win more against Schedule B! So which schedule is "harder" ? It appears that this question can only be answered relative to the team that actually has to play that schedule.

What's really going on here ?

A "great" team will likely fare better against Schedule A because it contains only inferior teams. Even though Schedule A contains 3 "good" teams, a "great" team will still be favored. In contrast, Schedule B contains two other "great" teams and the outcome of those games would be a toss-up.

From a "great" team's perspective, it does not gain much advantage from playing a "pathetic" team instead of an "average" team. However there is a significant difference between playing a "good" or "great" team.

Of course the situation is reversed from the perspective of a "bad" team. It would prefer Schedule B since at least it should beat the "pathetic" team.

General Statement

All of this can be summed up by the following heuristic:

An above average team should prefer to play a less distributed schedule, while a below average team should prefer to play a more distributed schedule.

A more "distributed" schedule would be something like (Great, Pathetic) while a less "distributed" schedule would be (Average, Average).

How Massey calculates schedule strength

Let the function EW(schedule) give the expected wins a team would get versus the specified schedule. Then I define schedule strength S to be the unique rating S such that if Team X has rating S then

n*EW(X) = EW(actual schedule played)

where n is then number of games played.

In words this means that if the team in question had played X in every game, then the expected wins would be exactly the same as for the actual schedule played.

As a consequence of this definition of schedule strength, a team's schedule is judged primarily by the "peers" that appear on its schedule. A good team has a hard schedule if it must play other good teams, while a bad team has a hard schedule if it does not play any other bad teams.

Kenneth Massey
October 7, 1999