NCAA women's basketball ratings scales before and after the UConn-Oregon game

OFID · Feb 4, 2020

We all like to look at rankings of women's basketball teams (who's the best team? how many teams are better than us?), but looking at rankings instead of the underlying ratings scores loses information about how much better one team is than another.

It's been agreed that there is no dominant team this year, but is there a group of teams that have separated themselves from the pack? To address this, I looked at the Massey rating score, the Massey power score, the RPI score, the Elo score and the Her Hoops Stats score.

Prior to the UConn-Oregon game, the Massey rating score suggested that three or four teams, Baylor, South Carolina, Oregon and maybe Connecticut were separating from the pack (first page of the attached pdf file [137 Kb]). The Massey power score looked about the same (second page), with South Carolina and Oregon flipped. The similarity is as expected, as the Massey rating score is derived from the Massey power score, with the former intended to measure past performance and the latter intended to measure potential for the future.

The RPI score includes information about strength of schedule and opponents' strength of schedule. I have argued in the Massey Analytics thread on the General Women's basketball forum that this does not address how good a team is. Others have pointed out in the same thread that these inclusions may well be beneficial to NCAA women's basketball as a whole. Since my goal is finding a group of teams that are better than the rest, I'm going to take the RPI with a grain of salt. Prior to the UConn-Oregon game, the RPI (third page) was in partial agreement with the Massey score. Oregon and South Carolina were separating from the pack, with Baylor and Connecticut in a large group of second-tier teams (which also included Louisville, Stanford, Gonzaga, UCLA, Maryland, Missouri St, Oregon St and Iowa).

The results for Elo were more similar to the Massey scales, with Baylor, South Carolina, Oregon separating from the back and Louisville, Stanford and Connecticut forming a group of second-tier teams (fourth page). Her Hoops Stats had only Baylor, Oregon and South Carolina in the elite group and Connecticut and Maryland as second-tier teams (fifth page).

I sought a consensus among these five scales using Principal Component Analysis. Setting aside the details about PCA, the resulting composite score was essentially the sum of the individual scale scores, after each scale was transformed to have a mean of zero and a variance of 1, multiplied by 1/sqrt(5). The composite had Baylor, Oregon and South Carolina separating, with Connecticut not far behind (sixth page).

And then UConn played Oregon ...

After that game, the Massey rating score had only Baylor, Oregon and South Carolina in a group far ahead of everyone else, with Connecticut among fourteen second-tier teams (seventh page). The Massey power score was similar, except that Baylor and Oregon were flipped (eighth page).

The RPI score had only Oregon and South Carolina separated, without a clearly separated group of second-tier teams (ninth page). Similar to the Massey scores, Elo had Oregon, Baylor and South Carolina separating, but only Louisville and Stanford were in the second-tier teams (tenth page). HHS had Baylor and Oregon flipped, with Connecticut and Maryland in a distant second cluster (eleventh page).

The composite score had Oregon, Baylor and South Carolina as the elite teams, with no clearly separated group of second tier teams (eleventh page).

If short, according to these five rating scales, there appears to be three elite teams now: Oregon, Baylor and South Carolina. Each of the five scales had these teams as elite, with the exception of RPI excluding Baylor (due to their extremely weak out of conference schedule). The outcome of the UConn-Oregon game has moved UConn out of elite or near elite status, and further back towards the pack. The South Carolina games will provide UConn with an opportunity to move back to the elite group.

Notes:

In order to manage the overstrikes, I used a smallish font in the plots and rounded all of the scores to three significant digits. Since the attached file is a pdf, you can zoom in to your heart's desire.
The Massey ratings were obtained from the Massey site: Massey Ratings - CBW. Here's an oddity: the cutoff for inclusion in the ratings is games played by 11:59 pm EST. Hawaii played a home game on 1/30 that was scheduled at 7 pm local time, which is midnight on 1/31 EST. That game was not included in the 1/31 report.
I'm getting HHS ratings from the Her Hoops Stats site: Her Hoop Stats Rtg National Team Leaderboard | NCAA Division I Women's Basketball | Her Hoop Stats.
I'm getting the Elo rating from Warren Nolan site: ELO Chess Ranking 2020 Womens College Basketball | WarrenNolan.com.
Lastly, I'm getting the RPI data from the Warren Nolan Site: RPI (Live) 2020 Women's College Basketball | WarrenNolan.com. The RPI rankings are given on that page (twice), but the RPI score for each team is only on the team site. It takes about 2-3 minutes for my program to retrieve the rankings for 351 teams. It would be easier to get them from the Real Time RPI site, but I have lost faith in quality of their data. For about 3 1/2 weeks in January 2020, they included UConn's loss to Louisville on 1/31/18 (no, we didn't play Louisville this season, that's a game from last season). RTRPI finally fixed that, but they're still including Savannah State in the rankings, a school that in no longer in the MEAC, having made the transition from Division I to Division II this year and has no games scheduled against Division I opponents.
All programming and analysis was done in R.

Centerstream · Feb 4, 2020

Funny how these analytics seem to be consistent in that the 3 top teams are Baylor, Oregon and South Carolina.
Questions:
What team(s) scheduled all 3 teams this season?
What team(s) scheduled 2 of the teams this season?

(Yet the possibility of losing to them means it is a disastrous year for that team.)
Hmm.

OFID · Feb 4, 2020

Centerstream:
I'm sure you're aware of one team that scheduled all 3

. It turns out, as they said in Star Wars, there is another: Washington State. For UConn, all three were non-conference. For Washington St, Oregon was a conference game and the other two were self-inflicted.

The teams that scheduled two of them are Georgia (Baylor [non-conference] and South Carolina), Indiana (Baylor and South Carolina [both non-conference]), Kansas St (Baylor and Oregon [non-conference]) and Oklahoma St (Baylor and Oregon [non-conference]).

To me, it's unsurprising that there was so much agreement. Maybe these really are the three best teams?

willtalk · Feb 9, 2020

The problem is that services like Massey require teams to schedule the right teams. If they have not scheduled those teams the sample size would not be sufficient. For example if Uconn had not scheduled Baylor or Oregon they would probably be still be rated in the top tier. It required those games to move them into a more realistic grouping.

The flaw in the system was shown by the projected outcome of the score of the game between Uconn and Oregon. They had it as a close game. Statistics have value when put into perspective, but bottom line, a realistic eyeball test will beat stats any time. I believe most basketball knowledgable people had Oregon as the top team even after they lost to Louisville. Although the rankings have to reflect won-lost records, that metric does not always hold true. It appears that because of the development of parity in WCBB a loss or win no longer has the same value it once had.

msf22b · Feb 9, 2020

Interesting...not a stats man myself

But the results seem right.
(unless we beat up on South Carolina).

OFID · Feb 9, 2020

willtalk said:
The problem is that services like Massey require teams to schedule the right teams. If they have not scheduled those teams the sample size would not be sufficient.

It's a good point that while all of these services are eager to provide point estimates, none are willing to part with measures of uncertainty; that is, none of them say how accurate their estimates are. Elo and the Massey estimates are model-based, so I expect their properties are quite good. The further into the season we go, the better the estimates are. HHS is proprietary, so I couldn't say about that one. RPI is ad hoc, and I expect does not have good properties.

willtalk said:
For example if Uconn had not scheduled Baylor or Oregon they would probably be still be rated in the top tier.

Stony Brook (23-1) and Princeton (17-1) would be counter-examples to that. Each team has only lost one game, but they haven't beaten any really good teams either. To be elite, you have to beat really good teams. Baylor has beaten 7 Massey top 50 teams; Oregon has beaten 5 Massey top 50 teams (including 4 top 11 wins) and South Carolina 11 Massey top 50 wins (including two top 5 wins). On the other hand, Princeton has only 1 Massey top 50 win and Stony Brook doesn't even have a Massey top 100 win.

Another counterexample would be Maryland (17-4), which has 9 Massey top 50 wins and is ranked 5th in Massey despite having 4 losses. Our problem is not that we scheduled Baylor and Oregon, it's that we did schedule enough teams that ended up being really good. Our best win (so far) is DePaul. Our other Massey top 50 wins are Ohio State and Tennessee. That's it, just 3 Massey top 50 wins.

willtalk said:
Statistics have value when put into perspective, but bottom line, a realistic eyeball test will beat stats any time.

I completely agree that there are things that are not captured in the inputs to these rating scales: points for, points against, home/away/neutral site etc., and that these estimates are not the be-all and end-all. However, whether an eyeball estimate is better will depend on who's eyeball it is. I'd argue that any eyeball estimate can be improved by looking at (and understanding the strengths and weaknesses of) ratings scales.

billybud · Feb 9, 2020

I never could understand stats...

Like Kenpom who seems to value efficiency and tempo over wins...you can earn higher scores in an efficient loss than with an inefficient win.

Looking at men's basketball..as an example of my puzzlement.

NET...WVU lost on the road to then NET #58 Oklahoma and moved up from NET #10 to #9.

....Or NET with Michigan State..with 8 losses including a blow out 29 point loss to Purdue...ahead of 3 loss FSU who beat NET 6 Louisville in the Yum and who beat VT at VT (who Michigan State lost to).

Is NET like the football "SEC Effect"? The Big Ten has 11 teams in the Top 40...sweeping 12-11 Minnesota gives you credit for two top 50 wins?

nwhoopfan · Feb 9, 2020

billybud said:
I never could understand stats...

lies, damned lies and statistics...

OFID · Feb 9, 2020

... or, alternatively: figures don't lie, but liars figure. If you understand what assumptions are being made and can assess if those assumptions are being satisfied, you won't get fooled.

billybud · Feb 9, 2020

I guess my problem is the latter part....my ability to assess the assuptions made in various indexes versus real world power.

OFID · Feb 9, 2020

Here are some questionable assumptions:

As mentioned above with reference to the Oregon-UConn game, these scales don't use information about how teams match up with one another.
As far as I know, all of these scales assume that how good teams are (relative to one another) doesn't change throughout the season. For example, Baylor's loss to South Carolina is evaluated without reference to Cox's injury.

Note that it is necessary to make some assumptions to produce these estimates.

Glenn · Feb 9, 2020

Nice work! You are correct that Massey and the others do not take into account missing players or the evolution of team strength over the course of the season. Massey does take into account whether the game is home or away or neutral and includes margin of victory. RPI is just some trash, its based on who you played, but not whether you beat them.

OFID · Feb 9, 2020

To the best of my knowledge, Elo does not take home/away/neutral into account. RPI does take wins and losses into account, it just weights them far less than other ratings scales (25%), and does not use margin of victory. Again, since HHS is proprietary, I have no idea what they do.

I share your frustration with RPI. Does anybody believe there is any chance at all that Missouri State (#4) is better than Baylor (#8)? Would anybody rather see Missouri State in the NCAA tournament instead of Baylor, or have a higher seed than Baylor?

OFID · Feb 9, 2020

I thought of another questionable assumption:

Rematches are treated like any other game; the experience and knowledge gained from the first meeting is treated the same as the experience and knowledge gained from playing any other team.

NCAA women's basketball ratings scales before and after the UConn-Oregon game

OFID

Attachments

Centerstream

Another loooooong offseason

OFID

willtalk

msf22b

Maestro

OFID

billybud

nwhoopfan

hopeless West Coast homer

OFID

billybud

OFID

Glenn

OFID

OFID

Forum statistics