Massey analytics

azfan · Jan 28, 2020

Massey Ratings - College Basketball Women's : NCAA D1 Ratings

Computer ratings and rankings for CBW (College Basketball Women's), with links to team predictions, scores, and schedules.

www.masseyratings.com

This is the first time I've spent any time looking at the Massey rankings. I may be misinterpreting or misusing the day though but I wonder about the following questions.

1. Would one conclude from Massey that Northwestern is a stronger defensive team than Baylor?

2. Understanding the top 64 in Massey would not make the big dance because of automatic beds but I interpret this as the University of Southern Cal in the Pac-12 would be a tournament team?

3. I had to smile at DePaul's defensive ranking. Does anyone really believe that they are 225th defensively in D1?

4. I haven't seen Gonzaga play are they are a top defensive team?

I guess what I'm wondering is how does Massey account for the quality of opponent and conference strength.

It seems to me won loss record would not be very helpful.

In any event it was fun to look at the evaluations and analytics at this point in the season.

Bear11 · Jan 28, 2020

The thing to remember about the Massey offense/defense numbers is that they only look at the raw point totals, instead of the underlying details. It is more accurate to measure offense and defense on a per-possession basis, since teams with different styles of play can have large disparities in the number of possessions per game. Depaul is a good example, because they play notoriously quickly. This leads to very high point totals, but can also make their defense look bad because they give the other side more chances as well. Their defense isn't nearly that bad, but their offense is not as good as their Massey rating either.

Her hoop stats provides better data for this because they factor in pace of play to their numbers. In their defensive metric, Baylor has by far the best defense, with Northwestern at a respectable 13th. This happens because Baylor plays at a relatively fast pace, while Northwestern is extremely slow. On the Depaul example, Massey has them as the number 1 offense and 225 defense. Her hoop stats puts those number at 13 and 99. The overall Massey rating is valid, because the numbers cancel each other out, but the split offense and defense numbers can be exaggerated.

Here is the link to the her hoops stats numbers if you are interested. Her Hoop Stats Rtg National Team Leaderboard | NCAA Division I Women's Basketball | Her Hoop Stats

azfan · Jan 28, 2020

Bear11 said:
The thing to remember about the Massey offense/defense numbers is that they only look at the raw point totals, instead of the underlying details. It is more accurate to measure offense and defense on a per-possession basis, since teams with different styles of play can have large disparities in the number of possessions per game. Depaul is a good example, because they play notoriously quickly. This leads to very high point totals, but can also make their defense look bad because they give the other side more chances as well. Their defense isn't nearly that bad, but their offense is not as good as their Massey rating either.

Her hoop stats provides better data for this because they factor in pace of play to their numbers. In their defensive metric, Baylor has by far the best defense, with Northwestern at a respectable 13th. This happens because Baylor plays at a relatively fast pace, while Northwestern is extremely slow. On the Depaul example, Massey has them as the number 1 offense and 225 defense. Her hoop stats puts those number at 13 and 99. The overall Massey rating is valid, because the numbers cancel each other out, but the split offense and defense numbers can be exaggerated.

Here is the link to the her hoops stats numbers if you are interested. Her Hoop Stats Rtg National Team Leaderboard | NCAA Division I Women's Basketball | Her Hoop Stats

Thanks!

Kyla Irwin currently 1 in EFG% in HHS.

Centerstream · Jan 28, 2020

It's a computer program...

Bear11 · Jan 28, 2020

Centerstream said:
It's a computer program...

And? No one is claiming the computers are perfect, but they can provide real insight and unbiased numbers. I have found that Massey can give some scarily accurate predictions for game scores.

Centerstream · Jan 28, 2020

Bear11 said:
And? No one is claiming the computers are perfect, but they can provide real insight and unbiased numbers. I have found that Massey can give some scarily accurate predictions for game scores.

A computer program can be manipulated to give the program specific, designed, biased and/or favorable data.
For instance it can be programmed to give a school or conference a certain advantage, imo.

TheFarmFan · Jan 28, 2020

Centerstream said:
A computer program can be manipulated to give the program specific, designed, biased and/or favorable data.
For instance it can be programmed to give a school or conference a certain advantage, imo.

Massey is completely transparent in his formula and has applied it for over a decade across over a dozen sports. So whether that's abstractly true, it's inapplicable here.

FanFromMaine · Jan 29, 2020

Bear11 said:
The thing to remember about the Massey offense/defense numbers is that they only look at the raw point totals, instead of the underlying details. It is more accurate to measure offense and defense on a per-possession basis, since teams with different styles of play can have large disparities in the number of possessions per game. Depaul is a good example, because they play notoriously quickly. This leads to very high point totals, but can also make their defense look bad because they give the other side more chances as well. Their defense isn't nearly that bad, but their offense is not as good as their Massey rating either.

Her hoop stats provides better data for this because they factor in pace of play to their numbers. In their defensive metric, Baylor has by far the best defense, with Northwestern at a respectable 13th. This happens because Baylor plays at a relatively fast pace, while Northwestern is extremely slow. On the Depaul example, Massey has them as the number 1 offense and 225 defense. Her hoop stats puts those number at 13 and 99. The overall Massey rating is valid, because the numbers cancel each other out, but the split offense and defense numbers can be exaggerated.

Here is the link to the her hoops stats numbers if you are interested. Her Hoop Stats Rtg National Team Leaderboard | NCAA Division I Women's Basketball | Her Hoop Stats

I took a quick look at HHS and appreciate that you identified it as an option I have been trying to find a second analysis to which to compare Massey.

At first glance the two teams with the biggest variance between the systems among the top ten were Maryland and Louisville. HHS basically inverts the order from Massey.

Both teams have been hard for me to read this season and, since Charlie Creme continues to tout Louisville as a 1 seed, I have been watching their rankings more closely.

Can someone identify for me the metrics that Charlie uses to justify Louisville’s seed status? Given the flat slope of top teams, it does not bother me whether UConn is 1or 2 seed. I’m just curious.

Plebe · Jan 29, 2020

FanFromMaine said:
Can someone identify for me the metrics that Charlie uses to justify Louisville’s seed status? Given the flat slope of top teams, it does not bother me whether UConn is 1or 2 seed. I’m just curious.

Louisville has 7 wins over teams in "quadrant 1" (1-50); only South Carolina has more. They also have an excellent win on a neutral court over Oregon. There are not four teams with a better resume than Louisville for now.

OFID · Feb 1, 2020

FFM:
In addition to the Massey rating and the HHS rating, I looked at Louisville vs Maryland using the Massey power RPI and Elo ratings. I also looked at a few other teams I was interested in: Mississippi Valley State and Coppin State, two dreadful teams, and at Baylor.

I visualized the data using spaghetti plots (yes, that's a thing; feel free to look it up on the internet). Spaghetti plots overlay multiple observations for each individual (in this case, for a team) connected by a line. This technique is typically used for multiple measurements of the same thing across time. When doing that, all of the measurements are on the same scale and there is a natural ordering for the observations. In this application, neither is true. To get everything on the same scale, I transformed each scale to have mean zero and variance one. This transformation not only preserves the ordering within each scale, it also preserves the proportions of the differences between the teams. I ordered the rating scales by attempting to put the most similar scales next to each other.

If you were to plot the exact same ratings multiple times, the lines would be exactly parallel. Plotting ratings that had the same rankings (i.e., the same orderings) but different scores would result in lines that were not parallel, but did not cross. The more lines cross, the more dissimilar the rankings are.

The spaghetti plots are on the first page of the attached pdf file.

Some observations not directly pertinent to question at hand:

The Massey rating score and the Massey power score are very similar. This is not surprising as the ratings score is derived from the power score.
One would expect some differences between the Massey scores and the others, since Massey includes all games and HHS, RPI and Elo only include Division I games. This is reflected in the plot.
The distribution of HHS scores is somewhat more long-tailed, with more observations close to the middle, but also observations further away from the middle.
The RPI score is very, very different than the Elo score.

Back to the question at hand, Louisville vs Maryland:

The Elo and RPI scores have Louisville above Maryland, as the Massey rating score does. The Massey power score has Maryland above Louisville, as the HHS score does, but not by anywhere near as much.
As per Plebe's post, looking at the RPI rankings of the opponents for Louisville and Maryland make it clear that Louisville has accomplished more (page 2 of the pdf), as per RPI. Louisville has a much better record against top 50 teams.
Looking at the HHS rankings, it is not as clear which team as accomplished more (page 3 of the pdf). Two of Louisville's opponents (Virginia and Central Michigan) are ranked a little lower on HHS and drop out of the top 50, leaving them with a 5-1 record against top 50 schools. Among Maryland opponents, no teams drop out and Michigan drops in, giving them an 7-4 record against top 50 schools; more wins, but also more losses, and a much lower winning percentage (0.833 vs 0.637). The cutoff at 50 is somewhat arbitrary; if you look instead at the top 60, Louisville is at 8-1 and Maryland is still at 7-4. Louisville has both more wins and a higher winning percentage (0.889 vs 0.637) against top 60 teams.
The Massey power ratings are even less convincing than the HHS rankings (page 4 of the pdf). Similarly to HHS, the shift into and out of the top 50 favors Maryland, but if you look just about anywhere past the to 50, Louisville looks better.
My overall impression is that the stronger case is made the RPI scores that Louisville has accomplished more, compared to the case made by the HHS and Massey power scores for Maryland.

The two dreadful teams (Mississippi Valley State and Coppin State):

Mississippi Valley State is substantially better than Coppin State on the Massey ratings, Massey power and HHR scores, about the same on the Elo score, but much, much worse on the RPI score (page 1 of the pdf).
As can be seen from looking at the HHS, RPI or Massey power rankings (pages 2, 3, and 4 of the pdf, respectively), MVS has won only one of their games while Coppin State has lost all of their games, however, Coppin State has lost to much better teams.
To be clear, the Coppin State games against those opponents that were ranked higher than MVS's highest ranking opponent were not competitive. They lost to South Dakota by 38, to Rutgers by 74(!), West Virginia by 35, South Dakota State by 53, Dayton by 30, and Cincinnati by 46.
The two teams had one common opponent: Florida A&M, which is another dreadful team (their highest ranking on these five rating scales is 343). Mississippi Valley State defeated Florida A&M by 1 point at home; Coppin State lost to Florida A&M by 10 on the road. I don't know what the home court advantages were, but it's hard to believe that they would account for an 11 point swing.
It makes very little sense to me that RPI has Coppin State so much above MVS. Coppin State has apparently gotten a boost by being routed by some good teams, however, my intuition is that Coppin State did nothing in those games to indicate that they were remotely comparable to those good teams. Providing a boost for getting routed by good teams does nothing toward evaluating how good a team is.

Baylor:

Baylor is kind of the anti-Coppin State. They are ranked first on all of the rating scales, except for RPI where they drop down to fifth (page 1 of the pdf).
Baylor has only lost one game, to South Carolina on a neutral court (when their best player was injured and unable to play), which rates no lower than third on any of these five scales (not shown). That's a better worst loss than South Carolina (Indiana - neutral court), Oregon (Arizona State - away), Louisville (Ohio State - away), and Stanford (Texas - away) and comparable to Connecticut (Baylor - home) .
RPI appears to be docking Baylor for scheduling and beating six ranked below 200, all at home. Six sub-200 opponents is more than any of the teams in the previous bullet, and the other teams played at least one sub-200 team on the road.
I agree that beating awful teams at home should do little, if anything at all, to build one's resume, but I don't see (functionally) assessing a penalty for that. Squandering an opportunity on a bad opponent should be penalty enough. Assessing a penalty also does nothing towards evaluating how good a team is.

The Massey rating scales, the HHS rating scale and the Elo scale all appear to be different assessments of how good a team is, but the above comparison of MVS/Coppin State and the examination of Baylor suggests that RPI is not trying to do that. Are they trying to evaluate success in the NCAA Tournament (i.e., more experience against better teams will help against the better teams in the Tournament)? What are they aiming for? Also, since the only way for bad teams to get better in RPI is to play good teams and good teams playing bad teams is penalized by RPI, RPI is structurally making it more difficult for bad teams from getting better.

Notes:

The Massey ratings were obtained from the Massey site: Massey Ratings - CBW. The team schedules were obtained from each team page on that site. Here's an oddity: the cutoff for inclusion in the ratings is games played by 11:59 pm EST. Hawaii played a home game on 1/30 that was scheduled at 7 pm local time, which is midnight on 1/31 EST. That game was not included in the 1/31 report.
I'm getting HHS ratings from the Her Hoops Stats site: Her Hoop Stats Rtg National Team Leaderboard | NCAA Division I Women's Basketball | Her Hoop Stats. The individual team schedules are from the Massey site.
I'm getting the Elo rating from Warren Nolan site: ELO Chess Ranking 2020 Womens College Basketball | WarrenNolan.com. The individual team schedules are from the Massey site.
Lastly, I'm getting the RPI data from the Warren Nolan Site: RPI (Live) 2020 Women's College Basketball | WarrenNolan.com. The RPI rankings are given on that page (twice), but the RPI score for each team is only on the team site. It takes about 2-3 minutes for my program to retrieve the rankings for 351 teams. It would be easier to get them from the Real Time RPI site, but I have lost faith in quality of their data. For about 3 1/2 weeks in January 2020, they included UConn's loss to Louisville on 1/31/18 (no, we didn't play Louisville this season, that's a game from last season). RTRPI finally fixed that, but they're still including Savannah State in the rankings, a school that in no longer in the MEAC, having made the transition from Division I to Division II this year and has no games scheduled against Division I opponents.
All programming and analysis was done in R.

TheFarmFan · Feb 1, 2020

OFID said:
Baylor:

Baylor is kind of the anti-Coppin State. They are ranked first on all of the rating scales, except for RPI where they drop down to fifth (page 1 of the pdf).

Baylor has only lost one game, to South Carolina on a neutral court (when their best player was injured and unable to play), which rates no lower than third on any of these five scales (not shown). That's a better worst loss than South Carolina (Indiana - neutral court), Oregon (Arizona State - away), Louisville (Ohio State - away), and Stanford (Texas - away) and comparable to Connecticut (Baylor - home) .

RPI appears to be docking Baylor for scheduling and beating six ranked below 200, all at home. Six sub-200 opponents is more than any of the teams in the previous bullet, and the other teams played at least one sub-200 team on the road.

I agree that beating awful teams at home should do little, if anything at all, to build one's resume, but I don't see (functionally) assessing a penalty for that. Squandering an opportunity on a bad opponent should be penalty enough. Assessing a penalty also does nothing towards evaluating how good a team is.

The Massey rating scales, the HHS rating scale and the Elo scale all appear to be different assessments of how good a team is, but the above comparison of MVS/Coppin State and the examination of Baylor suggests that RPI is not trying to do that. Are they trying to evaluate success in the NCAA Tournament (i.e., more experience against better teams will help against the better teams in the Tournament)? What are they aiming for? Also, since the only way for bad teams to get better in RPI is to play good teams and good teams playing bad teams is penalized by RPI, RPI is structurally making it more difficult for bad teams from getting better.

This is excellent analysis - many thanks! I've always viewed RPI as the tool that compromises between a statistical measure of the best team and a reward for teams that "scheduled tough" or at least don't "schedule easy." Now, there's a loophole whereby teams can schedule against winning teams in bad conferences and it stuffs the RPI more than it should, so it's not perfect at incentivizing scheduling up, either.

All things considered, however, I'm not opposed to RPI docking Baylor a bit for scheduling so many little sisters of the poor. IMHO, WCBB is hurt by two main factors: ugly, lopsided scoresheets that suggest the sport is not credible beyond a handful of teams, and grossly sub-30% shooting in televised games. More than anything else, those two factors, I believe, give haters cause to hate.

Baylor beating up on little sisters of the poor -- who shoot at around 20-25% and have no business being on TV, but who will almost inevitably be televised at least sometimes because they're playing Baylor -- provides that ammunition. If RPI dings Baylor for it, fine by me.

Incidentally, Tara has never copped to this, but I have watched Stanford's substitution patterns for well over a decade now, and it seems pretty clear to me that Tara likes to avoid winning by much more than 30 if she can help it. Once Stanford is up 30, the first string is on the bench for the night. Some like Mulkey (and often Geno) do not seem so inclined, and I personally don't think it's a great look for the sport.

azfan · Feb 1, 2020

I think you make an interesting point that is worth testing. Your assertion that there are two ongoing issues in WBB that affect demand the first being playing extremely weak non-conference schedules and the second shooting book below 30% of televised games.

On the first I have to intuitively agree with you. It would be interesting to see a comparison between men and women's college basketball scheduleing. Specifically looking at scheduling of the top 10 teams in each sport. I wonder if some of the top 10 teams in men's basketball also pad resume through weaker scheduling. Alternatively it might be there in men's college basketball there are simply more strong program is relation to the total number of teams.

As to your second concern regarding poor shooting and televised games. If this is between two comparable teams particularly strong teama I'm not sure that that should be a detraction from demand.

For example in the UCLA- Arizona Wildcat game last night the Wildcats particularly the first half shot the lights out of the ball. Part of the reason UCLA had so much trouble hiting shotswas I think Wildcat defense.

Of of course my view may be biased due to support of ASU basketball. CTT in her years in Tempe has always emphasized defense over offense. So I've come to admire and appreciate strong defensive efforts.

In any of event I appreciate your observations and the mammoth amount of work done by the poster who analyzed the various rating and analytics sites through great application to various teams.

Very impressive.

FanFromMaine · Feb 2, 2020

OFID said:
FFM:
In addition to the Massey rating and the HHS rating, I looked at Louisville vs Maryland using the Massey power RPI and Elo ratings. I also looked at a few other teams I was interested in: Mississippi Valley State and Coppin State, two dreadful teams, and at Baylor.

I visualized the data using spaghetti plots (yes, that's a thing; feel free to look it up on the internet). Spaghetti plots overlay multiple observations for each individual (in this case, for a team) connected by a line. This technique is typically used for multiple measurements of the same thing across time. When doing that, all of the measurements are on the same scale and there is a natural ordering for the observations. In this application, neither is true. To get everything on the same scale, I transformed each scale to have mean zero and variance one. This transformation not only preserves the ordering within each scale, it also preserves the proportions of the differences between the teams. I ordered the rating scales by attempting to put the most similar scales next to each other.

If you were to plot the exact same ratings multiple times, the lines would be exactly parallel. Plotting ratings that had the same rankings (i.e., the same orderings) but different scores would result in lines that were not parallel, but did not cross. The more lines cross, the more dissimilar the rankings are.

The spaghetti plots are on the first page of the attached pdf file.

Some observations not directly pertinent to question at hand:

The Massey rating score and the Massey power score are very similar. This is not surprising as the ratings score is derived from the power score.

One would expect some differences between the Massey scores and the others, since Massey includes all games and HHS, RPI and Elo only include Division I games. This is reflected in the plot.

The distribution of HHS scores is somewhat more long-tailed, with more observations close to the middle, but also observations further away from the middle.

The RPI score is very, very different than the Elo score.

Back to the question at hand, Louisville vs Maryland:

The Elo and RPI scores have Louisville above Maryland, as the Massey rating score does. The Massey power score has Maryland above Louisville, as the HHS score does, but not by anywhere near as much.

As per Plebe's post, looking at the RPI rankings of the opponents for Louisville and Maryland make it clear that Louisville has accomplished more (page 2 of the pdf), as per RPI. Louisville has a much better record against top 50 teams.

Looking at the HHS rankings, it is not as clear which team as accomplished more (page 3 of the pdf). Two of Louisville's opponents (Virginia and Central Michigan) are ranked a little lower on HHS and drop out of the top 50, leaving them with a 5-1 record against top 50 schools. Among Maryland opponents, no teams drop out and Michigan drops in, giving them an 7-4 record against top 50 schools; more wins, but also more losses, and a much lower winning percentage (0.833 vs 0.637). The cutoff at 50 is somewhat arbitrary; if you look instead at the top 60, Louisville is at 8-1 and Maryland is still at 7-4. Louisville has both more wins and a higher winning percentage (0.889 vs 0.637) against top 60 teams.

The Massey power ratings are even less convincing than the HHS rankings (page 4 of the pdf). Similarly to HHS, the shift into and out of the top 50 favors Maryland, but if you look just about anywhere past the to 50, Louisville looks better.

My overall impression is that the stronger case is made the RPI scores that Louisville has accomplished more, compared to the case made by the HHS and Massey power scores for Maryland.

The two dreadful teams (Mississippi Valley State and Coppin State):

Mississippi Valley State is substantially better than Coppin State on the Massey ratings, Massey power and HHR scores, about the same on the Elo score, but much, much worse on the RPI score (page 1 of the pdf).

As can be seen from looking at the HHS, RPI or Massey power rankings (pages 2, 3, and 4 of the pdf, respectively), MVS has won only one of their games while Coppin State has lost all of their games, however, Coppin State has lost to much better teams.

To be clear, the Coppin State games against those opponents that were ranked higher than MVS's highest ranking opponent were not competitive. They lost to South Dakota by 38, to Rutgers by 74(!), West Virginia by 35, South Dakota State by 53, Dayton by 30, and Cincinnati by 46.

The two teams had one common opponent: Florida A&M, which is another dreadful team (their highest ranking on these five rating scales is 343). Mississippi Valley State defeated Florida A&M by 1 point at home; Coppin State lost to Florida A&M by 10 on the road. I don't know what the home court advantages were, but it's hard to believe that they would account for an 11 point swing.

It makes very little sense to me that RPI has Coppin State so much above MVS. Coppin State has apparently gotten a boost by being routed by some good teams, however, my intuition is that Coppin State did nothing in those games to indicate that they were remotely comparable to those good teams. Providing a boost for getting routed by good teams does nothing toward evaluating how good a team is.

Baylor:

Baylor is kind of the anti-Coppin State. They are ranked first on all of the rating scales, except for RPI where they drop down to fifth (page 1 of the pdf).

Baylor has only lost one game, to South Carolina on a neutral court (when their best player was injured and unable to play), which rates no lower than third on any of these five scales (not shown). That's a better worst loss than South Carolina (Indiana - neutral court), Oregon (Arizona State - away), Louisville (Ohio State - away), and Stanford (Texas - away) and comparable to Connecticut (Baylor - home) .

RPI appears to be docking Baylor for scheduling and beating six ranked below 200, all at home. Six sub-200 opponents is more than any of the teams in the previous bullet, and the other teams played at least one sub-200 team on the road.

I agree that beating awful teams at home should do little, if anything at all, to build one's resume, but I don't see (functionally) assessing a penalty for that. Squandering an opportunity on a bad opponent should be penalty enough. Assessing a penalty also does nothing towards evaluating how good a team is.

The Massey rating scales, the HHS rating scale and the Elo scale all appear to be different assessments of how good a team is, but the above comparison of MVS/Coppin State and the examination of Baylor suggests that RPI is not trying to do that. Are they trying to evaluate success in the NCAA Tournament (i.e., more experience against better teams will help against the better teams in the Tournament)? What are they aiming for? Also, since the only way for bad teams to get better in RPI is to play good teams and good teams playing bad teams is penalized by RPI, RPI is structurally making it more difficult for bad teams from getting better.

Notes:

The Massey ratings were obtained from the Massey site: Massey Ratings - CBW. The team schedules were obtained from each team page on that site. Here's an oddity: the cutoff for inclusion in the ratings is games played by 11:59 pm EST. Hawaii played a home game on 1/30 that was scheduled at 7 pm local time, which is midnight on 1/31 EST. That game was not included in the 1/31 report.

I'm getting HHS ratings from the Her Hoops Stats site: Her Hoop Stats Rtg National Team Leaderboard | NCAA Division I Women's Basketball | Her Hoop Stats. The individual team schedules are from the Massey site.

I'm getting the Elo rating from Warren Nolan site: ELO Chess Ranking 2020 Womens College Basketball | WarrenNolan.com. The individual team schedules are from the Massey site.

Lastly, I'm getting the RPI data from the Warren Nolan Site: RPI (Live) 2020 Women's College Basketball | WarrenNolan.com. The RPI rankings are given on that page (twice), but the RPI score for each team is only on the team site. It takes about 2-3 minutes for my program to retrieve the rankings for 351 teams. It would be easier to get them from the Real Time RPI site, but I have lost faith in quality of their data. For about 3 1/2 weeks in January 2020, they included UConn's loss to Louisville on 1/31/18 (no, we didn't play Louisville this season, that's a game from last season). RTRPI finally fixed that, but they're still including Savannah State in the rankings, a school that in no longer in the MEAC, having made the transition from Division I to Division II this year and has no games scheduled against Division I opponents.

All programming and analysis was done in R.

I am in awe of your effort; very nicely done.
At the risk of being piggish, may I ask a couple of follow-up questions?

You mention Elo scores which Sagarin used as well. Where did you obtain those data on women’s basketball since Sagarin appears to be no more? Also, I have always been curious why Massey has both a rating and a power score. As you say, they correlate well but what makes them different at all?

You make a good point in general about the ratings: are they only meant to be a mathematical picture in time or are they meant to be predictive? It took me a long time to truly reconcile myself to the the fact that the SATs were meant to be predictive of college performance and might not be a great picture in time of intelligence (and I did well on the tests).

But, as a fan of UConn’s women’s basketball team, I guess I want to see into the future (although I am trying to be more cognizant of the pleasure of the moment). In that vein, surprisingly to me, I would have to agree with Plebe that the experience of games against the top teams does seem more predictive of success in the tourney.

TheFarmFan · Feb 2, 2020

azfan said:
It would be interesting to see a comparison between men and women's college basketball scheduleing. Specifically looking at scheduling of the top 10 teams in each sport. I wonder if some of the top 10 teams in men's basketball also pad resume through weaker scheduling. Alternatively it might be there in men's college basketball there are simply more strong program is relation to the total number of teams.

Well, I guess my unstated premise is that men's basketball has more depth of "good" players in D1, and thus more parity, and so a top ten men's team scheduled against a sub-200 team can still generate the occasional upset. (See, e.g., Stephen F. Austin vs. Duke. There's no universe that happens to WCBB teams like UConn, Baylor, South Carolina, etc.)

azfan said:
As to your second concern regarding poor shooting and televised games. If this is between two comparable teams particularly strong teama I'm not sure that that should be a detraction from demand....

Of of course my view may be biased due to support of ASU basketball. CTT in her years in Tempe has always emphasized defense over offense. So I've come to admire and appreciate strong defensive efforts. . . .

Now don't get me wrong, I am a huge fan of well played defense, and I too am a CTT partisan. But I do think, at the end of the day, the reason why the media gravitates to the Ionescus and Ogunbowales of the world is their offensive production, and especially their "trick shot" capabilities. Because the men can dunk, have the physical capabilities to do more trick shots, etc., even when playing against a Virginia-esque defense, an opposing team still generates an exciting highlight reel.

Now well executed defenses against capable offenses can still be really fun to watch. I'm thinking the ASU vs. OSU game, which featured plenty of steals, blocked shots, etc., which is the defensive end's form of highlight reel plays.

What truly turns fans off (including, I'll be honest, this one), is trip after trip up the court that ends with a clunker or airball after 20-30 seconds of passing the ball around. A lot of mismatch games against little sisters of the poor feature that. When one team starts off shooting 1-1X while the other scores 30 points in a quarter, who wants to watch that? You know 8 minutes into the game it's over. And for the casual fan, what does it say about the quality of the product that that game is being featured on TV?

FanFromMaine said:
But, as a fan of UConn’s women’s basketball team . . . I would have to agree with Plebe that the experience of games against the top teams does seem more predictive of success in the tourney.

Totally agree. This is why I always pick the team with more "good wins" and "bad losses" than a team with fewer total losses but fewer good wins. Last year's UCLA team was exemplary of that, and they took UConn and a rise-to-the-occasion-Crystal Dangerfield deep in the 4th quarter to escape.

OFID · Feb 2, 2020

Hmm... can't seem to figure out how to get quoting to work right.

Azfan said

It would be interesting to see a comparison between men and women's college basketball scheduleing. Specifically looking at scheduling of the top 10 teams in each sport. I wonder if some of the top 10 teams in men's basketball also pad resume through weaker scheduling.

I have attached a pdf file containing the Massey scores for the AP top 20 women's (page 1) and men's programs.

I didn't count carefully, but there appears to be somewhat more sub-200 games for the men than the women, driven in part by Gonzaga and San Diego State both playing conferences with three sub-200 teams.

Azfan also said

In any of event I appreciate your observations and the mammoth amount of work done by the poster who analyzed the various rating and analytics sites through great application to various teams.

Very impressive.

Well, I guess my unstated premise is that men's basketball has more depth of "good" players in D1, and thus more parity, and so a top ten men's team scheduled against a sub-200 team can still generate the occasional upset.Thanks! I'm happy to contribute.

TheFarmFan said

Well, I guess my unstated premise is that men's basketball has more depth of "good" players in D1, and thus more parity, and so a top ten men's team scheduled against a sub-200 team can still generate the occasional upset.

I scanned the fifty or so men's games in which a top 25 team played a sub-200 team, and I'd guess 90% of them were decided by at least 20 points. There were no upsets (Stephen F Austin is currently ranked 75th by Massey; I have no clue what they were ranked before they beat Duke).

OFID · Feb 2, 2020

FanFromMaine said

I am in awe of your effort; very nicely done.
At the risk of being piggish, may I ask a couple of follow-up questions?

Thanks! Questions are welcome; I really enjoy the discussion.

FanFromMaine also said

You mention Elo scores which Sagarin used as well. Where did you obtain those data on women’s basketball since Sagarin appears to be no more?

I also miss the Sagarin ratings. I don't know why he stopped doing NCAAW, or what could be do to encourage him to restart.

I was concerned when my post got than long that folks wouldn't make it to them bottom. The Elo data are from ELO Chess Ranking 2020 Womens College Basketball | WarrenNolan.com.

FanFromMaine also said

Also, I have always been curious why Massey has both a rating and a power score. As you say, they correlate well but what makes them different at all?

Massey explains this here: Massey Ratings Description.
The key sentences are

The overall team rating is a merit based quantity, and is the result of applying a Bayesian win-loss correction to the power rating.
The Massey Ratings are designed to measure past performance, not necessarily to predict future outcomes.
In contrast to the overall rating, the Power is a better measure of potential and is less concerned with actual wins-losses.

OFID · Feb 3, 2020

I finally went back and looked more closely at the games in which AP Top 10 teams played a sub-200 opponent.

The women's teams collectively played 41 such games. The median margin of victory was 41 points; half of the MOVs were between 34 and 57 points. Of the 41 games, 37 (90.2%) had a margin of victory of at least 20 points. The smallest MOV was 5 points (Stanford over Cal Baptist). A stem-and-leaf plot looks like this:

The decimal point is 1 digit(s) to the right of the |

0 | 5
1 | 149
2 | 26899
3 | 34455666
4 | 001123479
5 | 045678
6 | 02588
7 | 04
8 | 01

The first line means that there was one game with an MOV between 5 and 10 points, and the MOV was 5. The second line means that there were three games between 10 and 20 points, and the MOVs were 11, 14, 19, and so on.

Here's where I'm going to have to eat some crow.

I was correct that there were more games for the men (48 vs 41), but that was about it. The median MOV for the men was 26; half of the MOVs were between 20 and 32. Of those 48 games, 32 (75%, which is definitely not about 90%) had an MOV of at least 20. The smallest MOV was 2 points (San Diego St over San Jose St). The stem-and-leaf plot looks like this:

The decimal point is 1 digit(s) to the right of the |

0 | 2
1 | 23334567777
2 | 00133445555667888999
3 | 111224569
4 | 1344
5 | 017

The shapes of the distributions differ some, but, in general, the men's distribution is shifted ~15 points to the left of the women's distribution; i.e., the men's games were more competitive. I'm not sure that I'd characterize them as competitive in an absolute sense (only 1 game was decided by < 12 points), but they were more competitive than the women's.

DefenseBB · Feb 3, 2020

OFID said:
FFM:
In addition to the Massey rating and the HHS rating, I looked at Louisville vs Maryland using the Massey power RPI and Elo ratings. I also looked at a few other teams I was interested in: Mississippi Valley State and Coppin State, two dreadful teams, and at Baylor.

I visualized the data using spaghetti plots (yes, that's a thing; feel free to look it up on the internet). Spaghetti plots overlay multiple observations for each individual (in this case, for a team) connected by a line. This technique is typically used for multiple measurements of the same thing across time. When doing that, all of the measurements are on the same scale and there is a natural ordering for the observations. In this application, neither is true. To get everything on the same scale, I transformed each scale to have mean zero and variance one. This transformation not only preserves the ordering within each scale, it also preserves the proportions of the differences between the teams. I ordered the rating scales by attempting to put the most similar scales next to each other.

If you were to plot the exact same ratings multiple times, the lines would be exactly parallel. Plotting ratings that had the same rankings (i.e., the same orderings) but different scores would result in lines that were not parallel, but did not cross. The more lines cross, the more dissimilar the rankings are.

The spaghetti plots are on the first page of the attached pdf file.

Some observations not directly pertinent to question at hand:

The Massey rating score and the Massey power score are very similar. This is not surprising as the ratings score is derived from the power score.

One would expect some differences between the Massey scores and the others, since Massey includes all games and HHS, RPI and Elo only include Division I games. This is reflected in the plot.

The distribution of HHS scores is somewhat more long-tailed, with more observations close to the middle, but also observations further away from the middle.

The RPI score is very, very different than the Elo score.

Back to the question at hand, Louisville vs Maryland:

The Elo and RPI scores have Louisville above Maryland, as the Massey rating score does. The Massey power score has Maryland above Louisville, as the HHS score does, but not by anywhere near as much.

As per Plebe's post, looking at the RPI rankings of the opponents for Louisville and Maryland make it clear that Louisville has accomplished more (page 2 of the pdf), as per RPI. Louisville has a much better record against top 50 teams.

Looking at the HHS rankings, it is not as clear which team as accomplished more (page 3 of the pdf). Two of Louisville's opponents (Virginia and Central Michigan) are ranked a little lower on HHS and drop out of the top 50, leaving them with a 5-1 record against top 50 schools. Among Maryland opponents, no teams drop out and Michigan drops in, giving them an 7-4 record against top 50 schools; more wins, but also more losses, and a much lower winning percentage (0.833 vs 0.637). The cutoff at 50 is somewhat arbitrary; if you look instead at the top 60, Louisville is at 8-1 and Maryland is still at 7-4. Louisville has both more wins and a higher winning percentage (0.889 vs 0.637) against top 60 teams.

The Massey power ratings are even less convincing than the HHS rankings (page 4 of the pdf). Similarly to HHS, the shift into and out of the top 50 favors Maryland, but if you look just about anywhere past the to 50, Louisville looks better.

My overall impression is that the stronger case is made the RPI scores that Louisville has accomplished more, compared to the case made by the HHS and Massey power scores for Maryland.

The two dreadful teams (Mississippi Valley State and Coppin State):

Mississippi Valley State is substantially better than Coppin State on the Massey ratings, Massey power and HHR scores, about the same on the Elo score, but much, much worse on the RPI score (page 1 of the pdf).

As can be seen from looking at the HHS, RPI or Massey power rankings (pages 2, 3, and 4 of the pdf, respectively), MVS has won only one of their games while Coppin State has lost all of their games, however, Coppin State has lost to much better teams.

To be clear, the Coppin State games against those opponents that were ranked higher than MVS's highest ranking opponent were not competitive. They lost to South Dakota by 38, to Rutgers by 74(!), West Virginia by 35, South Dakota State by 53, Dayton by 30, and Cincinnati by 46.

The two teams had one common opponent: Florida A&M, which is another dreadful team (their highest ranking on these five rating scales is 343). Mississippi Valley State defeated Florida A&M by 1 point at home; Coppin State lost to Florida A&M by 10 on the road. I don't know what the home court advantages were, but it's hard to believe that they would account for an 11 point swing.

It makes very little sense to me that RPI has Coppin State so much above MVS. Coppin State has apparently gotten a boost by being routed by some good teams, however, my intuition is that Coppin State did nothing in those games to indicate that they were remotely comparable to those good teams. Providing a boost for getting routed by good teams does nothing toward evaluating how good a team is.

Baylor:

Baylor is kind of the anti-Coppin State. They are ranked first on all of the rating scales, except for RPI where they drop down to fifth (page 1 of the pdf).

Baylor has only lost one game, to South Carolina on a neutral court (when their best player was injured and unable to play), which rates no lower than third on any of these five scales (not shown). That's a better worst loss than South Carolina (Indiana - neutral court), Oregon (Arizona State - away), Louisville (Ohio State - away), and Stanford (Texas - away) and comparable to Connecticut (Baylor - home) .

RPI appears to be docking Baylor for scheduling and beating six ranked below 200, all at home. Six sub-200 opponents is more than any of the teams in the previous bullet, and the other teams played at least one sub-200 team on the road.

I agree that beating awful teams at home should do little, if anything at all, to build one's resume, but I don't see (functionally) assessing a penalty for that. Squandering an opportunity on a bad opponent should be penalty enough. Assessing a penalty also does nothing towards evaluating how good a team is.

The Massey rating scales, the HHS rating scale and the Elo scale all appear to be different assessments of how good a team is, but the above comparison of MVS/Coppin State and the examination of Baylor suggests that RPI is not trying to do that. Are they trying to evaluate success in the NCAA Tournament (i.e., more experience against better teams will help against the better teams in the Tournament)? What are they aiming for? Also, since the only way for bad teams to get better in RPI is to play good teams and good teams playing bad teams is penalized by RPI, RPI is structurally making it more difficult for bad teams from getting better.

Notes:

The Massey ratings were obtained from the Massey site: Massey Ratings - CBW. The team schedules were obtained from each team page on that site. Here's an oddity: the cutoff for inclusion in the ratings is games played by 11:59 pm EST. Hawaii played a home game on 1/30 that was scheduled at 7 pm local time, which is midnight on 1/31 EST. That game was not included in the 1/31 report.

I'm getting HHS ratings from the Her Hoops Stats site: Her Hoop Stats Rtg National Team Leaderboard | NCAA Division I Women's Basketball | Her Hoop Stats. The individual team schedules are from the Massey site.

I'm getting the Elo rating from Warren Nolan site: ELO Chess Ranking 2020 Womens College Basketball | WarrenNolan.com. The individual team schedules are from the Massey site.

Lastly, I'm getting the RPI data from the Warren Nolan Site: RPI (Live) 2020 Women's College Basketball | WarrenNolan.com. The RPI rankings are given on that page (twice), but the RPI score for each team is only on the team site. It takes about 2-3 minutes for my program to retrieve the rankings for 351 teams. It would be easier to get them from the Real Time RPI site, but I have lost faith in quality of their data. For about 3 1/2 weeks in January 2020, they included UConn's loss to Louisville on 1/31/18 (no, we didn't play Louisville this season, that's a game from last season). RTRPI finally fixed that, but they're still including Savannah State in the rankings, a school that in no longer in the MEAC, having made the transition from Division I to Division II this year and has no games scheduled against Division I opponents.

All programming and analysis was done in R.

And I thought I had analytical issues...man you make me look remedial by comparison. Great Job OFID, great job...My one quibble is I do think teams should be penalized for playing 200 sub teams at home (which Baylor does ALL THE TIME) and while they sprinkle in an occasional UConn, Tennessee or Stanford, the fact that Northwesternsoutheasterncentraltexasvalleystate are played so regularly and give no true indication of how good a school is. Until Kim is more penalized for seeding, she will continue to do this, just as Brenda did and has had to modify her ways (however slightly).

While I too, am not a huge Louisville fan of their lofty ranking, with the ACC down so much this year, I fear we will never know their true rank until the NCAA (if UConn must be a #2 seed, put us in with them!). :rolleyes:

FanFromMaine · Feb 3, 2020

OFID said:
I finally went back and looked more closely at the games in which AP Top 10 teams played a sub-200 opponent.

The women's teams collectively played 41 such games. The median margin of victory was 41 points; half of the MOVs were between 34 and 57 points. Of the 41 games, 37 (90.2%) had a margin of victory of at least 20 points. The smallest MOV was 5 points (Stanford over Cal Baptist). A stem-and-leaf plot looks like this:

The decimal point is 1 digit(s) to the right of the |

0 | 5
1 | 149
2 | 26899
3 | 34455666
4 | 001123479
5 | 045678
6 | 02588
7 | 04
8 | 01

The first line means that there was one game with an MOV between 5 and 10 points, and the MOV was 5. The second line means that there were three games between 10 and 20 points, and the MOVs were 11, 14, 19, and so on.

Here's where I'm going to have to eat some crow.

I was correct that there were more games for the men (48 vs 41), but that was about it. The median MOV for the men was 26; half of the MOVs were between 20 and 32. Of those 48 games, 32 (75%, which is definitely not about 90%) had an MOV of at least 20. The smallest MOV was 2 points (San Diego St over San Jose St). The stem-and-leaf plot looks like this:

The decimal point is 1 digit(s) to the right of the |

0 | 2
1 | 23334567777
2 | 00133445555667888999
3 | 111224569
4 | 1344
5 | 017

The shapes of the distributions differ some, but, in general, the men's distribution is shifted ~15 points to the left of the women's distribution; i.e., the men's games were more competitive. I'm not sure that I'd characterize them as competitive in an absolute sense (only 1 game was decided by < 12 points), but they were more competitive than the women's.

Again, thanks for the effort.

I was surprised that the men's top ten teams have as many non-competitive games as they have. But I was even more surprised that the the percentage of MOVs over 20 points is as close between the men and the women as it is. Intuitively I would have thought it would be more of a 90% to 40% picture.

Perhaps because Geno wants UConn to always play as well a possible and we play in a weak conference, one infers that routs are the norm in the women's game. I support this strategy as Geno's goal is a competent rotation (usually 6-7 people) come tournament time. But it does leave him open to claims of running up the score.

DefenseBB · Feb 4, 2020

FanFromMaine said:
Again, thanks for the effort.

I was surprised that the men's top ten teams have as many non-competitive games as they have. But I was even more surprised that the the percentage of MOVs over 20 points is as close between the men and the women as it is. Intuitively I would have thought it would be more of a 90% to 40% picture.

Perhaps because Geno wants UConn to always play as well a possible and we play in a weak conference, one infers that routs are the norm in the women's game. I support this strategy as Geno's goal is a competent rotation (usually 6-7 people) come tournament time. But it does leave him open to claims of running up the score.

You know you hit upon an interesting topic. I wonder why Geno doesn't do more frequent rotations of the 6/7/8 players like subbing at the 3/4/5 minute mark of each game/quarter to get and keep the energy flowing and the team mixing it up. His pattern might be part of his demise as the team gets too set in their scoring malaise and lowing the MOV.

Massey analytics

azfan

Massey Ratings - College Basketball Women's : NCAA D1 Ratings

Bear11

azfan

Centerstream

Bear11

Centerstream

TheFarmFan

Stanford Fan, Huskies Admirer

FanFromMaine

Plebe

La verdad no peca pero incomoda

OFID

Attachments

TheFarmFan

Stanford Fan, Huskies Admirer

azfan

FanFromMaine

TheFarmFan

Stanford Fan, Huskies Admirer

OFID

Attachments

OFID

OFID

DefenseBB

Snark is always appreciated!

FanFromMaine

DefenseBB

Snark is always appreciated!

Online statistics

Forum statistics