Geeky Recruiting Data Shit

Pretty good start @AIRWOLF. Couple things I would consider:

1. Sort/order by star ranking (best to worst), not alphabetical. The idea is you want to show the important information in order.

2. The message you send with a line chart is greatly affected by aspect ratio. There is no perfect ratio, but generally 2:1 (w:h) is about right. In this case, a 0.2 star ranking change is a steep increase or decrease. Columns or bars might be a better choice.

3. Are you trying to emphasize the trend over time, or compare average ranking of each school? Comparing rank of each school can be done much simpler (11 bars ordered by avg rank over this time period, Cal excluded of course).
Comparing the trend of multiple items can be difficult in a limited space. A good method is to put all lines on a single chart, with a selected school using their color, all other schools being gray/semi-transparent in the background. Your visualization tool might not do this, though.

4. Be careful web-scraping TBS stuff. My cousin had the FBI knock on his door for similar activity. Not good.

Do you even Tableau, bro?

Seriously though, thanks. I have tons more I need to do with this, but I am a retard with this software.

A lot of it is stuff I have done in Excel before, just not over as long a time period.

I really am doing this first and foremost to learn how to use the stupid software. Simple things like changing the colors of the lines and formatting things are easy once you know how to do them, but they are not easy when you don't.

Yep, Tableau is great. It can definitely create the dynamic line chart. This poast may help (the "data-ink" concept is what I was trying to get at).
 
Created a shitty heatmap with dendrograms, interpolating the missing games so we get to play EVERYONE.

Rows are offense, columns are defense.

Spoilers: we lose to USC and are a tossup with dickrod thanks to our shitty offense. Our defense is clustered similarly to USC and Stanford, our offense is similar to that of ASU.

jb1fp8K.png

I have no idea what I am looking at, but I like it.
 
Been geeking out some more...

The 247 Sports Composite has seen some significant grade inflation over the years.

Here is the average of the class averages for the Top 100 classes since 2002:

cxxt0clnordu.jpg
 
As a result of this grade inflation, all recruiting "championships" aren't created equal.

On the below graph, I show the class average for the #1 class, adjusted for the average class ranking for FBS teams for that year. The classes USC was putting together in '05 and '06 clearly stand above the rest.

ept6uqdx035f.jpg

And why yes, those are actually the correct official colors on the graph, because I think and I care. I also have no life.
 
Here is the progression of our? classes in terms of 247 Composite class average over the years, compared to the average for all "Power 5" teams. Includes 2018 commits through Irvin.

View attachment 357
 
Damn. Last 3 classes are the best since 02, and getting better every year. Pete has this shit rolling.
 
Damn. Last 3 classes are the best since 02, and getting better every year. Pete has this shit rolling.

Yep. The other thing to realize is that even if you throw out Sarkisian's 2009 transition class, his recruiting was basically just in-line with the Power 5 average over the 2010-2013 time frame.

Petersen's '16 class was equivalent to Sarkisian's '13 class (Sark's best) on paper. Of course, we all realize that "on paper" is ridiculously flattering to Sark's classes, while it is actually a bit punitive to Petersen's classes.
 
Pete's classes are a stairway to heaven. I love 2018's class. Best I've ever seen at UW since I've cared to pay attention (1996). The raw data clearly shows that my boner will also be raw come this time next year. 2019 offseason natty will be truly speshul.
 
With grade inflation you are probably also seeing some distortion of the data since you have an upward limit to the grade. I would think this would mean that especially in the case of media bias it means our classes would be essentially under rated vs top recruiting classes full of kids with "5 star" ratings would be overstated.
 
I am learning how to use some data visualization software, so of course I am playing around with TBS data.

The data is Rivals recruiting database. I'd prefer to use the 24/7 Composite rankings, but scraping that shit from the web is a major PITA.

Pac-12 North
tqGL9FK.jpg


Pac-12 South
aIdGyhp.jpg

The funny part is the dip beore UW's rise was Pete's first class which ended up re-ranked as the best in the Pac that year.

S Budda Baker, DL Vita Vea, WR Dante Pettis, CB Sidney Jones, S JoJo McIntosh, DL Greg Gaines. They were part of the same class and back then only Baker was considered a top-300 player.http://www.espn.com/blog/pac12/post/_/id/108339/re-ranking-the-pac-12-recruiting-classes-from-2014
This might be unprecedented player development. And he put that class together from scratch in a couple months.
How does he do it? I had always assumed roids at Boise, but I think Washington/Pac12 football has much better oversight in that department.

Anybody want to speculate on his advantage?
Could he really be that much better at player development than everyone else while his assistants change dramatically?
Are there any notable film watching or stat nerds that have stuck with him the whole time who are great at identifying talent?
 
With grade inflation you are probably also seeing some distortion of the data since you have an upward limit to the grade. I would think this would mean that especially in the case of media bias it means our classes would be essentially under rated vs top recruiting classes full of kids with "5 star" ratings would be overstated.

Naw, because the grades aren't being inflated at the upper limits. There are the same number of five stars and not that many more four stars.

The biggest difference is that everyone is a 3 star now.
 
amew1s6770b9.jpg

I am still on the struggle bus, but I am working on figuring out how to do this shit in Tableau, rather than brute forcing it with ExcelFS.

"Apparent Talent" is the 4 year average Rivals average stars for all signees, relative to the Power 5 average. So 10% is 10% better than average.

"Composite Quality" is based on three SOS-adjusted ratings systems (Sagarin-game efficiency, FEI-drive efficiency, and S&P+-per play efficiency). It is a simple average of Z-scores, relative to the Power 5 average. The Composite Quality measure is highly correlated with FBS winning %, as one would expect, but it has the benefit of being SOS-adjusted and factoring out a great deal of randomness in terms of game outcomes.

If you follow the link to Tableau you can see what team and head coach each data point represents.

Tableau Link
 
Last edited:
Back
Top