2.27.2012

Football Time Plots

Here is a sneak peak at some of the work that Brandon and I have been doing. After a year of mining data, cleaning data, and mostly figuring out how to web scrape, we finally have some data to work with. The following three plots show time series plots for the University of Minnesota, the University of Texas at Austin, and the University of Iowa (for Lauren, Charles, and Tom).


On the y-axis we have plotted the difference in points scored (PF – PA) for each game in the school's football program history (at least as recorded by College Football Data Warehouse). We also show the y = 0 line (games plotted above that line are wins, games plotted below the line are losses). The box-and-whiskers plots cover ten-year spans to give an indication of decade-by-decade performance. (For those who have forgotten how to read a box-and-whiskers plot, the line inside the box is the median. The length of the box visually displays the variation in the middle 50% of the data. The whiskers extend to the furthest observation that is not a potential outlier.) Lastly, we plotted the integrated smoothness estimate based on the generalized additive model (think 'trend'!). The 95% confidence envelope around the smoother displays the uncertainty in the trend (think 'the real value could be anywhere in there').

The trends in these plots is quite striking. Minnesota has definitely seen better days (the last 40 years look as bad as I feel most Saturdays). Texas, not surprisingly, in all but a couple decades seems to have won at least 75% of their games. Iowa shows quite a lot of fluctuation, but the good(?) news–at least for Tom–is that they are trending upward in recent years.

Once we merge the data we have, I should be able to color the points by coach (think 'bad news for Brewster')!


No comments: