Some Miscellany to Catalog

In a blog post I read recently, a Carnegie Mellon statistics professor was waxing-on about the differences between a "statistician" and a "data scientist" [read blog].  He writes,
If people want to call those who do such jobs "data scientists" rather than "statisticians" because it sounds more dignified, or gets them more money, or makes them easier to hire, then more power to them. If they want to avoid the suggestion that you need a statistics degree to do this work, they have a point but it seems a clumsy way to make it. If, however, the name "statistician" is avoided because that connotes not a powerful discipline which transforms profound ideas about learning from experience into practical tools, but rather, a meaningless conglomeration of rituals better conducted with twenty-sided dice, then we as a profession have failed ourselves and, more importantly, the public, and the blame lies with us. Since what we have to offer is really quite wonderful, we should not let that happen.

The italicized bit is mine. I think it sums up what myself and others have felt for awhile now. It may seem that what we call ourselves is irrelevant, but maybe it isn't. Many companies or products go through a re-branding process to improve their image or make themselves sexier. The cynical side of me says that this is just an effort to increase revenue, but the almost ever present flip-side is that it works. As a discipline, we are competing with other disciplines for the revenue of people. Currently, students have more choices than ever before in the history of education. Re-branding the profession may be exactly what is needed.

However, changing the nomenclature or title (even the name of the degree) will not be enough. Too many statistics courses are still taught in a manner in which students come out feeling that the discipline is a meaningless conglomeration of rituals better conducted with twenty-sided dice. The content in the introductory course hasn't changed (in many courses) in twenty years. I am positive that some people who read that last sentence believe that to be a good thing. They likely view education through nostalgic lenses.

Several years ago, psychology researchers published an article in Journal of Personality and Social Psychology in which they identified the content, triggers, and functions of nostalgia. One key finding was that people seem to engage in nostalgia specifically to make themselves feel better, which suggested that we may be unconsciously biased towards remembering things that make us happy and against remembering the things that do not. Human beings have a remarkable propensity towards this bias, requiring far less information to confirm beliefs when they are consistent with our current state of mind. In the psychology literature, this is known as “confirmation bias”, and there is a substantial body of research that has shown that people are predisposed to remember more of the good things in life.

My experience (personal data) was that I did not understand the true nature, power, and sexiness of statistics (or data science) until my Ph.D. work. Some of this is, of course, related to being able to see the forest for the trees, but much of it was the problems that were presented in my earlier course work. Seeing a disciple as one that transforms profound ideas about learning from experience into practical tools requires profound questions and problems. The fact is that many students make decisions about their future course work and major after a single course. One great experience in an early course is all that is needed to forever tie that course into our memories. A terrible experience forever associates the discipline with negative feelings. The first course is the most important one for the discipline. We have an obligation to do better by our students in this first course. If that includes re-branding ourselves as data scientists, so be it.

On a lighter note, my favorite quote about R appeared in a NYT article about bond traders.
The traders here are mostly educated in math or physics, often outside the United States, and their desks are piled high with textbooks like the “R Graphs Cookbook,” for working with obscure computer programming languages.
You can read the article here [read article].


Where has Summer Gone?

I cannot believe that the Joint Statistics Meetings are upon us again. Once again, this summer seems to have flown by and I haven't even come close to getting the things done that I wanted to. Just a quick update (more to communicate this in the public arena for my own accountability) on what I have been doing and what I need to yet finish.

First, I taught a regression course this summer (technically we have a week to go). I updated all of my notes to use ggplot2 in the course. For myself, I also updated all of them using knitr, which I finally have compiling in TexShop. The PDF format of all the notes, along with the R scripts, data, and a bevy of other info is on the course website (open to the public) at http://www.tc.umn.edu/~zief0002/8262.php.

I have also been putting our course activities, homework, etc. from the CATALST project into a book that we will use in the EPsy 3264 course. I am almost through two of the three units, but I have a bad feeling about Unit 3. It has been my nemesis for two years running, and although I am more satisfied with its current instantiation, it still doesn't blow my socks off.

Speaking of socks, I got some sweet dachshund socks last week. Unlike the photo, mine are bright yellow. I have also been working on a project with the University extension program to evaluate the education program of the Supplemental Nutrition Assistance Program. It has been fun to help them think about study design. In addition, I have been working with a student to clean and analyze data collected at the Walk-In Counseling Center. We have tens of thousands of observations from as far back as 1992! It may be rich (or it may not), unfortunately we are still on the cleaning part. I think the plan is to eventually make the data public. Stay tuned.

For fun I have been working on the landscaping, although with the heat this summer it has been slow. The stone pathway remains unfinished, but I have gotten a little further on it. We also just planted some wild prairie (I know it is the wrong time of year to seed, but so it is). I have also been casting stepping stones for the auxiliary walking path. Maybe once I finish the walkway, I will post some pics.

Tim and I also painted Herbie so I could drive him for awhile longer before I buy a different car. That was quite an adventure. He is a darker more maroony-red than he was. We also bondo-ed the dents in the front and back. I also got a huge Goldy sticker for the roof. He looks pretty sweet, but soon he will look sweeter. I have some yellow hood and trunk stripes to put on. I also have some Goldy metallic emblems to replace the old MINI ones. 

Lastly, for fun, I have been watching the video from Harvard's CS50 lectures on iTunes U. This has been a real treat and I have already put some of the programming into action. I wish my computer science professors had been as animated, hip and witty as David. I also love how many opportunities those students get to actually immediately use their skills to improve the campus and community. Awesome! 

Now, on to the list of things that I need to do yet (that haven't been already mentioned):

  • I need to learn D3 at a higher level and implement it using some data. I bought the D3 book from O'Reilly, but thus far I am under-whelmed. I was hoping for a bigger book, but it is one of those slim numbers that O'Reilly seems to be pushing out.
  • I would like to finish a book or two that I have been meaning to read.
  • I would like to add insulation, vapor barrier, etc. to the outside walls in the garage. I think this might really help since our bedroom is directly above.
  • I want to finish a course proposal for a categorical data modeling course.
  • I want to update all of my webpages.
  • I would like to update my 8261 course. 


Call Me Maybe?

A pretty cool video of my favorite mascot....