3.26.2012

Scientific Writing

Adam Ruben recently wrote a very funny article for Science entitled How to Write Like a Scientist. He comments that his PhD advisor claimed he didn't write like a scientist (a sort of back-handed compliment) and then proceeds to give some very lively advice for "remedying" said problem, such as:
Scientific papers must begin with an obligatory nod to their own relevance, usually by citing exaggerated figures about disease prevalence or other impending disasters. If your research does not actually address one of these issues, pretend it does, because hey, that didn’t stop you on the grant application. For example, you might write, “Twenty million children die of scabies every day. OMG we built a robot kangaroo!”
and,
Remember your audience. It consists primarily of graduate students who, 10 years from now, will include your paper in their own voluminous collection of superscripted references. So remember them, and make your name easy to spell. 
Check out the entire article for much more advice.

3.25.2012

Some Miscellany

I saw a blog post that had gifts for data geeks. Most ideas were not that cool, but a couple were. One featured a wall clock from Fluid Forms. The clock is created by etching an OpenStreets Map into acrylic or wood. Below is a screen shot I took of one I created from a map of Minneapolis. The hands emanate from Nicollet Island.
They also make necklaces, earrings, etc. But, the minimum order is 60 pieces. I need to talk some shop into ordering them so I can buy a necklace.

The other cool gift, was from a project called Newsknitter. This project produces sweaters based on 24 hours of new feed from the Internet. Good looking, warming and surely a topic the wearer has to explain more than once.


3.17.2012

Some Word Clouds

I was dorking around with R today and decided to create some Word clouds. The first is of the 2012 State of the University Address given by President Kaler on March 1, 2012. After removing punctuation (except for hyphens), numbers, converting everything to lowercase, and stripping whitespace, I used the tm package to create a document-term matrix (recording each word along with its frequency). After converting this to a data frame, I used the brewer.pal() and wordcloud() functions to create the word cloud itself.


The result shows that President Kaler used all of the appropriate terms one would expect in a State of the University Address. The terms "students", "faculty, and "research" are all prominent, as are "budget", "tuition", balancing", "support" and "learning" and other administrative catch-all words.

The code I used was
library(tm)
library(wordcloud)

## Read in the data from a folder which contains the text document(s)
(ovid <- Corpus(DirSource("/Users/andrewz/Documents/Data/State-of-the-University/"), 
    readerControl = list(reader = readPlain)))

## Document preparation
sotu <- tm_map(ovid, removePunctuation, preserve_intra_word_dashes = TRUE)
sotu <- tm_map(sotu, removeNumbers)
sotu <- tm_map(sotu, tolower)
sotu <- tm_map(sotu, stripWhitespace)
sotu <- tm_map(sotu, removeWords, stopwords("english"))
sotu <- tm_map(sotu, stripWhitespace)

## Create document-term matrix
tdm <- DocumentTermMatrix(sotu)
m <- as.matrix(tdm)
v <- sort(colSums(m),decreasing = TRUE)
d <- data.frame(word = names(v), freq = v)

## Plot
pal <- brewer.pal(7, "Set3")
pdf("/Users/andrewz/Desktop/SOTU.pdf", width = 8.33, height = 6.67, bg = "black")
wordcloud(d$word,d$freq, 
 #scale=c(8, 0.3),
 min.freq = 3, 
 #max.words = 100, 
 #random.order = TRUE, 
 rot.per = 0.15, 
 colors = pal, 
 vfont=c("sans serif","plain")
 )
dev.off()


The second word cloud is based on my Google Scholar page. The cloud on the left-hand side shows my co-authors (sized by most frequent) and the cloud on the right-hand side shows terms that show up in the work linked to my Scholar page.

The summary citation info can also be output in R. Mine is


Total papers = 20
Median citations per paper = 1.5
Median (citations / # of authors) per paper = 0.4166667
H-index = 6
G-index = 9
M-index = 1
First author H-index = 4
Last author H-index = 2
First or last author H-index = 5
First or second author H-index = 5


The code is below
source("http://biostat.jhsph.edu/~jleek/code/googleCite.r")
out <- googleCite("http://scholar.google.com/citations?user=cWpN_s8AAAAJ&hl=en", 
    pdfname = "/Users/andrewz/Desktop/Zieffler_wordcloud.pdf")
gcSummary(out)


3.12.2012

USC Football

This one goes out to Jeff.


Really Utah and Colorado don't have enough data to support the smoother, but they are there for completeness. I took off the confidence envelopes so they wouldn't distract from the story.

3.09.2012

Digital Scrapbook: Teaching

This editorial was written by Rob Gardner, our Knowledge Bowl coach my senior year. It appeared in the St. Cloud Times and I believe almost any teacher (rookie or seasoned) can relate to its sentiment.


Teachers Make Lasting Mark on Students

I recently attended a pair of high school graduation ceremonies. I knew several graduating seniors and wanted to show my support for them and wish them luck in their futures-standard commencement fare.

But in the process of affirming these students' abilities and accomplishments, I also affirmed one of the most important decisions of my life: to become a teacher.

I recently finished my third year of college and my third year of studying to be a teacher. College and its requisite decisions have not always been easy. After my first year I changed my field of emphasis...; I didn't want to risk facing students and not being enthusiastic: If I couldn't enjoy the subject, I could never expect my students to. Luckily, others do not feel the same about this area.

So I changed my emphasis-and my outlook. Since then, things have gone well. Until spring quarter.

I had an education class with field experience. For the first time since I decided on my new major, I was put into the classroom for an extended period of time (eight weeks!) to observe, to practice and to learn.

And learn I did. I learned from one class how wonderful discussions can be and how willing students are to expand their worlds of knowledge and experience. I learned to treat all students as equals: to do anything less is to invalidate their worth. And I learned never to lower my expectations because students will match their teacher's expectations, whether high or low.

But then I moved to a different school and different students and had these lessons put to the test. I met the challenge and, two weeks later, felt nearly everything I had done. had gone awry. I began to doubt my decision to be a teacher.

How could two short weeks have such an impact? So much happens in so little time. Reactions quickly add up.

After this class ended, I'd have time to recover and regroup before I student teach, I reasoned. But deep down I knew I would not approach teaching the next time with the enthusiasm and openness that I had at the beginning of my field experience. I had my first battle scars, and I was reluctant to risk getting more.

Then came graduation. Though I graduated from high school only three years ago, I had already forgotten how students and teachers react commencement day. Even if I did remember, it would have been from a student perspective which, though joyous in its own right,is naturally limited.

I was lucky enough to view the Apollo commencement from "behind the scenes" as everyone prepared to go through the ceremonies. I saw, as teachers arrived, graduates rush up to them for one last hug or handshake. I saw seniors and teachers crack one last joke together. And I saw the tears begin to well up in almost everyone's eyes-including mine. During the ceremony I heard teachers whisper to each other about the students' qualities as the students received their diplomas.

After the ceremony the tears again began to form as the graduates emerged from Hallenbeck Hall. But this time the tears were mixed with surprise and happiness as former graduates-former students-picked their way through the crowd to say hello to their former teachers. Quickly, what once looked like the end of so many relationships became a reunion and renewal.

And that's exactly when I too felt renewed. In this instant I realized the real reason I want to be a teacher: While subject. matter (and my interest in it) is important, the impact I can have on students' lives, the dependable relationships I can build with students and the motivation I can give them far excel any other reasons for me to become a teacher.

Why do I want to be a teacher? Simply put, to work with students, all of whom are extraordinary.

Robert Gardner is majoring in English and secondary education at St. Cloud State University.

3.03.2012

The @#$%&!ing Coconut

After having my laptop "fixed" by the Coconut (which has taken since Tuesday January 17–currently March 3rd) and subsequently having to re-install many of my applications and all of their preferences at least twice in that span, I am writing this post to document some of the steps I have taken to make my system workable again.

Library Folder
First off, I was updated to OS 10.7 (Lion) from a workable Snow Leopard. Many of the preferences and extras that make life workable for computing wizards is stored in the users ~/Library folder on a Mac. Apple has decided to remove that folder from sight because most users are not wizards....or maybe because many teenagers don't know what a library is anymore. Regardless, this is unacceptable. The folder is actually still there, but it is just not visible. This can be changed by opening Terminal (/Applications/Utilities/Terminal) and then typing
chflags nohidden ~/Library
Wallah! The Library folder is now visible.

TexShop
I also needed the one-click run for TeXShop. To do this I copied the following into a file called Sweave.engine which I then put into the ~/Library/TeXShop/Engines folder.

#!/bin/bash

export PATH=$PATH:/usr/texbin:/usr/local/bin
R CMD Sweave "$1"
pdflatex "${1%.*}"
The PATH= gives the location of your TeX distribution. This puts the Sweave option under the typesetting menu (see below), and runs both Sweave and PDFtex when the Typeset button is pressed.
If it will not run, the permissions to execute the file might also need to be changed. In Terminal, type
chmod +x ~/Library/TeXShop/Engines/Sweave.engine
This allows the file to be executed.

Sweave and TexShop
As of R v. 2.8, the default manner in which it linked to the Sweave files. This is easily fixed. Open Terminal,
mkdir -p ~/Library/texmf/tex/latex 
cd ~/Library/texmf/tex/latex 
ln -s /Library/Frameworks/R.framework/Resources/share/texmf Sweave
Line 1 creates a texmf, tex and latex folder (each subsequent folder embedded the others in the user's Library folder. Line 2 will change directory so that you are in the latex folder. The last line, will create a link from your user's tex directory to the Sweave files that R uses. Now it is possible to just use the following line in TexShop:
\usepackage{Sweave}

Turn Off File Locking
The Lion OS introduced automatic file locking for any file that hasn’t been edited recently. You will notice this when you try and open an older file and make changes to it, a dialog box asks to duplicate the file or to manually unlock it. This is problematic (and annoying) for things like updating R packages. The solution:
  • Open System Preferences > Time Machine
  • Click the Options button
  • Uncheck the Lock Documents box
Turn Off File Vault
File Vault is Lion's hard disk encryption system. Worthwhile if you are a main character on the TV show Homeland, otherwise, not so much. To adios this,

  • Open System Preferences > Security and Privacy
  • Click the Turn Off File Vault button
This may take awhile depending on how much data you have. It took my system roughly 6 hours.

TextWrangler
A syntax editor is a must for all data scientists. I use TextWrangler, the free version of BBEdit. To make Text Wrangler more compatible with R, there are two things that you need to do which are essential. (1) Add functionality to run syntax directly from TextWrangler, and (2) change the color theme. For the former,
  • Download the SendSelToR.txt file from macsi
  • Change the file suffix from .txt to .scpt (applescript)
  • Move the file to TextWrangler's Scripts folder (~/Library/Application Support/TextWrangler/Scripts)
  • Open Text Wrangler and from the Window menu select Palettes > Scripts
  • You should see the SendSelToR script. If so, hit the Set Key... button and type in a key command that you will then use to send the syntax to R. For example, ⌘-return is a common keystroke to run syntax.
To change the color theme,

  • Download the syntax highlighting file (R.plist) from macsi
  • Move this file into Text Wrangler's Language Modules folder (~/Library/Application Support/TextWrangler/Language Modules/)
  • Restart Text Wrangler
  • From the TextWrangler menu select Preferences > Languages
  • Add a new suffix mapping (e.g., .r to R language)

This gets things ready, next,

  • Download BBColors from Daring Fireball
  • Open the zip file and move bbcolors to some location in your shell’s PATH (a typical location would be /usr/local/bin/ or /usr/bin/)
  • Download a color scheme file (.bbcolor files) that you like. Daring Fireball provides a couple. I personally like Ethan Schoonover's Solarized (Dark) theme which is available through GitHub. That being said, I use Ghetto Cooler's Gentle Honey because it is easier for students to see when I project scripts in class.
  • Go to ~/Library/Application Support/
  • Create a new folder called BBColors
  • Move any .bbcolor files into this newly created folder
  • Open Terminal and type
bbcolors -load "Gentle Honey" -tw
where Solarized Dark is the name associated with the .bbcolor file. The -tw will associate this with TextWrangler rather than BBEdit.

Transfer Settings
Importing over the old settings or preferences when you change computers is very time saving, although you may just want to start from scratch. The following is what I ported over

  • Stickies: The Stickies database is in ~/Library/StickiesDatabase. Copy and paste this into your new Stickies database.
  • iCal: The iCal data is stored in ~/Library/Calendar
  • Taco: The preferences data (syntax coloring, etc.) is stored in ~/Library/Preferences
  • Fonts: These are stored in ~/Library/Fonts
I used Dropbox to transfer these files over to my computer.


Terminal
I also wanted to change the color theme for Terminal. I personally like Ethan Schoonover's Solarized (Dark) theme.

  • Go to Tomislav Filipčić's GitHub and click on the zip link to download the Solarized Light and Solarized Dark themes for the Terminal App.
  • After downloading the zip file, double click each theme in turn. This will add them to the preferences pane in Terminal.
  • Open Terminal and select Terminal > Preferences
  • Click the Settings pane
  • Choose a theme and click the Default button below the list of themes.
  • Re-open Terminal





Hidden Files
The easiest way to see hidden file is through Terminal. Open Terminal and type the following
defaults write com.apple.Finder AppleShowAllFiles TRUE
killall Finder
The first line will rewrite the Finder's preference file to show all files (including the hidden ones). The second line kills the Finder, causing it to reboot so that the preferences will take effect. When you are done messing with the hidden files you can hide them again by setting the last flag in line 1 to FALSE.


General Lion Slowness
Lion saves any file changes to your local drive so you can use Time Machine even if your external drive isn't connected. While they claim it doesn't slow things down, it might very well, especially when working on a large file. To disable local backups, open Terminal and type the following:
sudo tmutil disablelocal