Sunday Data/Statistics Link Roundup (11/11/12)

  1. Statisticians have been deconstructed! I feel vaguely insulted, although I have to admit I’m not even sure I know what the article says. This line is a doozy though: "Statistics always pulls back from the claims it makes…" As a statistician blogger, I make tons of claims. I probably regret some of them, but I’d never take them back :-). 
  2. Following our recent detour into political analysis, Here is a story about the statisticians that helped Obama win the election by identifying blocks of voters/donors that could help lead the campaign to victory. I think there are some lessons here for individualized health. 
  3. XKCD is hating on frequentists! Wasserman and Gelman respond. This is the same mistake I think a lot of critics of P-values make. When used incorrectly, any statistical method makes silly claims. The key is knowing when to use them, regardless of which kind you prefer. 
  4. Another article in the popular press about the shortage of data scientists, in particular “big data” scientists. I also saw a ton of discussion of whether Nate Silver used “big data” in making his predictions. This is another one of those many, many cases where the size of the data is mostly irrelevant; it is knowing the right data to use that is important. 
  5. Apparently math can be physically painful.  I don’t buy it. 

A statistical project bleg (urgent-ish)

We all know that politicians can play it a little fast and loose with the truth. This is particularly true in debates, where politicians have to think on their feet and respond to questions from the audience or from each other. 

Usually, we find out about how truthful politicians are in the “post-game show”. The discussion of the veracity of the claims is usually based on independent fact checkers such as PolitiFact. Some of these fact checkers (Politifact in particular) live-tweet their reports on many of the issues discussed during the debate. This is possible, since both candidates have a pretty fixed set of talking points they use, so it is near real time fact-checking. 

What would be awesome is if someone could write an R script that would scrape the live data off of Politifact’s Twitter account and create a truthfullness meter that looks something like CNN’s instant reaction graph (see #7) for independent voters. The line would show the moving average of how honest each politician was being. How cool would it be to show the two candidates and how truthful they are being? If you did this, tell me it wouldn’t be a feature one of the major news networks would pick up…