Sunday data/statistics link roundup (5/27)

  1. Amanda Cox on the process they went through to come up with this graphic about the Facebook IPO. So cool to see how R is used in the development process. A favorite quote of mine, “But rather than bringing clarity, it just sort of looked chaotic, even to the seasoned chart freaks of 620 8th Avenue.” One of the more interesting things about posts like this is you get to see how statistics versus a deadline works. This is typically the role of the analyst, since they come in late and there is usually a deadline looming…
  2. An interview with Steve Blank about Silicon valley and how venture capitalists (VC’s) are focused on social technologies since they can make a profit quickly. A depressing/fascinating quote from this one is, “If I have a choice of investing in a blockbuster cancer drug that will pay me nothing for ten years,  at best, whereas social media will go big in two years, what do you think I’m going to pick? If you’re a VC firm, you’re tossing out your life science division.” He also goes on to say thank goodness for the NIH, NSF, and Google who are funding interesting “real science” problems. This probably deserves its own post later in the week, the difference between analyzing data because it will make money and analyzing data to solve a hard science problem. The latter usually takes way more patience and the data take much longer to collect. 
  3. An interesting post on how Obama’s analytics department ran an A/B test which improved the number of people who signed up for his mailing list. I don’t necessarily agree with their claim that they helped raise $60 million, there may be some confounding factors that mean that the individuals who sign up with the best combination of image/button don’t necessarily donate as much. But still, an interesting look into why Obama needs statisticians
  4. A cute statistics cartoon from @kristin_linn  via Chris V. Yes, we are now shamelessly reposting cute cartoons for retweets :-). 
  5. Rafa’s post inspired some interesting conversation both on our blog and on some statistics mailing lists. It seems to me that everyone is making an effort to understand the increasingly diverse field of statistics, but we still have a ways to go. I’m particularly interested in discussion on how we evaluate the contribution/effort behind making good and usable academic software. I think the strength of the Bioconductor community and the rise of Github among academics are a good start.  For example, it is really useful that Bioconductor now tracks the number of package downloads