The pebbles of academia

I have just been awarded a certificate for successful completion of the Conflict of Interest Commitment training (I barely passed). Lately, I have been totally swamped by administrative duties and have had little time for actual research. The experience reminded me of something I read in this NYTimes article by Tyler Cowen

Michael Mandel, an economist with the Progressive Policy Institute, compares government regulation of innovation to the accumulation of pebbles in a stream. At some point too many pebbles block off the water flow, yet no single pebble is to blame for the slowdown. Right now the pebbles are limiting investment in future innovation.

Here are some of the pebbles of my academic career (past and present): financial conflict of interest training , human subjects training, HIPAA training, safety training, ethics training, submitting papers online, filling out copyright forms, faculty meetings, center grant quarterly meetings, 2 hour oral exams, 2 hour thesis committee meetings, big project conference calls, retreats, JSM, anything with “strategic” in the title, admissions committee, affirmative action committee, faculty senate meetings, brown bag lunches, orientations, effort reporting, conflict of interest reporting, progress reports (can’t I just point to pubmed?), dbgap progress reports, people who ramble at study section, rambling at study section, buying airplane tickets for invited talks, filling out travel expense sheets, and organizing and turning in travel receipts. I know that some of these are somewhat important or take minimal time, but read the quote again.

I also acknowledge that I actually have it real easy compared to others so I am interested in hearing about other people’s pebbles? 

Update: add changing my eRA commons password to list!

Tags: Rant humor

When dealing with poop, it’s best to just get your hands dirty

I’m a relatively new dad. Before the kid we affectionately call the “tiny tornado” (TT) came into my life, I had relatively little experience dealing with babies and all the fluids they emit. So admittedly, I was a little squeamish dealing with the poopy explosions the TT would create. Inevitably, things would get much more messy than they had to be while I was being too delicate with the issue. It took me an embarrassingly long time for an educated man, but I finally realized you just have to get in there and change the thing even if it is messy, then wash your hands after. It comes off. 

It is a similar situation in my professional life, but I’m having a harder time learning the lesson. There are frequently things that I’m not really excited to do: review a lot of papers, go to long meetings, revise a draft of that paper that has just been sitting around forever. Inevitably, once I get going they usually aren’t as difficult or as arduous as I thought. Even better, once they are done I feel a huge sense of accomplishment and relief. I used to have a metaphor for this, I’d tell myself, “Jeff, just rip off the band-aid”. Now, I think “Jeff, just get your hands dirty”. 

Sunday data/statistics link roundup (5/27)

  1. Amanda Cox on the process they went through to come up with this graphic about the Facebook IPO. So cool to see how R is used in the development process. A favorite quote of mine, “But rather than bringing clarity, it just sort of looked chaotic, even to the seasoned chart freaks of 620 8th Avenue.” One of the more interesting things about posts like this is you get to see how statistics versus a deadline works. This is typically the role of the analyst, since they come in late and there is usually a deadline looming…
  2. An interview with Steve Blank about Silicon valley and how venture capitalists (VC’s) are focused on social technologies since they can make a profit quickly. A depressing/fascinating quote from this one is, “If I have a choice of investing in a blockbuster cancer drug that will pay me nothing for ten years,  at best, whereas social media will go big in two years, what do you think I’m going to pick? If you’re a VC firm, you’re tossing out your life science division.” He also goes on to say thank goodness for the NIH, NSF, and Google who are funding interesting “real science” problems. This probably deserves its own post later in the week, the difference between analyzing data because it will make money and analyzing data to solve a hard science problem. The latter usually takes way more patience and the data take much longer to collect. 
  3. An interesting post on how Obama’s analytics department ran an A/B test which improved the number of people who signed up for his mailing list. I don’t necessarily agree with their claim that they helped raise $60 million, there may be some confounding factors that mean that the individuals who sign up with the best combination of image/button don’t necessarily donate as much. But still, an interesting look into why Obama needs statisticians
  4. A cute statistics cartoon from @kristin_linn  via Chris V. Yes, we are now shamelessly reposting cute cartoons for retweets :-). 
  5. Rafa’s post inspired some interesting conversation both on our blog and on some statistics mailing lists. It seems to me that everyone is making an effort to understand the increasingly diverse field of statistics, but we still have a ways to go. I’m particularly interested in discussion on how we evaluate the contribution/effort behind making good and usable academic software. I think the strength of the Bioconductor community and the rise of Github among academics are a good start.  For example, it is really useful that Bioconductor now tracks the number of package downloads

Computational biologist blogger saves computer science department

People who read the news should be aware by now that we are in the midst of a big data era. The New York Times, for example, has been writing about this frequently. One of their most recent articles describes how UC Berkeley is getting $60 million dollars for a new computer science center. Meanwhile, at University of Florida the administration seems to be oblivious to all this and about a month ago announced it was dropping its computer science department to save $. Blogger Steven Salzberg, a computational biologists known for his work in genomics, wrote a post titled “University of Florida eliminates Computer Science Department. At least they still have football" ridiculing UF for their decisions. Here are my favorite quotes:

 in the midst of a technology revolution, with a shortage of engineers and computer scientists, UF decides to cut computer science completely? 

Computer scientist Carl de Boor, a member of the National Academy of Sciences and winner of the 2003 National Medal of Science, asked the UF president “What were you thinking?”

Well, his post went viral and days later UF reversed it’s decision! So my point is this: statistics departments, be nice to bloggers that work in genomics… one of them might save your butt some day.

Disclaimer: Steven Salzberg has a joint appointment in my department and we have joint lab meetings.

Just like regular communism, dongle communism has failed

Bad news comrades. Dongle communism in under attack. Check out how this poor dongle has been subjugated. This is in our lab meeting room. To add insult to injury, this happened on May 1st

Tags: humor

I don’t think it means what ESPN thinks it means

Given ESPN’s recent headline difficulties it seems like they might want a headline editor or something…

Sunday data/statistics link roundup (1/29)

  1. A really nice D3 tutorial. I’m 100% on board with D3, if they could figure out a way to export the graphics as pdfs, I think this would be the best visualization tool out there. 
  2. A personalized calculator that tells you what number (of the 7 billion or so) that you are based on your birth day. I’m person 4,590,743,884. Makes me feel so special….
  3. An old post of ours, on dongle communism. One of my favorite posts, it came out before we had much traffic but deserves more attention. 
  4. This isn’t statistics/data related but too good to pass up. From the Bones television show, malware fractals shaved into a bone. I love TV science. Thanks to Dr. J for the link.
  5. Stats are popular

Fundamentals of Engineering Review Question Oops

The Fundamentals of Engineering Exam is the first licensing exam for engineers. You have to pass it on your way to becoming a professional engineer (PE). I was recently shown a problem from a review manual: 

When it is operating properly, a chemical plant has a daily production rate that is normally distributed with a mean of 880 tons/day and a standard deviation of 21 tons/day. During an analysis period, the output is measured with random sampling on 50 consecutive days, and the mean output is found to be 871 tons/day. With a 95 percent confidence level, determine if the plant is operating properly. 

  1. There is at least a 5 percent probability that the plant is operating properly. 
  2. There is at least a 95 percent probability that the plant is operating properly. 
  3. There is at least a 5 percent probability that the plant is not operating properly. 
  4. There is at least a 95 percent probability that the plant is not operating properly. 

Whoops…seems to be a problem there. I’m glad that engineers are expected to know some statistics; hopefully the engineering students taking the exam can spot the problem…but then how do they answer? 

Dear editors/associate editors/referees, Please reject my papers quickly

The review times for most journals in our field are ridiculous. Check out Figure 1 here. A careful review takes time, but not six months. Let’s be honest, those papers are sitting on desks for the great majority of those six months. But here is what really kills me: waiting six months for a review basically saying the paper is not of sufficient interest to the readership of the journal. That decision you can come to in half a day. If you don’t have time, don’t accept the responsibility to review a paper.

I like sharing my work with my statistician colleagues, but the Biology journals never  do this to me. When my paper is not of sufficient interest, these journals reject me in days not months. I sometimes work on topics that are fast pace and many of my competitors are not statisticians. If I have to wait six months for each rejection, I can’t compete. By the time the top three applied statistics journals reject the paper, more than a year goes by and the paper is no longer novel. Meanwhile I can go through Nature Methods, Genome Research, and Bioinformatics in less than 3 months.

Nick Jewell once shared an idea that I really liked. It goes something like this. Journals in our field will accept every paper that is correct. The editorial board, with the help of referees, assigns each paper into one of five categories A, B, C, D, E based on novelty, importance, etc… If you don’t like the category you are assigned, you can try your luck elsewhere. But before you go, note that the paper’s category can improve after publication based on readership feedback. While we wait for this idea to get implemented, I please ask that if you get one of my papers and you don’t like it, reject it quickly. You can write this review: “This paper rubbed me the wrong way and I heard you like being rejected fast so that’s all I am going to say.” Your comments and critiques are valuable, but not worth the six month wait. 

ps -  I have to admit that the newer journals have not been bad to me in this regard. Unfortunately, for the sake of my students/postdocs going into the job market and my untenured jr colleagues, I feel I have to try the established top journals first as they still impress more on a CV.

Tags: Rant humor

Getting email responses from busy people

I’ve had the good fortune of working with some really smart and successful people during my career. As a young person, one problem with working with really successful people is that they get a ton of email. Some only see the subject lines on their phone before deleting them. 

I’ve picked up a few tricks for getting email responses from important/successful people:  

The SI Rules

  1. Try to send no more than one email a day. 
  2. Emails should be 3 sentences or less. Better if you can get the whole email in the subject line. 
  3. If you need information, ask yes or no questions whenever possible. Never ask a question that requires a full sentence response.
  4. When something is time sensitive, state the action you will take if you don’t get a response by a time you specify. 
  5. Be as specific as you can while conforming to the length requirements. 
  6. Bonus: include obvious keywords people can use to search for your email. 

Anecdotally, SI emails have a 10-fold higher response probability. The rules are designed around the fact that busy people who get lots of email love checking things off their list. SI emails are easy to check off! That will make them happy and get you a response. 

It takes more work on your end when writing an SI email. You often need to think more carefully about what to ask, how to phrase it succinctly, and how to minimize the number of emails you write. A surprising side effect of applying SI principles is that I often figure out answers to my questions on my own. I have to decide which questions to include in my SI emails and they have to be yes/no answers, so I end up taking care of simple questions on my own. 

Here are examples of SI emails just to get you started: 

Example 1

Subject: Is my response to reviewer 2 ok with you?

Body: I’ve attached the paper/responses to referees.

Example 2

Subject: Can you send my letter of recommendation to


Keywords = recommendation, Jeff, John Doe.

Example 3

Subject: I revised the draft to include your suggestions about simulations and language

Revisions attached. Let me know if you have any problems, otherwise I’ll submit Monday at 2pm.