Interview with Amanda Cox - Graphics Editor at the New York Times

Amanda Cox 

Amanda Cox received her M.S. in statistics from the University of Washington in 2005. She then moved to the New York Times, where she is a graphics editor. She, and the graphics team at the New York Times, are responsible for many of the cool, informative, and interactive graphics produced by the Times. For example, this, this and this (the last one, Olympic Symphony, is one of my all time favorites). 

You have a background in statistics, do you consider yourself a statistician? Do you consider what you do statistics?

I don’t deal with uncertainty in a formal enough way to call what I do statistics, or myself a statistician. (My technical title is “graphics editor,” but no one knows what this means. On the good days, what we do is “journalism.”) Mark Hansen, a statistician at UCLA, has possibly changed my thinking on this a little bit though, by asking who I want to be the best at visualizing data, if not statisticians.

How did you end up at the NY Times?

In the middle of my first year of grad school (in statistics at the University of Washington), I started applying for random things. One of them was to be a summer intern in the graphics department at the Times.

How are the graphics and charts you develop different than producing graphs for a quantitative/scientific audience?

"Feels like homework" is a really negative reaction to a graphic or a story here. In practice, that means a few things: we don’t necessarily assume our audience already cares about a topic. We try to get rid of jargon, which can be useful shorthand for technical audiences, but doesn’t belong in a newspaper. Most of our graphics can stand on their own, meaning you shouldn’t need to read any accompanying text to understand the basic point. Finally, we probably pay more attention to things like typography and design, which, done properly, are really about hierarchy and clarity, and not just about making things cute. 

How do you use R to prototype graphics? 

I sketch in R, which mostly just means reading data, and trying on different forms or subsets or levels of aggregation. It’s nothing fancy: usually just points and lines and text from base graphics. For print, I will sometimes clean up a pdf of R output in Illustrator. You can see some of that in practice at, which where one of my colleagues, Kevin Quealy, posts some of the department’s sketches. (Kevin and I are the only regular R users here, so the amount of R used on chartsnthings is not at all representative of NYT graphics as a whole.)

Do you have any examples where the R version and the eventual final web version are nearly identical?

Real interactivity changes things, so my use of R for web graphics is mostly just a proof-of-concept thing. (Sometimes I will also generate “poor-man’s interactivity,” which means hitting the pagedown key on a pdf of charts made in a for loop.) But here are a couple of proof-of-concept sketches, where the initial R output doesn’t look so different from the final web version.

The Jobless Rate for People Like You

How Different Groups Spend Their Day

You consistently produce arresting and informative graphics about a range of topics. How do you decide on which topics to tackle?

News value and interestingness are probably the two most important criteria for deciding what to work on. In an ideal world, you get both, but sometimes, one is enough (or the best you can do).

Are your project choices motivated by availability of data?

Sure. The availability of data also affects the scope of many projects. For example, the guys who work on our live election results will probably map them by county, even though precinct-level results are so much better. But precinct-level data isn’t generally available in real time.

What is the typical turn-around time from idea to completed project?

The department is most proud of some of its one-day, breaking news work, but very little of that is what I would think of as data-heavy.  The real answer to “how long does it take?” is “how long do we have?” Projects always find ways to expand to fill the available space, which often ranges from a couple of days to a couple of weeks.

Do you have any general principles for how you make complicated data understandable to the general public?

I’m a big believer in learning by example. If you annotate three points in a scatterplot, I’m probably good, even if I’m not super comfortable reading scatterplots. I also think the words in a graphic should highlight the relevant pattern, or an expert’s interpretation, and not merely say “Here is some data.” The annotation layer is critical, even in a newspaper (where the data is not usually super complicated).

What do you consider to be the most informative graphical elements or interactive features that you consistently use?

I like sliders, because there’s something about them that suggests story (beginning-middle-end), even if the thing you’re changing isn’t time. Using movement in a way that means something, like this or this, is still also fun, because it takes advantage of one of the ways the web is different from print.

Sunday data/statistics link roundup (5/27)

  1. Amanda Cox on the process they went through to come up with this graphic about the Facebook IPO. So cool to see how R is used in the development process. A favorite quote of mine, “But rather than bringing clarity, it just sort of looked chaotic, even to the seasoned chart freaks of 620 8th Avenue.” One of the more interesting things about posts like this is you get to see how statistics versus a deadline works. This is typically the role of the analyst, since they come in late and there is usually a deadline looming…
  2. An interview with Steve Blank about Silicon valley and how venture capitalists (VC’s) are focused on social technologies since they can make a profit quickly. A depressing/fascinating quote from this one is, “If I have a choice of investing in a blockbuster cancer drug that will pay me nothing for ten years,  at best, whereas social media will go big in two years, what do you think I’m going to pick? If you’re a VC firm, you’re tossing out your life science division.” He also goes on to say thank goodness for the NIH, NSF, and Google who are funding interesting “real science” problems. This probably deserves its own post later in the week, the difference between analyzing data because it will make money and analyzing data to solve a hard science problem. The latter usually takes way more patience and the data take much longer to collect. 
  3. An interesting post on how Obama’s analytics department ran an A/B test which improved the number of people who signed up for his mailing list. I don’t necessarily agree with their claim that they helped raise $60 million, there may be some confounding factors that mean that the individuals who sign up with the best combination of image/button don’t necessarily donate as much. But still, an interesting look into why Obama needs statisticians
  4. A cute statistics cartoon from @kristin_linn  via Chris V. Yes, we are now shamelessly reposting cute cartoons for retweets :-). 
  5. Rafa’s post inspired some interesting conversation both on our blog and on some statistics mailing lists. It seems to me that everyone is making an effort to understand the increasingly diverse field of statistics, but we still have a ways to go. I’m particularly interested in discussion on how we evaluate the contribution/effort behind making good and usable academic software. I think the strength of the Bioconductor community and the rise of Github among academics are a good start.  For example, it is really useful that Bioconductor now tracks the number of package downloads