Mark Twain, Data Analyst

parismony

A topic of interest to me as a liberal arts grad is the similarity between good writing and good data work.

Mark Twain famously said, “I didn’t have time to write you a short letter, so I wrote a long one instead.” Tight writing takes longer. A long letter may seem more comprehensive, but take away the fluff, and you’re left with a less coherent message.

Same idea with data. Some people load as many variables as possible into a model, hoping it gives the most realistic view possible.

This presents the same problem as a long letter. Most of this data is garbage. You are at risk of overfitting data — that is, capturing spurious rather than meaningful relationships. 

Subscribe to my mailing list.

Comments

  1. I would also be interested in looking into a correlation between the presentation and understanding of that data and good writing. I am a believer in narrative data. Data tells a story and it takes a good analyst to recognize and tell that story to a broader audience.

    • George Mount says:

      Definitely an overlap! I actually got the idea for this post from TJ Walker’s Secret to Foolproof Presentations. Check it out on Amazon, it’s only a dollar, if you can get over the ridiculous cover image.

      He points out that many presenters try to dump so much information into a speech that it leaves nothing remarkable. People do this with data, too. They try to account for every little nuance of the data set, rather than making it a useful lens for exploration.

      Thanks for the comment Matt!

Leave a Reply

%d bloggers like this: