Recently, former LinkedIn Data Scientist, DJ Patil, made the claim that what data science is really about is telling stories.
I agree.
The purpose of data analysis within a company is to help the company make better decisions. However, decisions are made by managers, not analysts. In order to be effective, analysts must communicate their analysis in a convincing way: by telling stories. In fact, models are just stories formalized in a mathematical language.
Here’s the rub. Bad stories are a lot easier to tell than good stories.
A bad story typically requires less analysis and effort. An average can be calculated with a simple SQL query whereas regressions and proper standard errors require more time and thought. Explaining sophisticated analysis is also more difficult. For example, good analysis often requires studying a narrow subset of the data with very specialized techniques. Further, extrapolating from a subsample to the rest of the sample requires many assumptions. Decision makers often do not have the time or ability to understand the reasons why particular steps were taken. Lastly, managers who do not understand statistics often have data requests that are not very informative about the objects of interest. Essentially, they want to hear bad stories.
It is the responsibility of the analyst to fight actively to raise the level of conversation in the company by:
Convincing the company to use appropriate metrics.
Highlighting the difficulty of making inference without experimentation.
Fighting to work on questions that are actually answerable with the data at hand in a reasonable amount of time.
In summary, data scientists should aim to raise the desire of everyone in the company to hear better stories. That way, when someone does tell a bad story, they are challenged appropriately.