500 Words, Day 10 / by dan turner

If you're of the mainstream narrative mindset that journalism is dead, you'll be surprised by how vibrant and rampant data-driven journalism (DDJ) is today. (Disclosure: this lede is perhaps a bit linkbait, as I intend to share this essay with my #datajmooc classmates.)  But even the most data-y of data visualizations can fall prey to the same pitfalls of postmodernism that plagued even Walter Lippmann, who wrote about journalism and truth even before modernism was a thing.

DDJ involves digging into structured data, often more than a human can handle, and is usually mated to infographics. You've seen this in projects such as interactive vote maps, visualizations of Mariano Rivera's pitches, tracings of on-the-ground implications of the Wikileaks Afghanistan war logs – Joel Gunter has a good summary of how the New York Times thinks about the subject.

I should stress that DDJ isn't just geeks entertaining themselves. Pew Internet has documented how infographics, participatory visualizations, and other aspects of DDJ increase engagement with news stories and drive readers (and revenues).

In his 1922 book Public Opinion, Walter Lippmann wrote that what news serves to do is "signalize an event" (by which I take it to mean separating out the signal from the noise; a fairly cutting-edge concept at the time, decades before the work of Claude Shannon). He also wrote that the function of truth is to "bring to light the hidden facts and set them in relation to each other". This sounds an awful lot like DDJ, where from giant data sets journalists extract a signal of scandal, or of progress, that might otherwise have gone unnoticed. A core technique of DDJ, also, is to compare and contrast disparate data sets and sources, such as hunger and average household income, to discover where causal connections lie.

However, though Lippmann's words sound wise, and the mission is a noble one, they can be co-opted by reality to the point where you're not delivering news, but someone's agenda. The same caution holds for DDJ, too.

As Michael Schudson points out in his Discovering the News, this view of news relies on events being a "relatively unbiased sampling of the 'hidden facts'", and not part of a narrative constructed either explicitly by PR or implicitly by suffusing power structures (think of a newspaper just reposting a politician's or "expert" speech as fact). DDJ is less susceptible to this dominant-narrative influence, in that data is harder to spin, but you can bet someone's working on it, and GIGO.

Some data geeks think that if you have the data, you have the answer (I've seen this and am not making it up). But the questions you ask and awareness of context is the secret sauce that transforms data into information. Thankfully, "interrogating the data" is becoming the watchword for data journalists today, who are being trained to look at what's in the data and treating it as they would any press officer statement.

And that's 500 words.