A fascinating presentation by Kate Crawford, principal researcher at Microsoft Research, at the 2013 Strata Conference takes a closer look at big data and what it means, exploring some of what Crawford calls "algorithmic illusions" and the limitations of the large-scale data solutions that are being embraced in many parts of the business world.
Using a fundamental analogy to an optical illusion involving a spinning cat, Crawford makes the case that while big data is essential to many business applications, there’s more than one way to interpret many of the results of data sets that may seem objective to human decision makers."Things can be seen differently," Crawford said, citing a paper in which she and co-author David Boyd reflect on some major principles of big data use, including what Crawford calls "mythology," or the belief that big data brings absolute truth and objectivism to a project. Leaders, she said, often directly associate big data with an objective bird’s-eye view, while ignoring what she called the three fundamental limitations or considerations that may affect this objectivity in key ways: bias, signal and scale.
Starting with bias, Crawford uses examples of flooding in Australia and the United States to show that big data doesn’t always match the reality on the street. She ties in the second principle, signal, further illustrating how data sets can reflect hidden actualities that can heavily skew the results. As one example, Crawford cited the multiple kinds of world maps that have been developed in an attempt to show an objective view of the relative size of continents and nations.
"Maps are not neutral," Crawford said. "We’re making choices every time we decide to represent our data."
To further illustrate the principle, Crawford uses the example of an application that reports potholes in Boston to city officials, suggesting that these kinds of apps that work on smartphones and mobile devices can end up making overall reports look a lot like census maps indicating relative age and income across a city or municipality.
"We run the risk of further entrenching particular kinds of social inequity," Crawford said, pointing to those who may be left out of a given big data set due to differences in technology use.
"What happens if you live in the shadow of big data sets?" she said.
In addition, Crawford also talks about research from years ago that questioned whether high-level information always represents more granular data and whether an "objective panorama" always works as a more accurate representation than data on a smaller scale. Crawford also asks listeners to think not just about big data, but about "data with depth." By this, she means data that truly guides readers toward objective reality, rather than glossing over details with a more global approach that, while easier to understand, may leave out key elements of what actually exists.