Data science needs its Turing test. In executive meetings, decision-makers confer in natural languages and sentiments when they wrestle with the trade-offs of their choices.
They seek to bridge perception differences between them with visuals, charts, infographics and tables.
The amassing of humongous volumes of an expanding variety of variables can potentially transform executive decision-making — when it populates human mind maps.
The 2020 State of Data Governance and Automation survey conducted by Dataversity and Erwin reported that better decision-making is the primary driver for their data intelligence initiatives (62% in the 2020 survey up from 45% in 2018) followed by analytics (51%).
Natural Language Data Searches
Data is agreeable to human decision-makers when it becomes part of a conversation. It engages humans when they gain situational awareness with stories. It stimulates thinking when humans see patterns in visuals. It provokes questions when humans find relationships in the data.
The answers in the data suggest courses of action. Yet, data is formatted to present tables and charts that at best show patterns — unlikely to capture the interdependence in complex environments.
Metadata, annotated in human languages, is the bedrock of analytics that communicates narratives. It will start to provide information about events and when, what, where, how, and why they happened.
Humans are spontaneously inclined to choose categories that all their peers can understand without confounding them by varying uses of the terms.
"Historically, the focus of metadata management has been on technical metadata (platform, structure, physical characteristics). However, equal weight is now being given to the capture and correlation of business metadata (business rules, associated applications, and business capabilities) and semantic metadata (business terminology and ontology) to support storytelling and contextual association," said Danny Sandwell, Director of Product Marketing at erwin.
Finding Data Patterns
When data sets are as large as they are today, storytellers can put their head around them by looking at a map of the relationships in the data.
Knowledge graphs, pioneered by Google, shows the interrelationships between entities, their properties, and their relationships. Currently, Google uses knowledge graphs to organize snapshot biographical information about prominent people, such as Donald Trump.
Storytellers are unlikely to be satisfied with the bare-bones data that knowledge graph visualize. For reasons self-evident, they would want to know how Donald Trump came to be the POTUS without any previous background in politics.
They will start to look at his social network and who influenced him, experiences interacting with blue-collar workers and their aspirations, and what he learned from his business failures and successes.
The answers come from databases with factual data such as a list of his friends, and other pieces will come from databases with qualitative information such as news.
Community efforts like DBpedia and Yago create a knowledge platform with information from Wikipedia to find related information.
Emerging semantic standards like Resource Description Framework (RDF) instead use triplets: subject, predicate, and object, which construct snippets like John (subject) lives in the (predicate) suburbs (Object).
Information classified in classes and concepts, common across datasets, paves the way for interlinking and finding related information and the drivers of variation in either one or both of them. A query language like SPARQL can search the contents of enormous datasets stored in RDF.
Identifying patterns is only the start of a journey for the exploration of data. A deeper dive reveals stories that executives can use to design their strategies and operational modus operandi to realize the efficiencies to gain competitive advantage.
"Knowledge graphs and an ontology forms the foundation of a data storytelling system, but it needs reasoning and learning layered on top to reach its fullest potential. A storytelling system needs to understand the types of questions users typically ask, the information that is useful to include in the answer, and the types of questions that are likely to be raised by an answer to an earlier question," said Nate Nichols, Distinguished Principal, Product Strategy and Architecture of Narrative Science.
"Based on latent semantics in data such as the underperformance of a team member, the technology developed by Narrative Science asks follow-up questions and machine learning algorithms filter out queries relevant to complete the narrative," Nate Nichols explained.
"You need metadata that represents sentiment (If this metric goes up over time, is that good, bad, or neutral?); metric phenomena (e.g., does this metric accumulate over time (such as revenue) or is it point-in-time (such as stock price); and qualitative judgments (e.g., what does it mean for sales to 'bounce back'?)," Nichols elaborated.
More advanced narrative solutions are incorporating additional fuzzier elements like culture, historical background, and strategy.
"We've done internal proofs-of-concept around mimicking a company's writing style to more closely reflect the company's culture. We consider historical trends already. We will include explicit goals soon," Nichols revealed about Lexio, a product it has launched recently.
Storytellers weave together multiple strands of stories, extracted from several types of data series, to tell a cohesive story which technical standards stymie by preventing access of data from various stores. Human semantic categorization is agnostic to technical terms used by vendors and can search across databases.
Columns of tables stating the location of a house have the same semantic meaning even if metadata annotates it in different ways. The virtualization of databases opens the door to search for any document in any database as long the chosen method of annotating is amenable to translation into human languages.
Furthermore, searches of databases return documents and not the content contained within them. Hashtags have been wildly popular in social media as they can pinpoint content of interest.
Searches of databases with hashtags return slices of content from larger documents. Storytellers often want to search content at a granular level to help explain anomalies in data.
Decision-makers navigate what is often a fog of complexity, missing and noisy data, the flux of competition and industry disruption in their business environment, black swan events, uncertain impacts of business decisions, and scarce or misaligned resources to respond.
Yet, the agile will find potential opportunities that others miss. Executives need decision-making tools that capture the interdependence in a countless number of variables, pinpoint the missing facts, adapt to the fluidity in the business environment by capturing new data and reconfiguring the visuals to represent evolving realities, and latent opportunities, and present 3D situational awareness.
"Supporting narratives through metadata is at the heart of a real data intelligence solution. After from multiple perspectives and in context, the next natural step is to support ideation and innovation. It is sort of like a metadata-driven 'JAD' session. We deliver this in the form of navigable, visual mind maps that allow users to see all their technical and business assets, and the associations between them," Danny Sandwell asserted.
"They can storyboard a narrative, visualize the supporting data landscape, and then test the hypothesis by exploring these assets and associations. We can then see if the data at hand supports the narrative, identify gaps, and synthesize actionable requirements to drive innovation."
A roadmap for conversational communication of data delivered as knowledge and insights and their visual display is in place.
At this stage, vendors are integrating multiple elements that will facilitate the extraction of pertinent pieces of data and their relationships, the examination of data from numerous angles, its display in text, images, and data is underway.
Business decision-makers have expressed their hunger for analytics that dovetails into the natural processes of decision-making.
Accelerated adoption of semantic business intelligence is on the horizon.