The business world is buzzing about data discovery. On the surface it seems like a simple term, but this catch phrase means more than "finding stuff out." So what is data discovery, really? And how does it fit into the modern landscape of mobile, analytics and big data?
Data discovery, which is sometimes referred to as data mining, involves collecting and analyzing data, and then presenting the findings in readable, usable formats. In the most basic terms, data discovery is the process of finding patterns within data and using those patterns to meet a particular business objective.
Of course, there is more to data discovery than matching up points. Organizations use data discovery for a wide range of objectives and applications in various areas – and in a modern, digital world, there is more data to discover than ever.
Where Did Data Discovery Come From?
While data discovery is relatively new to the "hot" lexicon of digital business terms, the methods and strategies are not so new. The term’s predecessor, data mining, was introduced in the 1990s, but businesses and organizations have been using some form of data discovery since the dawn of commerce.
Modern data discovery as a business strategy came about through the rise of big data – a catch-all term that describes the relatively recent, exponential growth of large, complex data sets where the sheer volume of information rules out using traditional database and organizational tools to extract anything useful.
However, big data is a big deal for today’s businesses, because among all that structured and unstructured data are highly useful patterns that can be used to improve marketing strategies, ROI and profits. Data discovery platforms, therefore, are designed to give organizations easier ways to pinpoint, analyze and extract relevant data.
How Does Data Discovery Work
Platforms for data discovery typically consist of several tools that are bundled together and work in conjunction to extract data and present it in a meaningful way. There are several different ways these tools find and identify relevant information, but most of them revolve around three basic analytical methods:
- Metadata: All digital content contains metadata, or "data about data." This information is generally hidden from end users, but is visible on the back end. Metadata is typically stored using tables and column attributes – so data discovery tools using metadata would look for matches in column name, data size and data type.
- Labels: In many cases, data is generated and grouped under labels, or tags, that describe the data within that group. These tags may be generated when the data is created, or can be added for reference and additional information. Labels or tags are similar to metadata, although less formal.
- Content: This strategy analyzes the data itself, rather than attached labels or metadata.
Typically, there will be far more content data volume than tags or metadata, which means identifying data by content takes longer and uses more complex discovery methods. However, content analysis also tends to provide richer and more useful relational results.
Once the data has been analyzed, other data discovery tools can be used to present the discovered relationships, trends or patterns in a useful format. Graphs, tables and charts are basic presentation tools used in data discovery, but more complex yet readable presentations, such as infographics, are gaining favor with data analysts.
What Can Data Discovery Do?
In terms of practical usage, there are nearly unlimited uses for data discovery platforms and tools. These methods and strategies are most commonly used by consumer-facing organizations in almost every industry, including retail, financial, communications and marketing, although not-for-profits, business-to-business organizations and government agencies also make use of this technology.
Data discovery enables an organization to find relationships between internal factors (such as price, product positioning and employee performance) and external factors (such as competition data, economic indicators and customer demographics). These relationships help businesses illustrate and define the impacts of changes to one or more factors on sales, customer engagement and profits.
The tools used in data discovery offer a more detailed picture of influential factors, and allow companies to fine-tune their marketing strategies and advertising campaigns with highly targeted information. The recommendation engine on the popular streaming video service Netflix is a good example of data discovery technology at work. The service uses external data about customers’ viewing histories and internal data about the media content in their database to make individualized suggestions for new videos that are likely to interest their customers.
But the potential application of data discovery goes beyond retail consumers. One example is Advanced Scout software, a program used by the National Basketball Association (NBA). It analyzes players’ movements from image recordings of basketball games to help coaches develop strategies and orchestrate plays.
As data discovery platforms advance and the technology becomes more affordable, more organizations will be able to use these tools to better understand their customers and deliver unique, customized offerings that improve commerce for everyone.