The Crucial Link Between AI and Good Data Management
Artificial intelligence can only be as smart as the data used to train it. That's why proper data management is essential in order for AI to be trained on high-quality data.
Artificial intelligence is unlike traditional software in one very important aspect: It has to learn how to do its job.
This provides a key benefit for product life cycles in that instead of having to wait for coding wizards to manually upgrade their creations once per year (or even less frequently), the system itself can add new tools, create new features and otherwise alter itself to better satisfy user requirements. The downside, of course, is that few AI programs will provide top-flight performance right out of the box; only through continuous use will they come to understand what is expected of them and how best to achieve their objectives.
A key factor in this evolution is the data that AI-driven systems are exposed to. Good data, properly conditioned and placed in the right context, will allow services to make informed decisions and take appropriate actions, while bad data will lead to poor results and steadily diminishing performance.
As an example, consider an AI-driven marketing strategy. A key data point might suggest increased interest in a particular product offering in a certain region or among a certain demographic. But if the data is based merely on webpage views or other anecdotal evidence rather than deep-dive consumer surveys, significant time, money and other resources could be diverted from more productive projects in order to chase an opportunity that doesn’t exist. (For more on AI in marketing, check out How Artificial Intelligence Will Revolutionize the Sales Industry.)
Seeing the Problem
To date, however, the enterprise has had marginal success in managing data, particularly unstructured data. According to Corinium, 70 percent of IT and data management teams struggle to meet analytics needs, while nearly 40 percent have trouble maintaining good data quality even though more than half are using cutting-edge hybrid and multi-cloud architectures for their data storage.
On the positive side, however, many organizations are starting to recognize the significance of the problem and are taking steps to address it. More than 90 percent of respondents say they will invest more than $1 million in new analytics initiatives in the coming year, with more than 60 percent employing hybrid, multi-cloud strategies to federate data across internal and external infrastructure.
One key problem still to overcome, however, is the need to evolve beyond basic data collection and aggregation to more advanced contextual and relevancy models, says Informatica President Amit Walia. Only by parsing key metadata regarding technology, business, operations and usage can the enterprise foster the kind of “intelligent data” needed to train intelligent algorithms.
But this is becoming harder to do as data volumes continue to explode. Somewhat ironically, many data analysis and management solutions are turning to the same AI and machine learning algorithms that empower the smart applications that end up consuming data and metadata. By making the entire process more intelligent, the enterprise can automate many of the rote functions that currently occupy the bulk of highly paid data scientists’ time, leaving them free to focus on more complex strategic objectives.
Data from Afar
One thing that every intelligent data management system will need is streamlined connectivity to and from the cloud. While wide-area networking is becoming increasingly fast, flexible and software-defined, it still lacks the fine-grain management tools to collate, process and transfer data at AI-friendly speeds. This is why NetApp and Nvidia have teamed up to unite the AFF A800 flash platform with the DGX supercomputer. The solution leverages NetApp’s Data Fabric to effectively provide “edge to core to cloud” data control, giving analytics engines an accurate, up-to-date view of the entire distributed ecosystem and direct access to data no matter where it resides or what format it is in.
Retrieving data is only the first step, however. Improving the way the database ingests and interprets data can be equally effective. Pavel Bains, CEO of database decentralization firm Bluzelle, believes blockchain can make a major contribution in this regard, by creating a universal data store that accommodates both structured and unstructured data. This will allow data management teams to provide the deep context that is needed for AI to quickly make sense of it all while at the same time ensuring that critical data is not under the control of any one cloud provider. Blockchain’s use of distributed, peer-to-peer storage nodes all but ensures that data can be made available virtually anywhere at the highest possible speed, all while maintaining high integrity due to its immutable but open ledger approach. (When AI works the way it's supposed to, it can be a huge help to business. Learn more in 5 Ways Companies May Want to Consider Using AI.)
AI is a misnomer because it isn’t really intelligent. It cannot intrinsically differentiate between fact and fiction, good and bad, right and wrong. All it can do is consume massive amounts of data and look for patterns that fulfill its programming mandates. If the data is incorrect, or is interpreted incorrectly, the pattern will be skewed and the results will be faulty.
In this light, the real intelligence behind artificial intelligence lies where it always has: the human brain. Only through proper oversight in the collection and preparation of data will AI be able to deliver the greatest benefit to digital services and operations.
The smarter we are about data, the smarter our machines will become in the quest to achieve greater productivity.