In a new era of rapid technical advancement, people are getting serious about using deep learning systems.
A few years ago, we knew a good bit about deep learning as a science, but few people saw actual use cases for this type of technology. Fast-forward to today, and we have applications like Chat Generative Pre-Training (ChatGPT) and Dall-E from OpenAI (and other types of neural network engines from chatbots to advanced recommendation engines) working with human operators across the web to make art, tell stories and do all kinds of cognitive work that would have been unthinkable a decade ago. (Also read: How Recommender Systems Are Changing E-Commerce.)
So what do these deep learning technologies really look like under the hood?
Here’s a tour of a few modern components of deep learning and machine learning programs.
The Multilayer Perceptron
This is one of the mot fundamental components of deep learning systems. It exemplifies the essential build of neural networks as digital constructs mimicking the way the human brain works.
The multilayer perceptron has an input layer, hidden layer and an output layer, with the ability to use feedforward and backpropagation systems to contribute to deep learning results.
We’ll talk about some specific flavors of neural networks a little later on, but the use of multilayer perceptron is often useful in an algorithmic context.
“MLPs (multilayer perceptrons) are global approximators and can be trained to implement any given nonlinear input-output mapping,” write Anke Meyer-Baese and Volker Schmid in a paper titled “Foundations of Neural Networks.” “In a subsequent testing phase, they prove their interpolation ability by generalizing even in sparse data space regions.”
So add that to the catalog of top deep learning components.
Deep Belief Networks
Here’s an example of a deep learning model that applies to certain types of neural network functions.
When you look at the basic definition of the deep belief network (DBN) on Wikipedia, you see that it’s a “generative graphical model … with layers of latent variables.”
Looking at use cases, you see that DBNs are essentially for solving certain kinds of training problems where programs fail to converge in the right way, or, for example, need too much input training information.
Digging deeper, we see that engineers think of DBNs as based on things like spin-glass systems and Boltzmann machines, which were themselves generated around mathematical models describing energy distributions.
Pavan Vadapalli at Upgrad provides this definition of the Deep Belief Network as a “restricted version of Boltzmann machines.”
“Each subnetwork’s hidden layers will serve as the visible input layer for the network’s adjacent layer,” Vadapalli writes. “Thus, it makes the lowest visible layer a training set for the adjacent layer of the network. Hence, every layer of the network can be trained greedily and independently. Each layer of the deep structure utilizes hidden variables as observed variables for training each layer of the deep structure.” (Also read: Why are GPUs important for deep learning?)
So these types of evolved systems built on past mathematical models are really more than a little useful in deep learning design.
Convolutional Neural Networks
Scientific descriptions of what a convolutional neural network does are prohibitively technical for most readers. Here’s what’s happening in a nutshell:
The CNN is moving from a technical pixel layout of an image, abstractifying it into components that it can use to capture the style of similar images and match up image components. That might not sound incredibly appealing, but the result is the ability to generate all sorts of artistic visuals based on keyword searches, which is definitely turning heads as end users start to see the actual output.
“Computer vision … enables computers and systems to derive meaningful information from digital images, videos and other visual inputs, and based on those inputs, it can take action,” writes an IBM researcher, noting use cases in marketing, healthcare and retail, to name a few. “This ability to provide recommendations distinguishes it from image recognition tasks.”
You could also say that CNNs are advancing computer vision to the extent that tomorrow’s robots will likely be able to make their way with digital “eyes” in much the same ways as humans. (Also read: 5 Defining Qualities of Robots.)
RNN Networks
The simple way to describe RNN is that it gives neural networks “memory” by adding connections between neurons.
That, then, leads to more capable predictions — for instance, a program that can generate text by predicting what should come after a word or letter in a sentence.
Again, the description doesn’t quite match what happens when you actually feed something into a program with this type of neural network engine and get the results.
A computer can now invite you to dinner or write you a poem, or present other text output that passes a high-level Turing test with flying colors. To keep users from getting confused, ChatGPT actually points out fairly routinely that it is an AI, not a human, and thus incapable of making value judgements, experiencing feelings or emotions, or really understanding human things like apologies. It’s unclear whether successive designs will have these kinds of caveats built in.
Conclusion
The above systems are the real application of a lot of deep learning technology — pretty soon we’re not going to be able to tell whether something we see on the internet (or hear on the phone, etc.) has been made by a human or by a computer.
And that’s just the first step in these kinds of deep learning models coming into maturity. With companies like NVIDIA actively pursuing new innovations in AI hardware, we’re going to see a big shift in the market around training deep learning models. The “deep learning revolution” Is transforming The world. (Also read: Is Deep Learning Just Neural Networks on Steroids?)