Margaret Rouse is an award-winning technical writer and teacher known for her ability to explain complex technical subjects simply to a non-technical, business audience. Over…
A foundation model is a deep learning algorithm that has been pre-trained with extremely large data sets scraped from the public internet.
Unlike narrow artificial intelligence (narrow AI) models that are trained to perform a single task, foundation models are trained with a wide variety of data and can transfer knowledge from one task to another. This type of large-scale neural network can be trained once and then fine-tuned to complete different types of tasks.
Foundation models can cost millions of dollars to create because they contain hundreds of billions of hyperparameters that have been trained with hundreds of gigabytes of data. Once completed, however, each foundation model can be modified an unlimited number of times to automate a wide variety of discrete tasks.
Today, foundational models are used to train artificial intelligence applications that rely on natural language processing (NLP) and natural language generation (NLG). Popular use cases include:
Foundation models are expected to make AI projects easier and cheaper for large enterprise companies to execute. Instead of having to spend millions of dollars on high performance cloud GPUs to train a machine learning model, companies can use data that has been pre-trained and focus their attention (and budget) on tuning the model for specific tasks.
Critics of foundation models are concerned, however, that this type of customizable “large-scale-neural-network-in-a-can” uses so much data and contains so many deep learning layers that it is impossible for a human to understand how an amended model computed a specific output. This type of black box vulnerability leaves, foundation models at risk for data poisoning attacks designed to pass on misinformation or purposely introduce machine bias.
BLOOM (BigScience Large Open-science Open-access Multilingual Language Model) is an important foundations model created by volunteers from a community-driven machine learning (ML) platform called Hugging Face. The team of volunteers who created this model have shared details about what data the model was trained on and what criteria was used to determine optimal performance.
The researchers are hoping that because BLOOM’s open-access large language model (LLM) performs as well as OpenAI and Google foundation models, it will encourage AI adoption in many different types of applications beyond robotic process automation (RPA) and other types of narrow AI.
The BLOOM model, which includes 176 billion parameters and was trained for 11 weeks, is now available to the public and can be accessed through the Hugging Face website. BLOOM is fluent in in 46 human languages and 13 programming languages.
Researchers at Stanford University’s Center for Research on Foundation Models (CRFM) are also studying how foundation models have the potential to speed AI adoption while also supporting the principles of responsible AI.
According to the CRFM website, a major focus of the research center is to develop rigorous principles for training and evaluating foundation models.
The Center for Research on Foundation Models (CRFM), a new initiative of the Stanford Institute for Human-Centered Artificial Intelligence (HAI), hosted the Workshop on Foundation Models from August 23-24, 2021. The workshop convened experts and scholars reflecting a diverse array of perspectives and backgrounds to discuss opportunities, challenges, limitations, and societal impact of these emerging technologies.
Watch the video below to learn more about CRFM.
Techopedia’s editorial policy is centered on delivering thoroughly researched, accurate, and unbiased content. We uphold strict sourcing standards, and each page undergoes diligent review by our team of top technology experts and seasoned editors. This process ensures the integrity, relevance, and value of our content for our readers.
Margaret is an award-winning technical writer and teacher known for her ability to explain complex technical subjects to a non-technical business audience. Over the past twenty years, her IT definitions have been published by Que in an encyclopedia of technology terms and cited in articles by the New York Times, Time Magazine, USA Today, ZDNet, PC Magazine, and Discovery Magazine. She joined Techopedia in 2011. Margaret's idea of a fun day is helping IT and business professionals learn to speak each other’s highly specialized languages.
What is Turnitin AI Checker? The Turnitin AI checker is an advanced tool aimed at maintaining the integrity of school...
Maria WebbTechnology journalist
What is ISO/IEC 42001? ISO/IEC 42001 is an international standard that provides a governance framework for implementing and continually improving...
Margaret RouseTechnology Expert
What are Physical Resource Networks (PRNs)? The definition of Physical Resource Networks (PRNs) is that they are a type of...
Nicole WillingTechnology Journalist
Trending NewsLatest GuidesReviewsTerm of the Day