What is a Small Language Model (SLM)?
A small language model (SLM) is a lightweight generative AI model. The label “small” in this context refers to the size of the model’s neural network, the number of parameters the model uses to make a decision, and the volume of data the model is trained on.
SLMs require less computational power and memory than large language models (LLMs). This makes them suitable for on-premises and on-device deployments.
Large language models like ChatGPT and Google Bard are resource-intensive. They have complex deep learning architectures, require vast amounts of training data, need significant amounts of storage, and consume incredible amounts of electricity.
Until recently, these resource requirements served as barriers to entry and gave Big Tech a big advantage in the fast-moving artificial intelligence (AI) marketplace. The development of SLMs has begun to lower these barriers and allow startups and other small businesses to develop and deploy their own language models.
Benefits and Limitations
SLMs can be trained with relatively small datasets. Their simpler architectures are more explainable, and their small footprints allow them to be deployed on mobile devices.
One of the main advantages of small language models is that SLMs can be designed to process data locally. This option is particularly important for Internet of Things (IoT) edge devices and businesses that need to comply with strict privacy and security policies.
Small language model deployment comes with a trade-off, however. Because SLMs are trained on smaller datasets, their knowledge bases are more limited than their LLM counterparts. They also tend to have a more narrow understanding of language and context, which can lead to less accurate and/or less nuanced responses compared to larger models.
|Small Language Models
|Large Language Models
|Can have less than 15 million parameters.
|Can have hundreds of billions of parameters.
|Can use mobile device processors.
|Can require hundreds of GPU processors.
|Can handle simple tasks.
|Can handle complex, diverse tasks.
|Easier to deploy in resource-constrained environments.
|Deployment often requires substantial infrastructure.
|Can be trained a week.
|Training can take months.
Small Language Models vs. Specialized Language Models
The acronym SLM can be confusing because it can stand for “small language model” or “specialized language model.”
To add to the confusion, many smaller language models can also be characterized as specialized language models.
Specialized language models are specifically trained or fine-tuned for particular domains or tasks. This type of model is designed to perform well in a targeted area, which could be anything from legal jargon to medical diagnoses.
To avoid confusion, it’s important to remember that small models are characterized by:
- The number of parameters they use
- The size of their footprint
- The amount of data required to train them
Specialized models are characterized by their topic or domain.
Not all small language models are specialized – and many specialized models are quite large.
Orca 2: Microsoft developed Orca 2 by fine-tuning Meta’s Llama 2 with high-quality synthetic data. This approach allowed Microsoft to achieve performance levels that rival or surpass those of larger models, particularly in zero-shot reasoning tasks.
Phi 2: Microsoft’s Phi 2 is a transformer-based SLM that is designed to be efficient and versatile in both cloud and edge deployments. According to Microsoft, Phi 2 demonstrates state-of-the-art performance for mathematical reasoning, common sense, language understanding, and logical reasoning.
BERT Mini, Small, Medium, and Tiny: These are smaller versions of Google’s BERT model, scaled down to fit different resource constraints. They offer a range of sizes, from the Mini with only 4.4 million parameters to the Medium with 41 million parameters.
GPT-Neo and GPT-J: These SLM models are scaled-down versions of OpenAI’s GPT models.
MobileBERT: As the name suggests, MobileBERT is designed for mobile devices.
T5-Small: The Text-to-Text Transfer Transformer (T5) model from Google comes in various sizes. T5-Small is designed to provide a balance between performance and resource usage.