Exclusive: IBM Explains New ‘TinyTimeMixer’ AI

In the AI revolution, everyone wants to go big — we all like the biggest number. But some researchers think that sometimes less can be more, much more.

To prove it, the IBM Research team has gone not small but tiny with its new TinyTimeMixer (TTM) AI.

Techopedia sits down with Jayant Kalagnanam, Director of AI Applications at IBM, to explain how IBM’s newest artificial intelligence model does not need to play the parameter game in order to outperform large language models (LLMs) when it comes to time-series forecasting.

Key Takeaways

IBM’s TinyTimeMixer (TTM) AI outperforms larger models in time-series forecasting by focusing on features and algorithms instead of parameters.
TTM is a lightweight, pre-trained AI model that offers 40% improved accuracy with 300x reduced compute and latency.
Unlike large language models, TTM can run on traditional computers (rather than needing GPUs), making it accessible for various applications.
IBM uses TTM internally and with partners for tasks such as workload management, stock price prediction, and manufacturing optimization.
TTM shows the potential of smaller, targeted AI models to deliver superior results in specific domains.

From Influenza to Energy: How Time Series Forecasting Models Help The World

Time series forecasting models can be used to predict stock prices, manage demands and supply more efficiently, run energy or weather forecasts, and optimize supply chains, among other things.

A recent study published in BMC Public Health showed the potential that time series AI has in the field of health. The paper describes how new technologies combined with time series models can be used to forecast the percentage of influenza-like-illness (ILI) rate in Hebei Province, China.

The information generated by these artificial intelligence models can provide more precise guidance for influenza prevention and control measures.

Why Time Series Forecasting is Not Mainstream

If time-series forecasting models have such potential to automate and transform industries and sectors, why is the tech not more widely available?

Kalagnanam from IBM answered the question for us.

“It’s difficult to build a time series model due to the diverse nature and scarcity of publicly available pre-training data.

“Today’s time series models also tend to be very slow and large, coming in at about one billion parameters.”

This is where IBM’s TTM can help, coming in as super-lightweight, pre-trained, is easy for organizations to fine-tune, and brings efficiency right out of the box. It is also free to download, use, and modify.

The TTM model currently open-sourced by IBM on Hugging Face claims 40% improved accuracy and a compute and latency reduced by 300x compared to larger models. The AI is so small it can run fast inferences on traditional computers or laptops instead of using GPUs.

It’s a vastly polar approach to models like OpenAI‘s ChatGPT 4, which is rumored to have 1.75 trillion parameters, while upcoming models like LLama 4 from Meta are expected to exceed this.

Zero-Shot and Pre-Trained

IBM’s TTM is — like large language models — zero-shot. It can perform tasks or answer questions on topics it hasn’t been specifically trained on. In simple words, it will generate responses even when not given specific examples.

Zero-shot AI has seen an uptick in interest within the research community when building general pre-trained or foundation models for time-series forecasting that can successfully ‘generate forecasts from unseen datasets’.

Kalagnanam listed other notable achievements that IBM’s TTM now has under its belt.

“IBM Research has provided two fundamental advances:

“New architectures (patchTST, and patchTSMixer), which have demonstrated state-of-the-art benchmarks beating other models, thereby establishing the importance of these new classes of methods.

“Pre-trained models for time series domains are relatively new (mostly a 2024 phenomenon).”

How Tiny Models Can Deliver Superior Results in Time Series Forecasting

TTMs are more like machine learning models than large language AIs that combine many models. These smaller technologies have significantly fewer parameters because the strength, performance, and accuracy of these models are not dependent on parameters, but focused on features and algorithms designed for very specific functions.

Specificity is what allows TTMs to be small yet incredibly effective in identifying patterns and predicting outcomes.

Techopedia asked Kalagnanam if they had tested its TTM — part of IBM’s flagship family of Granite models.

“Yes, IBM and its partners are already using the model within various domains.”

Kalagnanam said the company is using the model internally for workload management on IBM Storage systems.

“The model helps teams more accurately predict storage workloads on a day-to-day basis and more appropriately division storage,” Kalagnanam explained.

IBM has also been working with partners like QuantumStreetAI to predict stock price movement across various industries for investors.

“Using the tiny time mixer model and IBM’s Watsonx data and AI platform, QuantumStreetAI pulls in ESG (environmental, social, and governance) and sentiment signals from news, released reports, and other varied data sources to forecast stock price movements.”

Another application for TTMs is digital twin tech to optimize manufacturing processes — a use case that IBM has piloted across the cement, steel, food, and manufacturing industries.

“This TTM-based model is embedded in an optimization framework to provide set point recommendations to improve throughput and efficiency,” Kalagnanam said.

IBM’s TTM is not an accidental find. The company intentionally sought out from the start to build the model this way. Kalagnanam from IBM shared with Techopedia the two questions that researchers at the company asked themselves before creating the Tiny AI.

“Can ‘tiny’ pre-trained models succeed in the time-series domain too? If so, can they outperform the zero/few-shot forecasting results of ‘large’ time-series pre-trained models demanding significant computational resources and runtime?

The Bottom Line: The Hype Vs. A Responsible Use of AI

Not all news in the AI era world are flashy marketing plays. But given that organizations and businesses everywhere are jumping onto the AI wagon, often unnecessarily and only to capitalize on the hype of AI, it’s important to hear the music through the AI washing noise.

A tiny model that can outperform large models with specific use cases may not be as attractive to end customers as human-inspired large language models that can talk, generate images, and even create breathtaking videos.

However, these new AIs mark a return to the more targeted machine learning style that can unleash tremendous impact.