For the past few weeks, Llama 3 has been capturing all the attention of artificial intelligence (AI) enthusuasts.
But as we mentioned last week, whenever a contender to the crown threatens OpenAI’s dominance, the ChatGPT creator often follows competitors’ big releases by their competitors with their own releases to regain momentum.
Perhaps that is why a mysterious, advanced model, referred to as “GPT-2” and tied to OpenAI, suddenly appeared in the wild, blew those who tried it away, and then vanished just as quickly.
This “mysterious” new GPT-2 large language model (LLM) emerged on chatbot benchmark site lymsys.org, and users were quick to identify it as equal to the current crop of AI models.
There is a mysterious new model called gpt2-chatbot accessible from a major LLM benchmarking site. No one knows who made it or what it is, but I have been playing with it a little and it appears to be in the same rough ability level as GPT-4. A mysterious GPT-4 class model? Neat! pic.twitter.com/1s2iEreaiT
— Ethan Mollick (@emollick) April 29, 2024
I was skeptical about the GPT2 chatbot, but it is undoubtedly more capable than opensource models and, in some cases, better than GPT4-turbo
But it is not better than Opus in my experience – curious to know what is behind it.
Also, about the gpt2-chatbot:
It does not have a… pic.twitter.com/CWPVrM48Ig— Denis Shiryaev 💙💛 (@literallydenis) April 29, 2024
What really got people talking was the fact that no one knew who produced it. This led to many users suggesting that OpenAI had released the chatbot as a preview of GPT-5.
Even OpenAI founder and CEO Sam Altman added fuel to the fire by cryptically tweeting “I do have a soft spot for gpt2.”
i do have a soft spot for gpt2
— Sam Altman (@sama) April 30, 2024
As did OpenAI staff member Steven Heidel:
when gpt-2
— Steven Heidel (@stevenheidel) April 30, 2024
The chatbot generated so much interest that lymsys.org was forced to temporarily shut it down due to an “unexpectedly high traffic & capacity limit.”
Thanks for the incredible enthusiasm from our community! We really didn't see this coming.
Just a couple of things to clear up:
– In line with our policy, we've worked with several model developers in the past to offer community access to unreleased models/checkpoints (e.g.,…
— lmsys.org (@lmsysorg) April 30, 2024
While the mystery of who released GPT-2 is yet to unravel, what is clear is that release has somewhat stolen Llama 3’s thunder and has people talking about OpenAI again.
GPT-2: Who’s Behind the Curtain?
So who is the mysterious figure behind the curtain? Based on the information we have available, there’s no way to know for sure.
Michal Oglodek, CTO and co-founder of AI-powered communications platform Ivy.ai, told Techopedia:
“It’s hard to say definitively who’s behind ‘GPT-2 chatbot,’ but the fact that it appeared on LMSYS Chatbot Arena suggests it could be a collaboration with researchers or developers outside of the major AI companies.
“Whether it’s a solo endeavor or a sneak peek from a big player like OpenAI, it’s clear that the boundaries of AI development are constantly being pushed.”
Either way, GPT-2 has been a win for OpenAI due to the simple fact that it has taken momentum away from Meta and the open-source Llama 3.
We’ve already seen OpenAI trying to follow up the release of Llama 3 with product updates such as expanding its enterprise API offering and the release of a new feature called Memory, which enables ChatGPT to remember the preferences of Plus subscribers.
It’s worth noting that Sam Altman also made some interesting comments at a talk for Stanford University this week, where he downplayed the capabilities of GPT-4, saying “GPT-4 is the dumbest model any of you will ever have to use again.”
He also spoke about the organization’s willingness and ability to invest in AI, stating “I don’t care if we burn $50 billion a year, we’re building Artificial General Intelligence (AGI) and it’s going to be worth it.”
Reactions to GPT-2 So Far
AI Breakfast speculated to 167,000+ followers on X, saying: “Most likely explanation for GPT-2 chatbot: OpenAI has been working on a more efficient method for fine-tuning language models, and they managed to get GPT-2, a 1.5B parameter model, to perform pretty damn close to GPT-4, which is an order of magnitude larger and more costly to train/run.”
Most likely explanation for gpt2-chatbot:
OpenAI has been working on a more efficient method for fine-tuning language models, and they managed to get GPT-2, a 1.5B parameter model, to perform pretty damn close to GPT-4, which is an order of magnitude larger and more costly to…
— AI Breakfast (@AiBreakfast) April 30, 2024
Brian Roemmele, a prompt engine and editor of ReadMultiplex.com, also shared some positive feedback on the model. “I have been testing GPT-2 chatbot for a few days. Today it seems to have gotten much more attention.
“It has surpassed all of our ChatGPT benchmarks.”
I have been testing gpt2-chatbot for a few days. Today it seems to have gotten much more attention.
It surpassed all of our ChatGPT-4 benchmarks.
Hypothesis: A few of us have concluded it is a form of pre-lobotomized ChatGPT-4 or trained heavily on it. https://t.co/KKLHmPVYnf
— Brian Roemmele (@BrianRoemmele) April 29, 2024
It's an advance over GPT-4, for sure. Wildly better at coding, shockingly better at stats proofs/identities. Very slow and limited (I got 5k tokens out and it crashed).
— Chuck Smith (@cesmithjr) April 29, 2024
Harrison Kinsley, founder of nnfs.io, was similarly impressed, posting “I have found gpt-2 chatbot to so far be exceptional.”
I have found gpt2-chatbot to so far be exceptional.
If you've not tried it, you can for free at https://t.co/GxIBS7itiB then go to the direct chat tab or arena and choose gpt2-chatbot.
Looking forward to learning more about this one.
— Harrison Kinsley (@Sentdex) April 30, 2024
The Bottom Line
GPT-2 has turned a lot of heads and got a lot of people talking. For now, the creators remain a mystery but in any case, it got people hyped up about OpenAI and GPT-5 again.
Is OpenAI behind GPT-2? Or is the largest player in AI just admiring from afar? Is GPT-2, as some commentators speculate, not so much an evolution of OpenAI’s current models, but a different kind of model the company is trying out?
We expect it to appear in the wild again, this time with more information and more time for testing.
If there’s a lesson to be learned, it’s that sometimes a little bit of drama is as important for increasing hype than simply enhancing performance.