Why Grok 2 Can’t Stop Winning

Why Trust Techopedia

When xAI launched Grok in November 2023, there was a lot of hype, but it quickly became clear that the chatbot had a long way to go before it could catch up with key players like ChatGPT and Google’s Bard (now Gemini). Grok-2 looks to be the answer.

xAI’s new and improved large language model (LLM) knows real-time events via Twitter/X and outperforms Claude and GPT-4 on the LYMSYS chatbot leaderboard. It also allows users to create high-quality generative AI images.

These factors, combined with the one-man PR machine that is Elon Musk, make Grok one of the most important models on the market today. Let’s dive in.

Key Takeaways

  • xAI’s Grok 2 language model has become one of the most important LLMs on the market.
  • Grok 2 outperforms Claude and GPT-4 on the LYMSYS leaderboard overall.
  • The model’s relaxed content moderation aligns with Elon Musk’s free speech principles but raises concerns about potential misuse.
  • It remains unclear how xAI curates posts on X in Grok 2’s training data.
  • Our verdict: The model is still playing catchup with GPT-4o and Gemini 1.5 Pro but is, in many respects, top of the class.

The State of Grok 2 in 2024

While Grok 2’s performance improvements have largely closed the gap between competitors like OpenAI and Google, arguably the biggest win for the LLM has come in its transition to a multimodal LLM.

Originally, Grok was limited to text-to-text interactions, but the ability to create high-quality images has made it a much more versatile tool, capable of taking on competitors like ChatGPT (using DALL-E 3).

Users on X have been extremely enthusiastic about the range of images that Grok 2 can generate, and it’s ability to combine with tools like Runway Gen-3 Alpha or Kling AI to create lifelike animated videos.

Advertisements

Its integration with X has also paid dividends, giving the model the ability to summarize news based on recent posts and to reference tweets used to generate outputs.

However, what’s really made Grok 2 a standout is its different approach to moderation. Grok 2’s text and image generation capabilities are less moderated than the more heavy-handed moderation offered by tools like OpenAI and Google.

This is a controversial area, one we often criticize, but for the moment, we will consider the advantages in a vacuum.

Providers like Google, in particular, have been heavily criticized for their left-leaning bias, which is likely to have alienated some users. Grok 2’s humorous and laissez-faire approach to content moderation potentially makes it a less biased chatbot than more politically focused language models like Gemini.

Grok 2’s position in the market as a free speech-friendly LLM is further bolstered by Elon Musk’s widely publicized personal beliefs on freedom of speech and his commitment to reducing censorship on X (an approach that does garner criticism for enabling toxicity on the platform).

At the same time, more relaxed content moderation also offers more user freedom, where Grok will answer questions and generate images that other language models will avoid.

That said, this opens the door for users to create more controversial and potentially unethical content, such as using the Grok 2 image generator to create deepfakes of public figures committing crimes.

Currently, Grok-2 and Grok-2 mini are available in beta on X to Premium Subscribers, and the organization has also unveiled plans to make both models available via the enterprise API.

How Grok 2 is Closing the Gap

According to the announcement blog post, xAI evaluates model performance through the use of AI tutors.

AI tutors analyze responses generated by Grok and select the best response based on criteria outlined in the company’s guidelines. Models are evaluated in two main areas; their ability to follow instructions and to provide accurate factual information.

The company notes that thanks to this approach, “Grok-2 has shown significant improvements in reasoning with retrieved content and its tool use capabilities, such as correctly identifying missing information, reasoning through sequences of events, and discarding irrelevant posts.”

From an outside perspective, this type of evaluation seems to offer some quality control in terms of Grok-2’s outputs, but it still needs to address misinformation.

Earlier this month, election officials in Michigan, Minnesota, New Mexico, Pennsylvania, and Washington sent a letter to Musk warning that Grok produced false information about state ballot deadlines after Biden dropped out of the presidential race.

Where Grok Still Falls Short

Despite its successes, Grok isn’t perfect. It’s only available to X Premium subscribers at the moment, and it still has all the fundamental shortcomings most LLMs have, such as a tendency to hallucinate and generate misinformation. It also still falls slightly behind GPT-4o and Gemini 1.5 Pro in terms of performance.

Another key shortcoming is that it doesn’t offer voice inputs in the way that models like GPT-4o and Gemini do. This means that users have to enter their inputs via text, which isn’t the end of the world, but does hold back the user experience to some degree.

It is also unclear how inputs from X are being curated under the hood. X has a reputation for toxicity, and there is the potential that some of the toxicity and bias in user posts could seep through into Grok itself (though this is also true of OpenAI’s decision to integrate ChatGPT with Reddit).

Although these concerns could be partially mitigated by training the model content produced by verified accounts with lots of impressions and engagement, they are unlikely to be removed completely.

How to Use Grok 2

If you want to use Grok 2, you need an X Premium account. X offers localized pricing starting at $3 per month or $32 per year. A full list of pricing by country can be found here.

Once you have a X Premium Subscription you can access the chatbot by clicking on the Grok tab via the X platform.

To begin using Grok, simply enter your written prompt into the text box and press enter. Here are some ideas to get you started:

  • Ask a question about a topic – When was [X]?, Who invented [X]?, What is [X]?
  • Produce content – Write an article or blog post on [INSERT TOPIC]
  • Create images – Create an image of [INSERT SUBJECT]
  • Ask for a joke  – Tell me a joke about [INSERT TOPIC]

If you’re unsure what to ask Grok, you will find some conversations below the prompt bar, which you can click on to automatically enter a prompt into the chatbot.

The Bottom Line

Grok 2 has quickly become a go-to LLM for many users. With improved multimodal capabilities and image creation, xAI has all the critical ingredients necessary to snatch up users who are tired of OpenAI or Google’s approach to content moderation.

Advertisements

Related Reading

Related Terms

Advertisements
Tim Keary
Technology Specialist
Tim Keary
Technology Specialist

Tim Keary is a freelance technology writer and reporter covering AI, cybersecurity, and enterprise technology. Before joining Techopedia full-time in 2023, his work appeared on VentureBeat, Forbes Advisor, and other notable technology platforms, where he covered the latest trends and innovations in technology.