xAI Releases Grok-2 Model With Image Generation

Why Trust Techopedia
Key Takeaways

  • xAI has announced the next generation of its Grok AI model.
  • The most significant update is the ability to generate images from text.
  • Grok-2 and Grok-2 mini are available in beta for X Premium+ and Premium users.

The Grok-2 model from xAI includes new image generation features for all X Premium subscribers.

Elon Musk’s X is currently facing controversy for using posts on the platform to train its AI models without their consent. However, that hasn’t stopped his AI startup, xAI, from releasing the next version of the Grok AI model which brings image generation as a new core feature along with a redesigned chat interface to all X Premium subscribers. 

Earlier today, xAI rolled out next generation models, Grok-2 and Grok-2 mini, in beta. These models are now available for users in X’s paid tiers – though it appears only Premium+ users can try the full-fledged model while the mid-tier is restricted to the “mini” version. xAI said the newer models show massive improvements over the older Grok-1.5 model and perform well against other competitor models “in areas such as graduate-level science knowledge (GPQA), general knowledge (MMLU, MMLU-Pro), and math competition problems (MATH).” 

Although xAI does not reveal the weight of the training data, it can be expected to surpass Grok-1’s 314 billion parameters. In the announcement, the company claims the test version of Grok-2 outperforms OpenAI’s GPT-4-Turbo and Anthropic’s Claude 3.5 Sonnet on the LMSYS leaderboard, a crowdsourced platform to rank chatbots – though we couldn’t verify this on the live leaderboard ourselves. 

Realistic Images and No Problem With Text

The image generation functionality is powered by Flux, a newly-introduced AI image generation model by Black Forest Labs and known for its realistic imagery. 

Image generated with Grok-2 mini, using the prompt: “Cinematic medium angle eye level shot of dancers at a party, exhilarated, energetic, 1920s aesthetics.” Credits: Tushar Mehta/Techopedia
Image generated with Grok-2 mini, using the prompt: “Cinematic medium angle eye level shot of dancers at a party, exhilarated, energetic, 1920s aesthetics.” Credits: Tushar Mehta/Techopedia

Like other models for text-to-image generation, Flux is able to replicate textures of the human skin. Where it appears to stand out is its ability to write text as instructed, something which other AI models struggle with. 

Image generated with Grok-2 mini, using the prompt: “young man with wavy hair, 1940s fashion, holding a banner that says ‘End The WAR’.” Credits: Tushar Mehta/Techopedia
Image generated with Grok-2 mini, using the prompt: “young man with wavy hair, 1940s fashion, holding a banner that says ‘End The WAR’.” Credits: Tushar Mehta/Techopedia

In addition to generating images, Grok is also said to be able to incorporate visual perception through attached image files. This was previously teased with the release of the Grok-1.5 Vision model, though the features have yet to be included in the chatbot for users. In addition, xAI is releasing the new models to developers who can incorporate it within their apps and platforms using APIs.