What is Google Gemini (Gemini AI)?
Google Gemini (Gemini AI) is an integrated suite of large language models (LLMs) that Google DeepMind designed from the beginning to be multimodal. The integrated suite can process text, images, code, and audio through a single user interface (UI).
In December 2023, Gemini replaced PaLM 2, the LLM that powered Google Bard. In February 2024, Google announced that from now on, Bard will be called Gemini.
Techopedia Explains the Google Gemini Meaning
Google Gemini AI definitions often position Gemini LLMs as a family of powerful AI assistants. The term “assistant” implies that Google views Gemini as an augmented intelligence tool that’s designed to help users with various tasks, not replace human workers.
How Google Gemini Got Its Name
Some media outlets have reported that Gemini stands for “Generalized Multimodal Intelligence Network Interface,” but that information could not be confirmed.
According to Google Bard, it’s more likely that Google developers named the integrated LLM suite after the constellation Gemini and the ancient Greek myth of Castor and Pollux that inspired the zodiac sign. When prompted, Google Gemini agreed and pointed out that this aligns with Google’s history of using astronomical themes in product naming.
How Gemini Works
Gemini AI models are rumored to use the Google Pathways architecture. In this type of AI architecture, a series of modular machine learning (ML) models are initially taught how to perform a specific task. Once trained, the modules are connected to form a network.
The networked modules can work independently, or they can work together to generate different types of outputs. On the back end, encoders convert different types of data into a common language, and decoders generate outputs in different modalities based on the encoded inputs and the task at hand.
Google has acknowledged that the models are faster when they are run on Google Tensor Processing Units (TPUs).
A user-friendly interface hides the complexities of the Gemini architecture and makes it possible for people with different skill levels to use Gemini models for generative AI purposes.
What Can Gemini Do?
It’s important to note that Google Gemini is continually evolving, and model capabilities are always expanding. For example, early versions of the free web-based models could interpret uploaded images, but they could not generate images from prompts.
Today, the free version of Gemini can be used to generate text in a variety of formats, translate languages, answer questions with factual accuracy, summarize information on web pages, explain programming concepts, generate new code, and suggest improvements for code snippets.
Another thing that seems to be continually evolving is the product names for different Gemini model clusters. Currently, the smallest version of the Gemini model family is being called Gemini Nano. It is a lightweight version of Gemini that can be executed on Android devices, starting with Google Pixel 8 Pro and the Samsung S24 Series.
Google Gemini Ecosystem
According to Sundar Pichai, CEO of Google and Alphabet, “Gemini will support an entire ecosystem – from the products that billions of people use every day, to the APIs and platforms helping developers and businesses innovate.”
Until Google standardizes descriptions for the Gemini chatbot and product integration options, users can get the latest information by visiting Google’s landing page for Gemini Updates.
How Gemini AI is Trained
Gemini LLM models are alleged to have been trained with a combination of the following techniques:
Some industry experts have speculated that Google relied heavily on reinforcement learning with human feedback (RLHF) to train Gemini modules on Cloud TPU v5e chips. According to Google, TPUs have five times more computational power than the chips used to train Chat GPT.
As of yet, Google has not released any detailed information about the datasets that Gemini AI models were trained on. It is likely, however, that Google engineers used the LangChain framework and repurposed data they used to train PaLM 2.
If this is the case, then Gemini foundation models would have initially been trained on data from web documents, books, code, images, audio, and video. It remains to be seen whether Google DeepMind’s holistic approach to training AI assistants will be as effective as Open AI’s approach, which has been to add new modes iteratively.
Free and Paid Subscription Models
Desktop users can access the free version of Gemini through a web browser. Mobile users have the option of using the free version, which is currently being called Gemini Pro, by installing the Gemini app on Android devices or the Google app on iOS devices.
Gemini Advanced is a paid version of Gemini that extends the capabilities of the free version for $19.99/month. The landing page for Gemini Advanced refers to the model as 1.0 Ultra. It’s not clear whether DeepMind is using Gemini Advanced subscribers to beta test enterprise versions of Gemini – or whether Gemini Advanced will eventually be called Gemini Ultra.
Google Workspace customers can currently subscribe to Gemini Business or Gemini Enterprise to access 1.0 Ultra. Gemini Business costs $20 per user/month and requires a one-year commitment. It provides users with enterprise-level security and privacy and is designed to meet the needs of most business users.
Gemini Enterprise costs $30 per user/month and requires a one-year commitment as well. The enterprise subscription provides everything Gemini Business offers, as well as advanced translation capabilities for meetings and full Gemini access/use.
Gemini vs. GPT-4
Gemini and GPT-4 are often used together because each family of models has different strengths. For example, ChatGPT Plus excels at summarizing topics and writing code, while Gemini Advanced is better at creative writing and adjusting the tone of text outputs. If you need help with a creative writing project, Gemini might be a better choice. But if you’re writing non-fiction or analyzing code, GPT-4 might be more suitable.
Another consideration is that Gemini can access the Internet. This means that Gemini can incorporate more recent knowledge in its responses than Chat GPT-4.
Google Gemini Pros and Cons
One of the biggest advantages of Gemini is that Google is integrating this family of multimodal AI models into other Google products and services. This means that users will be able to access Gemini’s capabilities within familiar Google tools like Search, Gmail, and Docs, without needing to switch between different apps.
One of the biggest disadvantages of Gemini is that it can sometimes provide responses that are overly confident, even when information outputs are incorrect.
FAQs
What is Google Gemini in simple terms?
What is Google Gemini used for?
Is Google Gemini any good?
Is Google Gemini free or paid?
Is Gemini better than ChatGPT?
References
- Google DeepMind Gemini – Dr Alan D. Thompson – Life Architect (Lifearchitect)
- Introducing Pathways: A next-generation AI architecture (Blog)
- Tensor Processing Units (TPUs) | Google Cloud (Cloud.google)
- Get started with Gemini Nano on Android (on-device) | Google AI for Developers (Ai.google)
- Store.google (Store.google)
- Gemini Apps’ release updates & improvements (Gemini.google)
- Announcing Cloud TPU v5e and A3 GPUs in GA | Google Cloud Blog (Cloud.google)
- Generative AI applications with Vertex AI PaLM 2 Models and LangChain | Google Cloud Blog (Cloud.google)
- Gemini Advanced – get access to Google’s most capable AI model, 1.0 Ultra (Gemini.google)
- Gemini for Google Workspace | Gen AI Tools for Business (Workspace.google)