What is Google Gemini?
Google Gemini, or Gemini AI, is an integrated suite of large language models (LLMs) that is currently being developed by Google AI. According to Google CEO Sundar Pichai, Gemini’s foundation models were designed from the beginning to be multimodal.
This means users will be able to process and generate text, images, code, and audio content through a single user interface (UI).
Gemini is currently being beta-tested by a select group of developers at a small number of companies. It’s expected that Gemini will replace PaLM 2, the LLM that currently powers Google Bard, by the end of 2023.
Google Gemini Features
Zoubin Ghahramani, the Vice President of Google DeepMind, said that Gemini will be available in the same four sizes as PaLM 2: Gecko, Otter, Bison, and Unicorn.
- Gecko is expected to be lightweight and ideal for use on mobile devices.
- Otter is designed to be more powerful than Gecko. It is expected to be suitable for a wide range of unimodal tasks.
- Bison is designed to be larger and more versatile than Otter. It is likely to be suitable for a limited number of multimodal tasks and is expected to compete with Chat GPT-4 for market share.
- Unicorn is designed to be the largest, most powerful, and versatile Gemini size. It is expected to be suitable for a wide range of multimodal tasks and go far beyond the capabilities of Chat GPT or any of its competitors.
How Gemini AI Works
Gemini is likely to use the Google Pathways architecture. In this type of AI architecture, a series of modular machine learning (ML) models are initially taught how to perform a specific task. Once trained, the modules are connected to form a network.
The networked modules can work independently, or they can work together to generate different types of outputs. On the back end, encoders convert different types of data into a common language, and decoders generate outputs in different modalities based on the encoded inputs and the task at hand.
It’s expected that Google will use Duet AI as the front end for Gemini. This user-friendly interface will hide the complexities of the Gemini architecture and make it possible for people with different skill levels to use Gemini models for generative AI purposes.
How Gemini AI is Trained
Gemini LLM models are alleged to have been trained with a combination of the following techniques:
- Supervised learning: Gemini AI modules were trained to predict outputs for new data by using patterns learned from labeled training data.
- Unsupervised learning: Gemini AI modules were trained to autonomously discover patterns, structures, or relationships within data without the need for labeled examples.
- Reinforcement learning: Gemini AI modules improved their decision-making strategies iteratively through a trial and error process that taught modules to maximize rewards and minimize penalties.
Some industry experts have speculated that Google relied heavily on reinforcement learning with human feedback (RLHF) to train Gemini modules on Cloud TPU v5e chips. According to Google, TPUs have five times more computational power than the chips used to train Chat GPT.
As of yet, Google has not released any specific information about the datasets that Gemini AI was trained on. It is likely, however, that Google engineers used the LangChain framework and repurposed data they recently used to train PaLM 2.
This data came from a variety of sources, including books and articles, code repositories, websites, video and podcast transcripts, social media posts, and internal Google data.
Google Gemini Release Date
The release date and final capabilities of Gemini AI are still unknown. What has been confirmed, however, is that Google has given a limited number of developers at a small number of companies early access to Gemini.
This suggests Gemini could be ready for release and integration into Google Cloud Vertex AI services by the end of 2023. If all goes well, Gemini AI will also be integrated into all Google enterprise and consumer cloud services that use artificial intelligence (AI), including Google Search, Google Translate, and Google Assistant.
Once Gemini AI is released, its scalability – along with its flexible tool and application programming interface (API) integration capabilities — will make it suitable for use in a wide range of real-time desktop and mobile applications.
How Google Gemini AI Got Its Name
Some media outlets have reported that Gemini stands for “Generalized Multimodal Intelligence Network Interface,” but that information could not be confirmed.
According to Google Bard, it’s more likely that Google developers named the integrated LLM suite after the constellation Gemini and the ancient Greek myth of Castor and Pollux that inspired the zodiac sign.