Google I/O: Everything We Know About the Future of Gemini

Why Trust Techopedia

Google announced its visions for the future — and its multimodal Gemini everywhere.

Long gone are the days when algorithms scanned massive databases to bring you 10 blue links or services like Gmail and Google Docs powered daily work and life matters in silos.

Starting today and rolling out over the next months, Gemini will become deeply linked to Google Services, as well as appear in search engine results.

Coming a day after OpenAI unveiled its own future via GPT 4-o its own fascinating future, Google paints a similar, you may call complementary future — and all roads lead to artificial intelligence (AI) coming to the masses.

For Android users, AI is going everywhere — with Gemini being built deeply into Android 15, including an onboard, scaled-down Gemini living within the operating system itself.

Key Takeaways

  • Google unveiled its vision for the future powered by Gemini.
  • Gemini promises to revolutionize how users interact with Google across various devices, offering contextualized answers and understanding nuanced queries in platforms like Google Search, Photos, and Gmail.
  • All products start rolling out over the next few weeks and months.
  • The system’s conversational AI, named Google Astra, displayed impressive responsiveness and accuracy, answering a variety of questions in real-time with precision.
  • Gemini is about to become omnipresent across Google’s ecosystem, bringing artificial intelligence to the masses.

Watch the Google I/O Livestream

Through a fast-paced presentation, we saw how Gemini was going to change search and how we interact with the world on pretty much any device.

Gemini in Google Services

In Google Search, Gemini will now begin to provide contextualized answers, answering directly on the Search Engine Results Page.

In Google Photos, Gemini will understand nuanced questions.

Ask, “What’s my licence plate?” and Gemini will scan your photos and return a result.

Ask Gemini when your daughter — “Lucy” — began to swim, and it will return you to the day and moment.

Ask, “Show me Lucy’s progression of swimming?” and Gemini will recognize the context and create a fully-fledged ‘memory’.

Ask Gmail to summarize all emails from a sender — including PDFs and attachments — and Gemini will near-instantly provide the result.

If you are sent a one-hour video from Google Meet, ask for a summary, and Gemini will again provide a highlights reel.

Gmail also goes further — instead of searching for emails, simply ask Gemini to, for instance, pick through long threads and answer contextual questions.

As Google CEO Sundar Pichai said: “Multi-modal rapidly expands the question we can ask, and the answers we get back”

Google CEO Sundar Pichai
Google CEO Sundar Pichai.

Gemini AI and Video Searching

Another demo within the Gemini mobile app shows a user filming his bookcase. Gemini instantly provides all the book titles and authors—even when book titles are partially obscured by ornaments.

Meanwhile, speaking to Gemini via your smartphone camera unveiled the power under the hood. Walking around the office, the new Google Astra Conversational AI was bombarded with a dozen creative questions:

“Tell me when you see something that makes a sound. Where did you see my glasses? Which part of London am I in? What does this code do? Write me a poem about what you see in front of me.”

Each question was replied to instantly and accurately, and assuming the reality lives up to the demo, each of us is getting a very smart assistant on whichever device we are on.

Gemini in Business

We expect Google Chip to see a large take-up in companies using Google Enterprise. Added as a virtual teammate across business services such as Hangouts, it becomes an all-seeing eye on projects.

Demos included asking Chip if key decisions had been signed off, and Chip would return when and where the decision had been made. Another demo showed a team member asking for a summary of all blockers before launch, and a document was generated nearly instantly — as Google put it, “saving hours or dozens of hours of a person’s hours”.

Making Gemini Inputs Bigger and Better

One of the main announcements at the event was that Google will be enhancing Gemini with a 2 million token context window

This means that Gemini 1.5 Pro will be able to process double the amount of tokens as the previous version, enabling users to analyze larger documents and input media. 

The expanding context length makes Gemini 1.5 Pro the model with the largest context window. For comparison, GPT-4 supports context lengths of up to 32,000 tokens, while GPT-4 Turbo stretches up to 128,000 tokens. 

Enhancing Image Creation with Imagen 3

Google also announced the launch of Imagen 3, its latest text-to-image model, which will power the image creation tool ImageFX. 

The model, competing against OpenAI’s DALL-E 3, provides several improvements over Imagen 2, including better detail, richer lighting, fewer distracting artefacts, and better text rendering. 

The organization confirmed that Imagen 3 will be available to select users as a private preview via ImageFX. Google plans to make Imagen 3 available through other Google products, including the Gemini App, Google Workspace, and Google Ads. 

Veo Goes Toe to Toe with Sora

Another big announcement at the event was the release of Veo, a generative AI-powered video generation tool that can create 1080p videos of up to a minute in length. 

Veo is capable of creating high-quality videos in a range of compositional styles and will be available to select users via VideoFX. 

Google also confirmed plans to bring Veo to YouTube shorts. This move has the potential to bring generative video to the masses while competing directly against OpenAI’s Sora, as it will provide a direct medium for bringing AI-generated video content to end users.

The New Trillium Chip

Google also announced a new data center chip, Trillium, which is calls five times faster than previous versions.

With demand for artificial intelligence and machine learning (ML) processing power growing “by a factor of 1 million in the last six years, [or] roughly increasing 10-fold every year” it offers a custom chip and data center which can compete against Nvidia, which holds 80% of the market.

Google did speak highly of Nvidia though — not all competition needs to be hostile.

The Bottom Line

If you have 90 minutes to spare, watching the Google I/O presentation is a worthy exercise, with plenty more use cases than we can summarize here.

Want to return a shopping item? Just point your phone at it, and Google will find the product, find the receipt, contact the supplier for you, and arrange a pickup date — all autonomously.

But if you don’t have time to watch, you won’t have to wait long—all of these services will launch seamlessly over the next few weeks to months with no effort from the end users.

While OpenAI has transformed the world — certainly the tech world — you simply cannot buy the installed market base that Google already has. So this is going to be an interesting six months.