The race to meet the global demands of AI is well on its way, with leading companies such as OpenAI, Microsoft, Amazon, IBM, NVIDIA, Google, and others building up their hardware and software resources. The end game is straightforward. To remove the technological and technical barriers that constrain the growth of artificial intelligence (AI) by offering anyone worldwide access to the latest AI tech.
On November 7, Techopedia attended the Google event “Better together: NVIDIA & Google Cloud on G2 VMs powered by NVIDIA L4 Tensor core GPUs.”
During the event, Dave Salvator, Director of Accelerated Computing Products at NVIDIA, and Kevin Chung, COO of Google Writer, spoke about the partnership and the new NVIDIA technology integrated into Google Cloud.
The Rise of AI-Infrastructure-as-a-Service on Google Cloud
The first generative AI technologies, like ChatGPT, Bing Chat, or Google Bard, were made possible thanks to leading tech companies’ unlimited computing and software resources. Today. These resources are virtually unavailable to most businesses.
As AI becomes mainstream and businesses and organisations worldwide seek to leverage the technology, a new global movement is emerging: AI-infrastructure-as-a-service. With shared similarities to the pandemic-era international digital cloud migration, this trend is expected to transform the world and disrupt every industry.
Under AI-infrastructure-as-a-service models, the idea is to give users access to everything they need to experiment, develop, test, customize, deploy, and maintain their AI technology.
In early 2023, Google was the first cloud to launch NVIDIA L4 Tensor core GPUs. Since then, they have been working to expand AI cloud experiences across Google Cloud services. The Google-NVIDIA partnership can potentially speed up AI and enhance its performance while helping companies and developers reduce costs.
AI at the Service of Google’s DNA
Leading companies from Volkswagen to Snap, MidJourney, and Workspot are already using Google-NVIDIA AI solutions to drive different business outcomes.
Salvator from NVIDIA explained what it takes to embrace the transformative power of AI across industries.
“In order to build these things (AI systems) you need a combination of very preformant hardware and software, as well as the ability and tools that make the work of developers building these AI models easier.”
NVIDIA works closely with Google to deliver these demands. “Not only do you get great performance from the platform, but you get it in a way that makes it a little easier to use,” Salvator added.
Salvator explained that NVIDIA provides Google with its latest hardware and the company’s expertise as a data center. Many NVIDIA platform services, software, application frameworks, and tools are now integrated into the Google Cloud, making AI more accessible and user-friendly.
The NVIDIA L4 Tensor Core GPUs
The NVIDIA L4 Tensor Core GPUs were launched in early 2023. The L4 is the fourth-generation NVIDIA Tensor Core. With four video decoders, 2 video encoders, and AVI support, it is designed for advanced, accelerated video and vision AI and AI workload acceleration. It offers up to 120X more AI video performance than just CPU-only implementations.
“We were very pleased to work with Google and have them be the first cloud provider to bring this product to market as a cloud service,” Salvator said.
NVIDIA’s L4 is integrated on Vertex and in the Google Kubernetes Engine (GKE).
Google’s Writer Is Now Powered by NVIDIA L4 GPUs
One of the Google products that is now powered by NVIDIA L4 GPUs is Google’s Writer.
At first glance, the Writer app may not look like much. However, it is a powerful tool for companies. “Writer is a full stack, complete generative AI platform built for enterprises,” Chung said at the Google event.
Writer gives companies a seamless out-of-the-box solution to work with AI. The technology leverages large language models (LLMs), Natural Language Processors (NLP), and machine learning (ML) on G2 virtual machines and is currently used by more than 200 companies.
At the event, Chung revealed that after talking to more than 1,000 executives in the past month, he found several common themes.
Most executives are looking at generative AI and think they have several use cases for it. Still, they ask themselves where should they start.
According to Chung, executives agree that building AI from the ground up is extremely complex. Executives are also concerned about the time it takes to build and deploy AI solutions. They want faster ROI. Chung assured that Google’s Writer is the solution to all these common problems.
“We provide our own proprietary LLMs, they are accurate, secure, and rapid to deploy,” Chung said.
While Writer’s tech behind the curtains is sophisticated, in principle, the app’s concept is pretty simple. Any company, organization, or developer can use the app and create its own knowledge database. Writer supports all types of data including audio, charts, spreadsheets, SQL, videos, or text.
Once the knowledge graph is layered on top of Writer’s AI, users can turn to ready-to-use frameworks available through a simple user interface to ask the AI questions, do research, have it analyze data, create content, and much more.
The minimalist, non-technical user interface of Writer allows companies to deploy an AI application across their business that anyone can use, even those without technical knowledge. Additionally, the app has a strong focus on brand style guidelines and guardrails, accuracy, security, and compliance.
Chung explained how companies are already using Writer to generate documents, create AI assistants, draft newsletters, emails, and thought leadership pieces.
Writer can also be used to build virtual chatbots that support sales or marketing teams and can accelerate and simplify the complex process of onboarding workers who need to be familiarized rapidly with large databases, documents, and information.
Undoubtedly, by offering a full stack solution, Writer cuts down AI production times from years to just days, bringing down costs dramatically. In its early days, Writer worked with 120 million parameters. Chung gave details on how the tech evolved.
“Now we’re at somewhere over 40 plus billion parameters. Writer also generates 9000 words a second with over a trillion API calls per month. It works seamlessly with Nvidia GPUs, enabling us to do very heavy computation without any hiccups.”
Google-NVIDIA Integration Environment: Technical Specifics
But the integrations of NVIDIA into the Google Cloud environment do not end with Writer. At the Google event, Salvator walked attendees through the Google-NVIDIA portfolio.
With GPU deployment options, users can access the following:
- Anthos on BareMetal, VMware: Create and manage Kubernetes clusters with NVIDIA GPU on existing infrastructure with VMware or BareMetal Servers.
- Google Distributed Cloud Hosted: Accelerate sensitive workloads that require digital sovereignty in fully managed on-premises infrastructure with NVIDIA GPUs.
- Google Distributed Cloud Edge: Accelerate mission-critical AI use cases with NVIDIA GPUS at Google Edge, Telco Edge, or Customer Edge locations.
- GDP Edge Appliance: Accelerate data processing, analytics, and processing with NVIDIA GPUS at remote edge locations.
As platform-as-a-service, the Google-NVIDIA offering includes:
- Google Kubernetes Engine (GKE): Allows users to automatically create, manage, and scale K8s clusters with NVIDIA GPUs. Offers support for GPU-sharing capabilities and NVIDIA NGC containers.
- Vertex AI: Gives access to the latest NVIDIA AI software, such as Triton, Merlin, and MONAI, within a unified MLOPs platform to build, deploy, and scale ML models in production.
- Dataflow: Leverages NVIDIA TensorRT and GPUs to accelerate inference with end-to-end pipelines on streaming data.
- Dataproc: Leverages RAPIDs Accelerator for Apache Spark to accelerate Spark SQL/DF-based data pipelines with no code changes.
The Benefits of AI Innovation on the Cloud
Salvator also spoke about the cost-performance benefits of leveraging the Google-NVIDIA innovation.
“If you think about it, you are building the model in the cloud, a lot of times you’re paying for your instance on an hourly basis. If you can get your work done faster, or get more work done in the same amount of time, that translates into cost”.
NVIDIA’s latest technology on the Google Cloud environment gives users access to more computing power, allowing them to deploy their applications faster, speeding up time-to-market and deriving value from the product faster.
More powerful and efficient AI hardware and software also means that companies can use fewer resources to get the same performance.
“This is a fundamental benefit you get on Google Cloud Platform, that can either take the form of using fewer GPUs and say, doing the same work, or using the same number of GPUs and doing more work.”
NVIDIA AI Enterprise Through Google Cloud
Another service that Google Cloud users can now access is NVIDIA AI Enterprise. The service includes 50 frameworks, pre-trained models, and development tools, which can help accelerate and simplify AI production and deployment.
NVIDIA AI Enterprise also offers 24-7 support. “It gives companies the confidence to build applications and know that they can get help building, deploying, and keeping it running 24/7,” Salvator said.
NVIDIA AI Enterprise also gives users access to prepackaged reference applications to rapidly automate business operations. These include:
- Generative AI knowledge base chatbots.
- Intelligent Virtual Assistants.
- AI Audio Transcription.
- Digital Fingerprint and threat detection.
- AI-spear phishing detection.
- AI vehicle and robot route optimization.
- Next-item personalized product prediction recommendations.
The Bottom Line: The Inevitable Transformation of Global Digital Infrastructures and Services
Google and NVIDIA’s hardware, software, and services integration are impressive and can greatly benefit users. They reduce the technical barriers and upfront costs, increase access to the tech, and can boost performance while offering state-of-the-art security and compliance.
However, this partnership is not an isolated case. It is part of a bigger movement. We should expect to see AI infrastructure-as-a-service become the norm very soon. All big cloud providers are already offering similar services, or working to deploy them for their users. As mentioned, global cloud, edge, and on-premises infrastructures need to be rebuilt to meet the computing demands of AI. And AI-infrastructure-as-a-service responds to this inevitable AI call.
The AI-infrastructure-as-a-service movement, giving a new breath of life to the technology sector, will continue to scale globally at a frantic pace as enterprises look for cost-effective, agile, fast, and easy-to-use AI solutions. The goal? To make AI accessible to anyone.