On February 1, 2024, Google announced the launch of its AI image creation tool, ImageFX. This text-to-image generator is now available through the Google Labs website for users in the US, Australia, New Zealand, and Kenya.
The company also announced the release of MusicFX, a text-to-music tool that allows users to create music of up to 70 seconds in length or loops.
These releases come just months after OpenAI integrated the popular text-to-image model DALL-E 3 with ChatGPT, giving users the ability to generate images from written prompts.
Learn what ImageFX is and where you can give it a try, explore its major features and capabilities, and take a peek at what lies ahead in the future of AI image creation tools.
What Is ImageFX?
“ImageFX is a new tool in Labs that lets people create images with simple text prompts. Our early experiments in Labs highlight how important creative exploration is to new users of generative AI tools,” Google said in the announcement blog post.
“People often discover new ideas through testing a range of prompts and concepts as they iterate. To spur further creativity, ImageFX includes a prompt interface featuring ‘expressive chips’ that let you quickly experiment with adjacent dimensions of your creation and ideas,” the blog post said.
Expressive chips are essentially a set of recommended keywords generated by ImageFX, which the user can choose to generate other stylistically similar designs.
This feature is also the solution’s key differentiation from other competitors like DALL-E 3 and Midjourney.
How to Use ImageFX
When using the platform, users can opt to enter their own text prompt and press the generate button to create an image or click on the “I’m feeling lucky” option to create a random prompt and image.
After creating an image, users have the option to download or share it. There is also the option to change a numerical seed to give the solution’s output more variety.
Users can also click on expressive chips at the bottom of the screen. Types of keywords recommended during our testing included photorealistic, dramatic, 35mm film, minimal, sketchy, handmade, wide shot, illustration, close-up, and highly detailed.
Testing ImageFX: A Step-By-Step Guide
In this section, we’re going to try ImageFX capabilities in action. All you need is a Google account, to be located in one of the approved destinations, or a VPN.
Clicking on the link will prompt you to sign in, so select the Sign in with Google option and then press Sign in again.
A pop-up will come up, giving you the option to receive marketing emails or research invitations. Check which option you want (if any) and press the Next button.
Now you’re all set and ready to experiment with your prompts!
Using ImageFX: The Basics
On the left-hand side of the screen, you will see a text box where you can enter your written prompt and press the Generate button to produce images that will be displayed on the right-hand side of the screen.
Underneath the text prompt box, you will find a button that says More alongside a series of keywords – the expressive chips. Clicking More allows you to generate another set of keywords, and clicking on a keyword adds it to your written prompt.
Finally, on the bottom right-hand corner of the screen, there are three buttons, the first allows you to select a numerical seed to increase the variety of outputs, the next lets you download an image, and the last one lets you share the image.
ImageFX in Action
As with any image-to-text tool, the quality of image output will depend largely on your initial prompt.
To get the best results, you’ll want to include as much context as possible. For the purposes of this guide, we decided to go with a surreal image – of an alien playing football with an ostrich.
The results were as follows:
The first image was acceptable and looked fairly “realistic,” but the other outputs weren’t so good.
To see if we could get some alternative designs, we pressed the More button to get ImageFX to provide us with more expressive chips to choose from.
From these options, we clicked on the Painting option to see how the image would look as a painting. The results were as follows:
To further test expressive chips, we went looking for an option that would create an animated-style version.
The closest keyword match we could get was Illustration. Here are the results of the prompt:
These results were probably the best of the bunch in terms of matching the intent of the prompt and the overall output quality.
ImageFX vs. DALL-E 3: Which Is Better?
To help evaluate ImageFX we decided to compare its output against DALL-E 3’s to see which created the best images. While this isn’t an exhaustive test of each LLM’s image quality – it does give an idea of how each tool will respond to a barebones prompt.
To start off our test, we instructed DALL-E 3 to create an image of an alien playing football with an ostrich (the same initial prompt we entered into ImageFX). The results were as follows:
During our test, we noticed that DALL-E 3 took longer than ImageFX to generate the image, but we felt the output image it created was much better than any of the designs produced by Google’s solution so far.
That being said – it did only generate one image.
To further build on our comparison we decided to see how each handled a cartoon T-rex. Here are the results:
The images created by ImageFX were all highly detailed, but we felt like DALL-E 3 produced an image that not only better matched the intent of the prompt, but produced a pretty good Disney-style animated character.
First Impressions of ImageFX[Su_note]
Overall, ImageFX was very easy to use.
We found expressive chips to be a welcome addition – they offered a valuable reference point we could use to see how prompts could be adapted or improved when creating images. This would be useful for users who were struggling with coming up with compositional ideas.
While the image quality didn’t blow us away, particularly with the alien-ostrich example, in other tests, it did generate extremely high-quality results.[/su_note]
Here is a decent image it created of an Astronaut on the moon:
In this sense, ImageFX is definitely a tool that you can get good results with if you’re willing to take the time to enter the right prompts.
Imagen 2 Explained
The core of Google’s AI image generator, Imagen 2, is the text-to-image diffusion model that enables ImageFX to produce high-quality images. It’s also a model used to power Google Bard so that users can create images directly, integrated with search generative experience (SGE).
To enable Imagen 2 to create detailed images, Google added more detailed descriptions to image captions in the model’s training data so that it could learn between different artistic styles.
Using this approach means that the model can better understand the context of user prompts and respond with more relevant output.
Another important differentiator for Imagen 2 is that it’s accessible with Google Cloud – more specifically via the ImagenAPI in Google Cloud Vertex AI.
In the future, Bard and Imagen 2 have the potential to become a power couple much the same way that ChatGPT and DALL-E 3 have, simply by making image creation technology accessible alongside a free, publicly available research assistant.
This is particularly true when considering the introduction of the more powerful Gemini Pro language model to Bard.
Where Does ImageFX Fit into the Text-to-Image Market?
ImageFX AI image creation tool is competing against a number of established competitors in the text-to-image market. Competing tools include OpenAI’s DALL-E 3 and Midjourney.
Below, we’ve created a high-level overview of what each tool has to offer.
|Ease of use
|Create images for free
|Generate random images
|Images fixed to 1536×1536
|Image can be 1024×1024, 1024×1792 or 1792×1024
|Images can be 1024×1024, 2048×2048, 4096×4096
|Outputs owned by user
|Outputs owned by user
|No. Requires a paid plan such as ChatGPT Plus or Enterprise
|Paid plans start at $20 per user per month for ChatGPT, $25 per user per month for ChatGPT Team, and price on request for the Enterprise package
|Paid plans start at $10 per month for the Basic Plan, $30 per month for the Standard Plan, $60 per month for the Pro Plan, and $120 per month for the Mega Plan with more fast GPU time and other benefits included
|Via Google’s Search Labs (restricted regions)
Google’s AI Safety and Legal Protections
Google has some basic safety protections to help mitigate the risks presented by AI-generated images. One of these protections is content moderation guidelines, which prevent the generation of violent, offensive, or sexually explicit content.
The organization has also made a concerted effort to make it easier for users to identify AI-generated images.
For instance, all images created with ImageFX are given a digital watermark by SynthID to make them easier to identify. Likewise, images also include IPTC metadata so that users will be able to tell when they encounter AI-generated images.
Using digital watermarks is an attempt to address concerns over deepfakes, digitally-created images of people that are difficult to distinguish from real ones.
The Future of AI Image Generation
AI image generation is evolving rapidly at the moment, with vendors like Google and OpenAI looking to build multimodal AI solutions that can respond to inputs, including text, images, audio, and video.
The development of ImageFX and its underlying model Imagen 2 highlight that Google is attempting to integrate the ability to generate high-quality, photorealistic images into its product ecosystem. This is shown by using Imagen 2 to add image-creation functionalities to Bard.
As it stands, there is a long way to go to advance image generation technology. While tools like Stable Diffusion and Midjourney have offered users powerful text-to-image generators, they’ve also been difficult to use.
Current technologies have also struggled to develop realistic images, having difficulties with elements like hands and faces and creating an unsettling uncanny valley effect when attempting to depict any lifelike designs.
Image generation is an extremely fast-moving segment of the AI market at the moment. Google’s launch of ImageFX provides a big opportunity for the organization to start competing against DALL-E 3, which has remained one of the most accessible text-to-image generators on offer.
Where can I try using ImageFX?
Is ImageFX free?
When will ImageFX become available globally?
- Sign in to start making music just like this (AI Test Kitchen)
- DALL·E 3 is now available in ChatGPT Plus and Enterprise (OpenAI)
- Imagen 2 (DeepMind Google)
- Try ImageFX and MusicFX, our newest generative AI tools in Labs (Google)
- Experiment at the intersection of AI and creativity (AI Test Kitchen)
- Sign in to start creating images just like this (AI Test Kitchen)
- Report Content On Google (Google)