X

Google's Text-to-Image AI Creates Any Wacky Image You Can Imagine

A plumber riding a dinosaur? Google says its AI system Imagen creates photorealistic images from input text.

Imad Khan Senior Reporter
Imad is a senior reporter covering Google and internet culture. Hailing from Texas, Imad started his journalism career in 2013 and has amassed bylines with The New York Times, The Washington Post, ESPN, Tom's Guide and Wired, among others.
Expertise Google | Internet Culture
Imad Khan
2 min read
AI generated images from the Imagen text-to-image program, including a brain on a rocket

An example of some of the images created by Imagen, Google's text-to-image AI generator. 

Imagen composited by Sarah Tew/CNET

Google has a new text-to-image AI that the company says beats the competition.

Called Imagen, the program takes in text -- for example, "a photo of a Persian cat wearing a cowboy hat and red shirt playing a guitar on a beach" -- and outputs a result. Imagen can produce images that are photorealistic or an artistic rendering. 

Mad libs-style image generator example from Imagen website

Google's website for Imagen let's people people select text to change the resulting image. 

Imagen composited by Sarah Tew/CNET

Imagen follows other text-to-image generators such as DALL-E, VQ-GAN+CLIP and Latent Diffusion Models. When asked to compare images created by Imagen and other text-to-image generators, Google said, people found its model outperformed competitors in accuracy and image fidelity. 

Google shared several examples of text prompts and the resulting images created by the AI on its Imagen website -- including gems such as "A cute corgi lives in a house made out of sushi" -- but these may only represent the best results generated. Google declined to comment for this story.

Text-to-image learning models show the power of machine learning systems. In this case, Imagen removes the need to know how to use specialized software like Photoshop to create abstract images. AI systems are helping the company come closer to its vision of an ambient computing future, as noted at the Google I/O conference earlier this month. Ambient computing is the idea that people will be able to one day use computers intuitively, without needing knowledge of specific systems or code. 

The power of text-to-image AI isn't lost on Google, however, and the company has chosen not to release Imagen to the public. Imagen scrapes the internet for information to learn and create images. Because the internet can be filled with stereotypes and biases, these end up becoming present in Imagen. Google said the biases include a preference for lighter skin tones and certain Western gender stereotypes. The company also fears that Imagen could be used negatively by bad actors.

"Generative methods can be leveraged for malicious purposes, including harassment and misinformation spread, and raise many concerns regarding social and cultural exclusion and bias," according to a white paper published by Google.

Google cautions other AI makers to be wary of releasing text-to-image models to the public without close attention given to the information an AI is being trained on.