AI image generators are not new: DALL-E 3 integrated into ChatGPT, Imagen 3, recently implemented in Gemini, Midjourney or Stable Diffusion. They all use text to create more or less successful images: you type your query, and the image appears a few seconds later.
For its new tool, Whisk, Google has chosen to adopt a radically different approach by freeing itself from these textual constraints. More intuitive to use, it uses a universal language: the one in the picture. Explanations.
A triple innovative creative architecture
The uniqueness of Whisk lies in its tripartite methodology. The tool breaks down generation into three distinct dimensions: subject, scene and style, each of which can be supplied with several reference images. If you don’t have an image in mind, the Whisk interface can generate one for you, and in a few clicks it will offer you illustrations (made by AI, of course) adapted to your request.
Powered by the latest version of the Imagen 3 model, Whisk simultaneously generates visuals and their associated textual descriptions. Google emphasizes that the tool is designed to “ rapid visual exploration, not for pixel-perfect edits “. Generation times, although perceived as annoying by the tester The VergeJay Peters, do not seem prohibitive.
Faced with a result that does not exactly correspond to expectations, Whisk allows you to gradually refine the generated image. It is possible to select a produced image, modify its underlying text prompt or adjust reference images to guide the system towards the desired result. This rapid feedback loop – a few seconds per generation – facilitates creative exploration through successive trials. As Google points out in its blog: “ Whisk can sometimes miss its target », which is precisely why prompt editing still remains available.
Alongside Whisk, Google announced that its Veo 2 model, capable of generating photorealistic videos, arrives in a new version. The latter would be better able to understand the “ unique language of cinematography » and would significantly reduce common and distracting visual artifacts like multiple fingers and other oddities, a recurring problem with competing models. This new evolution of Veo 2 will initially be deployed in VideoFX, accessible via Google Labs waiting list, before enriching YouTube Shorts. and other products » during 2025.
For now, neither Whisk nor Veo 2 are available in Francenor in Europe. The official Whisk website will greet you with this message: “ Whisk is not yet available in your country “. After a few tries, even using a VPN didn’t change anything and Google has not provided any official launch date for France.
- Whisk uses images as references to create new ones, without using text.
- The tool works in three stages: subject, scene and style, which can be modified with each iteration.
- Whisk and Veo 2 are not yet available in Europe.