Manipulating a photo so your father-in-law looks wide-eyed, your dog looks at the camera, or that hill you hiked looks like a cousin of Everest? Well, that’s already possible thanks to Adobe Photoshop and the rest of the photo editors on the market.
But, of course, the tools that such software provides require great skill (and patience) in precisely controlling position/shape/expression/arrangement elements of the photo.
It is not very different from wanting to retouch illustrations by force of a brush: handmade, but an unlikely option for most users.
However, we are in the age of artificial intelligence and a group of artificial intelligence researchers have made public the existence of a tool called DragCAN, in which, through the generative manipulation of images, is able to provide an alternative to that majority of users.
“With DragGAN, anyone can warp an image and have precise control over where each pixel ends up.”
DragGAN opens up a whole new category in the field of image editingwhere the user is able to customize photorealistic images (either real photos, or images created by other generative AIs)…
…through an interactive mechanism as simple as drag and drop; let’s forget about text prompts. Actually, for practical purposes, it is quite similar to editing photos as if they were 3D models.
The key to DragCAN is that, as long as it’s been trained on the element class of the image we’re trying to edit, the AI is able to provide the missing information in the original image: Come on, we can tell it to open a lion’s mouth, and it will be the tool that is in charge of creating the fangs and the tongue, for example.
In the test images published by its creators, it is observed that DragGAN allows perform these tasks indicating only points of origin and destinationwith which it recognizes when to move something (like the head of a lion) and when to alter its shape (like the length of the sleeves of a T-shirt).
According to its creators, It would only be necessary to have a card like the NVIDIA RTX 3090 (around €2,500) to carry out the tasks described above in a few seconds. Unfortunately, though, the software still not available to the general public.
One more step into the future of AI-generated images
What is revolutionary about what DragGAN offers is the possibility of create constant, user-controlled iterations of preview imageswhich will also make it easier to create animations and comics using AI.
We cannot help but think that the future of AI lies in integrating functionalities into the same tools. strictly generative (as Midjourney), of outpainting (like the one that already has DALL-E 2 integrated), so interactive shape and position editing (like this DragGAN) and editing by means of prompts of the appearance of the image (like ControlNet).
In Genbeta | Photoshop will stop displaying thousands of colors unless you pay (more): if you don’t pay, you’ll only see black