The combination of the digital and physical world is a reality for the team at the Mountain View company.
It is not difficult to imagine someone interested in technology nowadays, having a good time with some artificial intelligence models, either asking ChatGPT trivia or using platforms like Stable Diffusion to get spectacular images. However, all these models have something in common: they live in the digital world. What would you think if we told you that in Google They have gotten transfer this knowledge to a robotic armable to think and analyze the world around him with complex commands?
An AI model capable of being the brain of an entire generation of robots
Recently, thanks to the information that Google has shared on the GitHub website, we have been able to learn of the existence of PALM-E. We are facing what the North American company has called Embodied Multimodal Language Model. There are dozens, if not hundreds, of artificial intelligence models today, and many more that the big companies in the sector are working on perfecting for the near future. However, what it does special to the google model is that he is able, not only to perform language or visual tasks, but can also transform complex commands into commands for robots last generation.
Combining this technology with PaLM-E (research released yesterday by Google) = IRL Ctrl+f pic.twitter.com/z70jRdi315
— AI Breakfast (@AiBreakfast) March 7, 2023
As you can see in the video that we leave you on these lines, without the robotic arm received any special training, it is capable of execute the issued orderin this case ‘bring me the bag of potatoes from the drawer’, and perform the necessary interpretation to carry out its mission. In addition, the robotic arm with the PaLM-E AI model he is able to fix errors during execution of tasks, for example, to return to pick up the bag, if someone takes it from his hand.
This model has been created by a group of researchersboth from the AI department of Google as of the Technical University of Berlinand has the scandalous number of 562 billion parameters, both vision and language, to be able to control robotic devices, in this case an arm created by Google Robotics is used. The team ensures that it is the largest visual-language model ever developed and that it does not need to be retrained to develop different tasks. Without a doubt, we are facing a evolutionary leap in robot management in the professional and private sphere, something that could be essential so that they begin to be something more than mere entertainment and begin to display their skills to lend us a hand on a daily basis.