Most of the coverage of humanoid robots has understandably focused on hardware design. Given how frequently developers toss around the phrase “general-purpose humanoid,” more attention should be paid to the first part. After decades of single-purpose systems, the leap to more general-purpose systems would be a big one. We're not there yet.
Developing robotic intelligence that can take full advantage of the wide range of motion enabled by bipedal humanoid designs has become a hot topic for researchers. The use of generative AI in robotics has also been a hot topic recently, and new research from MIT points out that the latter could have a profound impact on the former.
One of the biggest challenges on the road to general-purpose systems is training. We have a solid grasp of best practices for training humans how to perform different tasks. Approaches to robotics are promising but fragmented. There are many promising methods, such as reinforcement learning and imitation learning, but future solutions may combine these methods and augment them with generative AI models.
One of the key use cases the MIT team proposes is the ability to glean relevant information from these small, task-specific datasets, a method they call Policy Composition (PoCo).Tasks include useful robot behaviors, like hammering in a nail or flipping an object with a spatula.
“[Researchers] “The robot trains a separate diffusion model to learn a strategy, or policy, for completing one task using a specific dataset,” the university noted. “The policy learned by the diffusion model is then integrated into a general policy that allows the robot to perform multiple tasks in different settings.”
According to MIT, incorporating diffusion models improved task performance by 20%, including the ability to perform tasks that require multiple tools and the ability to learn/adapt to unfamiliar tasks. The system is able to combine relevant information from different datasets into the sequence of actions required to perform the task.
“One advantage of this approach is that we can combine policies to get the best of both worlds,” said Lirui Wang, lead author of the paper. “For example, a policy trained on real-world data may be able to achieve better dexterity, while a policy trained in simulation may be able to achieve better generalization.”
The goal of this particular work is to create intelligent systems that allow robots to swap out different tools to perform different tasks. The widespread adoption of multi-purpose systems will bring the industry one step closer to the universal dream.