MIT this week unveiled a new model for training robots. This technique mimics the large amount of information used to train large-scale language models (LLMs), rather than the standard set of focused data used to teach robots new tasks. It will be.
The researchers note that imitation learning, where an agent learns by following an individual performing a task, can fail when small challenges are introduced. These could be things like lighting, different settings, or new obstacles. In such a scenario, the robot does not have enough data available to adapt.
The team looked to models like GPT-4 as a kind of brute-force data approach to problem-solving.
“In the language domain, all data is just text,” says Lirui Wang, lead author of the new paper. “In robotics, given the heterogeneity of the data, you need a different architecture if you want to pre-train in a similar way.”
The team introduced a new architecture called a heterogeneous pre-trained transformer (HPT) that collects information from different sensors and different environments. Next, I combined the data into a training model using a transformer. The larger the transformer, the better the output.
The user then inputs the design, configuration, and job they want the robot to perform.
“Our dream is to have a universal robot brain that you can download and use in your robot without any training,” said CMU Associate Professor David Held. “It's still early days, but we continue to work hard and hope that scaling will lead to breakthroughs in robotics policy, just as it did with large-scale language models.”
Part of this research was established by the Toyota Research Institute. At last year's TechCrunch Disrupt, TRI debuted a way to train robots overnight. We recently entered into an important partnership integrating robotic learning research with Boston Dynamics hardware.