SEIL: Simulation-augmented Equivariant Imitation Learning

Northeastern University
*Equal contribution

Given limited data, SEIL learns robust close-loop controller.

Abstract

In robotic manipulation, acquiring samples is extremely expensive because it often requires interacting with the real world. Traditional image-level data augmentation has shown the potential to improve sample efficiency in various machine learning tasks. However, image-level data augmentation is insufficient for an imitation learning agent to learn good manipulation policies in a reasonable amount of demonstrations. We propose Simulation-augmented Equivariant Imitation Learning (SEIL), a method that combines a novel data augmentation strategy of supplementing expert trajectories with simulated transitions and an equivariant model that exploits the symmetry in robotic manipulation. Experimental evaluations demonstrate that our method can learn non-trivial manipulation tasks within ten demonstrations and outperforms the baselines with a significant margin.

Introductory Video

Full Presentation Video

\

Transition Simulation

Transition simulation (TS) is a novel 3D data augmentation method. Via TS, data are enriched and help model generalize to more cases.

Equivariant Behavioral Cloning

By enforcing O(2) symmetry into the neural network, the model can directly generalize to more unseen scenarios. Another benefit from equivariant models is that the convergence is faster and training is more stable because the oarameter searching space is reduced to find the optimal solution.

Policy accuracy

Taking a closer look at robot's behavior in a harder task, e.g., Shoe Packing, where the robot needs to learn accurate pick and place positions and orientations, SEIL achieves better accuracy compared with CNN-based BC.

Discussion

Transition simulation is intuitively useful because it efficiently augments data in the 3D data, comparing with conventional image-level image augmentation techniques. But how exacly does it help in function approximation with neural networks? Here are some explanations. Imagine the model is imitating a sequence of actions in our dataset shown in the follwing figure.

BibTeX

@inproceedings{jia2023seil,
  title={Seil: Simulation-augmented equivariant imitation learning},
  author={Jia, Mingxi and Wang, Dian and Su, Guanang and Klee, David and Zhu, Xupeng and Walters, Robin and Platt, Robert},
  booktitle={2023 IEEE International Conference on Robotics and Automation (ICRA)},
  pages={1845--1851},
  year={2023},
  organization={IEEE}
}