LHManip: A Dataset for Long-Horizon Language-Grounded Manipulation Tasks in Cluttered Tabletop Environments
Abstract
Instructing a robot to complete an everyday task within our homes has been a long-standing challenge for robotics. While recent progress in <PRE_TAG>language-conditioned imitation learning</POST_TAG> and <PRE_TAG>offline reinforcement learning</POST_TAG> has demonstrated impressive performance across a wide range of tasks, they are typically limited to short-horizon tasks -- not reflective of those a home robot would be expected to complete. While existing architectures have the potential to learn these desired behaviours, the lack of the necessary long-horizon, multi-step datasets for real robotic systems poses a significant challenge. To this end, we present the <PRE_TAG>Long-Horizon Manipulation (LHManip)</POST_TAG> dataset comprising 200 episodes, demonstrating 20 different <PRE_TAG>manipulation tasks</POST_TAG> via real robot teleoperation. The tasks entail multiple sub-tasks, including <PRE_TAG>grasping</POST_TAG>, <PRE_TAG>pushing</POST_TAG>, <PRE_TAG>stacking</POST_TAG> and <PRE_TAG>throwing objects</POST_TAG> in highly cluttered environments. Each task is paired with a natural language instruction and multi-camera viewpoints for <PRE_TAG>point-cloud</POST_TAG> or <PRE_TAG>NeRF reconstruction</POST_TAG>. In total, the dataset comprises 176,278 <PRE_TAG>observation-action pairs</POST_TAG> which form part of the Open X-Embodiment dataset. The full LHManip dataset is made publicly available at https://github.com/fedeceola/LHManip.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper