Thank you to all of our speakers, panelists, presenters, and attendees for allowing us to run such a wonderful and productive workshop. We hope that the discussions fostered by it gave valuable insights into robot representation learning!
General-purpose robotic systems require powerful representations and abstractions. In deployment, such robots are expected to encounter diverse and complex scenarios. While recent large-scale learned models exhibit remarkable generalization, similarly getting representations that can flexibly generalize to all the unanticipated situations a robot might face remains challenging, especially given the cost of robot data. Thus, it is important to investigate how to best learn generalizable representations, evaluate their effectiveness, and leverage them for downstream robotics tasks.
Ideally, these representations should capture: (1) spatial-dynamic information needed for fine-grained control, (2) semantic information required for common-sense reasoning and scene understanding, and (3) knowledge of conventions needed for smooth human-robot interactions. Additionally, these representations must be robust to the diversity of tasks, scenes, and operators the robot will encounter. In this workshop, we aim to explore the following: What makes a good robot representation, how can we learn them, and how can we most effectively make use of them?
Our speakers and panelists are pioneering robotics and machine learning researchers defining the state of the art on a range of topics, including: end-to-end control, task and motion planning (TAMP), human-robot interaction (HRI), scene understanding / SLAM, and more. We invite the community for submissions in these areas as well as from a wider set of perspectives – for example, submissions addressing how the following fields might guide robotics research: (1) deep representation learning in vision and language; (2) learning representations for field robotics and AI, where data is extremely scarce or noisy; or (3) bias and robustness in neural representations.
We aim to investigate the following topics and research questions:
We also give a non-exhaustive list of keywords:
We are accepting workshop submissions of the following types
We request that submissions are in the RSS format. They should not be anonymized. Additionally, you may submit papers that are under review at other venues or submitted to other workshops.
Paper Submission Deadline | |
Paper Acceptance | |
Camera-ready Version Due | |
Workshop |
Session 1 |
|
8:40 AM - 8:50 AM | Opening Remarks |
8:50 AM - 9:30 AM | Invited Talk 1: Wolfram Burgard (Virtual) |
9:30 AM - 10:20 AM | Poster Session A, Coffee Break |
Session 2 |
|
10:20 AM - 11:55 AM | Invited Talks 2, 3, 4: Liam Paull, Chelsea Finn, Mahi Shafiullah |
11:55 AM - 12:30 PM | Panel |
12:30 PM - 2:00 PM | Lunch Break |
Session 3 |
|
2:00 PM - 2:20 PM | Spotlight Talks |
2:20 PM - 3:00 PM | Invited Talk 5: Krishna Murthy |
3:00 PM - 4:00 PM | Poster Session B, Coffee Break |
Session 4 |
|
4:00 PM - 4:40 PM | Invited Talk 6: Andreea Bobu |
4:40 PM - 5:00 PM | Closing Remarks |
Asterisk after paper title indicates spotlight talk.
Poster Session A: 9:30 AM - 10:20 AM |
TOP-ERL: Transformer-based
Off-Policy Episodic Reinforcement Learning Ge Li, Dong Tian, Hongyi Zhou, Xinkai Jiang, Rudolf Lioutikov, Gerhard Neumann |
Enter the Mind Palace: Reasoning
and Planning for Long-term Active Embodied Question Answering Muhammad Fadhil Ginting, Dong-Ki Kim, Xiangyun Meng, Andrzej Marek Reinke, Jai Krishna Bandi, Navid Kayhani, Oriana Peltzer, David Fan, Amirreza Shaban, Sung-Kyun Kim, Mykel Kochenderfer, Ali-akbar Agha-mohammadi, Shayegan Omidshafiei |
Learning Attentive Neural
Processes for Planning with Pushing Actions Atharv Jain, Seiji A Shaw, Nicholas Roy |
Interpretable Human-in-the-Loop
In-Context Preference Learning Via Preference Boundaries Valerie K. Chen, Julie Shah, Andreea Bobu |
Online Latent Factor
Representation Learning Alejandro Murillo-González, Lantao Liu |
DexWild: Dexterous Human
Interactions for In-the-Wild Robot Policies Tony Tao, Mohan Kumar Srirama, Jason Jingzhou Liu, Kenneth Shaw, Deepak Pathak |
GRIM: Task-Oriented Grasping with
Conditioning on Generative Examples Shailesh, Alok Raj, Nayan Kumar, Priya Shukla, Andrew Melnik, Michael Beetz, Gora Chand Nandi |
Bi-Manual Joint Camera Calibration
and Scene Representation Haozhan Tang, Tianyi Zhang, Matthew Johnson-Roberson, William Zhi |
DisDP: Robust Imitation Learning
via Disentangled Diffusion Policies Pankhuri Vanjani, Paul Mattes, Kevin Daniel Kuryshev, Xiaogang Jia, Vedant Dave, Rudolf Lioutikov |
RayFronts: Open-Set Semantic Ray
Frontiers for Online Scene Understanding and Exploration Omar Alama, Avigyan Bhattacharya, Haoyang He, Seungchan Kim, Yuheng Qiu, Wenshan Wang, Cherie Ho, Nikhil Varma Keetha, Sebastian Scherer |
Learning Factorized Diffusion
Policies for Conditional Action Diffusion Omkar Patil, Prabin Kumar Rath, Kartikay Milind Pangaonkar, Eric Rosen, Nakul Gopalan |
Learning Symbolic World Model
Representations for Long-Horizon Robot Planning Naman Shah, Jayesh Nagpal, Siddharth Srivastava |
WoMAP: World Models For Embodied
Open-Vocabulary Object Localization* Tenny Yin, Zhiting Mei, Tao Sun, Lihan Zha, Emily Zhou, Jeremy Bao, Miyu Yamane, Ola Sho, Anirudha Majumdar |
ReWiND: Language-Guided Rewards
Teach Robot Policies without New Demonstrations* Jiahui Zhang, Yusen Luo, Abrar Anwar, Sumedh Anand Sontakke, Joseph J Lim, Jesse Thomason, Erdem Biyik, Jesse Zhang |
Poster Session B: 3:00 PM - 4:00 PM |
DREAM: Differentiable
Real-to-Sim-to-Real Engine for Learning Robotic Manipulation Haozhe Lou, Mingtong Zhang, Haoran Geng, Hanyang Zhou, Sicheng He, Zhiyuan Gao, Siheng Zhao, Jiageng Mao, Pieter Abbeel, Jitendra Malik, Daniel Seita, Yue Wang |
Learning Long-Context Diffusion
Policies via Past-Token Prediction* Marcel Torne, Andy Tang, Yuejiang Liu, Chelsea Finn |
H3DP:
Triply‑Hierarchical
Diffusion Policy for Visuomotor Learning Yiyang Lu, Yufeng Tian, Zhecheng Yuan, Xianbang Wang, Pu Hua, Zhengrong Xue, Huazhe Xu |
Robo2VLM: Visual Question
Answering from Large-Scale In-the-Wild Robot Manipulation Datasets Kaiyuan Chen, Shuangyu Xie, Zehan Ma, Pannag R Sanketi, Ken Goldberg |
Implicit Contact Representations
with Neural Descriptor Fields for Learning Dynamic Recovery Policies Fan Yang, Sergio Francisco Aguilera Marinovic, Soshi Iba, Rana Soltani Zarrin, Dmitry Berenson |
CL-HCoTNav: Closed-Loop
Hierarchical Chain-of-Thought for Zero-Shot Object-Goal Navigation with Vision-Language
Models Yuxin Cai, Haoruo Zhang, Wei-Yun Yau, Chen Lv |
XPG-RL: Reinforcement Learning
with Explainable Priority Guidance for Efficiency-Boosted Mechanical Search Yiting Zhang, Shichen Li, Elena Shrestha |
Importance Weighted Retrieval for
Few-Shot Imitation Learning Amber Xie, Rahul Chand, Dorsa Sadigh, Joey Hejna |
Point Policy: Unifying
Observations and Actions with Key Points for Robot Manipulation Siddhant Haldar, Lerrel Pinto |
A Steerable Vision-Language-Action
Framework for Autonomous Driving Tian Gao, Catherine Glossop, Kyle Stachowicz, Timothy Gao, Celine Tan, Oier Mees, Yuejiang Liu, Sergey Levine, Dorsa Sadigh, Chelsea Finn |
GraphSeg: Segmented 3D
Representations via Graph Edge Addition and Contraction Haozhan Tang, Tianyi Zhang, Oliver Kroemer, Matthew Johnson-Roberson, William Zhi |
Seeing the Bigger Picture: 3D
Latent Mapping for Mobile Manipulation Policy Learning* Sunghwan Kim, Woojeh Chung, Yulun Tian, Zhirui Dai, Arth Shukla, Hao Su, Nikolay Atanasov |
SkillWrapper: Autonomously
Learning Interpretable Skill Abstractions with Foundation Models Ziyi Yang, Benned Hedegaard, Ahmed Jaafar, Skye Thompson, Yichen Wei, Everest Yang, Haotian Fu, Shreyas Sundara Raman, Stefanie Tellex, George Konidaris, David Paulius, Naman Shah |
Structured 3D Scene Queries with
Graph Databases Aaron Ray, Luca Carlone |
EgoZero: Robot Learning from Smart
Glasses* Vincent Liu, Ademi Adeniji, Haotian Zhan, Siddhant Haldar, Raunaq Bhirangi, Pieter Abbeel, Lerrel Pinto |
Feel the Force: Contact-Driven
Learning from Humans Ademi Adeniji, Zhuoran Chen, Vincent Liu, Venkatesh Pattabiraman, Siddhant Haldar, Raunaq Bhirangi, Pieter Abbeel, Lerrel Pinto |
BEAST: Efficient Tokenization of
B-Splines Encoded Action Sequences for Imitation Learning Hongyi Zhou, Weiran Liao, Xi Huang, Yucheng Tang, Fabian Otto, Xiaogang Jia, Xinkai Jiang, Simon Hilber, Ge Li, Qian Wang, Ömer Erdinç Yağmurlu, Nils Blank, Moritz Reuss, Rudolf Lioutikov |