TouchGuide: Inference-Time Steering of Visuomotor Policies via Touch Guidance

Let Policies Have Touch Sensing!

Zhemeng Zhang1,2*, Jiahua Ma3*, Xincheng Yang2*
Xin Wen3†, Yuzhi Zhang3†, Boyan Li1†, Yiran Qin4‡, Jin Liu1,2, Can Zhao1
Li Kang1,5, Haoqin Hong6, Zhenfei Yin4, Philip Torr4, Hao Su7, Ruimao Zhang3, Daolin Ma1,2✉
1Shanghai Jiao Tong University 2Xense Robotics 3Sun Yat-sen University 4Oxford
5Shanghai AI Laboratory 6University of Science and Technology of China 7UCSD
*Equal contribution Equal contribution Project leader Corresponding author
TacUMI and TouchGuide overview

TacUMI enables low-cost, high-precision tactile data collection, and TouchGuide steers visuomotor policies with touch guidance at inference time.

Abstract

Fine-grained and contact-rich manipulation remain challenging for robots, largely due to the underutilization of tactile feedback. To address this, we introduce TouchGuide, a novel cross-policy visuo-tactile fusion paradigm that fuses modalities within a low-dimensional action space. Specifically, TouchGuide operates in two stages to guide a pre-trained diffusion or flow-matching visuomotor policy at inference time. First, the policy produces a coarse, visually plausible action using only visual inputs during early sampling. Second, a task-specific Contact Physical Model (CPM) provides touch guidance to steer and refine the action, ensuring it aligns with realistic physical contact conditions. Trained through contrastive learning on limited expert demonstrations, the CPM provides a tactile-informed feasibility score to steer the sampling process toward refined actions that satisfy physical contact constraints. Furthermore, to facilitate TouchGuide training with high-quality and cost-effective data, we introduce TacUMI, a data collection system. TacUMI achieves a favorable trade-off between precision and affordability; by leveraging rigid fingertips, it obtains direct tactile feedback, thereby enabling the collection of reliable tactile data. Extensive experiments on five challenging contact-rich tasks, such as shoe lacing and chip handover, show that TouchGuide consistently and significantly outperforms state-of-the-art visuo-tactile policies.

TacUMI

TacUMI data collection system overview

TacUMI is a low-cost, lightweight tactile data collection system. It provides direct tactile feedback via rigid fingertips and uses Vive Trackers for high-precision pose capture, balancing accuracy with deployability.

TouchGuide

TouchGuide framework overview

During inference, TouchGuide uses the CPM to score tactile feasibility and guide a pre-trained visuomotor policy toward fine-grained action corrections, improving contact-physics compliance without retraining.

Results Comparison

BibTeX


@misc{zhang2026touchguide,
title={TouchGuide: Inference-Time Steering of Visuomotor Policies via Touch Guidance},
author={Zhemeng Zhang and Jiahua Ma and Xincheng Yang and Xin Wen and Yuzhi Zhang and Boyan Li and Yiran Qin and Jin Liu and Can Zhao and Li Kang and Haoqin Hong and Zhenfei Yin and Philip Torr and Hao Su and Ruimao Zhang and Daolin Ma},
year={2026},
eprint={2601.20239},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2601.20239},
}