Capturing Shared Actions and Coordination in Collaborative Workspaces: A Markerless Pipeline for Synchronized Human-Robot 3D Pose Estimation

Egle Maria Orlando,; Nenna, Federica; Federico Maria Lorusso,; Mingardi, Michele; Buodo, Giulia; Gamberini, Luciano

doi:10.1109/ACCESS.2026.3687097

Recent industrial paradigms have increasingly adopted human-centric approaches that prioritize worker well-being in collaborative robotics, where humans and robots share workspaces and coordinate their actions in real time. Understanding these interactions requires tools capable of capturing the embodied, dynamic nature of human-robot collaboration while remaining non-intrusive and ecologically valid. Markerless pose estimation offers a promising solution, yet existing approaches track humans and robots separately, missing the collaborative dynamics that emerge from their synchronized movements. We present an integrated pipeline for concurrent 3D pose estimation of humans and the UR10e cobot using a single RGB-D camera. The system combines 2D keypoint detection with depth-based reconstruction to generate synchronized 3D trajectories for both agents within a unified reference frame. Quantitative validation was conducted on a custom-trained UR10e model using encoder-based ground truth, demonstrating the system’s reliability across both collaborative assembly tasks and controlled trajectories. By capturing human and robot as a unified dyadic system rather than independent entities, the pipeline enables quantitative analysis of interaction fluency, embodied coordination, and shared action dynamics, fundamental aspects for advancing safe, efficient, and human-centered collaborative robotics. Link Github: https://github.com/egleorl/Markerless-3D-Human-Cobot-Pose-Estimation