To be appeared at IROS 2025
Collaborative Perception enables multiple agents, such as autonomous vehicles and infrastructure, to share sensor data via vehicular networks so that each agent gains an extended sensing range and better perception quality. Despite its promising benefits, realizing the full potential of such systems faces significant challenges due to inherent imperfections in underlying system layers, consisting of network layer imperfections and hardware-level noises. Such imperfections and noises include packet loss in vehicular networks, localization errors from GPS measurements, and synchronization errors caused by clock deviation and network latency.
To address these challenges, we propose a novel end-to-end collaborative perception framework, SCORPION, that harnesses the AI co-design of the application layer and system layer to tackle the aforementioned imperfections. SCORPION consists of three main components: lost bird's eye view feature reconstruction (L-BEV-R) recovers lost spatial features during lossy V2X communication, while deformable spatial cross attention (DSCA) and temporal alignment (TA) compensate for localization and synchronization errors in feature fusion. Experimental results on both synthetic and real-world collaborative 3D object detection datasets demonstrate that SCORPION advances the state-of-the-art collaborative perception methods by 5.9 - 13.2 absolute AP on both standard and noisy scenarios.
Figure 1: SCORPION Overview. SCORPION consists of three main components: a lost BEV feature reconstruction (L-BEV-R) module, a deformable spatial cross-attention (DSCA) module, and a temporal alignment (TA) module.
Figure 2: Qualitative Comparison on Challenging Objects. We compare SCORPION with the state-of-the-art collaborative perception methods on challenging objects, including several occluded, small, and far-away objects.
Figure 3: Qualitative Comparison. SCORPION outperforms the state-of-the-art collaborative perception methods by 5.9 - 13.2 absolute AP on both standard and noisy scenarios.
@inproceedings{zhu2025scorpion,
title={SCORPION: Robust Spatial-Temporal Collaborative Perception Model on Lossy Wireless Network},
author={Ruiyang Zhu, Minkyoung Cho, Shuqing Zeng, Fan Bai, and Z. Morley Mao},
journal={To be appeared: 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
year={2025}
}