OpenMulti: Open-Vocabulary Instance-Level Multi-Agent Distributed Implicit Mapping

Poster

teaser

OpenMulti facilitates information exchange between agents, effectively compensating for regions not observed by individual agents. Additionally, OpenMulti supports downstream tasks including open-vocabulary instance retrieval.

Abstract

Multi-agent distributed collaborative mapping provides comprehensive and efficient representations for robots. However, existing approaches lack instance-level awareness and semantic understanding of environments, limiting their effectiveness for downstream applications. To address this issue, we propose OpenMulti, an open-vocabulary instance-level multiagent distributed implicit mapping framework. Specifically, we introduce a Cross-Agent Instance Alignment module, which constructs an Instance Collaborative Graph to ensure consistent instance understanding across agents. To alleviate the degradation of mapping accuracy due to the information compression problem, we leverage Cross Rendering Supervision to enhance distributed learning of the scene. Experimental results show that OpenMulti outperforms related algorithms in both finegrained geometric accuracy and zero-shot semantic accuracy. In addition, OpenMulti supports instance-level retrieval tasks, delivering semantic annotations for downstream applications.

Framework

First, each agent performs data collection and processing, capturing RGB and depth images, recording poses, and performing instance segmentation along with feature extraction. The Cross-Agent Instance Alignment module then constructs an Instance Connectivity Graph to ensure semantic consistency. Finally, the instance-level NeRF network maintained by each agent is optimized using a traditional loss function and the cross-rendering supervision module, which addresses the information compression problem and enhances the accuracy of collaborative multi-agent implicit mapping.

teaser2

Results

Cross-Agent Instance Alignment

The results of scene

Agent1

Agent2

Agent3

Agent4


Multi-Agent 3D Reconstruction

visualisation of scene representation by our method.


Agent1

Agent2

Agent3

Agent4

Instance-Level Open-Vocabulary Retrieval

OpenMulti supports direct retrieval and associative retrieval tasks.