KV-Tracker: Real-Time Pose Tracking with Transformers

Published:Dec 27, 2025 13:02
1 min read
ArXiv

Analysis

This paper addresses the computational bottleneck of multi-view 3D geometry networks for real-time applications. It introduces KV-Tracker, a novel method that leverages key-value (KV) caching within a Transformer architecture to achieve significant speedups in 6-DoF pose tracking and online reconstruction from monocular RGB videos. The model-agnostic nature of the caching strategy is a key advantage, allowing for application to existing multi-view networks without retraining. The paper's focus on real-time performance and the ability to handle challenging tasks like object tracking and reconstruction without depth measurements or object priors are significant contributions.

Reference

The caching strategy is model-agnostic and can be applied to other off-the-shelf multi-view networks without retraining.