LVLDrive: Enhancing Autonomous Driving with 3D Spatial Understanding
Analysis
Key Takeaways
- •LVLDrive integrates LiDAR data with Vision-Language Models to improve 3D spatial understanding for autonomous driving.
- •A Gradual Fusion Q-Former is used to integrate LiDAR features without disrupting pre-trained VLMs.
- •A spatial-aware question-answering dataset is developed to enhance 3D perception and reasoning.
- •The framework demonstrates superior performance compared to vision-only methods in driving benchmarks.
“LVLDrive achieves superior performance compared to vision-only counterparts across scene understanding, metric spatial perception, and reliable driving decision-making.”