Learning to Sense for Driving: Joint Optics-Sensor-Model Co-Design for Semantic Segmentation
Published:Dec 25, 2025 05:00
•1 min read
•ArXiv Vision
Analysis
This paper presents a novel approach to autonomous driving perception by co-designing optics, sensor modeling, and semantic segmentation networks. The traditional approach of decoupling camera design from perception is challenged, and a unified end-to-end pipeline is proposed. The key innovation lies in optimizing the entire system, from RAW image acquisition to semantic segmentation, for task-specific objectives. The results on KITTI-360 demonstrate significant improvements in mIoU, particularly for challenging classes. The compact model size and high FPS suggest practical deployability. This research highlights the potential of full-stack co-optimization for creating more efficient and robust perception systems for autonomous vehicles, moving beyond traditional, human-centric image processing pipelines.
Key Takeaways
- •End-to-end co-design of optics, sensor, and model improves semantic segmentation performance.
- •Task-driven optimization leads to better performance than human-centric image processing.
- •The proposed system is efficient and deployable on edge devices.
Reference
“Evaluations on KITTI-360 show consistent mIoU improvements over fixed pipelines, with optics modeling and CFA learning providing the largest gains, especially for thin or low-light-sensitive classes.”