We propose a novel cross-modality KD framework to enable LC-to-CR distillation in the BEV feature space. With the transferred knowledge from an LC teacher detector, the CR student detector can outperform existing baselines without additional cost during inference.
We design four KD modules to address the notable discrepancies between different sensors to realize realize effective cross-modality KD. As we operate KD in the BEV space, the proposed loss designs can be applied to other KD configurations. Our improvement also includes adding a gated network to the baseline model for adaptive fusion.
We conduct extensive evaluation on nuScenes to demonstrate the effectiveness of CRKD. CRKD can improve the mAP and NDS of student detectors by 3.5% and 3.2% respectively. Since our method focuses on a novel KD path with distinctively large modality gap, we provide thorough study and analysis to support our design choices.
@inproceedings{zhao2024crkd,
author = {Zhao, Lingjun and Song, Jingyu and Skinner, Katherine A},
title = {CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge Distillation},
journal = {2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2024},
}