NEO-KD enhances adversarial training by optimizing knowledge distillation in multi-exit networks through careful tuning of hyperparameters.
Tuning hyperparameters α and β in NEO-KD is crucial for balancing knowledge transfer and maintaining adversarial training effectiveness.