The NEO-KD algorithm optimally balances knowledge distillation across multiple exits, enhancing adversarial training effectiveness in multi-exit networks, with particular focus on tuning hyperparameters for performance.
In the NEO-KD objective function, careful selection of the hyperparameters α and β is essential; extreme values can either hinder knowledge distillation or compromise adversarial robustness.
#adversarial-training #knowledge-distillation #multi-exit-networks #hyperparameter-tuning #machine-learning
Collection
[
|
...
]