Apparate: Early-Exit Models for ML Latency and Throughput Optimization - Implementation | HackerNoon
Briefly

Apparate leverages TensorFlowServing and Clockwork to optimize performance. By utilizing ONNX format for models, it enhances the efficiency of ramp training with a focused 10% dataset.
The ramp training method integrates the first 10% of each dataset to create a balanced approach for training and validation, maintaining a 1:9 split for efficiency.
Read at Hackernoon
[
|
]