Apparate: Early-Exit Models for ML Latency and Throughput Optimization - Abstract and Introduction | HackerNoonApparate effectively reduces latency in ML inference by using early exits without compromising throughput or accuracy.