Import AI
5 months ago
Artificial intelligence

Using the PowerInfer method, language models can be made more efficient by offloading some neurons to GPU and the rest to CPU.
PowerInfer offers significant efficiency improvements over previous methods by utilizing a power law distribution of neuron activation in language models. [ more ]
