Data science
fromInfoQ
1 day agoGemma 4 Multi-Token Prediction Delivers Up to ~3x Faster Token Generation
Gemma 4 can use multi-token prediction drafters with speculative decoding to verify multiple proposed tokens in parallel, improving inference speed up to ~3× without quality loss.