phi-3-mini: The 3.8B Powerhouse Reshaping LLM Performance on Your Phone

"Phi-3-mini is a 3.8 billion parameter language model trained on 3.3 trillion tokens, demonstrating competitive performance to models like Mixtral 8x7B and GPT-3.5."

"The innovations in phi-3-mini stem from a newly scaled-up training dataset composed of heavily filtered publicly available web data and synthetic data."

"Initial parameter-scaling results show that phi-3-small and phi-3-medium models show significant improvement over phi-3-mini, achieving 75% and 78% on MMLU, respectively."

"Phi-3-vision, based on phi-3-mini, features strong reasoning capabilities and operates well with both image and text prompts."

Phi-3-mini is a 3.8 billion parameter language model trained on 3.3 trillion tokens. It exhibits performance comparable to larger rivals like Mixtral 8x7B and GPT-3.5, achieving 69% on MMLU and 8.38 on MT-bench. Its capabilities are enhanced through a unique dataset consisting of filtered public web data. To further this technology, phi-3-small and phi-3-medium models, containing 7B and 14B parameters respectively, achieve even better performance on benchmarks. Additionally, phi-3-vision model demonstrates substantial reasoning skills with textual and visual inputs.

#language-models #ai-development #benchmark-performance #data-training #mobile-deployment

Read at Hackernoon

Unable to calculate read time

Collection

[

...

]

phi-3-mini: The 3.8B Powerhouse Reshaping LLM Performance on Your Phone | HackerNoonphi-3-mini: The 3.8B Powerhouse Reshaping LLM Performance on Your Phone | HackerNoon Briefly

phi-3-mini: The 3.8B Powerhouse Reshaping LLM Performance on Your Phone | HackerNoon
phi-3-mini: The 3.8B Powerhouse Reshaping LLM Performance on Your Phone | HackerNoon
Briefly