LLaVA-Phi: How We Rigorously Evaluated It Using an Extensive Array of Academic Benchmarks | HackerNoon
Briefly

We rigorously evaluated LLaVA-Phi using an extensive array of academic benchmarks specifically designed for multi-modal models, achieving superior performance in visual-based question-answering.
LLaVA-Phi outperformed numerous existing large multimodal models, demonstrating particularly notable results on ScienceQA due to its specialized training in code generation and mathematical corpora.
Read at Hackernoon
[
|
]