Microsoft launches Phi models optimized for multimodal processingMicrosoft expands its Phi language model line with Phi-4-mini and Phi-4-multimodal for improved multimodal processing and hardware efficiency.
Google Gemini: Everything you need to know about the new generative AI platform | TechCrunchGemini is Google's next-gen generative AI that supports multimodal processing, going beyond text.
Nexa AI Unveils Omnivision: A Compact Vision-Language Model for Edge AIOmnivision is a compact vision-language model for edge devices, reducing image tokens and improving efficiency without compromising accuracy.
Microsoft launches Phi models optimized for multimodal processingMicrosoft expands its Phi language model line with Phi-4-mini and Phi-4-multimodal for improved multimodal processing and hardware efficiency.
Google Gemini: Everything you need to know about the new generative AI platform | TechCrunchGemini is Google's next-gen generative AI that supports multimodal processing, going beyond text.
Nexa AI Unveils Omnivision: A Compact Vision-Language Model for Edge AIOmnivision is a compact vision-language model for edge devices, reducing image tokens and improving efficiency without compromising accuracy.
Rhymes AI Unveils Aria: Open-Source Multimodal Model with Development ResourcesAria is a leading multimodal MoE model outperforming both open and proprietary counterparts in testing.
Using Large Language Models for Zero-Shot Video Generation: A VideoPoet Case Study | HackerNoonVideoPoet synthesizes high-quality videos using a transformer model that integrates multiple conditioning signals across various modalities.
Rhymes AI Unveils Aria: Open-Source Multimodal Model with Development ResourcesAria is a leading multimodal MoE model outperforming both open and proprietary counterparts in testing.
Using Large Language Models for Zero-Shot Video Generation: A VideoPoet Case Study | HackerNoonVideoPoet synthesizes high-quality videos using a transformer model that integrates multiple conditioning signals across various modalities.
Amazon announces its own set of Nova AI modelsAmazon launched the 'Nova' series of AI foundation models, enhancing capabilities in text and multimodal processing for various applications.
Can DreamLLM Surpass the 30% Turing Test Requirement? | HackerNoonDREAMLLM enhances multimodal document creation by autonomously generating text and images based on user instructions.