#vision-language

[ follow ]
Gadgets
fromTechCrunch
1 day ago

I watched LG's new home robot CLOid do laundry but I have questions | TechCrunch

LG's CLOid is an AI-powered, autonomous home robot with arms and sensors designed to perform diverse domestic tasks and integrate with smart-home systems.
fromPyImageSearch
1 month ago

Grounding DINO: Open Vocabulary Object Detection on Videos - PyImageSearch

Imagine asking a friend to find any object in a picture simply by describing it. This is the promise of open-set object detection: the ability to spot and localize arbitrary objects (even ones never seen in training) by name or description. Unlike a closed-set detector trained on a fixed list of classes (say, "cat", "dog", "car"), an open-set detector can handle new categories on the fly, simply from language cues.
Python
Apple
fromInfoQ
1 month ago

AnyLanguageModel: Unified API for Local and Cloud LLMs on Apple Platforms

AnyLanguageModel provides a unified Swift API enabling interchangeable use of local Core ML/MLX and remote cloud language models, supporting vision-language prompts and minimizing dependencies.
Artificial intelligence
fromHackernoon
7 months ago

Chameleon Sets New Benchmarks in AI Image-Text Tasks | HackerNoon

Chameleon sets a new standard for multimodal machine learning with a unified token-based architecture, improving reasoning across image and text.
[ Load more ]