#image-processing

[ follow ]
#agentic-ai
#ai
fromFuturism
1 week ago
Artificial intelligence

Frontier AI Models Are Doing Something Absolutely Bizarre When Asked to Diagnose Medical X-Rays

Python
fromPycon
1 week ago

Python and the Future of AI: Agents, Inference, and Edge AI

AI tools are increasingly integrated into development, with a dedicated track at PyCon US focusing on their future and practical applications.
fromPsychology Today
1 month ago
Artificial intelligence

AI Spots Brain Disorders in Seconds From Scans

Prima diagnoses over 50 brain disorders from MRI scans in seconds with up to 97.5% accuracy and serves as a foundation model for neuroimaging.
Mobile UX
fromGSMArena.com
10 hours ago

The Honor 600 series will bring the much-improved AI Image to Video 2.0 feature

Honor's upcoming 600 series features AI Image to Video 2.0, enabling users to create videos from images and text commands.
Artificial intelligence
fromFuturism
1 week ago

Frontier AI Models Are Doing Something Absolutely Bizarre When Asked to Diagnose Medical X-Rays

Hallucinations and 'mirage reasoning' in AI models pose significant risks, especially in healthcare applications, leading to potentially dangerous misinformation.
Python
fromPycon
1 week ago

Python and the Future of AI: Agents, Inference, and Edge AI

AI tools are increasingly integrated into development, with a dedicated track at PyCon US focusing on their future and practical applications.
Marketing tech
fromInfoQ
1 day ago

Reimagining Platform Engagement with Graph Neural Networks

Graph neural networks can enhance recommender systems by personalizing content and optimizing for long-term user engagement.
#ai-agents
Data science
fromMedium
1 week ago

15 Datasets for Training and Evaluating AI Agents

Datasets for training and evaluating AI agents are essential for building reliable agentic systems and preventing execution failures.
fromEngadget
1 month ago
Artificial intelligence

NVIDIA is reportedly working on its own open-source AI agent platform

NVIDIA is developing NemoClaw, an enterprise-focused open-source AI agent platform designed to work across non-NVIDIA hardware with enhanced security features.
fromTechCrunch
1 month ago
Artificial intelligence

Perplexity's new Computer is another bet that users need many AI models | TechCrunch

Perplexity launches Computer, an agentic tool for Max subscribers that unifies AI capabilities to execute complex workflows independently using 19 models and create subagents.
Data science
fromMedium
1 week ago

15 Datasets for Training and Evaluating AI Agents

Datasets for training and evaluating AI agents are essential for building reliable agentic systems and preventing execution failures.
Artificial intelligence
fromEngadget
1 month ago

NVIDIA is reportedly working on its own open-source AI agent platform

NVIDIA is developing NemoClaw, an enterprise-focused open-source AI agent platform designed to work across non-NVIDIA hardware with enhanced security features.
fromTechCrunch
1 month ago
Artificial intelligence

Perplexity's new Computer is another bet that users need many AI models | TechCrunch

Social media marketing
fromTechCrunch
6 days ago

X is rolling out automatic translation and photo editing powered by Grok | TechCrunch

X introduces automatic translation and a new photo editor powered by Grok models to enhance user experience.
Photography
fromTechRepublic
6 days ago

Google Photos Adds One-Tap 'AI Enhance' Tool, Video Speed Controls

Google introduces an 'AI Enhance' button in Photos for easy image improvement and adds video playback speed controls for Android users.
Python
fromEfficientcoder
4 days ago

Build Your Own AI Meme Matcher: A Beginner's Guide to Computer Vision with Python

Computer Vision enables real-time facial recognition and meme matching using Object-Oriented Programming for clean and organized code.
Software development
fromArs Technica
2 weeks ago

Running local models on Macs gets faster with Ollama's MLX support

Ollama enhances local language model performance on Apple Silicon with MLX support and improved caching, catering to growing interest in local models.
Data science
fromInfoWorld
1 week ago

Why 'curate first, annotate smarter' is reshaping computer vision development

Strategic data selection and curation reduce annotation costs and enhance development productivity in computer vision teams.
Python
fromMathspp
5 days ago

uv skills for coding agents

Utilizing uv workflows enhances Python code execution and script management for coding agents, ensuring proper handling of dependencies and sandboxing.
DevOps
fromInfoWorld
3 weeks ago

An architecture for engineering AI context

AI systems must intelligently manage context to ensure accuracy and reliability in real applications.
Artificial intelligence
fromTheregister
1 week ago

Microsoft shivs OpenAI with new AI models for speech, images

Microsoft launched public preview versions of machine learning models for speech recognition, speech synthesis, and image generation, competing directly with OpenAI.
Apple
fromInfoQ
3 weeks ago

Apple Improves Context Window Management for its Foundation Models

iOS 26.4 enhances context window management for Apple's Foundation Models, enabling developers to optimize usage within the 4096-token limit.
Artificial intelligence
fromFortune
2 weeks ago

Is AI's visual understanding mostly a 'mirage'? New research suggests so. | Fortune

Anthropic faces significant cybersecurity risks following multiple sensitive data leaks related to its new AI model, Mythos.
Business intelligence
fromComputerWeekly.com
3 weeks ago

AI tools offer 'near-real-time' analysis of data from seized mobile phones and computers | Computer Weekly

Cellebrite's AI-powered Guardian Investigate platform enables police to rapidly analyze mobile device data, discover connections between datasets, track phone locations over time, and construct event timelines for major crime investigations.
Science
fromThe Cipher Brief
3 weeks ago

Why the U.S. Must Build the Ultimate Multi-Modal Foundation Model

Advanced AI models like AlphaEarth demonstrate pixel-level geospatial intelligence capabilities that must be integrated into U.S. national security frameworks to maintain technological leadership.
Marketing tech
fromTNW | Microsoft
3 weeks ago

Microsoft's MAI-Image-2 enters the top three AI image generators

Microsoft's MAI-Image-2 ranks third on Arena.ai's image generation leaderboard, behind only Google and OpenAI, and is now rolling out across Copilot and Bing Image Creator.
#deepseek-v3
Photography
fromInfoQ
4 weeks ago

Image Processing for Automated Tests

Image-based test automation using AI algorithms enables testing applications without access to internal states like DOM or component trees, providing visual representations to identify intended versus faulty states.
Python
fromBusiness Matters
2 weeks ago

Building AI-powered visual solutions: How Python forms the foundation for advanced Computer Vision use cases

Python is the preferred programming language for developing computer vision technologies due to its simplicity, flexibility, and extensive libraries.
Mobile UX
fromEngadget
1 month ago

Nothing updates its AI app with semantic search and a new way to track events

Nothing's updated Essential Space app now recognizes events from images and supports semantic search, making it easier to organize and find screenshots, voice recordings, and other digital content on 2025 and 2026 Nothing phones.
Artificial intelligence
fromMedium
3 weeks ago

Less Compute, More Impact: How Model Quantization Fuels the Next Wave of Agentic AI

Model quantization and architectural optimization can outperform larger models, challenging the belief that more GPUs equal greater intelligence.
Apple
fromFast Company
1 month ago

Photoshop's new AI assistant makes it easer than ever to edit images

Adobe launches an AI assistant for Photoshop Web and Mobile that enables intuitive photo editing through prompts, voice commands, and touch navigation, with results integrable into full Adobe creative workflows.
fromNature
1 month ago

Merlin: a computed tomography vision-language foundation model and dataset - Nature

The large volume of abdominal computed tomography (CT) scans coupled with the shortage of radiologists have intensified the need for automated medical image analysis tools. Previous state-of-the-art approaches for automated analysis leverage vision-language models (VLMs) that jointly model images and radiology reports.
Medicine
#circle-to-search
Artificial intelligence
fromEngadget
1 month ago

You can (sort of) block Grok from editing your uploaded photos

X and xAI introduced a feature allowing users to block Grok from modifying their uploaded images, but this limited measure fails to address widespread misuse of the image generation tool for creating nonconsensual intimate imagery.
Python
fromPyImageSearch
1 month ago

DeepSeek-V3 Model: Theory, Config, and Rotary Positional Embeddings - PyImageSearch

DeepSeek-V3 introduces revolutionary architectural innovations including Multihead Latent Attention that reduces KV cache memory by 75% while maintaining model quality, addressing critical challenges in inference efficiency, training cost, and long-range dependency capture.
World news
fromBored Panda
2 months ago

Using AI For Good: 20 Portraits Of Missing Individuals To Help Finding Them

Updating missing persons' images on milk cartons led to nationwide attention, found eight people, and turned an advertising effort into a lifesaving social movement.
Tech industry
fromTheregister
2 months ago

How Nvidia is using emulation to turn AI FLOPS into FP64

Nvidia achieves higher FP64 throughput through software emulation on Rubin GPUs, trading hardware FP64 for emulated matrix performance up to 200 TFLOPS.
Artificial intelligence
fromComputerWeekly.com
1 month ago

Edge AI: What's working and what isn't | Computer Weekly

Edge AI deployment success depends on identifying efficient, narrow use cases with manageable risks rather than pursuing sophisticated, large-scale models across all applications.
Information security
fromThe Hacker News
1 month ago

From Exposure to Exploitation: How AI Collapses Your Response Window

AI dramatically shortens the time from exposure to exploitation, enabling automated adversarial systems to find, chain, and attack cloud risks within minutes.
Science
fromFuturism
2 months ago

AI Discovers Hundreds of Anomalies in Archive of Hubble Images

A custom AI tool scanned Hubble archives and rapidly detected over 1,300 astrophysical anomalies, many previously undocumented, including galactic mergers and jellyfish galaxies.
fromMedium
2 months ago

From Graphs to Generative AI: Building Context That Pays-Part 1

Every year, poor communication and siloed data bleed companies of productivity and profit. Research shows U.S. businesses lose up to $1.2 trillion annually to ineffective communication, that's about $12,506 per employee per year. This stems from breakdowns that waste an average of 7.47 hours per employee each week on miscommunications. The damage isn't only interpersonal; it's structural. Disconnected and fragmented data systems mean that employees spend around 12 hours per week just searching for information trapped in those silos.
Data science
Python
fromPyImageSearch
1 month ago

SAM 3 for Video: Concept-Aware Segmentation and Object Tracking - PyImageSearch

SAM3 extends beyond static image segmentation to video by maintaining streaming memory and tracking state, enabling unified detection, segmentation, and tracking across frames while preserving object identity over time.
#google-photos
fromYanko Design - Modern Industrial Design News
1 month ago

Nvidia wants robots to learn before executing tasks by watching 44,000 hours of human video - Yanko Design

The robotics industry, for now, faces the biggest challenge in teaching robots to operate in the messy real world. The unstructured environment means robots need massive amounts of data to learn. Gathering and structuring that data is the costliest thing in robotics and perhaps the biggest impediment, slowing the entire development process.
Artificial intelligence
Artificial intelligence
fromBusiness
1 month ago

Image to Image AI: A Smarter Way to Transform and Enhance Visual Content - Business

Image to Image AI transforms existing photos into enhanced or stylized versions using artificial intelligence, eliminating the need for manual editing skills or complex tools.
Python
fromPyImageSearch
2 months ago

Grounded SAM 2: From Open-Set Detection to Segmentation and Tracking - PyImageSearch

Grounded SAM 2 extends Grounding DINO by adding pixel-level segmentation and video-aware tracking to convert language-driven detections into precise, persistent object masks.
#ai-image-generation
#sam-3
fromTechCrunch
1 month ago

Google launches Nano Banana 2 model with faster image generation | TechCrunch

The new Nano Banana 2 retains some of the high-fidelity characteristics of the Pro model but produces images faster. The company says you can create images with a resolution ranging from 512px to 4K, in different aspect ratios.
Artificial intelligence
fromPyImageSearch
1 month ago

Vector Search with FAISS: Approximate Nearest Neighbor (ANN) Explained - PyImageSearch

In the previous lesson, you learned how to turn text into embeddings - compact, high-dimensional vectors that capture semantic meaning. By computing cosine similarity between these vectors, you could find which sentences or paragraphs were most alike. That worked beautifully for a small handcrafted corpus of 30-40 paragraphs. But what if your dataset grows to millions of documents or billions of image embeddings? Suddenly, your brute-force search breaks down - and that's where Approximate Nearest Neighbor (ANN) methods come to the rescue.
Python
Artificial intelligence
fromMail Online
1 month ago

Can you tell the difference between real and AI-generated people?

People are overconfident in their ability to distinguish AI-generated faces from real ones and perform only slightly better than chance.
Python
fromPyImageSearch
2 months ago

TF-IDF vs. Embeddings: From Keywords to Semantic Search - PyImageSearch

Vector databases and embeddings enable semantic search and retrieval-augmented generation by mapping text meaning into geometric vectors for similarity-based retrieval.
Artificial intelligence
fromHackernoon
2 months ago

Segment Anything in Motion: A Hands-On Guide to sam3-video | HackerNoon

sam3-video is a unified foundation model from Meta Research for prompt-based segmentation that performs segmentation in both images and videos.
Python
fromPyImageSearch
1 month ago

Vector Search Using Ollama for Retrieval-Augmented Generation (RAG) - PyImageSearch

Retrieval-Augmented Generation (RAG) augments LLMs with retrieved context from vector search (FAISS) to produce accurate, up-to-date, evidence-grounded responses.
fromInfoWorld
2 months ago

16 open source projects transforming AI and machine learning

It's no different with machine learning and large language models. If anything, the open source ecosystem has grown richer and more complex, because now there are open source models to complement the open source code. For article, we've pulled together some of the most intriguing and useful projects for AI and machine learning. Many of these are foundation projects, nurturing their own niche ecology of open source plugins and extensions.
Artificial intelligence
Artificial intelligence
fromHackernoon
2 months ago

This "Flash" AI Model Is Fast and Dangerous at Math-Here's What It Can Do | HackerNoon

GLM-4.7-Flash is a 30-billion-parameter mixture-of-experts model offering strong performance for lightweight deployment.
fromInfoQ
2 months ago

Building Embedding Models for Large-Scale Real-World Applications

What happens under the hood? How is the search engine able to take that simple query, look for images in the billions, trillions of images that are available online? How is it able to find this one or similar photos from all that? Usually, there is an embedding model that is doing this work behind the hood.
Artificial intelligence
Artificial intelligence
fromInfoQ
2 months ago

Foundation Models for Ranking: Challenges, Successes, and Lessons Learned

Large-scale search and recommendation systems use two-stage retrieval and ranking pipelines to efficiently serve personalized results for hundreds of millions of users and items.
Artificial intelligence
fromInfoWorld
2 months ago

What is context engineering? And why it's the new AI architecture

Context engineering designs and manages the information, tools, and constraints an LLM receives, enabling scalable, high-signal inputs and improved model outcomes.
fromenglish.elpais.com
2 months ago

How does artificial intelligence think? The big surprise is that it intuits'

Each of these achievements would have been a remarkable breakthrough on its own. Solving them all with a single technique is like discovering a master key that unlocks every door at once. Why now? Three pieces converged: algorithms, computing power, and massive amounts of data. We can even put faces to them, because behind each element is a person who took a gamble.
Artificial intelligence
[ Load more ]