
Generative AI use is widespread, but adoption varies sharply by company size, leaving smaller businesses, developers, and everyday users with limited access to advanced capabilities. Many retail and small businesses rely on basic AI utilities powered by their own facilities, such as text inference and multimedia generation using base models, because full utilization requires heavy infrastructure. A resource-efficient approach is needed to support billions of AI agents and intelligent machines. Tether’s edge-first LoRA fine-tuning framework for Microsoft’s Bitnet LLM aims to reduce computational overhead so consumer-grade devices can perform advanced operations. The approach enables fine-tuning of large models on handheld devices and personal computers using platform-agnostic, ternary-quantized techniques.
"The future of AI should be accessible, available, and open to people and builders everywhere, and it should not require an absurd amount of resources only available to a handful of cloud providers, Paolo Ardoino, CEO, Tether."
"About 700 million people use generative AIs like Gemini and ChatGPT weekly, but adoption is far from uniform. McKinsey's 2025 State of AI survey found that nearly half of respondents from companies with more than $5 billion in revenue have reached the AI scaling phase, compared with just 29 percent of those from companies with less than $100 million in revenue, a gap that only widens further down the chain, locking out smaller businesses, developers, and everyday users."
"Retail and small businesses are limited to basic AI utilities that their facilities can power, such as text-based inference and multimedia generation, using base models. That is billions of end users, and developers locked out of full utilization and development of intelligent software due to high infrastructure demands."
"Imagine a 13-billion-parameter model being fine-tuned on everyday handheld devices like Samsung S25 and iPhone 16, as well as on regular personal computers. The breakthrough combines resource-efficiency and platform-agnostic techniques to develop a fine-tuning framework for the ternary-quantized LLM."
Read at Computerworld
Unable to calculate read time
Collection
[
|
...
]