#quantization

[ follow ]
Scala
fromHackernoon
1 year ago

Running Quantized Code Models on a Laptop Without a GPU | HackerNoon

The research utilizes the llama-cpp-python package for efficient quantization of LLMs in a Windows 11 Python environment.
Scala
fromHackernoon
1 year ago

Bringing Big AI Models to Small Devices | HackerNoon

Quantization enhances the accessibility of LLMs on consumer devices, potentially reducing the digital divide.
fromHackernoon
1 year ago

Inside the Evaluation Pipeline for Code LLMs With LuaUnit | HackerNoon

To streamline and standardize the automated evaluation procedure, we translated the native assertions in MCEVAL to LuaUnit-based assertions, improving consistency across benchmarks.
Scala
Scala
fromHackernoon
1 year ago

What Makes Code LLMs Accurate? | HackerNoon

Pass@1 rates for Lua programming tasks show that quantization level impacts model performance, particularly affecting lower bit models.
#model-performance
Business intelligence
fromHackernoon
1 year ago

The V-Shaped Mystery of Inference Time in Low-Bit Code Models | HackerNoon

Higher precision results in longer inference times, especially for incorrect solutions.
Longer inference times do not guarantee improved performance across different models.
Artificial intelligence
fromInfoQ
10 months ago

Gemma 3n Available for On-Device Inference Alongside RAG and Function Calling Libraries

Gemma 3n is a multimodal AI model enhancing enterprise efficiency through mobile device utilization.
Typography
fromHackernoon
1 year ago

Accelerating Neural Networks: The Power of Quantization | HackerNoon

Quantization reduces the memory and computational demands of neural networks by converting floating-point numbers to lower-precision integers.
[ Load more ]