#quantization

[ follow ]
Scala
fromHackernoon
9 months ago

Running Quantized Code Models on a Laptop Without a GPU | HackerNoon

The research utilizes the llama-cpp-python package for efficient quantization of LLMs in a Windows 11 Python environment.
fromHackernoon
9 months ago

Bringing Big AI Models to Small Devices | HackerNoon

Quantization enhances the accessibility of LLMs on consumer devices, potentially reducing the digital divide.
fromHackernoon
9 months ago

Inside the Evaluation Pipeline for Code LLMs With LuaUnit | HackerNoon

To streamline and standardize the automated evaluation procedure, we translated the native assertions in MCEVAL to LuaUnit-based assertions, improving consistency across benchmarks.
Scala
Scala
fromHackernoon
9 months ago

What Makes Code LLMs Accurate? | HackerNoon

Pass@1 rates for Lua programming tasks show that quantization level impacts model performance, particularly affecting lower bit models.
#model-performance
fromHackernoon
9 months ago

The V-Shaped Mystery of Inference Time in Low-Bit Code Models | HackerNoon

Higher precision results in longer inference times, especially for incorrect solutions.
Longer inference times do not guarantee improved performance across different models.
Artificial intelligence
fromInfoQ
2 months ago

Gemma 3n Available for On-Device Inference Alongside RAG and Function Calling Libraries

Gemma 3n is a multimodal AI model enhancing enterprise efficiency through mobile device utilization.
fromHackernoon
4 months ago

Accelerating Neural Networks: The Power of Quantization | HackerNoon

Quantization reduces the memory and computational demands of neural networks by converting floating-point numbers to lower-precision integers.
Scala
fromHackernoon
4 months ago

The Future of AI Compression: Smarter Quantization Strategies | HackerNoon

Impact-based parameter selection outperforms magnitude-based criteria in improving quantization for language models.
#large-language-models
Artificial intelligence
fromHackernoon
4 months ago

Rethinking AI Quantization: The Missing Piece in Model Efficiency | HackerNoon

Quantum strategies optimize LLM precision while balancing accuracy and effectiveness through methods like post-training quantization and quantization-aware training.
fromHackernoon
9 months ago

Increased LLM Vulnerabilities from Fine-tuning and Quantization: Experiment Set-up & Results | HackerNoon

The testing on different downstream tasks, including fine-tuning and quantization, shows that while fine-tuning can improve task effectiveness, it can simultaneously increase jailbreaking vulnerabilities in LLMs.
Data science
[ Load more ]