#model-training

[ follow ]
#machine-learning
OMG science
fromInfoWorld
1 month ago

How DeepSeek innovated large language models

DeepSeek's innovative models redefine performance benchmarks with advanced techniques for precision training and reasoning.
fromHackernoon
1 month ago
Artificial intelligence

Our Analysis on Think-and-Execute and Pseudocode | HackerNoon

Task-level pseudocode significantly improves reasoning performance compared to instance-specific logic.
Pre-training on code corpora enhances the understanding of task-level logic.
fromHackernoon
5 months ago
Artificial intelligence

This AI Doesn't Just Skim Scientific Papers-It Tags, Sorts, and Explains Them Too | HackerNoon

The article discusses the training and evaluation of various LLM models with a focus on NER and internal relation recognition.
fromHackernoon
5 months ago
Online learning

Direct Nash Optimization Beats Bigger Models with Better Data | HackerNoon

Offline contrastive training provides more valuable signals for model performance than traditional supervised fine-tuning methods.
fromArs Technica
11 months ago
Data science

What kind of bug would make machine learning suddenly 40% worse at NetHack?

NetHack is used for machine learning experimentation, showing challenges in model performance consistency.
OMG science
fromInfoWorld
1 month ago

How DeepSeek innovated large language models

DeepSeek's innovative models redefine performance benchmarks with advanced techniques for precision training and reasoning.
fromHackernoon
1 month ago
Artificial intelligence

Our Analysis on Think-and-Execute and Pseudocode | HackerNoon

Task-level pseudocode significantly improves reasoning performance compared to instance-specific logic.
Pre-training on code corpora enhances the understanding of task-level logic.
fromHackernoon
5 months ago
Artificial intelligence

This AI Doesn't Just Skim Scientific Papers-It Tags, Sorts, and Explains Them Too | HackerNoon

The article discusses the training and evaluation of various LLM models with a focus on NER and internal relation recognition.
fromHackernoon
5 months ago
Online learning

Direct Nash Optimization Beats Bigger Models with Better Data | HackerNoon

Offline contrastive training provides more valuable signals for model performance than traditional supervised fine-tuning methods.
fromArs Technica
11 months ago
Data science

What kind of bug would make machine learning suddenly 40% worse at NetHack?

NetHack is used for machine learning experimentation, showing challenges in model performance consistency.
more#machine-learning
#natural-language-processing
fromHackernoon
1 year ago
Miscellaneous

DreamLLM: Additional Experiments That Shed New Light | HackerNoon

DREAMLLM's multimodal adaptation enhances language model performance, setting new benchmarks in natural language processing tasks.
fromHackernoon
1 month ago
Roam Research

Detailing the Primary Methodology Implemented in Our Models: Octopus v2 | HackerNoon

The model effectively selects and generates function parameters through a two-stage process involving classification and language modeling.
fromHackernoon
1 year ago
Miscellaneous

DreamLLM: Additional Experiments That Shed New Light | HackerNoon

DREAMLLM's multimodal adaptation enhances language model performance, setting new benchmarks in natural language processing tasks.
fromHackernoon
1 month ago
Roam Research

Detailing the Primary Methodology Implemented in Our Models: Octopus v2 | HackerNoon

The model effectively selects and generates function parameters through a two-stage process involving classification and language modeling.
more#natural-language-processing
Artificial intelligence
fromHackernoon
2 months ago

What is the Best Way to Train AI Models? | HackerNoon

Fine-tuning models enhances understanding of visual scene structures compared to full-training.
Visual hierarchy decoding in CNNs provides insights into feature representation.
#ai-research
Artificial intelligence
fromArs Technica
2 months ago

DeepSeek goes beyond "open weights" AI with plans for source code release

Open source AI should include training code and data details to meet formal definitions and improve transparency, replicability, and understanding of models.
fromHackernoon
9 months ago
Data science

Textbooks Are All You Need: Abstract and Introduction | HackerNoon

phi-1 is a compact 1.3B parameter language model for code, achieving notable accuracy despite its smaller size.
Artificial intelligence
fromArs Technica
2 months ago

DeepSeek goes beyond "open weights" AI with plans for source code release

Open source AI should include training code and data details to meet formal definitions and improve transparency, replicability, and understanding of models.
fromHackernoon
9 months ago
Data science

Textbooks Are All You Need: Abstract and Introduction | HackerNoon

phi-1 is a compact 1.3B parameter language model for code, achieving notable accuracy despite its smaller size.
more#ai-research
Artificial intelligence
fromInfoWorld
3 months ago

The bitter lesson for generative AI adoption

Relying on retrieval-augmented generation and prompt engineering is a more sustainable strategy compared to constant model training and fine-tuning.
Science
fromArs Technica
3 months ago

It's remarkably easy to inject new medical misinformation into LLMs

Misinformation training in models increases overall unreliability in medical content, even from minimal inclusion.
fromHackernoon
4 months ago
Data science

New Study Shows How Positive-Sum Fairness Impacts Medical AI Models in Chest Radiography | HackerNoon

The study addresses the impact of ethnicity on the prediction of lung lesions using chest radiographs.
It emphasizes the importance of fairness in AI healthcare models across different racial subgroups.
fromHackernoon
1 year ago
Medicine

How AI Learns from Human Preferences | HackerNoon

The RLHF pipeline enhances model effectiveness through three main phases: supervised fine-tuning, preference sampling, and reinforcement learning optimization.
Data science
fromAxios
9 months ago

This is AI's brain on AI

Data from AI models is increasingly used to train other AI models through synthetic data, aiding chatbots but also posing risks of destabilization.
[ Load more ]