#preference-learning

[ follow ]
Artificial intelligence
fromMedium
1 year ago

How Robots Learn Preferences with Minimal Human Feedback

Vik's research focuses on how robots can learn from minimal human feedback, adapting without the need for large datasets.
Artificial intelligence
fromHackernoon
1 year ago

The Art of Arguing With Yourself-And Why It's Making AI Smarter | HackerNoon

The paper presents Direct Nash Optimization, enhancing large language model training by utilizing pair-wise preferences instead of traditional reward maximization.
[ Load more ]