Social Choice for AI Alignment: Dealing with Diverse Human FeedbackFoundation models like GPT-4 are fine-tuned to prevent unsafe behavior by refusing requests for criminal or racist content. They use reinforcement learning from human feedback.