Leaked training documents from Scale AI reveal Meta's strategies for ensuring their AI models are both engaging and safe. Contractors evaluating user interactions with Meta's AI chatbot were guided on how to classify prompts, distinguishing between sensitive 'tier one' content, which should be rejected, and more lenient 'tier two' prompts. For instance, flirty but non-explicit interactions were acceptable, while prompts that sexualized minors faced immediate rejection. This careful classification illustrates Meta's approach to balancing user engagement with the imperative of content safety, positioning AI to interact appropriately without crossing ethical lines.
In categorizing user prompts, Meta more strictly monitored tier one prompts involving sensitive issues while allowing tier two prompts some leeway for creative engagement.
Contracts for contractors specified a need to reject certain flirty prompts if they ventured into sexual explicitness, balancing fun with safety in AI interactions.
Collection
[
|
...
]