Anthropic has positioned itself as a transparent, safety-focused AI company, contrasting with the increasing opaqueness of competitors like OpenAI. Recently, the firm released a paper, 'Values in the wild', analyzing 300,000 interactions with its chatbot, Claude, to uncover a total of 3,307 distinct AI values. These values were derived from explicit user statements and form a hierarchy of five macro-categories: Practical, Epistemic, Social, Protective, and Personal, with Practical being the most common. This analysis aims to better understand how Claude reasons and responds in alignment with user values.
Anthropic's analysis of 300,000 conversations with Claude derived 3,307 AI values, mapping a hierarchy of macro-categories to understand the chatbot's moral reasoning.
By analyzing user interactions, Anthropic showcases how Claude endorses user values and introduces new considerations, promoting ideals such as personal agency and professional growth.
Collection
[
|
...
]