Research shows AI datasets have human values blind spots
Briefly

Researchers from Purdue University examined the human values reflected in open-source AI training datasets. They found a significant imbalance, with most datasets emphasizing information and utility values over prosocial aspects like empathy and justice. This was highlighted through a constructed taxonomy of human values from moral philosophy, revealing that while topics such as wisdom and knowledge were well-represented, critical areas such as justice and human rights were under-addressed. This imbalance in AI training could lead to unintended societal consequences, highlighting the need for value diversify in AI datasets.
Our findings reveal a significant imbalance in the human values embedded in AI systems, emphasizing the need for a more equitable representation of all values.
The datasets primarily prioritize information and utility values over prosocial values like empathy and justice, which could affect AI's societal impact.
We constructed a taxonomy of human values to examine three major U.S. AI companies' training datasets, revealing insufficient representation of justice and human rights.
While our AI model performs well on utilitarian queries, it lacks depth when addressing empathy and civic concerns, showcasing a critical gap in training.
Read at TNW | Future-Of-Work
[
|
]