A Mirror Test For LLMs - LessWrong

"The mirror test isn't directly applicable to LLMs. They don't have bodies and eyes, obviously. And even if they did have bodies and eyes, their training data has familiarized them with such tests, so they would know how they were supposed to behave."

"An LLM's 'eyes' are its input layer, into which tokens are fed. The tokens themselves define the external world, and that portion of the tokens that are wrapped within 'Assistant' tags constitutes their body."

"What we want is an experimental paradigm that can effectively measure LLM self-awareness, but current models ultimately fall short in demonstrating true self-awareness."

A new measure of self-awareness for LLMs is proposed, inspired by the mirror test used for nonverbal animals. The mirror test assesses self-awareness by observing if an animal recognizes a mark on its body in a reflection. While LLMs lack physical bodies, their input layers can be analogized to eyes. However, LLMs are trained to recognize their own conversational style, making it difficult to assess true self-awareness. The proposed experimental paradigm aims to address these challenges but reveals limitations in current LLM capabilities.

#llm #self-awareness #mirror-test #artificial-intelligence #cognitive-science

Read at Lesswrong

Unable to calculate read time

Collection

[

...

]

A Mirror Test For LLMs - LessWrongA Mirror Test For LLMs - LessWrong Briefly

A Mirror Test For LLMs - LessWrong
A Mirror Test For LLMs - LessWrong
Briefly