The article discusses the limitations of large language models (LLMs) in understanding language semantics despite their ability to generate syntactically correct language. Scholars like Bender & Koller argue that LLMs are trained primarily on linguistic forms, which restricts their capability to grasp the complexities of meaning tied to communicative intentions. This critique aligns with the grounding problem, highlighting a disconnection between the tokens processed by LLMs and their real-world meanings. As such, skepticism persists around LLMs' ability to achieve true semantic understanding compared to human language users.
Even if LLMs can induce the syntax of language from mere exposure to sequences of linguistic tokens, this does not entail that they can also induce semantics.
Some skeptics, like Bender & Koller (2020), argue that language models are incapable of understanding the meaning of linguistic expressions, as they are trained on linguistic form alone.
They distinguish form from meaning, defined as the relation between linguistic expressions and the communicative intentions they serve to express.
Since, on their view, meaning cannot be learned from linguistic form alone, it follows that language models are constitutively unable to grasp the meaning of language.
Collection
[
|
...
]