Agent design patterns
Briefly

Agent design patterns
"We are getting closer to long-running autonomous agents. @METR_Evals reports that agent task length doubles every 7 months. But one challenge is that models get worse as context grows. @trychroma reported on context rot, @dbreunig outlined various failure modes, and Anthropic explains it here: Context must be treated as a finite resource with diminishing marginal returns. Like humans with limited working memory, LLMs have an "attention budget." Every new token depletes it."
"Give Agents A Computer @barry_zyj and @ErikSchluntz defined agents as systems where LLMs direct their own actions. It's become clear over the past year that agents benefit from access to a computer, giving them primitives like a filesystem and shell. The filesystem gives agents access to persistent context. The shell lets agents run built-in utilities, CLIs, provided scripts, or code they write. We've seen this across many popular agents. Claude Code broke out as an agent that "lives on your computer". Manus uses a virtual computer. And both fundamentally use tools to control the computer, as @rauchg captured: The fundamental coding agent abstraction is the CLI ... rooted in the fact that agents need access to the OS layer. It's more accurate to think of Claude Code as "AI for your operating system"."
2025 ended with Meta acquiring Manus for over $2 billion and Claude Code achieving a $1 billion run rate. Agent task lengths are rapidly increasing, with METR_Evals reporting task length doubling every seven months. Model performance degrades as context grows, producing phenomena like context rot and other failure modes. LLMs have a finite attention budget: each token consumes limited contextual capacity. Context engineering focuses on placing the most relevant information in the window for the next step. Providing agents with a computer—filesystem and shell—enables persistent state, tooling, and richer multi-layer action spaces for autonomy.
Read at Rlancemartin
Unable to calculate read time
[
|
]