
"He said he "tasked 16 agents with writing a Rust-based C compiler, from scratch, capable of compiling the Linux kernel. After nearly 2,000 Claude Code sessions and $20,000 in API costs, the agent team produced a 100,000-line compiler that can build Linux 6.9 on x86, ARM, and RISC-V." With agent teams, he said, "multiple Claude instances work in parallel on a shared codebase without active human intervention.""
"One key task was getting round the need for "an operator to be online and available to work jointly," which we presume means removing the need for Claude Code to wait for a human to tell it what to do next. "To elicit sustained, autonomous progress, I built a harness that sticks Claude in a simple loop... When it finishes one task, it immediately picks up the next.""
"I leave it up to each Claude agent to decide how to act. In most cases, Claude picks up the 'next most obvious' problem. This threw up a number of lessons, including the need to 'write extremely high quality tests.' Readers were also advised to 'put yourself in Claude's shoes.' That means the 'test harness should not print thousands of useless bytes' to make it easier for Claude to find what it needs."
Sixteen Claude-based agents collaborated autonomously to build a Rust-based C compiler capable of compiling Linux 6.9 for x86, ARM, and RISC-V, producing roughly 100,000 lines after nearly 2,000 sessions and about $20,000 in API costs. The system used an automated harness loop so agents could continue without an online human operator, and each agent selected the next most obvious task. The experiment revealed practical requirements: extremely high-quality tests, minimizing noisy test output, accounting for the model's inability to tell time, and preventing excessive test runs that hinder progress.
Read at Theregister
Unable to calculate read time
Collection
[
|
...
]