Six Sessions at QCon AI Boston 2026 That Take Productionizing AI Seriously

QCon AI Boston 2026 is approaching with a nearly sold-out event featuring 40+ sessions. Six highlighted sessions focus on what AI engineering looks like after prototypes fail under real team usage. One keynote addresses a misconception that latency is only a GPU issue, noting that a request can bottleneck across client work, conversation loading, context assembly, tokenization, routing, inference, streaming, and observability. Another session explains that coding agents work out of the box but struggle when they must operate inside a specific company environment. LinkedIn’s CAPT uses an MCP-based context layer to provide internal knowledge, reporting faster issue triage and many community-authored skills. OpenAI’s work emphasizes performance engineering using agent-readable telemetry and tooling for investigation.

"A single user request can pass through client work, conversation loading, context assembly, tokenization, routing, inference, streaming, and observability. Any one of those layers can become the bottleneck. The second half of the problem is newer. Agentic coding lets teams ship faster, which also means performance regressions can accumulate faster. Martin will cover how OpenAI is moving performance engineering toward agent-operated investigation, with telemetry and tooling that agents can read directly."

"Coding agents work well out of the box until they have to do real work inside a specific company. They do not know your services. They do not know your internal frameworks. They do not know which data systems matter, which workflows are standard, or which conventions have built up over years of engineering practice."

"Ajay Prakash's session looks at how LinkedIn approached that problem with CAPT, an MCP-based context layer for AI agents. The architecture matters, but the more useful part may be the organizational deployment story: what happened when LinkedIn tried to roll MCP out across engineering, what did not work first, and how the system evolved. Reported results include 70% faster issue triage and 500+ community-authored skills."

#ai-engineering #latency-and-performance #agentic-coding #mcp-context-layers #telemetry-and-observability

Read at InfoQ

Unable to calculate read time

Collection

[

...

]

Six Sessions at QCon AI Boston 2026 That Take Productionizing AI SeriouslySix Sessions at QCon AI Boston 2026 That Take Productionizing AI Seriously Briefly

Six Sessions at QCon AI Boston 2026 That Take Productionizing AI Seriously
Six Sessions at QCon AI Boston 2026 That Take Productionizing AI Seriously
Briefly