From Dashboard Soup to Observability Lasagna: Building Better Layers
Briefly

From Dashboard Soup to Observability Lasagna: Building Better Layers
"I'm Martha. I'm a product engineer at a company called incident.io. We build a product that handles your end-to-end incident management. That means your alerts firing, paging you, all the way through to writing a postmortem. I work across the stack, but I focus a lot on the reliability of our product and the observability that enables that. Today you're going to leave with a process to unsoup your dashboards, which I promise is a very technical term."
"First, we're going to start with a story. Our story starts in early 2024. We'd just finished building an on-call product, so something that handles your alerts and pages you, wakes you up in the middle of the night when your software goes wrong. I'm going to use on-call as an example throughout this talk because I'm sure most of you know what it means to be paged."
The team transformed a chaotic collection of dashboards into a layered observability stack to make incident response reliable and actionable. The product covers end-to-end incident management: alerting, paging, and postmortems, with reliability and observability prioritized. Facing a tight release timeline for an on-call product, the team emphasized trust in paging and required structured monitoring so failures would surface reliably. The approach centers on unsouping dashboards into clear layers, guiding engineers through a reproducible debugging process, and applying practical technical practices to ensure smooth operation and maintain high availability for critical on-call flows.
Read at InfoQ
Unable to calculate read time
[
|
]