
"I'm a site reliability engineer. My day job is to be somewhat of a pessimist. I like to say that I'm paid to worry. Really what I do is I analyze the way systems fail, and then I do engineering work to mitigate the risk. Either by trying to prevent the failures or more often than not trying to make it recover more quickly."
"About a year out of college, I was working in a legacy C++ code base. It had 10,000-line classes with 1,000-line for loops and if statements that took up half a page. Of course, there were no unit tests and it was poorly commented. I had almost a little bit of fear every time somebody asked me to go make a change to the inscrutable business logic."
A site reliability engineer analyzes how systems fail and implements engineering measures to mitigate risk, prevent failures, and accelerate recovery. Incident management restores services quickly when outages occur, followed by learning from incidents to reduce future risk. Digital logic circuits combine Boolean algebra and electrical engineering, forming a low-level foundation of computing built from AND, OR, and NOT gates. That foundation proved useful when confronting a legacy C++ codebase with massive classes, long loops, no tests, and poor comments, prompting insight and techniques to simplify complex, inscrutable business logic and reduce fear when changing code.
Read at InfoQ
Unable to calculate read time
Collection
[
|
...
]