Addressing Orphaned Pods on Netflix's Titus Container Platform
Briefly

When a node goes away, a garbage collection (GC) process is triggered, leading to the deletion of associated pods. To enhance user experience, Titus employs a custom controller to maintain a history of Pod and Node objects, ensuring transparency. However, the absence of a satisfying explanation for why the agent was lost prompted further investigation into the root causes.
The addition of the `pod-termination-reason` annotation allows Netflix's engineering team to capture termination reasons, providing information to understand why nodes disappear.
Read at InfoQ
[
add
]
[
|
|
]