Good observability on operational metrics, as much automation as possible, clear processes and playbooks for escalations.
Most engineering orgs Iโve seen require Eng oncalls. The perspective above is a simple non expensive way to minimize time investigations eg 5% movement of a metric. Teams spend lots of time asking โis this a real world effect, if so, should we be worried?โ
Good observability on operational metrics, as much automation as possible, clear processes and playbooks for escalations.
Most engineering orgs Iโve seen require Eng oncalls. The perspective above is a simple non expensive way to minimize time investigations eg 5% movement of a metric. Teams spend lots of time asking โis this a real world effect, if so, should we be worried?โ