As mentioned in {T308989#7953277}, we should leverage kubernetes' liveliness/readiness probes to at least auto-magically kick over the pods if/when this happens
Description
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | None | T310754 Recurrent API worker failures | |||
Open | None | T310753 Set up liveliness/readiness probes |