Today Thumbor paged due to its blackbox probes failing, specifically due to the service being slow and breaching its timeout (10s configured in service::catalog IIRC)
Additionally it looks like thumbor units get killed (and restarted) with SIGABRT. And haproxy reported latencies have goone up since a few days: https://grafana.wikimedia.org/d/Pukjw6cWk/thumbor?orgId=1&from=now-7d&to=now