Page MenuHomePhabricator

Thumbor 404s on an auth failure to Swift
Open, Needs TriagePublic

Description

During T331820 we saw elevated rates of 404 errors - this isn't a very good signal for potential failures within the service. Based on some quick grepping it doesn't look like we do this intentionally, but this is possibly a side effect of overly broad exception handling.

Related Objects

Event Timeline

This error manifests in the 404 log as

2023-03-14 11:14:37,815 8834 thumbor:ERROR [SWIFT_LOADER] get_object failed: SWIFT_URL ClientException('Auth GET failed',)

Any swiftclient.exceptions.ClientException while loading the file causes a 404 (see rTHMBREXT wikimedia_thumbor/loader/swift/__init__.py:152-158). The documentation vaguely indicates that a HTTP response code should be available in ClientException.http_status, so that could be used to adjust the Thumbor response.

@MatthewVernon FYI, I think I was able to find back this incident in logstash via the thumbor logger:
https://logstash.wikimedia.org/goto/670be7bfe6e76cab960dd01ab763d6f9

ClientException: Auth GET failed: https://ms-fe.svc.eqiad.wmnet/auth/v1.0 401 Unauthorized [first 60 chars of response] <html><h1>Unauthorized</h1><p>This server could not verify t