I ran stress tests a few times, but the servers quickly locked up with ScoreProcessorOverloaded. Make sure the production service can't die in this way.
When the Redis "celery" queue fills with pending jobs, then we can hit a limit (configured to 100 normally, 400 for the stress tests) where new jobs cannot be processed. I'm not sure how, but I see 481 items currently in the queue. This number doesn't grow or shrink, regardless of new requests.
The fix is probably to have a job that expires old pending jobs (Redis TTL won't work inside the list).
This problem appeared because all jobs are timing out.