Page MenuHomePhabricator

Increase the nginx proxy timeouts in superset to 185 seconds
Closed, ResolvedPublic

Description

This ticket is being created due to a discussion started on Slack about the current Superset timeout behavior.

In this ticket from 2021: T294771: Increase Superset Timeout
...we decided that we should support a timeout of 3 minutes all the way through the superset request stack.

NOTE: We do this instead of supporting async queries, which are discussed separately in: T397338: Enable async queries for Superset with Celery

We believe that this 3 minutes timeout was working correctly when Superset was running on bare-metal and VMs, but then we migrated Superset to Kubernetes in T347710

It looks like we inadvertently introduced a component with a 60s timeout, whereas everything else is 185 or 180 seconds.

In this diagram, we have several layers through which requests pass:

  • envoyproxy, acting as a TLS terminator
  • nginx, acting as a static assets webserver
  • gunicorn, acting as a flask application webserver
  • the superset application, itself

We have timeouts of 180 or 185 seconds set in:

But it looks like we missed the following two parameters from the nginx config:

We can see that there is no configuration value for these in the nginx pod.

btullis@deploy2002:~$ kubectl exec -it superset-production-7779c8cd56-q2qpv -c superset-production-assets -- bash

runuser@superset-production-7779c8cd56-q2qpv:/app$ nginx -T|grep timeout
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful

Therefore, both of them will be assigned their default value of 60s

I should think that it is proxy_read_timeout that would be the more important of the two, since this is how long nginx will wait for a response from gunicorn after having sent a request.

Event Timeline

Change #1197979 had a related patch set uploaded (by Stevemunene; author: Stevemunene):

[operations/deployment-charts@master] superset: Increase the nginx proxy timeout

https://gerrit.wikimedia.org/r/1197979

Change #1197979 merged by jenkins-bot:

[operations/deployment-charts@master] superset: Increase the nginx proxy timeout

https://gerrit.wikimedia.org/r/1197979

added the two options to nginx and we don't seem to be having the timeouts previously seen on some charts.
Marking this as done while monitoring for any potential issues.