Page MenuHomePhabricator

Test superset running on gunicorn + gevent
Closed, ResolvedPublic

Description

Gunicorn is our WSGI HTTP server that runs Superset. We are currently using it in pre-fork mode, with 8 workers. This means that every HTTP request for superset is handled by one process in a synchronous, so one request stick to its worker process until it finishes.

This is ok for CPU bound tasks, but Superset is mostly I/O bound (waiting for Druid/DBs/etc.. to return data). The suggestion is to use Gunicorn with gevent, that is a async I/O lib (based on libevent) to transform the HTTP requests into coroutines, handling concurrency on the same worker process when I/O is performed.

The trick should be to force gunicorn[gevent] in the Superset's frozen-requirements.txt, deploy and then force the use of gevent via gunicorn config.

Related Objects

Event Timeline

Change 599295 had a related patch set uploaded (by Elukey; owner: Elukey):
[analytics/superset/deploy@master] Add gunicorn[gevent] dependency.

https://gerrit.wikimedia.org/r/599295

May 28 10:35:08 an-tool1005 superset[8281]: [2020-05-28 10:35:08 +0000] [2] [INFO] Starting gunicorn 20.0.4
May 28 10:35:08 an-tool1005 superset[8281]: [2020-05-28 10:35:08 +0000] [2] [INFO] Listening at: http://0.0.0.0:9080 (2)
May 28 10:35:08 an-tool1005 superset[8281]: [2020-05-28 10:35:08 +0000] [2] [INFO] Using worker: gevent
May 28 10:35:08 an-tool1005 superset[8281]: [2020-05-28 10:35:08 +0000] [5] [INFO] Booting worker with pid: 5
May 28 10:35:08 an-tool1005 superset[8281]: [2020-05-28 10:35:08 +0000] [6] [INFO] Booting worker with pid: 6
May 28 10:35:08 an-tool1005 superset[8281]: [2020-05-28 10:35:08 +0000] [7] [INFO] Booting worker with pid: 7
May 28 10:35:08 an-tool1005 superset[8281]: [2020-05-28 10:35:08 +0000] [8] [INFO] Booting worker with pid: 8
May 28 10:35:08 an-tool1005 superset[8281]: [2020-05-28 10:35:08 +0000] [9] [INFO] Booting worker with pid: 9
May 28 10:35:08 an-tool1005 superset[8281]: [2020-05-28 10:35:08 +0000] [10] [INFO] Booting worker with pid: 10
May 28 10:35:08 an-tool1005 superset[8281]: [2020-05-28 10:35:08 +0000] [11] [INFO] Booting worker with pid: 11
May 28 10:35:08 an-tool1005 superset[8281]: [2020-05-28 10:35:08 +0000] [12] [INFO] Booting worker with pid: 12

On an-too1005 the code seems working!

I was wrong on the current worker type, since it is gthread, and it is categorized as "async" worker:

The worker gthread is a threaded worker. It accepts connections in the main loop, accepted connections are added to the thread pool as a connection job. On keepalive connections are put back in the loop waiting for an event. If no event happen after the keep alive timeout, the connection is closed.

So it might not be something fully async, but let's test gevent and see how it works.

elukey triaged this task as Medium priority.
elukey added a project: Analytics-Kanban.

Change 599295 merged by Elukey:
[analytics/superset/deploy@master] Add gunicorn[gevent] dependency.

https://gerrit.wikimedia.org/r/599295

Change 602357 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::superset: move to gevent

https://gerrit.wikimedia.org/r/602357

Change 602357 merged by Elukey:
[operations/puppet@production] profile::superset: move to gevent

https://gerrit.wikimedia.org/r/602357

Deployed, let's leave this running for some days before closing.

elukey set Final Story Points to 5.