Page MenuHomePhabricator

Response time from https://tools.wmflabs.org/scholia/ varies
Closed, ResolvedPublic

Description

I have previously experienced slow performance from a Python Flask webapp on Toolforge. This was in connection with the Wembedder which used a lot of memory. Since a few days ago (5-10 September 2017?) I am experiencing a slow response with the scholia webservice https://tools.wmflabs.org/scholia/ Scholia is not a memory intensive webservice. My local instance running from http://127.0.0.1:5000/ uses little memory and has an fine response.

The response time from https://tools.wmflabs.org/scholia/ varies. Sometimes it is quite fast (to return the initial page. The complete page rendering may take many seconds and is not the issue here). At other times the response is slow, e.g., with up to 40 seconds for the main page.

My uwsgi.log shows:

[pid: 7358|app: 1|req: 725/3184] 10.68.21.81 () {40 vars in 835 bytes} [Tue Sep 12 14:47:31 2017] GET /scholia/venue/Q38999505 => generated 12087 bytes in 16271 msecs (HTTP/1.1 200) 2 headers in 82 bytes (1 switches on core 0)
[pid: 7360|app: 1|req: 789/3185] 10.68.21.81 () {42 vars in 840 bytes} [Tue Sep 12 14:47:35 2017] GET /scholia/ => generated 6541 bytes in 11952 msecs (HTTP/1.1 200) 2 headers in 81 bytes (1 switches on core 0)

There does not seem to be any major load on the webservice

I wonder how I can debug this or whether there is any suggestion for what is happening.

Event Timeline

Fnielsen created this task.Sep 12 2017, 2:56 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 12 2017, 2:56 PM
bd808 renamed this task from Slow performance on toolforge project to Response time from https://tools.wmflabs.org/scholia/ varies.Sep 12 2017, 11:15 PM
bd808 added a project: Tools.
bd808 added a subscriber: bd808.Sep 12 2017, 11:29 PM

Based on the contents of /data/project/scholia/service.manifest, this tool is running on Grid Engine. The first thing I would personally try is migrating it to run on the Kubernetes cluster instead. Neither Kubernetes nor Grid Engine perform active balancing of tools (rescheduling on a new node when the total system load is high), but Kubernetes does a better job than Grid Engine of isolating tools from each other and respecting quality of service guarantees.

There is pretty good documentation at https://wikitech.wikimedia.org/wiki/Help:Toolforge/Web#Python_.28uWSGI.29 on the various ways that a Python uWSGI application can be run. Moving from Grid Engine to Kubernetes would at minimum require rebuilding the virtualenv environment used by the application. If the application code is Python3 compatible, I would actually recommend switching from Python2 to Python3 at the same time.

Fnielsen closed this task as Resolved.Sep 13 2017, 11:15 AM
Fnielsen claimed this task.