Page MenuHomePhabricator

Fix ArticleQuality.js so that it doesn't violate PoolCounter constraints
Closed, ResolvedPublic

Description

Since T160692: Use poolcounter to limit number of connections to ores uwsgi has been deployed, some of our problematic strategies for querying ORES are coming back to bite us. E.g. the ArticleQuality.js gadget (T201927: Implement ORES gadget for article quality) sends too many parallel requests to the ORES service and it is then throttled. This has left the gadget in a broken state.

So! We need to implement a better strategy for batching requests to ORES. There's a related task for developing an ORES client T201691: Implement JS ORES client in mw-ORES extension. Let this task be the ground work for a more general client.

Event Timeline

Here's the results of my study in thread pooling in Javascript: https://gist.github.com/halfak/5e0c4951c52f57ff3320aff9b51b757e

Essentially, a system like this allows a developer to work with a thread pool without minding the number of threads. It's a simple asynchronous pattern that allows for filing jobs and getting a Promise() in response. The system also uses Deferred's to manage it's worker's activities.

Based on this, I needed to implement something slightly more complicated in ORES, I needed to both manage a set of workers *and* batch requests as they come in. The result is captured in this edit: https://meta.wikimedia.org/w/index.php?title=User:EpochFail/ArticleQuality-system.js&diff=18481658&oldid=18414783

Essentially, it implements a similar worker pool as is described in the threading example above. But it also draws tasks from the queue in batches. Within a batch job, individual Deferred objects are resolved. This allows a user to just make requests as though they were going to ORES one-at-a-time. The interface for ArticleQuality.oresScore() has not changed at all. The scoring jobs are then queued, batched, and distributed transparently so that the end-user developer doesn't need to keep track of anything but the Promise.

When testing this out on pages like https://eu.wikipedia.org/wiki/Wikipedia:Wikipedia_guztiek_izan_beharreko_artikuluen_zerrenda/3._maila (lots scores needed to annotate the links on the page), the performance is plenty fast.

(Could you add a project tag to this task? Thanks in advance!)

Halfak claimed this task.
Halfak moved this task from Parked to Completed on the Machine-Learning-Team (Active Tasks) board.