Page MenuHomePhabricator

Add support for async session to python-mwapi
Closed, ResolvedPublic

Description

The MW API lacks support for asynchronous frameworks. We would like to add support for asynchronous functionality to mwapi using the asynchronous HTTP client/server aiohttp.

For revscoring models, with the implementation of the MWAPI HTTP cache to the api extractor (https://github.com/wikimedia/revscoring/pull/522), we can fetch data from the MediaWiki API asynchronously and populate the MWAPICache object, to prevent calling the blocking mwapi Session.

For non-revscoring models like outlink-topic model, we can simply change the blocking mwapi Session to the non-blocking mwapi AsyncSession to fetch data asynchronously.

PR: https://github.com/mediawiki-utilities/python-mwapi/pull/46

Event Timeline

Aiko's patch has been merged!

New version of mwapi released by Aaron: https://pypi.org/project/mwapi/0.6.0/ :)

The current version has an issue in Python 3.7 :(

asyncio.exceptions.TimeoutError will raise AttributeError: module 'asyncio' has no attribute 'exceptions' since Timeout is in class concurrent.futures._base.TimeoutError in Python 3.7, but in class asyncio.exceptions.TimeoutError in Python 3.8 onward.

The fix is to change it to asyncio.TimeoutError that is acceptable for either version. Here is the patch: https://github.com/AikoChou/python-mwapi/commit/90d2f5d40a9aaaf9822b8e59540b30c85965621a

@achou let's create a pull request when you are ready, I'll ask Aaron to review and cut 6.1 :)