Page MenuHomePhabricator

research/mwaddlink has an all-repo CI failure
Closed, ResolvedPublic

Description

It is currently impossible to merge patches in research/mwaddlink:

ERROR    app:app.py:1458 Exception on /v1/linkrecommendations/wikipedia/simple/Cat [GET]
Traceback (most recent call last):
  File "/srv/app/.tox/pytest-integration/lib/python3.7/site-packages/flask/app.py", line 2073, in wsgi_app
    response = self.full_dispatch_request()
  File "/srv/app/.tox/pytest-integration/lib/python3.7/site-packages/flask/app.py", line 1518, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/srv/app/.tox/pytest-integration/lib/python3.7/site-packages/flask/app.py", line 1516, in full_dispatch_request
    rv = self.dispatch_request()
  File "/srv/app/.tox/pytest-integration/lib/python3.7/site-packages/flask/app.py", line 1502, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "/srv/app/app.py", line 303, in query
    sections_to_exclude=data["sections_to_exclude"][:25],
  File "/srv/app/src/query.py", line 55, in run
    sections_to_exclude=sections_to_exclude,
  File "/srv/app/src/scripts/utils.py", line 320, in process_page
    page_wikicode, redirects=redirects, pageids=pageids
  File "/srv/app/src/scripts/utils.py", line 93, in getLinks
    link = resolveRedirect(link, redirects)
  File "/srv/app/src/scripts/utils.py", line 108, in resolveRedirect
    return redirects.get(link, link)
  File "/usr/lib/python3.7/_collections_abc.py", line 660, in get
    return self[key]
  File "/srv/app/.tox/pytest-integration/lib/python3.7/site-packages/sqlitedict.py", line 245, in __getitem__
    return self.decode(item[0])
  File "/srv/app/.tox/pytest-integration/lib/python3.7/site-packages/sqlitedict.py", line 102, in decode
    return loads(bytes(obj))
ValueError: unsupported pickle protocol: 5

This started to happen shortly after T344832#9122501 was done, so it might be caused by simplewiki having incorrect datasets uploaded on the analytics.wikimedia.org server? @kevinbazira Do you have any thoughts on this?

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

I reproduced the error locally (using the docker-registry.wikimedia.org/wikimedia/research-mwaddlink:test image). It seems to be not affecting the hosted API endpoint at least, https://api.wikimedia.org/service/linkrecommendation/v1/linkrecommendations/wikipedia/simple/Cat works as expected. There are no issues locally with a MariaDB-backed approach, so that might play a role.

Makes merging patches impossible.

I think this is caused by the new published datasets for simplewiki T344832#9122501 (which are called in the CI tests):

{"written_at": "2023-08-28T09:00:15.816Z", "written_ts": 1693213215816155000, "msg": "{'type': 'ValueError', 'description': 'unsupported pickle protocol: 5', 'trace': ['  File \"/srv/app/.tox/pytest-integration/lib/python3.7/site-packages/flask/app.py\", line 2073, in wsgi_app\\n    response = self.full_dispatch_request()\\n', '  File \"/srv/app/.tox/pytest-integration/lib/python3.7/site-packages/flask/app.py\", line 1518, in full_dispatch_request\\n    rv = self.handle_user_exception(e)\\n', '  File \"/srv/app/.tox/pytest-integration/lib/python3.7/site-packages/flask/app.py\", line 1516, in full_dispatch_request\\n    rv = self.dispatch_request()\\n', '  File \"/srv/app/.tox/pytest-integration/lib/python3.7/site-packages/flask/app.py\", line 1502, in dispatch_request\\n    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)\\n', '  File \"/srv/app/app.py\", line 303, in query\\n    sections_to_exclude=data[\"sections_to_exclude\"][:25],\\n', '  File \"/srv/app/src/query.py\", line 55, in run\\n    sections_to_exclude=sections_to_exclude,\\n', '  File \"/srv/app/src/scripts/utils.py\", line 320, in process_page\\n    page_wikicode, redirects=redirects, pageids=pageids\\n', '  File \"/srv/app/src/scripts/utils.py\", line 93, in getLinks\\n    link = resolveRedirect(link, redirects)\\n', '  File \"/srv/app/src/scripts/utils.py\", line 108, in resolveRedirect\\n    return redirects.get(link, link)\\n', '  File \"/usr/lib/python3.7/_collections_abc.py\", line 660, in get\\n    return self[key]\\n', '  File \"/srv/app/.tox/pytest-integration/lib/python3.7/site-packages/sqlitedict.py\", line 245, in __getitem__\\n    return self.decode(item[0])\\n', '  File \"/srv/app/.tox/pytest-integration/lib/python3.7/site-packages/sqlitedict.py\", line 102, in decode\\n    return loads(bytes(obj))\\n']}", "type": "log", "logger": "logger", "thread": "MainThread", "level": "ERROR", "module": "app", "line_no": 128, "correlation_id": "4e15095a-4581-11ee-82b5-0242ac110002"}

We can hopefully fix this by doing the following:

Change 952920 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[research/mwaddlink@main] tests: Update expected data for provide_query_{get,post}

https://gerrit.wikimedia.org/r/952920

@MGerlach re-ran the pipeline for simplewiki as described in T345091#9124589. They're using a newer version of the model, resulting in slightly different suggestions. I've uploaded a patch that changes the expected_data.json files for relevant testcases, and the patch got a +2. Moving to sprint for CR.

Thanks for your help @MGerlach!

Great. Thank you @Urbanecm_WMF and @MGerlach for resolving this while I was away. Sorry for the inconvenience.

In order to avoid this in the future, should we add a section to the README about the tests? For example, mentioning that updating the WIKI_IDs used for testing likely require additional changes (e.g. changing the expected output in fixtures). I was honestly not aware so didnt anticipate that updating simplewiki would cause the CI to fail completely.

In order to avoid this in the future, should we add a section to the README about the tests? For example, mentioning that updating the WIKI_IDs used for testing likely require additional changes (e.g. changing the expected output in fixtures). I was honestly not aware so didnt anticipate that updating simplewiki would cause the CI to fail completely.

I think that's a great idea. Feel free to upload a patch to update the readme, or i can do it later.

Change 952920 merged by jenkins-bot:

[research/mwaddlink@main] tests: Update expected data for provide_query_{get,post}

https://gerrit.wikimedia.org/r/952920

Urbanecm_WMF claimed this task.

Patch's merged, CI's now running seamlessly. Thanks again @MGerlach for the help yesterday!