Page MenuHomePhabricator

ErfgoedBot categorisation task keeps crashing
Closed, ResolvedPublic

Description

I noticed that ErfgoedBot has not categorized anything for the last 7 days.

Auditing the logs, the following typically happens:

2016-08-14_13:24:11 Categorize images...
Retrieving 3 pages from commons:commons.
Retrieving 13 pages from commons:commons.
Retrieving 17 pages from commons:commons.
Retrieving 50 pages from commons:commons.
WARNING: /data/project/heritage/pywikibot/pywikibot/family.py:930: FamilyMaintenanceWarning: Family name wikimediachapter does not match family module name wikimedia
Retrieving 50 pages from commons:commons.
Traceback (most recent call last):
  File "/data/project/heritage/pywikibot/pwb.py", line 270, in <module>
    if not main():
  File "/data/project/heritage/pywikibot/pwb.py", line 264, in main
    run_python_file(filename, [filename] + args, argvu, file_package)
  File "/data/project/heritage/pywikibot/pwb.py", line 109, in run_python_file
    main_mod.__dict__)
  File "/data/project/heritage/erfgoedbot/categorize_images.py", line 660, in <module>
    main()
  File "/data/project/heritage/erfgoedbot/categorize_images.py", line 653, in main
    countrycode, lang, countryconfig, commonsCatTemplates, conn, cursor)
  File "/data/project/heritage/erfgoedbot/categorize_images.py", line 543, in processCountry
    countrycode, lang, commonsTemplate, commonsCategoryBase, commonsCatTemplates, page, conn, cursor)
  File "/data/project/heritage/erfgoedbot/categorize_images.py", line 194, in categorizeImage
    currentcats = list(page.categories())
  File "/data/project/heritage/pywikibot/pywikibot/data/api.py", line 2279, in __iter__
    self.data = self.request.submit()
  File "/data/project/heritage/pywikibot/pywikibot/data/api.py", line 1747, in submit
    raise APIError(**result['error'])
pywikibot.data.api.APIError: maxlag: Waiting for 10.64.48.150: 5.7911479473114 seconds lagged
CRITICAL: Waiting for 1 network thread(s) to finish. Press ctrl-c to abort
<class 'pywikibot.data.api.APIError'>

Event Timeline

I presume this is due to T144023. It looks like the fix was backported to 2.0 branch, so I’ll pull that.

(I wonder whether we should be running master ?)

Mentioned in SAL [2016-08-31T20:27:37Z] <JeanFred> Pulled latest pywikibot (branch 2.0) from Git: 8 commits, including fix for T144438.

I relaunched a categorize_images task, let’s see how it goes.

I noticed a lot of crashes of the harvesting recently resulting in a lot of monuments being dropped, but have found little evidence this was linked to this. I really need to get around at setting up Sentry :-/

Categorisation task has been running for 12 hours without crashing, so I guess it is fine? It’s a bit worrying that it has been turning its wheels for nothing for 12 hours, but that’s a different story >_>

JeanFred claimed this task.

I presume this is due to T144023. It looks like the fix was backported to 2.0 branch, so I’ll pull that.

(I wonder whether we should be running master ?)

We're not running master? Why?

I presume this is due to T144023. It looks like the fix was backported to 2.0 branch, so I’ll pull that.

(I wonder whether we should be running master ?)

We're not running master? Why?

I don’t know. I just noticed that we are running the 2.0 branch (it’s possible I set it up like that, or @Lokal_Profil did, or else − can’t remember). Shall we switch to master Maarten?

I presume this is due to T144023. It looks like the fix was backported to 2.0 branch, so I’ll pull that.

(I wonder whether we should be running master ?)

We're not running master? Why?

I don’t know. I just noticed that we are running the 2.0 branch (it’s possible I set it up like that, or @Lokal_Profil did, or else − can’t remember). Shall we switch to master Maarten?

I honestly am not entirely clear about which bits get backported to 2.0 and which don't and how structured/transparent/discoverable that process is in general.

The main reason to be running 2.0 in my view is that it allows us to get it through pip (thus have it as part of CI and various other environments) witching to master would lose at least some of these.