Page MenuHomePhabricator

Migrate wikidata-analysis tool away from Stretch Grid Engine
Closed, ResolvedPublic


Reportedly we want to keep using this tool (toolsadmin), so we need to migrate it off of the legacy infrastructure.

As far as I can tell, this tool uses the Stretch Grid Engine in two ways:

  • For the lighttpd webservice. I think this is just providing static web serving (I don’t see any .php files in public_html/), so this should be trivial to move to Kubernetes.
  • For the maplatest-cron cronjob, which runs five times a day. It has also failed to produce output since February 2021, as far as I can tell, so I think it’s at least questionable whether this is actually needed.
Running for latest
***                       Wikidata Toolkit: ToolkitAnalyzer              ***
******************************* Data Directory Layout **********************
* Target storage directory : data/                                         *
* Downloaded dump locations: data/dumpfiles/json-<DATE>/<DATE>-all.json.gz *
* Processor output location: data/<DATE>/                                  *
Targeting latest dump: 20220427
Error: Data directory specified does not exist.
Error: Data directory specified does not exist.
Cleaning up empty public_html dirs

Event Timeline

Webservice looks like it may have already been on k8s?
It definitely is now though:)

image.png (505×1 px, 108 KB)

So I think that wraps this up!