Page MenuHomePhabricator

Maintenance script to convert between different representations of interwiki / sites info
Open, MediumPublic

Description

The intent is to allow us to generate site info files (like P3044) from old CDB files or from the interwiki table, as well as to convert between different representations of site info. Some examples:

To migrate from a CDB file as generated by dumpInterwiki.php to the php used for SiteLookup:

convertSiteInfo.php interwiki:cdb:/foo/bar.cdb sites:php:/foo/bar.php

To dump the content of the sites table as JSON:

convertSiteInfo.php sites:db sites:json:/foo/bar.json

To dump the content of the interwiki table as JSON:

convertSiteInfo.php interwiki:db sites:json:/foo/bar.json

To convert from old-style sites JSON to new style JSON, with pre-computed indexes:

convertSiteInfo.php sites:json:/foo/bar-old.json sites:json:/foo/bar.json --index

To combine multiple php and json files into a single php file:

convertSiteInfo.php sites:json:foo.json sites:php:bar.php sites:json:combined.php

The way the sources and destinations are specified in the examples above are a bit ad-hoc and possibly unnecessary. The file extension should be sufficient in most cases: .cdb will trigger the classic interwiki code with a SiteLookup-adapter to be used for reading; .json will trigger a SiteLookup to be used (which should support the old as well as the new json structure). .php will also be read by a SiteLookup. For output, only the new style JSON and PHP structures would be supported, using T135156: Create a SiteStore that can write JSON and PHP files. We'll need some special syntax to indicate that we want to read from the interwiki or the sites table, though.

NOTE: We should consider to also support the XML format used by maintenance/exportSites.php, see docs/sitelist.txt.

Event Timeline

Krenair renamed this task from Maintenance script to convert between different representatiosn of interwiki / sites info to Maintenance script to convert between different representations of interwiki / sites info.May 12 2016, 5:36 PM

This task has been assigned to the same task owner for more than two years. Resetting task assignee due to inactivity, to decrease task cookie-licking and to get a slightly more realistic overview of plans. Please feel free to assign this task to yourself again if you still realistically work or plan to work on this task - it would be welcome!

For tips how to manage individual work in Phabricator (noisy notifications, lists of task, etc.), see https://phabricator.wikimedia.org/T228575#6237124 for available options.
(For the records, two emails were sent to assignee addresses before resetting assignees. See T228575 for more info and for potential feedback. Thanks!)