Page MenuHomePhabricator

Reuse Commonhelper2's config files for category and template matching
Closed, ResolvedPublic


This is the overall tracking ticket for reusing the config files of Commonhelper2.

It should support the use cases of

  • prohibiting non-Commons compatible licences
  • requiring Commons compatibel licences
  • mapping templates (such as the information template)
  • removing now unnecessary templates

Related Objects

Resolved Lea_WMDE

Event Timeline

Instead of config files, would it be better to "tag" the Wikidata items representing the template/category page somehow? This would allow the community to optimize configuration, without requiring file/repo changes.

@Magnus We discussed that, but if I remember correctly, we had the problem that wikidata maps the template/category names, but it does not help you with the attributes inside of the template. Or do you see a solution to that? Do you know how thoroughly the templates and categories are mapped on wikidata already?

You are correct that this would not solve the attribute mapping, short of some spectacularly horrible hackery.

Deeper analysis for this and all subtasks:

Naming suggestions:

  • $wgFileImporterCommonsHelperConfigBasePath
  • FileImporter\Remote\MediaWiki\CommonsHelperConfigRetriever (includes the domain to config filename mapping)
  • FileImporter\Services\Wikitext\CommonsHelperConfigParser turns the config page into a value object
  • FileImporter\Data\WikitextConversions is the value object, intentionally not named "CommonsHelper", because I want it to be reusable for other sources that are not the CommonsHelper
  • FileImporter\Services\Wikitext\WikitextContentValidator does all checks that don't change the wikitext
  • FileImporter\Services\Wikitext\WikitextContentCleaner applies all replacements


@Lea_WMDE, FYI, @Magnus just confirmed my impression: CommonsHelper2 is not running anywhere since got shut down. This does have some consequences:

  • The config files are not up to date.
    • This means the quality of the auto-replacements will not be as good as it could be, until the community starts updating the config files again.
    • This means we need to make it obvious how the community can update the auto-replacements.
    • This includes easy-to-reach documentation for the format (and which parts we support).
  • If we are continuing our plan (and I believe we should do so), we are basically hijacking the config files, as we are the only user then.
    • This gives us even more freedom. We could even remove parts we are not supporting, and re-add them later.
    • I want us to follow the principle I already outlined above: let's bind as less code as possible against this specific config file format and the source we fetch it from. (This refers to the two class names that include …CommonsHelper… above.)
Vvjjkkii renamed this task from Reuse Commonhelper2's config files for category and template matching to 5sdaaaaaaa.Jul 1 2018, 1:12 AM
Vvjjkkii raised the priority of this task from Medium to High.
Vvjjkkii updated the task description. (Show Details)
Bodhisattwa renamed this task from 5sdaaaaaaa to Reuse Commonhelper2's config files for category and template matching.Jul 1 2018, 1:24 PM
Bodhisattwa lowered the priority of this task from High to Medium.
Bodhisattwa updated the task description. (Show Details)
thiemowmde claimed this task.

What this tasks description asks for is done. There are a few sub-task still open, but I would like to make them independent from this umbrella task.