Page MenuHomePhabricator

Investigate the parsing of the file info wikitext to forbid files with a wrong licence
Closed, ResolvedPublic

Description

Task
Investigate how to parse the wiki text of the file description page. We want to find out if one of the forbidden categories and templates exist there, to be able to disallow the move.
We are only interested in the most current version of the file page.

Acceptance Criteria

  • There is a clear idea how the parsing of the file page should be done, and what the costs are

Notes

  • Possibilities to do so: Regex, API-Call to source wiki to get meta data from there (e.g. used templates, which would also contain indirect transclusions)

Event Timeline

Lea_WMDE triaged this task as Medium priority.May 2 2018, 11:46 AM
Lea_WMDE created this task.
Lea_WMDE moved this task from Backlog to Tickets ready for pickup on the Move-Files-To-Commons board.

Change 430415 had a related patch set uploaded (by Andrew-WMDE; owner: Andrew-WMDE):
[mediawiki/extensions/FileImporter@master] [Proof of Concept] Investigate the parsing of categories and templates

https://gerrit.wikimedia.org/r/430415

The above patch is a PoC showing, that we can get the lists of templates and categories from the API of the source wiki. That's also how the CommensHelper2 tool is doing getting them. Since this pretty cheap and gives us clear and clean results we should definitely go this way.

Change 430415 abandoned by Andrew-WMDE:
[Proof of Concept] Investigate the parsing of categories and templates

Reason:
Abandoned in favor of: https://gerrit.wikimedia.org/r/#/c/436291/

https://gerrit.wikimedia.org/r/430415

Vvjjkkii renamed this task from Investigate the parsing of the file info wikitext to forbid files with a wrong licence to xsdaaaaaaa.Jul 1 2018, 1:12 AM
Vvjjkkii reopened this task as Open.
Vvjjkkii removed Andrew-WMDE as the assignee of this task.
Vvjjkkii raised the priority of this task from Medium to High.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed the point value for this task.
Vvjjkkii edited subscribers, added: Andrew-WMDE; removed: gerritbot, Aklapper.
Bodhisattwa renamed this task from xsdaaaaaaa to Investigate the parsing of the file info wikitext to forbid files with a wrong licence.Jul 1 2018, 1:40 PM
Bodhisattwa closed this task as Resolved.
Bodhisattwa assigned this task to Andrew-WMDE.
Bodhisattwa lowered the priority of this task from High to Medium.
Bodhisattwa updated the task description. (Show Details)
Bodhisattwa added subscribers: gerritbot, Aklapper.