Maniphest T193622

Investigate the parsing of the file info wikitext to forbid files with a wrong licence
Closed, ResolvedPublic
Actions

Description

Task
Investigate how to parse the wiki text of the file description page. We want to find out if one of the forbidden categories and templates exist there, to be able to disallow the move.
We are only interested in the most current version of the file page.

Acceptance Criteria

There is a clear idea how the parsing of the file page should be done, and what the costs are

Notes

Possibilities to do so: Regex, API-Call to source wiki to get meta data from there (e.g. used templates, which would also contain indirect transclusions)

Details

	Subject	Repo	Branch	Lines +/-
	[Proof of Concept] Investigate the parsing of categories and templates	mediawiki/extensions/FileImporter	master	+124 -0

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Resolved	None	T140462 Correctly move files from Wikipedia to Commons (2013)
Resolved	thiemowmde	T146666 Build a commons extension that moves files to Commons
Resolved	thiemowmde	T193614 Reuse Commonhelper2's config files for category and template matching
Resolved	thiemowmde	T146751 Don't allow the moving of files with a wrong licence
Resolved	Andrew-WMDE	T193622 Investigate the parsing of the file info wikitext to forbid files with a wrong licence

Event Timeline

Lea_WMDE triaged this task as Medium priority.May 2 2018, 11:46 AM

Lea_WMDE created this task.

Lea_WMDE moved this task from Backlog to Tickets ready for pickup on the Move-Files-To-Commons board.

Tobi_WMDE_SW added a project: WMDE-QWERTY-Sprint-2018-05-02.May 2 2018, 2:47 PM

Tobi_WMDE_SW moved this task from Tickets ready for pickup to Tickets in sprint on the Move-Files-To-Commons board.

Andrew-WMDE claimed this task.May 2 2018, 3:00 PM

Tobi_WMDE_SW moved this task from Sprint Backlog to Doing on the WMDE-QWERTY-Sprint-2018-05-02 board.May 2 2018, 3:30 PM

Change 430415 had a related patch set uploaded (by Andrew-WMDE; owner: Andrew-WMDE):
[mediawiki/extensions/FileImporter@master] [Proof of Concept] Investigate the parsing of categories and templates

https://gerrit.wikimedia.org/r/430415

gerritbot added a project: Patch-For-Review.May 2 2018, 5:03 PM

Andrew-WMDE mentioned this in rEFLI8282639a0b44: [Proof of Concept] Investigate the parsing of categories and templates.May 2 2018, 5:04 PM

Andrew-WMDE moved this task from Doing to Review on the WMDE-QWERTY-Sprint-2018-05-02 board.May 2 2018, 5:04 PM

The above patch is a PoC showing, that we can get the lists of templates and categories from the API of the source wiki. That's also how the CommensHelper2 tool is doing getting them. Since this pretty cheap and gives us clear and clean results we should definitely go this way.

Lea_WMDE mentioned this in T194132: Parse the file info wikitext to forbid files with a wrong licence.May 8 2018, 8:56 AM

WMDE-Fisch moved this task from Review to Done on the WMDE-QWERTY-Sprint-2018-05-02 board.May 9 2018, 10:11 AM

thiemowmde moved this task from Incoming to Move files to Commons on the TCB-Team (now WMDE-TechWish) board.May 11 2018, 5:07 PM

Lea_WMDE closed this task as Resolved.May 16 2018, 10:30 AM

Change 430415 abandoned by Andrew-WMDE:
[Proof of Concept] Investigate the parsing of categories and templates

Reason:
Abandoned in favor of: https://gerrit.wikimedia.org/r/#/c/436291/

https://gerrit.wikimedia.org/r/430415

• Vvjjkkii renamed this task from Investigate the parsing of the file info wikitext to forbid files with a wrong licence to xsdaaaaaaa.Jul 1 2018, 1:12 AM

• Vvjjkkii reopened this task as Open.

• Vvjjkkii removed Andrew-WMDE as the assignee of this task.

• Vvjjkkii raised the priority of this task from Medium to High.

• Vvjjkkii added projects: CheckUser, Connected-Open-Heritage-Batch-uploads (RAÄ-KMB_1_2017-02), Tamil-Sites, Gamepress, Hashtags, Jade, KartoEditor, Language-2018-Apr-June, New-Editor-Experiences, Mail.

• Vvjjkkii updated the task description. (Show Details)

• Vvjjkkii removed the point value for this task.

• Vvjjkkii edited subscribers, added: Andrew-WMDE; removed: gerritbot, Aklapper.

Bodhisattwa renamed this task from xsdaaaaaaa to Investigate the parsing of the file info wikitext to forbid files with a wrong licence.Jul 1 2018, 1:40 PM

Bodhisattwa closed this task as Resolved.

Bodhisattwa assigned this task to Andrew-WMDE.

Bodhisattwa lowered the priority of this task from High to Medium.

Bodhisattwa removed projects: Mail, New-Editor-Experiences, Language-2018-Apr-June, KartoEditor, Jade, Hashtags, Gamepress, Tamil-Sites, Connected-Open-Heritage-Batch-uploads (RAÄ-KMB_1_2017-02), CheckUser.

Bodhisattwa updated the task description. (Show Details)

Bodhisattwa added subscribers: gerritbot, Aklapper.

thiemowmde removed a project: TCB-Team (now WMDE-TechWish).Jan 7 2022, 2:59 PM

Investigate the parsing of the file info wikitext to forbid files with a wrong licenceClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Investigate the parsing of the file info wikitext to forbid files with a wrong licence
Closed, ResolvedPublic
Actions

Related Objects
Search...