Page MenuHomePhabricator

commonscat failing to read deletion logs at commons
Open, NormalPublic

Description

Commonscat script is returning errors of invalid titles when trying to read deletion logs.

Eg. {{Commonscat|New York City neighborhoods}} which links to https://commons.wikimedia.org/wiki/Category:New_York_City_neighborhoods throws the error:

ERROR: InvalidTitle: u'Neighborhoods in New York City]].Reason: Naming convention to match the English Wikipedia [[User:Mackensen' contains illegal char(s) u']'

I suspect the regex could some reworking, but wasn't able to find a solution myself during some trial and error.

Event Timeline

Avicennasis raised the priority of this task from to Normal.
Avicennasis updated the task description. (Show Details)
Avicennasis added a project: Pywikibot.
Avicennasis added a subscriber: Avicennasis.

[\w\W]*\[\[\:?Category:(?P<newcat1>[^(\]\])\|\}]+)(\|[^\}]+)?\]\]|

might do it (per regex101.com, haven't had a change to test in script yet) ? it matches the summary for https://commons.wikimedia.org/wiki/Category:New_York_City_neighborhoods which threw this error, and still marches patterns that didn't cause any snags, like at https://commons.wikimedia.org/wiki/Category:Victory_Monuments