Page MenuHomePhabricator

Implement a standard page title normalization algorithm (same as mediawiki)
Closed, DuplicatePublic

Description

We will continue to have problems querying and joining our different data sources unless we act more like mediawiki. So normalizing page titles the same way as mediawiki seems like a good first step.

Previously reported as:

As part of this issue, someone pointed out that 'Speciaal:MyPage/zeusmodepreferences.js' is showing up in the top article endpoint (weird) and that if you try to retrieve it from the API, it doesn't work because of the slash in the name (bad): https://github.com/mediawiki-utilities/python-mwviews/issues/3

We can fix the slash in the name thing by requiring article titles to be URL-encoded and decoding them when we look them up, but I thought this was already happening.

Event Timeline

Milimetric claimed this task.
Milimetric raised the priority of this task from to High.
Milimetric updated the task description. (Show Details)
Milimetric added a project: Analytics-Kanban.
Milimetric subscribed.
Milimetric renamed this task from Pageview API not dealing with url quoting very well to Pageview API not dealing with url quoting very well {melc}.Feb 17 2016, 5:00 PM
Milimetric set Security to None.
Nuria renamed this task from Pageview API not dealing with url quoting very well {melc} to Pageview API not dealing with url quoting very well {melc} [8 pts].Feb 17 2016, 6:18 PM

Change 271540 had a related patch set uploaded (by Milimetric):
Fix handling of encoded and spaced article titles

https://gerrit.wikimedia.org/r/271540

Change 271540 merged by Milimetric:
Fix handling of encoded and spaced article titles

https://gerrit.wikimedia.org/r/271540

Milimetric renamed this task from Pageview API not dealing with url quoting very well {melc} [8 pts] to Pageview API not dealing with url quoting very well {melc}.Feb 22 2016, 8:58 PM
Milimetric set the point value for this task to 8.
Milimetric renamed this task from Pageview API not dealing with url quoting very well {melc} to Implement a standard page title normalization algorithm (same as mediawiki).Jun 6 2016, 4:48 PM
Milimetric removed Milimetric as the assignee of this task.
Milimetric edited projects, added: Analytics; removed: Analytics-Kanban.
Milimetric updated the task description. (Show Details)
Milimetric removed the point value 8 for this task.
Milimetric lowered the priority of this task from High to Medium.Jul 7 2016, 5:44 PM
Milimetric moved this task from Incoming to Backlog (Later) on the Analytics board.
Milimetric moved this task from Backlog (Later) to Event Platform on the Analytics board.