Page MenuHomePhabricator

Invalid page titles are appearing in the top_articles data
Closed, ResolvedPublic5 Story Points

Description

"-" is not possibly a valid page - in fact, it's sort of the ABSENCE of a valid page - and yet it appears in our top articles counts for (at least) enwiki, on mobile and overall, on 2 October.

Presumably this is an implementation problem somewhere (possibly in the PV def?)

Event Timeline

Ironholds created this task.Nov 1 2015, 3:12 AM
Ironholds raised the priority of this task from to Needs Triage.
Ironholds updated the task description. (Show Details)
Ironholds added a project: Analytics-Backlog.
Ironholds added a subscriber: Ironholds.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 1 2015, 3:12 AM
Restricted Application added a subscriber: StudiesWorld. · View Herald TranscriptJan 12 2016, 7:44 PM
Milimetric triaged this task as Normal priority.Feb 18 2016, 6:30 PM
Milimetric moved this task from Incoming to Analytics Query Service on the Analytics board.

We'll remove "-", as a special case, from the data that's computed for the top endpoint. That means we'll be removing the probably miniscule number of pageviews to the actual redirect page titled "-". But there really is no other way.

Huh. The pageID approach won't work? :(

We'd love to get pageId into each of our pageview requests, but that's not happening any time soon. When we have that, yes, we can add poor old en.wikipedia.org/wiki/- back to the pageview stats :)

JAllemandou edited projects, added Analytics-Kanban; removed Analytics.
JAllemandou set the point value for this task to 5.

Change 275899 had a related patch set uploaded (by Joal):
Remove "-" page from top pageview in API.

https://gerrit.wikimedia.org/r/275899

Change 275899 merged by Nuria:
Remove "-" page from top pageview in API.

https://gerrit.wikimedia.org/r/275899

JAllemandou moved this task from Next Up to Done on the Analytics-Kanban board.Mar 9 2016, 1:11 PM
Nuria closed this task as Resolved.Mar 22 2016, 7:15 PM