Page MenuHomePhabricator

VirtualPageViews should send titles with spaces substituted with underscores
Closed, ResolvedPublic2 Story Points

Description

Currently VirtualPageView events send the title value in the REST response (https://en.wikipedia.org/api/rest_v1/page/summary/San_Francisco) and the value of mw.config.get('wgTitle') from the page being viewed. We need to ensure that the title can be mapped to the canonical URL which this makes difficult.

The proposal is to switch from wgTitle to wgRelevantPageName and use titles.canonical from the REST response.

Developer notes

  • Essentially we want the underscores which is the canonical name ... the name used in the URL eg. "San_Francisco" not "San Francisco"
  • Page views obviously operate with URLs and page previews don't
  • Analytics team are trying to make page preview virtual page views correspond with the page view equivalent
  • We only need to make it work with mw. I have no idea what happens if you enable VirtualPageViews and mwApi

Bonus points

We discussed that we should probably only allow VirtualPageViews if the gateway is rest. If you can do this as part of that change that would be splendid, otherwise on sign off we should create a task to make that happen.

Related Objects

StatusAssignedTask
ResolvedDereckson
ResolvedJdlrobson
Resolvedovasileva
DuplicateNone
OpenNone
Resolvedmforns
Resolvedovasileva
ResolvedJdlrobson
DuplicateNone
DuplicateNone
Resolvedovasileva
Resolvedovasileva
Resolvedovasileva
Resolvedphuedx
Resolvedphuedx
DuplicateNone
ResolvedJdlrobson
ResolvedJdlrobson
DuplicateNone
Duplicateovasileva
Resolvedovasileva
DuplicateNone
DeclinedNone
DuplicateJdlrobson
ResolvedMhurd
DeclinedJMinor
Resolvedphuedx
ResolvedPchelolo
ResolvedJdlrobson
DeclinedPchelolo
DeclinedNone
OpenNone
Resolvedphuedx
DeclinedJdlrobson
DuplicateNone
ResolvedFjalapeno
Resolvedphuedx
Declinedpmiazga
DeclinedNone
Resolvedphuedx
DeclinedNone
ResolvedPchelolo
ResolvedbearND
ResolvedMholloway
ResolvedMSantos
ResolvedMholloway
InvalidNone
ResolvedJdlrobson
InvalidNone
DuplicateNone
ResolvedJdlrobson
ResolvedJdlrobson
ResolvedJdlrobson
ResolvedJdlrobson
Resolvedphuedx
ResolvedbearND
ResolvedMholloway
DuplicateNone
ResolvedJdlrobson
ResolvedJdlrobson
Resolvedphuedx
ResolvedJdlrobson
ResolvedJdlrobson
ResolvedbearND
ResolvedJdlrobson
ResolvedMholloway
ResolvedMholloway
ResolvedJdlrobson
ResolvedJdlrobson
ResolvedbearND
Resolved Tbayer
ResolvedNone
ResolvedNone

Event Timeline

Jdlrobson created this task.Apr 4 2018, 8:52 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 4 2018, 8:52 PM
Jdlrobson added a subscriber: ovasileva.

@ovasileva spun this out of T186728
I'm not sure whether this is a deployment blocker.

ovasileva triaged this task as High priority.Apr 5 2018, 10:22 AM

@Jdlrobson - it seems like it shouldn't be a direct blocker, but let's still try to get this done asap.

Jdlrobson updated the task description. (Show Details)Apr 5 2018, 3:56 PM
Jdlrobson updated the task description. (Show Details)
Jdlrobson updated the task description. (Show Details)Apr 5 2018, 4:01 PM
Jdlrobson set the point value for this task to 2.

We did a world first slack estimation (all of us minus @pmiazga) and landed on a 2. We believe it's a well defined clear objective but there's risk in touching the gateway code.

pmiazga claimed this task.Apr 9 2018, 5:08 PM
pmiazga moved this task from To Do to Doing on the Readers-Web-Kanbanana-Board-Old board.

Change 425282 had a related patch set uploaded (by Pmiazga; owner: Pmiazga):
[mediawiki/extensions/Popups@master] Page_title and source_title should be in canonical form

https://gerrit.wikimedia.org/r/425282

Change 425282 merged by jenkins-bot:
[mediawiki/extensions/Popups@master] Page_title and source_title should be in canonical form

https://gerrit.wikimedia.org/r/425282

pmiazga removed pmiazga as the assignee of this task.Apr 16 2018, 6:39 PM
pmiazga removed a project: Patch-For-Review.
pmiazga added a subscriber: Jdrewniak.

This task requires technical QA and SignOff - @Jdrewniak could you do it?

I've taken a look at the patch and I can confirm that the events are being sent with both the page_title and source_title in their "canonical" form.

LGTM.

@Tbayer can we sign off this task or do you want to verify it on Beta Cluster?

pmiazga removed Jdrewniak as the assignee of this task.Apr 16 2018, 8:05 PM

@Tbayer can we sign off this task or do you want to verify it on Beta Cluster?

I think it's fine if you sign off on it, considering that the task requirements should be straightforward.
(For context, the overarching goal here is to record page names in a format compatible with the existing pageview data in pageview_hourly, and it was stated in the discussion at T186728#4106322 ff. that using titles.canonical should be sufficient to achieve that.)

Jdrewniak closed this task as Resolved.Apr 18 2018, 5:06 PM