Page MenuHomePhabricator

VirtualPageViews should send titles with spaces substituted with underscores
Closed, ResolvedPublic2 Estimated Story Points

Description

Currently VirtualPageView events send the title value in the REST response (https://en.wikipedia.org/api/rest_v1/page/summary/San_Francisco) and the value of mw.config.get('wgTitle') from the page being viewed. We need to ensure that the title can be mapped to the canonical URL which this makes difficult.

The proposal is to switch from wgTitle to wgRelevantPageName and use titles.canonical from the REST response.

Developer notes

  • Essentially we want the underscores which is the canonical name ... the name used in the URL eg. "San_Francisco" not "San Francisco"
  • Page views obviously operate with URLs and page previews don't
  • Analytics team are trying to make page preview virtual page views correspond with the page view equivalent
  • We only need to make it work with mw. I have no idea what happens if you enable VirtualPageViews and mwApi

Bonus points

We discussed that we should probably only allow VirtualPageViews if the gateway is rest. If you can do this as part of that change that would be splendid, otherwise on sign off we should create a task to make that happen.

Related Objects

StatusSubtypeAssignedTask
ResolvedDereckson
ResolvedJdlrobson
Resolvedovasileva
DuplicateNone
OpenNone
Resolvedmforns
Resolvedovasileva
ResolvedJdlrobson
DuplicateNone
DuplicateNone
Resolvedovasileva
Resolvedovasileva
Resolvedovasileva
Resolvedphuedx
Resolvedphuedx
DuplicateNone
ResolvedJdlrobson
ResolvedJdlrobson
DuplicateNone
Duplicateovasileva
Resolvedovasileva
DuplicateNone
DeclinedNone
DuplicateJdlrobson
ResolvedMhurd
Declined JMinor
Resolvedphuedx
Resolved Pchelolo
ResolvedJdlrobson
Declined Pchelolo
Resolvedphuedx
DeclinedJdlrobson
DuplicateNone
Resolved Fjalapeno
Resolvedphuedx
Declinedpmiazga
DeclinedNone
Resolvedphuedx
DeclinedNone
Resolved Pchelolo
Resolved bearND
Resolved Mholloway
ResolvedMSantos
Resolved Mholloway
InvalidNone
ResolvedJdlrobson
InvalidNone
DuplicateNone
ResolvedJdlrobson
ResolvedJdlrobson
ResolvedJdlrobson
ResolvedJdlrobson
Resolvedphuedx
Resolved bearND
Resolved Mholloway
DuplicateNone
ResolvedJdlrobson
ResolvedJdlrobson
Resolvedphuedx
ResolvedJdlrobson
ResolvedJdlrobson
Resolved bearND
ResolvedJdlrobson
Resolved Mholloway
Resolved Mholloway
ResolvedJdlrobson
ResolvedJdlrobson
Resolved bearND
Resolved Tbayer
ResolvedNone
ResolvedNone

Event Timeline

Jdlrobson added a subscriber: ovasileva.

@ovasileva spun this out of T186728
I'm not sure whether this is a deployment blocker.

@Jdlrobson - it seems like it shouldn't be a direct blocker, but let's still try to get this done asap.

Jdlrobson updated the task description. (Show Details)
Jdlrobson set the point value for this task to 2.

We did a world first slack estimation (all of us minus @pmiazga) and landed on a 2. We believe it's a well defined clear objective but there's risk in touching the gateway code.

Change 425282 had a related patch set uploaded (by Pmiazga; owner: Pmiazga):
[mediawiki/extensions/Popups@master] Page_title and source_title should be in canonical form

https://gerrit.wikimedia.org/r/425282

Change 425282 merged by jenkins-bot:
[mediawiki/extensions/Popups@master] Page_title and source_title should be in canonical form

https://gerrit.wikimedia.org/r/425282

pmiazga removed a project: Patch-For-Review.
pmiazga added a subscriber: Jdrewniak.

This task requires technical QA and SignOff - @Jdrewniak could you do it?

I've taken a look at the patch and I can confirm that the events are being sent with both the page_title and source_title in their "canonical" form.

LGTM.

@Tbayer can we sign off this task or do you want to verify it on Beta Cluster?

@Tbayer can we sign off this task or do you want to verify it on Beta Cluster?

I think it's fine if you sign off on it, considering that the task requirements should be straightforward.
(For context, the overarching goal here is to record page names in a format compatible with the existing pageview data in pageview_hourly, and it was stated in the discussion at T186728#4106322 ff. that using titles.canonical should be sufficient to achieve that.)