Page MenuHomePhabricator

Pageviews API doesn't work for titles with % symbol.
Closed, ResolvedPublic2 Estimated Story Points

Description

After https://gerrit.wikimedia.org/r/#/c/271540/ pageviews API returns 500 for titles that contain % symbol.

Example: https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/all-access/user/100%25_Records/daily/2016021000/2016033000

The problem is that you don't need to call decodeURIComponent within the code, URI params are transparently decoded by HyperSwitch. To deal with original problems T126669 and T127034 another approach should be taken. In frontend RESTBase we now have a fancy way of normalising titles that matches mediawiki exactly using the mediawiki-title library. It is not applied for pageviews API yet, but that was one of the current goals to make it apply.

@Pchelolo will create a PR for frontend RESTBase, and somebody from Analytics should revert the problematic patch. After that all request coming to AQS will have normalised version of a title: localised namespace name, article name in mediawiki DBKEY format. Requests for non-normalised titles are normally 301 redirected so that we don't need to purge caches for all the combinations of non-normalised title forms. However, for pageviews we can serve the content directly and cache, because page views data is never changed and never purged, so caches should not be a problem.

There's one question we need to check first: Which title format is used internally in cassandra to store data?

Event Timeline

Change 280934 had a related patch set uploaded (by Milimetric):
Fix double-decode issue in pageview article route

https://gerrit.wikimedia.org/r/280934

Milimetric set the point value for this task to 2.Apr 1 2016, 3:46 PM

Change 280934 merged by Milimetric:
Fix double-decode issue in pageview article route

https://gerrit.wikimedia.org/r/280934