Page MenuHomePhabricator

[Pageviews_top_by_country] requests with projects with special characters returns 404 instead of 400
Open, HighPublic2 Estimated Story PointsBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

  • Make a request to Pageviews_top_by_country with a project with any special character
  • code below :
import requests


prod_url = 'https://wikimedia.org/api/rest_v1/metrics/pageviews/top-by-country/en.wikipedia.org^/all-access/2020/01'

header = {"accept": "application/json",
          "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36"}

response = requests.get(prod_url, headers=header)

print(response.status_code)
print(response.json())

What happens?:
status code 404 is returned

What should have happened instead?:
status code 400 is returned as seen in PROD

Software version (skip for WMF-hosted wikis like Wikipedia):

pageviews_top_by_country endpoint

Event Timeline

EChukwukere-WMF set the point value for this task to 2.
EChukwukere-WMF renamed this task from [Pageviews_top_by_country] requests with projects with special characters returns 404 instead of 400 returns details value including "day" to [Pageviews_top_by_country] requests with projects with special characters returns 404 instead of 400 .Mon, Sep 18, 6:33 AM

This issues is related to a common function that validates if project or referer parameter has some invalid characters according to this algorithm:

if (!normalizedProject.match('^[a-z0-9_\\.\\-]+$')) {
    throwIfNeeded('The parameter `project` contains invalid charaters.');
}

I think all 1.0 services are managing this validation for project and referer parameters. There is a common function to do it in AQS 1.0. Maybe we could create an aqsassist function to validate both parameters and include it for all the services. Just to provide the same behaviour. We have discussed about skipping this validation but now I was wondering if this was thought to avoid, explicitly, using wildcards when specifying projects or referers.

I think we are on time to include it for this service (page-analytics) and the others that have not been deployed yet and we will have a new opportunity for the rest when we complete the aqsassist common code we are working on currently. That moment we will need to push a new version of aqsassist and refactor all services.

@Sfaci @VirginiaPoundstone I glanced over This code is missing from all the microservices - projects validation for some and refererer for some . I would suggest to add this validation inside each service using request middleware and aqassist. I see the validation on device ,page , editors and edits . Page-title validation for edits and editors . Referrer validation for media-analytics

Ok!
So, on this sprint, we will work on three new aqsassist functions to validate/normalize project, referer and page-title and we'll include them to page-analytics, edit and editors via middleware function as Surbhi has proposed. This pending work for device, geo and media services is already included in T342018: compile list of known issues for triage post AQS 2.0 launch so that we can address it in the future.

A MR about this task is ready to be reviewed:
https://gitlab.wikimedia.org/frankie/aqsassist/-/merge_requests/23

  • It contains the three new aqsassist functions to validate/normalize project, referer and page-title we need for this service and the others. When merged, another related MR for page-analytics will be pushed to include their usage and validate/normalize projects using a middleware function.

Change 960593 had a related patch set uploaded (by Santiago Faci; author: Santiago Faci):

[generated-data-platform/aqs/page-analytics@main] Fixed some issues related to invalid characters and wrong messages

https://gerrit.wikimedia.org/r/960593

Sfaci triaged this task as High priority.

Change 960593 merged by Sg912:

[generated-data-platform/aqs/page-analytics@main] Fixed some issues related to invalid characters and wrong messages

https://gerrit.wikimedia.org/r/960593