Page MenuHomePhabricator

Fix rec-api-ng relative paths handling
Closed, ResolvedPublic1 Estimated Story Points

Description

As reported in T347263#9392641, the rec-api-ng hosted on LiftWing expects query strings to start after a slash in order to return an API response. If they start without the slash, a 404 error is thrown.

In this task we shall investigate the cause of this issue and modify rec-api-ng to handle relative paths well and process query strings with or without a preceding slash, ensuring consistent API responses.

Event Timeline

I was able to reproduce this issue locally.

1.Query strings that start after a slash return an API response:

$ time curl -s "127.0.0.1/api/?s=en&t=fr&n=3&article=Apple"
[{"pageviews": 0, "title": "Splendour_(apple)", "wikidata_id": "Q19840849", "rank": 496.0}, {"pageviews": 0, "title": "Euonymus_hamiltonianus", "wikidata_id": "Q11340926", "rank": 485.0}, {"pageviews": 0, "title": "Prune_dwarf_virus", "wikidata_id": "Q7253036", "rank": 482.0}]

real	0m15.191s
user	0m0.020s
sys	0m0.012s

2.Query strings that don't start after a slash try to redirect manually:

$ time curl -s "127.0.0.1/api?s=en&t=fr&n=3&article=Apple"
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>Redirecting...</title>
<h1>Redirecting...</h1>
<p>You should be redirected automatically to target URL: <a href="http://127.0.0.1/api/?s=en&amp;t=fr&amp;n=3&amp;article=Apple">http://127.0.0.1/api/?s=en&amp;t=fr&amp;n=3&amp;article=Apple</a>.  If not click the link.
real	0m0.016s
user	0m0.005s
sys	0m0.009s

Following the second example above, I ran the same command with the "-L" flag to automate redirection and the API was able to return an API response:

$ time curl -sL "127.0.0.1/api?s=en&t=fr&n=3&article=Apple"
[{"pageviews": 0, "title": "Splendour_(apple)", "wikidata_id": "Q19840849", "rank": 496.0}, {"pageviews": 0, "title": "Euonymus_hamiltonianus", "wikidata_id": "Q11340926", "rank": 485.0}, {"pageviews": 0, "title": "Prune_dwarf_virus", "wikidata_id": "Q7253036", "rank": 482.0}]

real	0m24.214s
user	0m0.019s
sys	0m0.010s

The solution above works for people using curl or tools that are willing to implement an "-L" flag-like feature that automates the redirection. This may not produce a very good user experience. In oder to fix this, I have looked into the Flask web framework and found that we can disable strict slashes so that routing can work with or without a trailing slash.

I implemented this in recommendation/data/recommendation.wsgi and the second example from T354601#9445095 that required manual redirection, worked wihout any issues:

$ time curl -s "127.0.0.1/api?s=en&t=fr&n=3&article=Apple"
[{"pageviews": 0, "title": "Splendour_(apple)", "wikidata_id": "Q19840849", "rank": 496.0}, {"pageviews": 0, "title": "Euonymus_hamiltonianus", "wikidata_id": "Q11340926", "rank": 485.0}, {"pageviews": 0, "title": "Prune_dwarf_virus", "wikidata_id": "Q7253036", "rank": 482.0}]

real	0m27.139s
user	0m0.012s
sys	0m0.018s

This will be a better solution that will enable the rec-api-ng to support tools or scenarios utilizing relative paths.

Change 988245 had a related patch set uploaded (by Kevin Bazira; author: Kevin Bazira):

[research/recommendation-api@master] Disable strict slashes to support relative paths

https://gerrit.wikimedia.org/r/988245

Change 988245 merged by jenkins-bot:

[research/recommendation-api@master] Disable strict slashes to support relative paths

https://gerrit.wikimedia.org/r/988245

Change 989167 had a related patch set uploaded (by Kevin Bazira; author: Kevin Bazira):

[operations/deployment-charts@master] ml-services: update recommendation-api-ng image

https://gerrit.wikimedia.org/r/989167

Change 989167 merged by jenkins-bot:

[operations/deployment-charts@master] ml-services: update recommendation-api-ng image

https://gerrit.wikimedia.org/r/989167

We have deployed this solution in prod. Users can now use query strings on the rec-api-ng hosted on LiftWing and it will return an API response on both relative and absolute paths as shown below:

1.Query strings that start after a slash return an API response:

kevinbazira@deploy2002:~$ time curl -s "https://api.wikimedia.org/service/lw/recommendation/v1/api/?s=en&t=fr&n=3&article=Apple"
[{"pageviews": 146, "title": "Hicksbeachia_pinnatifolia", "wikidata_id": "Q5751425", "rank": 499.0}, {"pageviews": 156, "title": "Euonymus_hamiltonianus", "wikidata_id": "Q11340926", "rank": 486.0}, {"pageviews": 15, "title": "Bismarck_(apple)", "wikidata_id": "Q866432", "rank": 484.0}]

real	0m3.988s
user	0m0.013s
sys	0m0.000s

2.Query strings that don't start after a slash now also return an API response:

kevinbazira@deploy2002:~$ time curl -s "https://api.wikimedia.org/service/lw/recommendation/v1/api?s=en&t=fr&n=3&article=Apple"
[{"pageviews": 146, "title": "Hicksbeachia_pinnatifolia", "wikidata_id": "Q5751425", "rank": 499.0}, {"pageviews": 156, "title": "Euonymus_hamiltonianus", "wikidata_id": "Q11340926", "rank": 486.0}, {"pageviews": 15, "title": "Bismarck_(apple)", "wikidata_id": "Q866432", "rank": 484.0}]

real	0m2.313s
user	0m0.008s
sys	0m0.004s
kevinbazira triaged this task as Medium priority.
kevinbazira set the point value for this task to 1.