Page MenuHomePhabricator

REST API not returning latest page when queried title is a redirect
Open, Needs TriagePublic

Description

Note the difference between querying the current title
a) https://en.wikivoyage.org/api/rest_v1/page/mobile-html/Thakhek
and then its using it's former title:
b) https://en.wikivoyage.org/api/rest_v1/page/mobile-html/Tha_Khaek

Using the former title "Tha_Khaek" returns a page that hasn't existed since before December 2022. Is T335770 rearing its head again in a weird way? @akosiaris I added you because you were the superman who resolved that issue.

I have tried null editing the current page to no avail. For what it's worth, https://en.wikivoyage.org/wiki/Tha_Khaek redirects just fine outside of the REST API. I have tried this from New Zealand and San Francisco.

In case it helps, here's the cURL of the old title from New Zealand (with the page content body truncated after the title header tag to not blow up the description here):

curl -v https://en.wikivoyage.org/api/rest_v1/page/mobile-html/Tha_Khaek
*   Trying 198.35.26.96:443...
* Connected to en.wikivoyage.org (198.35.26.96) port 443 (#0)
* ALPN: offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* (304) (IN), TLS handshake, Server hello (2):
* (304) (IN), TLS handshake, Unknown (8):
* (304) (IN), TLS handshake, Certificate (11):
* (304) (IN), TLS handshake, CERT verify (15):
* (304) (IN), TLS handshake, Finished (20):
* (304) (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256
* ALPN: server accepted h2
* Server certificate:
*  subject: CN=*.wikipedia.org
*  start date: Aug 22 06:51:12 2023 GMT
*  expire date: Nov 20 06:51:11 2023 GMT
*  subjectAltName: host "en.wikivoyage.org" matched cert's "*.wikivoyage.org"
*  issuer: C=US; O=Let's Encrypt; CN=R3
*  SSL certificate verify ok.
* using HTTP/2
* h2 [:method: GET]
* h2 [:scheme: https]
* h2 [:authority: en.wikivoyage.org]
* h2 [:path: /api/rest_v1/page/mobile-html/Tha_Khaek]
* h2 [user-agent: curl/8.1.2]
* h2 [accept: */*]
* Using Stream ID: 1 (easy handle 0x129812400)
> GET /api/rest_v1/page/mobile-html/Tha_Khaek HTTP/2
> Host: en.wikivoyage.org
> User-Agent: curl/8.1.2
> Accept: */*
>
< HTTP/2 200
< content-language: en
< content-type: text/html; charset=utf-8; profile="https://www.mediawiki.org/wiki/Specs/Mobile-HTML/1.2.2"
< vary: Accept-Encoding
< cache-control: s-maxage=1209600, max-age=0
< access-control-allow-origin: *
< access-control-allow-methods: GET,HEAD
< access-control-allow-headers: accept, content-type, content-length, cache-control, accept-language, api-user-agent, if-match, if-modified-since, if-none-match, dnt, accept-encoding
< access-control-expose-headers: etag
< x-content-type-options: nosniff
< x-frame-options: SAMEORIGIN
< referrer-policy: origin-when-cross-origin
< x-xss-protection: 1; mode=block
< content-security-policy: default-src 'none'; connect-src app://*.wikipedia.org https://*.wikipedia.org; media-src app://upload.wikimedia.org https://upload.wikimedia.org 'self'; img-src app://*.wikimedia.org https://*.wikimedia.org app://wikimedia.org https://wikimedia.org 'self' data:; object-src 'none'; script-src app://meta.wikimedia.org https://meta.wikimedia.org 'unsafe-inline'; style-src app://meta.wikimedia.org https://meta.wikimedia.org app://*.wikipedia.org https://*.wikipedia.org 'self' 'unsafe-inline'; frame-ancestors 'self'
< x-content-security-policy: default-src 'none'; connect-src app://*.wikipedia.org https://*.wikipedia.org; media-src app://upload.wikimedia.org https://upload.wikimedia.org 'self'; img-src app://*.wikimedia.org https://*.wikimedia.org app://wikimedia.org https://wikimedia.org 'self' data:; object-src 'none'; script-src app://meta.wikimedia.org https://meta.wikimedia.org 'unsafe-inline'; style-src app://meta.wikimedia.org https://meta.wikimedia.org app://*.wikipedia.org https://*.wikipedia.org 'self' 'unsafe-inline'; frame-ancestors 'self'
< x-webkit-csp: default-src 'none'; connect-src app://*.wikipedia.org https://*.wikipedia.org; media-src app://upload.wikimedia.org https://upload.wikimedia.org 'self'; img-src app://*.wikimedia.org https://*.wikimedia.org app://wikimedia.org https://wikimedia.org 'self' data:; object-src 'none'; script-src app://meta.wikimedia.org https://meta.wikimedia.org 'unsafe-inline'; style-src app://meta.wikimedia.org https://meta.wikimedia.org app://*.wikipedia.org https://*.wikipedia.org 'self' 'unsafe-inline'; frame-ancestors 'self'
< content-location: https://en.wikivoyage.org/api/rest_v1/page/mobile-html/Tha_Khaek
< server: restbase2016
< date: Sun, 17 Sep 2023 03:47:57 GMT
< etag: W/"4565862/d7ea7d10-86a8-11ed-887b-b84d258cb51b"
< age: 721
< x-cache: cp4038 miss, cp4038 hit/5
< x-cache-status: hit-front
< server-timing: cache;desc="hit-front", host;desc="cp4038"
< strict-transport-security: max-age=106384710; includeSubDomains; preload
< report-to: { "group": "wm_nel", "max_age": 604800, "endpoints": [{ "url": "https://intake-logging.wikimedia.org/v1/events?stream=w3c.reportingapi.network_error&schema_uri=/w3c/reportingapi/network_error/1.0.0" }] }
< nel: { "report_to": "wm_nel", "max_age": 604800, "failure_fraction": 0.05, "success_fraction": 0.0}
< set-cookie: WMF-Last-Access=17-Sep-2023;Path=/;HttpOnly;secure;Expires=Thu, 19 Oct 2023 00:00:00 GMT
< set-cookie: WMF-Last-Access-Global=17-Sep-2023;Path=/;Domain=.wikivoyage.org;HttpOnly;secure;Expires=Thu, 19 Oct 2023 00:00:00 GMT
< x-client-ip: 103.131.54.20
< set-cookie: GeoIP=NZ:AUK:Auckland:-36.85:174.77:v4; Path=/; secure; Domain=.wikivoyage.org
< set-cookie: NetworkProbeLimit=0.001;Path=/;Secure;Max-Age=3600
< accept-ranges: bytes
< content-length: 54873
<
<!DOCTYPE html><html prefix="dc: http://purl.org/dc/terms/ mw: http://mediawiki.org/rdf/" about="https://en.wikivoyage.org/wiki/Special:Redirect/revision/4565862" class="no-editing"><head prefix="mwr: https://en.wikivoyage.org/wiki/Special:Redirect/"><meta property="mw:TimeUuid" content="016f4ad0-69a3-11ed-b966-8b9e8477d5d5"><meta charset="utf-8"><meta property="mw:pageId" content="35699"><meta property="mw:pageNamespace" content="0"><meta property="mw:revisionSHA1" content="f92a1829ab870941fcecfc45cd274b71847add0e"><meta property="dc:modified" content="2022-11-21T13:47:12.000Z"><meta property="mw:htmlVersion" content="2.6.0"><meta property="mw:html:version" content="2.6.0"><link rel="dc:isVersionOf" href="//en.wikivoyage.org/wiki/Tha_Khaek"><base href="//en.wikivoyage.org/wiki/"><title>Tha Khaek</title><meta property="mw:jsConfigVars" content="{&quot;wgKartographerLiveData&quot;:{&quot;mask&quot;:[],&quot;around&quot;:[],&quot;buy&quot;:[],&quot;city&quot;:[],&quot;do&quot;:[],&quot;drink&quot;:[],&quot;eat&quot;:[],&quot;go&quot;:[],&quot;listing&quot;:[],&quot;other&quot;:[],&quot;see&quot;:[],&quot;vicinity&quot;:[],&quot;view&quot;:[]}}"><meta property="mw:generalModules" content="ext.kartographer.link|ext.kartographer.frame"><meta property="mw:moduleStyles" content="ext.kartographer.style"><meta http-equiv="content-language" content="en"><meta http-equiv="vary" content="Accept"><link rel="stylesheet" href="//meta.wikimedia.org/api/rest_v1/data/css/mobile/base"><link rel="stylesheet" href="//en.wikivoyage.org/api/rest_v1/data/css/mobile/site"><link rel="stylesheet" href="//meta.wikimedia.org/api/rest_v1/data/css/mobile/pcs"><meta name="viewport" content="width=device-width, user-scalable=no, initial-scale=1, shrink-to-fit=no"><script src="//meta.wikimedia.org/api/rest_v1/data/javascript/mobile/pcs"></script><link rel="icon" href="data:,"><meta property="pcs:locale" content="en"><meta property="mw:leadImage" content="https://upload.wikimedia.org/wikipedia/commons/3/34/Buddha_cave%2C_Laos.jpg" data-file-width="1500" data-file-height="1000"></head><body lang="en" class="mw-content-ltr sitedir-ltr ltr mw-body-content parsoid-body mediawiki mw-parser-output content skin-minerva" dir="ltr"><div id="pcs" class="mw-parser-output"><script>pcs.c1.Page.onBodyStart();</script><header><div class="pcs-edit-section-header v2"><div class="pcs-header-inner-left"><h1 data-id="0" class="pcs-edit-section-title"><span class="mw-page-title-main">Tha Khaek</span></h1>

...

* Connection #0 to host en.wikivoyage.org left intact

Event Timeline

Here are some more examples:

Elk Island National Parkthis article no longer exists, it should redirect to Beaver Hills, but you get the old non-existent article via the REST API:
https://en.wikivoyage.org/api/rest_v1/page/mobile-html/Elk_Island_National_Park

Vinnytsia – the title spelling has changed, but when you use the old title you get an old version of the page:
a) https://en.wikivoyage.org/api/rest_v1/page/mobile-html/Vinnytsya
b) https://en.wikivoyage.org/api/rest_v1/page/mobile-html/Vinnytsia

Gümüşlük – ditto
a) https://en.wikivoyage.org/api/rest_v1/page/mobile-html/Gumusluk
b) https://en.wikivoyage.org/api/rest_v1/page/mobile-html/Gümüşlük

There are many more examples, which I can provide if they'd be helpful.

I 'll admit I am a bit stumped here. This is clearly not the CDN's fault as RESTBase exhibits the same behavior while also violating what it advertises as the documentation of the API.

deploy1002:~$ curl -I http://restbase.discovery.wmnet:7233/en.wikivoyage.org/v1/page/mobile-html/Thakhek 
HTTP/1.1 200

vs

deploy1002:~$ curl -I http://restbase.discovery.wmnet:7233/en.wikivoyage.org/v1/page/mobile-html/Tha_Khaek 
HTTP/1.1 200

Both return 200, different content-lengths and etags but ... RESTBase docs say:

Requests for redirect pages return HTTP 302 with a redirect target in Location header and content in the body. To get a 200 response instead, supply false to the redirect parameter.

Ah ok. Thanks for checking. I suppose this can just sit open for a bit. I have a workaround, it just involves me hitting the API 2-3 mores times than I normally would. If that's fine with WMF then I guess that's fine with me for now.

@MSantos just curious if this should be on your radar. It's not an urgent bug, but it is a bug, and would be nice to have it on someone's radar, even if they backlog it.