Page MenuHomePhabricator

API 'info' query for some Chinese page titles on zh.wikipedia.org does not work
Closed, InvalidPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):
Python:

import requests
requests.get("https://zh.wikipedia.org/w/api.php", params={"prop": "info", "titles": "花鄉東橋站", "action": "query", "format": "json"}).json()
requests.get("https://zh.wikipedia.org/w/api.php", params={"prop": "info", "titles": "汉语", "action": "query", "format": "json"}).json()

What happens?:
The former returns JSON indicating the page does not exist (it has the missing key, for instance), even though it definitely does. The latter works - you get a response indicating the page exists, with correct information.

requests doesn't seem to be the issue here; it correctly URL-escapes the unicode characters. You can get the URL it produces and paste it into a browser and the browser bar will show the correct characters. This is the raw URL it produces: https://zh.wikipedia.org/w/api.php?prop=info&titles=%E8%8A%B1%E9%84%89%E6%9D%B1%E6%A9%8B%E7%AB%99&action=query&format=json

What should have happened instead?:
The response should provide correct information on the existing page.

This was reported as a bug in mwclient, but I'm *pretty* sure neither mwclient nor responses is the source of the problem.

Event Timeline

Shizhao added a project: Chinese-Sites.
Shizhao subscribed.

The correct Chinese title of "花鄉東橋站" should be “花乡东桥站”, see https://zh.wikipedia.org/w/api.php?prop=info&titles=%E8%8A%B1%E4%B9%A1%E4%B8%9C%E6%A1%A5%E7%AB%99&action=query&format=json

Enabled traditional and simplified Chinese conversion in zhwiki. The API interface gives the original title without conversion.

ahh hmm, now I come back and look at this with fresh eyes, the names are a little different. When you browse to https://zh.wikipedia.org/wiki/花鄉東橋站 , you get redirected (301) to https://zh.wikipedia.org/wiki/花乡东桥站 . It's not a mediawiki redirect, I don't think. Not sure how it's set up, but I guess that's the issue here. edit: seems this is a mechanism in Chinese wikipedia to convert between simplified and traditional renderings; however that's implemented, it doesn't work at the level of API requests. So there's probably not much that can be done here. I'll close the issue.