Recently, when using the Wikipedia app, I discovered that the localization of words on the app is not as complete as on the web version. For example, when browsing the article Georgia (country) on the app, Georgia (used in Taiwan) and Georgia (others) appear alternately in the text. (used in different regions), it’s not very comfortable to watch. In addition to this entry, there are many examples such as Montenegro and Montenegro, Slovenia and Slovenia, etc. I hope they can be solved as soon as possible!
wish you well
Version: WikipediaApp/7.4.3.2822 (iPadOS 17.0.3; Tablet)
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | BUG REPORT | None | T277824 [EPIC] Language variant issues | ||
Open | None | T356615 [Epic] Return the correct Chinese language variant in the REST endpoints | |||
Open | None | T351989 Localization Inconsistencies in Wikipedia App: Improving Regional Word Display |
Event Timeline
Can you investigate if this is true @ABorbaWMF , we are trying to see if the variant logic is different?
Also can you see if this is happening on both android and iOS
@ARamadan-WMF - Hello, can we possibly ask for screenshots about this specific issue and clarification on what the user is seeing?
I noticed that the disambiguation text is in different locations within the articles on Georgia (country) and Georgie (u.s. state) articles, however, I am unsure if this is the issue the user is describing.
Georgia state | Georgia country |
Here is an example of the Georgia (country) article on Mobile Web vs iOS and Android Apps. They look similar to me, but I may not be able to pick out the differences the user has reported.
Mobile Web | iOS | Android |
Something I do notice is that the mobile-html endpoint response returns different characters if I send zh-tw in the Accept-Language header vs, zh-Hant-TW. We changed it to zh-Hant-TW a few releases ago as a part of https://phabricator.wikimedia.org/T338079. When comparing the first paragraph on Desktop with https://zh.wikipedia.org/api/rest_v1/page/mobile-html/%E6%A0%BC%E9%B2%81%E5%90%89%E4%BA%9A (Georgia ZH article), it seems the zh-tw response matches Desktop, whereas zh-Hant-TW does not.
@Jgiannelos Is this expected? Are we sending the wrong BCP 47 code here?
Desktop zh-tw:
app zh-tw:
app zh-Hant-TW:
Here's the user reply:
From what I see in the 工單(I’m not sure what it’s called in English), what the user Tsevener showed is exactly what happened to me. That is, there seems to be a problem that the zh-Hant-TW displays the word Georgia and Azerbaijan in Chinese wrong. It displays the word used mainly in Mainland China(格魯吉亞&阿塞拜疆)instead of the ones used in Taiwan (喬治亞&亞塞拜然). Hope this helps. (Sorry for bad English)
From restbase after purging the specific page just to make sure that we dont get any stale content:
page/mobile-html output
zh-Hant-TW
zh-tw
If I request the same output directly from PCS:
zh-Hant-TW
zh-tw
The last 2 screenshots look the same to me. I think there is something wrong in language variant handling in RESTBase level.