[BUG] Wikidata description for the specific Chinese language variant should be shown
Closed, ResolvedPublicBUG REPORT
Actions

Assigned To

Authored By

	RHo
	Aug 22 2017, 3:31 PM

Description

Steps to reproduce

Go to the article on China using Simplified Chinese and note the Wikidata description.
Change to view the article in Traditional Chinese and note the Wikidata description.

Expected

The description that should be shown when viewing in Traditional Chinese should be the description from the "Traditional Chinese" row in the Wikidata entry for China, and the same for Simplified Chinese.

Actual

The description shown is pulling from the "Chinese" row in Wikidata, so there are characters being used and displayed in one variant when the language is set to the other. (In the example of the "China" article, there are Simplified characters in the description "中华人民共和国" showing on the Traditional character variant of the article)

Related Objects

Mentioned In: T219685: Handle language variants in the suggestions table populate script
T193360: Cantonese Wikipedia mobile app interface cannot edit yue descriptions from Wikidata
T177342: [BUG] Unable to get updated Wikidata description in Traditional Chinese articles after submitting it
Mentioned Here: T184000: Magic word on English WP to override display of Wikidata short description
T43716: [EPIC] Support language variant conversion in Parsoid
T159985: Implement language variant support in the REST API
T176678: converttitle parameter for the mobileview API
T177342: [BUG] Unable to get updated Wikidata description in Traditional Chinese articles after submitting it

Event Timeline

RHo created this task.Aug 22 2017, 3:31 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 22 2017, 3:31 PM

RHo updated the task description. (Show Details)Aug 22 2017, 3:32 PM

RHo added a project: Wikidata.Aug 22 2017, 7:04 PM

• Tbayer subscribed.Aug 22 2017, 7:11 PM

• NHarateh_WMF removed a project: Wikidata.Aug 23 2017, 3:26 PM

• NHarateh_WMF moved this task from Needs Triage to Tracking on the Wikipedia-Android-App-Backlog board.

• NHarateh_WMF added a project: Mobile-Content-Service.

Restricted Application added a project: Product-Infrastructure-Team-Backlog-Deprecated. · View Herald TranscriptAug 23 2017, 3:26 PM

MCS is not used for zhwiki until Parsoid and RESTBase can handle language variants.

• Mholloway subscribed.Sep 13 2017, 4:07 PM

So wait... Wikidata has descriptions in Traditional Chinese, Simplified Chinese, and another one called "Chinese"? What does that third one mean? Is it another variant all to itself, or does it "default" to traditional or simplified?

The China wikidata entry actually shows a different description for multiple variants, but only uses the "original" Chinese one.

This can be better illustrated by taking an example Chinese wiki article with no description yet, which looked like this:

After populating the description in Wikidata, the table is only updated for the Chinese only article:

TL:DR; Looks like the behavior is similar to Simple English vs English in that a row can be created for the different variants in wikidata, but we are only pulling in whatever is in the 'Chinese' row without any character transforms being applied.

Here are the screenshots might help you:

Read an article in Traditional Chinese, it does not load the description from Wikidata; But, when you go to the Wikidata page, you can find the description has been published on it.
==>
- Android app PUSH the description in "zh-hant" language code to Wikidata ==> correct
- Android app GET the description in "zh" language code from Wikidata ==> not correct

After update the description on Chinese label on the Wikidata, and then you will see the description after refreshing the Android app article page

When I tried to keep only Traditional Chinese label and description label on the Wikidata, the app did not show the description.

cooltey claimed this task.Sep 18 2017, 5:17 PM

Shizhao added a project: Chinese-Sites.Sep 19 2017, 2:23 AM

Shizhao moved this task from Backlog to Extensions/Skins on the Chinese-Sites board.Sep 27 2017, 2:13 AM

Liuxinyu970226 subscribed.Oct 1 2017, 10:37 PM

cooltey mentioned this in T177342: [BUG] Unable to get updated Wikidata description in Traditional Chinese articles after submitting it.Oct 3 2017, 10:44 PM

• Mholloway merged a task: T177342: [BUG] Unable to get updated Wikidata description in Traditional Chinese articles after submitting it.Oct 4 2017, 5:56 PM

• Mholloway added a subscriber: gerritbot.

To repeat (and expand) relevant discussion from T177342:

Neither API currently provides support for specifying a language variant for Wikidata descriptions. Even the mobileview API's 'variant' parameter has no effect here. It could be added, but we'll run into the same issue as with T176678 that the mobileview API is deprecated and in principle we shouldn't be spending time on it.

For MCS/RESTBase, we're waiting on T159985. (Note that this in turn depends on T43716, which is triaged at low priority.)

For the mobileview API, this is the method I think we'd need to update for variant support: https://github.com/wikimedia/mediawiki-extensions-MobileFrontend/blob/27599dfbdaecf1ca0a12164e648a14facefd00d2/includes/MobileFrontend.body.php#L189-L209

As a side note, I get the sense no one would object to moving the mobileview API into the MobileApp extension as discussed on T176678, which would remove the need for the Reading Web team to be involved, though that would still leave an open question about how much work we should be putting into the mobileview API on an ongoing basis. (Also, such a change should probably be announced in advance on mobile-l and wikitech-l.)

• Mholloway added a parent task: T159985: Implement language variant support in the REST API.Oct 4 2017, 7:49 PM

• Mholloway added a parent task: T148854: Use RESTBase for zhwiki.Oct 5 2017, 12:57 AM

Some of the problem here is that historically LanguageConverter does not specifically tag the source language variant of the text, since it is assumed it can be inferred from the character set. This is more-or-less true for Serbian (latin/cyrillic) and Chinese (simplified/traditional) but falls down badly with (say) British/American English. And it doesn't work 100% even for Serbian and Chinese, depending on the exact input text. The original article text in Wikipedia is a mix of variants, again with the assumption that you can determine on a word by word basis what the original variant is and what needs conversion.

Anyway, Parsoid is getting the ability to do language variant conversion, but we going forward we need to be careful to accurately record the source language variant -- for example, Wikidata should really be taking appropriate care and not following Mediawiki's (bad) example.

It doesn't seems to have any task to track Chinese language variant for Wikidata. But I'm not sure. I'd be grateful if anyone can link that or create a task.

This has also been a problem for importing data to Wikidata. I'm always confused by the difference of Chinese, Simplified and Traditional Chinese there.

I think we should:

remove Chinese in Wikidata
zh-cn, zh-sg falls back to zh-hans
zh-tw, zh-hk and zh-mo falls back to zh-hant.

In this way, we can map all language variants to Wikidata precisely without any other rules.

I agree with @fantasticfears 's comments. Allowing user to mark a label in zh is not accurate enough. I even thought of, in an aggressive perspective, we should use zh-cn/tw/hk/mo/sg instead of plain zh-hans/hant, when specifying the label and description for an entity. After all, aside from fallbacks (e.g. zh-tw --> zh-hant), wikidata can automatically use, e.g. zh-tw label when a user request for zh-hant label, but there is no zh-hant label directly assigned to this entity.

RHo added a project: Android-app-feature-Multilingual.Mar 29 2018, 4:45 PM

RHo mentioned this in T193360: Cantonese Wikipedia mobile app interface cannot edit yue descriptions from Wikidata.May 7 2018, 4:41 PM

I want to help with this. Wikibase doesn't cover this now. It's much more reasonable to fix this on their end. Maybe you can chime in and ask Wikidata people? @RHo

hi @fantasticfears, have just tagged wikidata again for their comment first, seems it was removed after the original ticket was filed fsr...

Addshore subscribed.Jul 18 2018, 9:45 AM

In T173842#3658746, @Mholloway wrote:

For the mobileview API, this is the method I think we'd need to update for variant support: https://github.com/wikimedia/mediawiki-extensions-MobileFrontend/blob/27599dfbdaecf1ca0a12164e648a14facefd00d2/includes/MobileFrontend.body.php#L189-L209

Yup, it looks like this needs to be updated to also account for the user language variant, currently it just uses the site content language.
This description is used in onOutputPageParserOutput (not sure if the stuff there is cached), might require a cache split based on user content language variant (not sure if it is already split on that)?

Should be a pretty smallish patch to MobileFrontend

Addshore added a project: User-Addshore.Jul 18 2018, 9:54 AM

Addshore moved this task from Unsorted 💣 to Watching 👀 on the User-Addshore board.Aug 17 2018, 1:25 PM

Addshore moved this task from incoming to monitoring on the Wikidata board.Aug 30 2018, 9:11 AM

@Addshore I was actually proposing storage/data model. If I have some information about how the data gets rendered from DB, I might be able to submit patches.

Should I also add iOS-app-Bugs and Wikipedia-Android-App-Backlog ? Or, is this very bug also happened on iOS?

@Liuxinyu970226 Mm... I didn't use the IOS system, but I used Android. I think the result is the same.

IMG_20190304_082557.jpg (2×1 px, 374 KB)

@Liuxinyu970226 Sorry... I mean the result is the no conversion.

• Mholloway mentioned this in T219685: Handle language variants in the suggestions table populate script.Apr 1 2019, 9:04 PM

Isn't this task to merge the entries into the single "Chinese" entry??

Seems the merging is not quite suitable because there's some matters such as different names in different region such as the title of a movie, or a drama series.

At least please leave those variants alone as before, such as "zh-CN", "zh-HK", "zh-MO", "zh-TW", "zh-SG", "zh-MY" etc.

Wikidata_edit_labels_Chinese_only.PNG (815×1 px, 65 KB)

Those variants are better off with fallback chains as Mediawiki (that’s a nice system)

JoeWalsh removed parent tasks: T148854: Use RESTBase for zhwiki, T159985: Implement language variant support in the REST API.Jul 31 2019, 8:19 PM

Restricted Application changed the subtype of this task from "Task" to "Bug Report". · View Herald TranscriptJul 31 2019, 8:19 PM

cooltey removed cooltey as the assignee of this task.Mar 18 2020, 8:21 PM

Shizhao moved this task from Extensions/Skins to Apps/Tools/Libs/Services on the Chinese-Sites board.Jul 6 2020, 1:08 AM

#product-infrastructure-team-backlog and Platform Engineering and cc @JoeWalsh

The issue is still there.

By requesting the following API, it should give the corresponding Wikidata description with the language code in Accept-Language.

https://zh.wikipedia.org/w/api.php?action=query&format=json&prop=description&titles=%E4%B8%89%E7%81%A3%E9%84%89%20(%E5%8F%B0%E7%81%A3)

According to the wikidata page: https://www.wikidata.org/wiki/Q713793

When sending Accept-Language: zh-hant, it should show 位於苗栗縣 in Traditional Chinese column.
When sending Accept-Language: zh-hans, it should show an empty description.

And we should no longer use the wikidata description in Chinese column.

cooltey added a project: Product-Infrastructure-Team-Backlog-Deprecated.Jul 13 2020, 8:27 PM

@cooltey @Charlotte Can you set a priority for this task?

@LGoto I would say to set it as Medium or High since it should not happen anyways.

• Charlotte triaged this task as High priority.Jul 22 2020, 3:53 PM

@cooltey Just to be clear, am I correct that the remaining issue here is that the service should NOT fall back to using the description from another variant if the preferred variant is unavailable? Using your example title, I see that the description on https://zh.wikipedia.org/api/rest_v1/page/mobile-html/%E4%B8%89%E7%81%A3%E9%84%89_(%E5%8F%B0%E7%81%A3) uses the correct variant when Accept-Language: zh-hant is sent. But when Accept-Language: zh-hans is sent or no language variant is specified, the generic "Chinese" description is shown, which (IIUC) is incorrect.

If what you're asking is to fix ApiQueryDescription (?action=query&prop=description) to handle language variants generally, then that's a separate task, and one probably best performed by someone with in-depth knowledge of Wikibase concepts and architecture (hint: probably someone from WMDE and not WMF Product Infrastructure). But I wouldn't agree with the prioritization in that case, because descriptions in the correct variant are already available by other means.

• Charlotte lowered the priority of this task from High to Medium.Jul 23 2020, 3:31 PM

In T173842#6329971, @Mholloway wrote:

If what you're asking is to fix ApiQueryDescription (?action=query&prop=description) to handle language variants generally, then that's a separate task, and one probably best performed by someone with in-depth knowledge of Wikibase concepts and architecture (hint: probably someone from WMDE and not WMF Product Infrastructure).

This API module is pretty far away from Wikibase (despite the code being in Wikibase currently).
The API module exists as part of a WMF Product feature (see T184000), so probably doesn't need to be coordinated with us (WMDE) much.

The request in this task was for the Wikipedia Android app to show the article description in the correct language variant if a description exists in Wikidata for that variant. That's long-since resolved. Please file requests for refining the current behavior (or for fixing ApiQueryDescription) as new tasks.

Shizhao moved this task from Apps/Tools/Libs/Services to Closed on the Chinese-Sites board.Aug 4 2020, 1:14 AM

	F28757190: Wikidata_edit_labels_Chinese_only.PNG
	Apr 24 2019, 5:36 PM

	F9568108: Chinese.png
	Sep 16 2017, 12:05 AM

	F9567989: Traditional.png
	Sep 16 2017, 12:05 AM

	F9557135: image.png
	Sep 15 2017, 2:37 PM

	F9556961: image.png
	Sep 15 2017, 2:37 PM

[BUG] Wikidata description for the specific Chinese language variant should be shownClosed, ResolvedPublicBUG REPORTActions