Page MenuHomePhabricator

The "Hidden full text manual conversion tag" -{H|}- were failed with zh-hans and zh-hant variants in Mediawiki 1.27.1
Closed, InvalidPublic

Description

The "Hidden full text manual conversion tag" (隐藏式全文手工转换标签) were failed in Mediawiki 1.27.1 with zh-hans and zh-hant variants.

It is a tag in following format, which supposed to match the word in [zh] and convert it to variants Chinese (such as zh-hans)

-{H|zh:文字1;zh-hans:文字2;zh-hant:文字3;zh-cn:文字4;zh-tw:文字5;zh-hk:文字6;zh-sg:文字7;zh-mo:文字8;}-

For a newly installed Mediawiki 1.27.1 with $wgLanguageCode = "zh"; The tag -{H|}- were not able to convert 文字1 into 文字2 and 文字3 with zh-hans and zh-hant

Example:
/index.php?title=Main_Page&variant=zh (no convertion. It should be 文字1 and it is.)

20161027093437.png (129×171 px, 3 KB)

/index.php?title=Main_Page&variant=zh-hans (convert to zh-hans. It should show 文字2, but not converted here)

20161027093616.png (177×189 px, 6 KB)

/index.php?title=Main_Page&variant=zh-hant (convert to zh-hant. It should show 文字3, but not converted here)

Main Page   qtest.png (154×165 px, 7 KB)

/index.php?title=Main_Page&variant=zh-cn (convert to zh-cn. The rest zh-cn, zh-tw, zh-hk, zh-sg, and zh-mo work well as it should be. So I'm not going to post more screenshot here.)

20161027093823.png (160×164 px, 6 KB)

Related Objects

Event Timeline

By comment out the line 141 & 142
'zh-hans' => 'unidirectional',
'zh-hant' => 'unidirectional',

from /languages/classes/LanguageZh.php, I can get the -{H|}- tag work in test wiki.
https://phabricator.wikimedia.org/diffusion/MW/browse/master/languages/classes/LanguageZh.php

However, it introduced a new bug, when $wgDisabledVariants were set, variants not been used will be showed.

Example:
-{H|zh:文字1;zh-hans:文字2;zh-hant:文字3;zh-cn:文字4;zh-tw:文字5;zh-hk:文字6;zh-sg:文字7;zh-mo:文字8;}-
with
$wgDisabledVariants = array( 'zh-cn', 'zh-tw', 'zh-hk', 'zh-mo', 'zh-my', 'zh-sg' );

will result in 文字2 in zh-hans environment and 文字3;zh-cn:文字4;zh-tw:文字5;zh-hk:文字6;zh-sg:文字7;zh-mo:文字8 in zh-hant environment.

Aklapper added a subscriber: Shizhao.

@Shizhao: Do you plan to fix this, or why did you add the MW-1.27-release tag?

I believe this is intentional given the unidirectional setting. The original programmers intended to mainly provide variants instead of scripts for Chinese, since:

  1. The HK/TW split is really large.
  2. Hans/Hant literally only tells you about the set of characters something is written in.

The wgDisabledVariants part is likely intentional as well -- with these variants disabled, why pretend that you actually know something about it and parse its name?

If you are controlling your own wiki (I mean, well, if true), you can always ask your users not to use these inexistent variants and set up abusefilters to warn them. Assuming they are happy with are seeing [義呆利 instead of 意呆利](https://zh.wikipedia.org/wiki/%E6%A8%A1%E5%9D%97:CGroup/Anime).

This bug may nevertheless be worthy for telling site deployers why MediaWiki is using variants for Chinese by default. Suggestion: close as invalid (by design).

I would suggest a systematic explanation about the conversion function documented in mediawiki.org or at least in the file's code comments.

Information about how the Chinese language convention system work were widely, fragmented, and conflicted exist in mediawiki.org/ zh.wikipedia.org/ meta.wikimedia.org/ code comments.

I would suggest a systematic explanation about the conversion function documented in mediawiki.org or at least in the file's code comments.

Information about how the Chinese language convention system work were widely, fragmented, and conflicted exist in mediawiki.org/ zh.wikipedia.org/ meta.wikimedia.org/ code comments.

https://www.mediawiki.org/wiki/Writing_systems/Syntax?

To be fair, the syntax page mainly talks about user usage and behavior (well that's not thoroughly documented either), not the actual internal classes used for implementators working on actual classes. Dev docs should probably just go into the source as comments so they are available in doxygen. Patchy-patchy time for T21044.

Change 797318 had a related patch set uploaded (by Winston Sung; author: Winston Sung):

[mediawiki/core@master] [Do not merge] Test ZhConverter manual level issue

https://gerrit.wikimedia.org/r/797318

Change 797318 abandoned by Winston Sung:

[mediawiki/core@master] [Do not merge] Test ZhConverter manual level issue

Reason:

https://gerrit.wikimedia.org/r/797318