Page MenuHomePhabricator

MediaWiki:Comma-separator should be separated into two messages for CJK localization
Open, Needs TriagePublic

Description

Problem
In Chinese, Japanese and (probably) Korean, there are two types of commas: one is the standard comma just like that in English (U+FF0C FULLWIDTH COMMA), the other is the enumeration comma (U+3001 IDEOGRAPHIC COMMA) which is used to separate words constituting a list e.g. "Apple, Banana, Orange" => “苹果、香蕉、橘子” in Chinese. But MediaWiki only has one message ([[MediaWiki:Comma-separator]]) for comma localization which is problematic for CJK languages.

Possible solution
Add a new MediaWiki message, or make changes to [[MediaWiki:Comma-separator]].

See also
https://en.wikipedia.org/wiki/Chinese_punctuation
https://en.wikipedia.org/wiki/Comma#Languages_other_than_Western_European
https://zh.wikipedia.org/wiki/MediaWiki_talk:Comma-separator
https://zh.wikipedia.org/wiki/Wikipedia:%E4%BA%92%E5%8A%A9%E5%AE%A2%E6%A0%88/%E6%8A%80%E6%9C%AF/%E5%AD%98%E6%A1%A3/2016%E5%B9%B412%E6%9C%88#.E8.BF.99.E4.B8.A4.E4.B8.AA.E6.A8.A1.E6.9D.BF.E8.B2.8C.E4.BC.BC.E5.87.BA.E9.97.AE.E9.A2.98.E4.BA.86
https://zh.wikipedia.org/wiki/Wikipedia:%E4%BA%92%E5%8A%A9%E5%AE%A2%E6%A0%88/%E6%8A%80%E6%9C%AF#.E6.89.80.E4.BB.A5.E6.88.91.E4.BB.AC.E5.BA.94.E8.AF.A5.E6.94.AF.E6.8C.81.E5.93.AA.E4.B8.AA.E6.A0.87.E7.82.B9.E7.AC.A6.E5.8F.B7.E7.94.A8.E4.BA.8EMediaWiki:Comma-separator.EF.BC.9F

Event Timeline

Liuxinyu970226 renamed this task from [[MediaWiki:Comma-separator/zh]] need to use two separators to MediaWiki:Comma-separator in both zh-hans and zh-hant (probably zh-hk too) need to use two separators.Jul 14 2017, 4:10 AM
Liuxinyu970226 renamed this task from MediaWiki:Comma-separator in both zh-hans and zh-hant (probably zh-hk too) need to use two separators to MediaWiki:Comma-separator in languages that use ideographic characters need to use two separators.Jul 14 2017, 4:15 AM

Hmm I still suggest voting on this topic, as there's users that aganist using ","

I think that Cantonese Wikipedia, Wuu Wikipedia and Gan Wikipedia should also make the change.

I think that Cantonese Wikipedia, Wuu Wikipedia and Gan Wikipedia should also make the change.

That's why I changed title as "in languages that use ideographic characters" (hey, ideographic characters can have Japanese and Korean, are both users notified?)

I believe that the most useful case here is the ideographic comma "、", used in Chinese for connecting a list of items. (The other comma "," is mainly for chunks of senstences.) Anyone using the other comma in a list of items need to serious consider what that , thing is and how they are ruining the language. It really looks like people are forgetting about the 、 comma and the ; semicolon, using , all the time instead. (Sorry for some rants here.)

Also, I believe that this bug, affects Japanese too. That language may find intetpunct a better list delimiter.

The bug should better be blamed on the anglocentric word choice of "comma", as there really are languages that have a "listing comma" and a "sentence comma". Let's call it a documentation bug, write it somewhere that this one is only used for list elements, and call it a day.

I think this task can be closed as Invalid (as in case the problem can be locally handled) because of suggestion from https://www.wikidata.org/wiki/Module:I18n/linguistic:

	comma = {
		message = 'comma-separator'
	},
	citation_comma = {
		zh = ',',  -- in Chinese the commas used in citation aren't '、'
		["zh-cn"] = ',',
		["zh-hans"] = ',',
		["zh-hant"] = ',',
		["zh-hk"] = ',',
		["zh-mo"] = ',',
		["zh-sg"] = ',',
		["zh-tw"] = ',',
		message = 'comma-separator'
	},

Agreed. We can use "、" as the standard comma-separator, and let modules define exceptions.

Dringsim subscribed.

comma-separator and commas in file-info, file-info-size, etc. are inconsistent in Chinese:

400 × 400像素,文件大小:978 KB,MIME类型:image/gif、​循环、​44帧、​4.0秒

( https://commons.wikimedia.org/wiki/File:Rotating_earth_(large).gif?uselang=zh-hans )

Diskdance renamed this task from MediaWiki:Comma-separator in languages that use ideographic characters need to use two separators to MediaWiki:Comma-separator should be separated into two messages for CJK localization.Jul 25 2023, 6:33 AM
Diskdance updated the task description. (Show Details)
Diskdance updated the task description. (Show Details)
Diskdance updated the task description. (Show Details)

While most cases of "comma-separator" use are for list of items, the MIME case is unfortunately more like a list of sub-clauses.

The code that produces this behavior is the specialized GIFHandler::GetLongDesc() function, which ends with a call to $wgLang->commaList at https://github.com/wikimedia/mediawiki/blob/44bb3dd389d630a1fa573731cfa2fa054698faff/includes/media/GIFHandler.php#L189C10-L189C28. The same issue affects PNGHandler.