pywikibot transliteration should support chinese transliteration
Open, LowestPublic

Description

Author: nzmoihue

Description:
https://github.com/wikimedia/pywikibot-core/blob/master/pywikibot/userinterfaces/transliteration.py should support more scripts like Korean, Chinese or ml. jQuery.ime https://github.com/wikimedia/jquery.ime/tree/master/rules transliteration keyboards can be used for developing it like https://github.com/wikimedia/jquery.ime/blob/master/rules/ml/ml-transliteration.js

I am using output of code on my gadget http://www.wikidata.org/wiki/MediaWiki:Gadget-SimpleTransliterate.js (http://commons.wikimedia.org/wiki/File:Wikidata_Transliteration_Gadget.png) that is why I like it is be developed a little more


Version: unspecified
Severity: enhancement
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=73410

Details

Security
None
Reference
bz56524
bzimport added a subscriber: Unknown Object (????).
bzimport set Reference to bz56524.
bzimport created this task.Nov 2 2013, 11:28 PM

Change 97040 had a related patch set uploaded by Ladsgroup:
Improving transliteration support

https://gerrit.wikimedia.org/r/97040

Change 97044 had a related patch set uploaded by Ladsgroup:
Improving transliteration support

https://gerrit.wikimedia.org/r/97044

Change 97040 merged by jenkins-bot:
Improving transliteration support

https://gerrit.wikimedia.org/r/97040

Change 97044 merged by jenkins-bot:
Improving transliteration support

https://gerrit.wikimedia.org/r/97044

Both of patches got merged.

nzmoihue wrote:

Reopened for Chinese transliteration

Can you give me list of Chinese characters that needed to be added to this list?

(For future reference, defining the exact scripts to be supported in a bug request is welcome. If it's just about "support more" than a report can easily get unfixable by comments broadening the scope of a bug report.)

nzmoihue wrote:

#c3

I checked that source but I couldn't find the dictionary file, [1] syas there is file named CTLauBig5.tit, but there isn't. Can you tell me more precise about the dictionary?

[1] http://cpansearch.perl.org/src/KAWASAKI/Lingua-ZH-Romanize-Pinyin-0.23/lib/Lingua/ZH/Romanize/DictZH.pm

nzmoihue wrote:

There is not a one-to-one "dictionary" there, that is why I CCd original writer of the transliteration. Also have a look at https://github.com/axgle/pinyin

Created attachment 16327
Python translation of https://github.com/axgle/pinyin

(In reply to [no longer active user] from comment #14)

There is not a one-to-one "dictionary" there, that is why I CCd original
writer of the transliteration. Also have a look at
https://github.com/axgle/pinyin

{{done}} translation of the four scripts to python. See attachment.

Attached: zhtransliteration.py

Created attachment 16327 [details]
Python translation of https://github.com/axgle/pinyin

However there is some bug that caused 6651 Chinese characters getting 'zuo'.

Attached: zhtransliteration.py

I suppose we can add this, but what's the intended use case? We support full unicode console output (and input, but transliteration is output-only) on all systems.

Change 157498 had a related patch set uploaded by Zhuyifei1999:
Improving transliteration support for Chinese

https://gerrit.wikimedia.org/r/157498

(In reply to Merlijn van Deen from comment #17)

I suppose we can add this, but what's the intended use case? We support full
unicode console output (and input, but transliteration is output-only) on
all systems.

Yes, that is indeed a hard question. Why do we still have transliteration.py?

Change 157498 merged by jenkins-bot:
Improving transliteration support for Chinese

https://gerrit.wikimedia.org/r/157498

jayvdb added a comment.Oct 7 2014, 8:14 AM

Reopen if there is more to be done.

(In reply to John Mark Vandenberg from comment #21)

Reopen if there is more to be done.

zh-hant (Traditional Chinese) needed.

I will repeat my question:

I suppose we can add this, but what's the intended use case? We support full unicode console output (and input, but transliteration is output-only) on all systems.

/why/ is it needed?

(In reply to Merlijn van Deen from comment #23)

I will repeat my question:

> I suppose we can add this, but what's the intended use case? We support full unicode console output (and input, but transliteration is output-only) on all systems.

/why/ is it needed?

I suppose, that when the output is somehow ASCII-limited (for some reason the log files by the grid engine on tool labs is an example of this), transliterated output could be more useful than a pile of question marks or other non-readable code.

Restricted Application added subscribers: revi, Aklapper. · View Herald TranscriptSep 18 2015, 11:34 AM

Add Comment