pywikibot transliteration should support chinese transliteration
OpenPublic

Description

Author: nzmoihue

Description:
https://github.com/wikimedia/pywikibot-core/blob/master/pywikibot/userinterfaces/transliteration.py should support more scripts like Korean, Chinese or ml. jQuery.ime https://github.com/wikimedia/jquery.ime/tree/master/rules transliteration keyboards can be used for developing it like https://github.com/wikimedia/jquery.ime/blob/master/rules/ml/ml-transliteration.js

I am using output of code on my gadget http://www.wikidata.org/wiki/MediaWiki:Gadget-SimpleTransliterate.js (http://commons.wikimedia.org/wiki/File:Wikidata_Transliteration_Gadget.png) that is why I like it is be developed a little more


Version: unspecified
Severity: enhancement
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=73410

bzimport added a project: Pywikibot-General.Via ConduitNov 22 2014, 2:33 AM
bzimport added a subscriber: Unknown Object (????).
bzimport set Reference to bz56524.
bzimport created this task.Via LegacyNov 2 2013, 11:28 PM
bzimport added a comment.Via ConduitNov 3 2013, 9:03 PM

nzmoihue wrote:

I've added http://www.wikidata.org/w/index.php?title=MediaWiki%3AGadget-SimpleTransliterate.js&diff=83565215&oldid=83539622 Malayalam, Gurmukhi, Gujarati and Oriya for my gadget.

bzimport added a comment.Via ConduitNov 14 2013, 11:09 AM

nzmoihue wrote:

This can be ported for Chinese support http://cpansearch.perl.org/src/KAWASAKI/Lingua-ZH-Romanize-Pinyin-0.23/

gerritbot added a comment.Via ConduitNov 22 2013, 5:15 PM

Change 97040 had a related patch set uploaded by Ladsgroup:
Improving transliteration support

https://gerrit.wikimedia.org/r/97040

gerritbot added a comment.Via ConduitNov 22 2013, 5:19 PM

Change 97044 had a related patch set uploaded by Ladsgroup:
Improving transliteration support

https://gerrit.wikimedia.org/r/97044

gerritbot added a comment.Via ConduitNov 25 2013, 8:23 PM

Change 97040 merged by jenkins-bot:
Improving transliteration support

https://gerrit.wikimedia.org/r/97040

gerritbot added a comment.Via ConduitNov 25 2013, 8:27 PM

Change 97044 merged by jenkins-bot:
Improving transliteration support

https://gerrit.wikimedia.org/r/97044

Ladsgroup added a comment.Via ConduitNov 26 2013, 6:48 AM

Both of patches got merged.

bzimport added a comment.Via ConduitNov 26 2013, 7:23 AM

nzmoihue wrote:

Reopened for Chinese transliteration

Ladsgroup added a comment.Via ConduitNov 26 2013, 9:40 AM

Can you give me list of Chinese characters that needed to be added to this list?

Aklapper added a comment.Via ConduitNov 26 2013, 10:24 AM

(For future reference, defining the exact scripts to be supported in a bug request is welcome. If it's just about "support more" than a report can easily get unfixable by comments broadening the scope of a bug report.)

bzimport added a comment.Via ConduitNov 26 2013, 10:43 AM

nzmoihue wrote:

#c3

Ladsgroup added a comment.Via ConduitNov 26 2013, 10:51 AM

I checked that source but I couldn't find the dictionary file, [1] syas there is file named CTLauBig5.tit, but there isn't. Can you tell me more precise about the dictionary?

[1] http://cpansearch.perl.org/src/KAWASAKI/Lingua-ZH-Romanize-Pinyin-0.23/lib/Lingua/ZH/Romanize/DictZH.pm

bzimport added a comment.Via ConduitNov 26 2013, 11:02 AM

nzmoihue wrote:

There is not a one-to-one "dictionary" there, that is why I CCd original writer of the transliteration. Also have a look at https://github.com/axgle/pinyin

zhuyifei1999 added a comment.Via ConduitAug 31 2014, 6:28 AM

Created attachment 16327
Python translation of https://github.com/axgle/pinyin

(In reply to [no longer active user] from comment #14)

There is not a one-to-one "dictionary" there, that is why I CCd original
writer of the transliteration. Also have a look at
https://github.com/axgle/pinyin

{{done}} translation of the four scripts to python. See attachment.

Attached: zhtransliteration.py

zhuyifei1999 added a comment.Via ConduitAug 31 2014, 10:31 AM

Created attachment 16327 [details]
Python translation of https://github.com/axgle/pinyin

However there is some bug that caused 6651 Chinese characters getting 'zuo'.

Attached: zhtransliteration.py

valhallasw added a comment.Via ConduitAug 31 2014, 11:01 AM

I suppose we can add this, but what's the intended use case? We support full unicode console output (and input, but transliteration is output-only) on all systems.

gerritbot added a comment.Via ConduitAug 31 2014, 11:49 AM

Change 157498 had a related patch set uploaded by Zhuyifei1999:
Improving transliteration support for Chinese

https://gerrit.wikimedia.org/r/157498

zhuyifei1999 added a comment.Via ConduitAug 31 2014, 12:18 PM

(In reply to Merlijn van Deen from comment #17)

I suppose we can add this, but what's the intended use case? We support full
unicode console output (and input, but transliteration is output-only) on
all systems.

Yes, that is indeed a hard question. Why do we still have transliteration.py?

gerritbot added a comment.Via ConduitSep 1 2014, 10:32 AM

Change 157498 merged by jenkins-bot:
Improving transliteration support for Chinese

https://gerrit.wikimedia.org/r/157498

jayvdb added a comment.Via ConduitOct 7 2014, 8:14 AM

Reopen if there is more to be done.

zhuyifei1999 added a comment.Via ConduitOct 7 2014, 8:37 AM

(In reply to John Mark Vandenberg from comment #21)

Reopen if there is more to be done.

zh-hant (Traditional Chinese) needed.

valhallasw added a comment.Via ConduitOct 7 2014, 9:10 AM

I will repeat my question:

I suppose we can add this, but what's the intended use case? We support full unicode console output (and input, but transliteration is output-only) on all systems.

/why/ is it needed?

zhuyifei1999 added a comment.Via ConduitOct 18 2014, 3:43 PM

(In reply to Merlijn van Deen from comment #23)

I will repeat my question:

> I suppose we can add this, but what's the intended use case? We support full unicode console output (and input, but transliteration is output-only) on all systems.

/why/ is it needed?

I suppose, that when the output is somehow ASCII-limited (for some reason the log files by the grid engine on tool labs is an example of this), transliterated output could be more useful than a pile of question marks or other non-readable code.

Liuxinyu970226 removed a subscriber: Liuxinyu970226.Via WebTue, Mar 3, 12:04 AM
Liuxinyu970226 added a subscriber: Liuxinyu970226.Via WebMon, Mar 9, 7:37 AM

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.