pywikibot transliteration should support chinese transliteration
OpenPublic

Description

Author: nzmoihue

Description:
https://github.com/wikimedia/pywikibot-core/blob/master/pywikibot/userinterfaces/transliteration.py should support more scripts like Korean, Chinese or ml. jQuery.ime https://github.com/wikimedia/jquery.ime/tree/master/rules transliteration keyboards can be used for developing it like https://github.com/wikimedia/jquery.ime/blob/master/rules/ml/ml-transliteration.js

I am using output of code on my gadget http://www.wikidata.org/wiki/MediaWiki:Gadget-SimpleTransliterate.js (http://commons.wikimedia.org/wiki/File:Wikidata_Transliteration_Gadget.png) that is why I like it is be developed a little more


Version: unspecified
Severity: enhancement
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=73410

bzimport added a project: Pywikibot-General.Via ConduitNov 22 2014, 2:33 AM
bzimport added a subscriber: Unknown Object (????).
bzimport set Reference to bz56524.
bzimport created this task.Via LegacyNov 2 2013, 11:28 PM
bzimport added a comment.Via ConduitNov 3 2013, 9:03 PM

nzmoihue wrote:

I've added http://www.wikidata.org/w/index.php?title=MediaWiki%3AGadget-SimpleTransliterate.js&diff=83565215&oldid=83539622 Malayalam, Gurmukhi, Gujarati and Oriya for my gadget.

bzimport added a comment.Via ConduitNov 14 2013, 11:09 AM

nzmoihue wrote:

This can be ported for Chinese support http://cpansearch.perl.org/src/KAWASAKI/Lingua-ZH-Romanize-Pinyin-0.23/

gerritbot added a comment.Via ConduitNov 22 2013, 5:15 PM

Change 97040 had a related patch set uploaded by Ladsgroup:
Improving transliteration support

https://gerrit.wikimedia.org/r/97040

gerritbot added a comment.Via ConduitNov 22 2013, 5:19 PM

Change 97044 had a related patch set uploaded by Ladsgroup:
Improving transliteration support

https://gerrit.wikimedia.org/r/97044

gerritbot added a comment.Via ConduitNov 25 2013, 8:23 PM

Change 97040 merged by jenkins-bot:
Improving transliteration support

https://gerrit.wikimedia.org/r/97040

gerritbot added a comment.Via ConduitNov 25 2013, 8:27 PM

Change 97044 merged by jenkins-bot:
Improving transliteration support

https://gerrit.wikimedia.org/r/97044

Ladsgroup added a comment.Via ConduitNov 26 2013, 6:48 AM

Both of patches got merged.

bzimport added a comment.Via ConduitNov 26 2013, 7:23 AM

nzmoihue wrote:

Reopened for Chinese transliteration

Ladsgroup added a comment.Via ConduitNov 26 2013, 9:40 AM

Can you give me list of Chinese characters that needed to be added to this list?

Aklapper added a comment.Via ConduitNov 26 2013, 10:24 AM

(For future reference, defining the exact scripts to be supported in a bug request is welcome. If it's just about "support more" than a report can easily get unfixable by comments broadening the scope of a bug report.)

bzimport added a comment.Via ConduitNov 26 2013, 10:43 AM

nzmoihue wrote:

#c3

Ladsgroup added a comment.Via ConduitNov 26 2013, 10:51 AM

I checked that source but I couldn't find the dictionary file, [1] syas there is file named CTLauBig5.tit, but there isn't. Can you tell me more precise about the dictionary?

[1] http://cpansearch.perl.org/src/KAWASAKI/Lingua-ZH-Romanize-Pinyin-0.23/lib/Lingua/ZH/Romanize/DictZH.pm

bzimport added a comment.Via ConduitNov 26 2013, 11:02 AM

nzmoihue wrote:

There is not a one-to-one "dictionary" there, that is why I CCd original writer of the transliteration. Also have a look at https://github.com/axgle/pinyin

zhuyifei1999 added a comment.Via ConduitAug 31 2014, 6:28 AM

Created attachment 16327
Python translation of https://github.com/axgle/pinyin

(In reply to [no longer active user] from comment #14)

There is not a one-to-one "dictionary" there, that is why I CCd original
writer of the transliteration. Also have a look at
https://github.com/axgle/pinyin

{{done}} translation of the four scripts to python. See attachment.

Attached: zhtransliteration.py

zhuyifei1999 added a comment.Via ConduitAug 31 2014, 10:31 AM

Created attachment 16327 [details]
Python translation of https://github.com/axgle/pinyin

However there is some bug that caused 6651 Chinese characters getting 'zuo'.

Attached: zhtransliteration.py

valhallasw added a comment.Via ConduitAug 31 2014, 11:01 AM

I suppose we can add this, but what's the intended use case? We support full unicode console output (and input, but transliteration is output-only) on all systems.

gerritbot added a comment.Via ConduitAug 31 2014, 11:49 AM

Change 157498 had a related patch set uploaded by Zhuyifei1999:
Improving transliteration support for Chinese

https://gerrit.wikimedia.org/r/157498

zhuyifei1999 added a comment.Via ConduitAug 31 2014, 12:18 PM

(In reply to Merlijn van Deen from comment #17)

I suppose we can add this, but what's the intended use case? We support full
unicode console output (and input, but transliteration is output-only) on
all systems.

Yes, that is indeed a hard question. Why do we still have transliteration.py?

gerritbot added a comment.Via ConduitSep 1 2014, 10:32 AM

Change 157498 merged by jenkins-bot:
Improving transliteration support for Chinese

https://gerrit.wikimedia.org/r/157498

jayvdb added a comment.Via ConduitOct 7 2014, 8:14 AM

Reopen if there is more to be done.

zhuyifei1999 added a comment.Via ConduitOct 7 2014, 8:37 AM

(In reply to John Mark Vandenberg from comment #21)

Reopen if there is more to be done.

zh-hant (Traditional Chinese) needed.

valhallasw added a comment.Via ConduitOct 7 2014, 9:10 AM

I will repeat my question:

I suppose we can add this, but what's the intended use case? We support full unicode console output (and input, but transliteration is output-only) on all systems.

/why/ is it needed?

zhuyifei1999 added a comment.Via ConduitOct 18 2014, 3:43 PM

(In reply to Merlijn van Deen from comment #23)

I will repeat my question:

> I suppose we can add this, but what's the intended use case? We support full unicode console output (and input, but transliteration is output-only) on all systems.

/why/ is it needed?

I suppose, that when the output is somehow ASCII-limited (for some reason the log files by the grid engine on tool labs is an example of this), transliterated output could be more useful than a pile of question marks or other non-readable code.

Liuxinyu970226 removed a subscriber: Liuxinyu970226.Via WebMar 3 2015, 12:04 AM
Liuxinyu970226 added a subscriber: Liuxinyu970226.Via WebMar 9 2015, 7:37 AM

Add Comment