Install Extension:Transliterator on fr, pl and en.wiktionary
OpenPublic

Description

Author: conrad.irwin

Description:
Please install the Transliterator extension (in SVN and documented at http://www.mediawiki.org/wiki/Extension:Transliterator) on the English Wiktionary. The consensus for this is at http://en.wiktionary.org/w/index.php?oldid=7110737.


Version: unspecified
Severity: enhancement

bzimport added a subscriber: Unknown Object (MLST).
bzimport set Reference to bz20246.
bzimport created this task.Via LegacyAug 14 2009, 9:01 PM
brion added a comment.Via ConduitAug 23 2009, 1:32 AM

Assigning to myself for review.

bzimport added a comment.Via ConduitNov 23 2009, 2:25 PM

conrad.irwin wrote:

Is there anything further I can do to accelerate progress here?

bzimport added a comment.Via ConduitDec 21 2009, 4:51 PM

conrad.irwin wrote:

status update: This has been glanced at by werdna (a while back), Roan and Alphos (yesterday) and a few minor issues have been resolved.

bzimport added a comment.Via ConduitMar 9 2010, 11:43 PM

conrad.irwin wrote:

Ping, again...

As murmerings last time it was looked at indicated people would be more comfortable with it using strtr() internally, this is what it now does.

Some feedback would be wonderful.

Yair_rand added a comment.Via ConduitJun 29 2010, 6:10 AM

Um, is anybody working on this? The communities have been waiting for almost eleven months for this to be installed. Will this be done anytime soon?

Reedy added a comment.Via ConduitJun 29 2010, 6:13 AM

Seems not. It's not been reviewed, so it has no chance of being deployed until it has

Yair_rand added a comment.Via ConduitJun 29 2010, 6:19 AM

Is anyone working on reviewing it, or planning to review it at some point?

JackPotte added a comment.Via ConduitJul 10 2010, 9:49 PM

Personally I'm using http://demo.icu-project.org/icu-bin/translit in waiting for the extension.

bzimport added a comment.Via ConduitAug 14 2010, 3:43 AM

internoob2010 wrote:

This bug is celebrating its first birthday today. Someone please resolve it!

siebrand added a comment.Via ConduitAug 29 2010, 1:21 PM

Cannot go to keyword shell before it has been reviewed.

tstarling added a comment.Via ConduitMar 24 2011, 6:58 AM

What is it for? I have read this bug report, the extension page and the Wiktionary vote page, and have found no answers.

bzimport added a comment.Via ConduitMar 24 2011, 3:21 PM

msh210+wmfbugzilla wrote:

(In reply to comment #12)

What is it for?

Wiktionaries transliterate words. In particular, English Wiktionary (happens to be the one I'm most familiar with and) transliterates into English any words written in a non-Latin script, for the benefit of its (anglophone) readers. This is now done manually, but often can be done automatically according to set rules (depending on the language being transliterated. The "maps" described in the extension description are intended to be one per language, generally). For example, if we were transliterating Spanish (which we don't, but it's an easy example to give), we might have a map that says (pseudocode)
ll maps to y
j maps to h
á maps to a
etc. This both will relieve people from having to transliterate manually and will increase the number of entries that have transliterations.

bzimport added a comment.Via ConduitMar 24 2011, 3:26 PM

michael wrote:

And, perhaps most importantly, should eliminate errors and the use of transliteration schemes that are non-standard, inconsistent, and illogical.

bzimport added a comment.Via ConduitMar 24 2011, 4:36 PM

conrad.irwin wrote:

Transliterations are used particularly in translations tables where the alphabet of the destination language is not Latin (see http://en.wiktionary.org/wiki/Uzbekistan#Translations), and throughout entries in non-Latin alphabets, see for example http://en.wiktionary.org/wiki/%D5%88%D6%82%D5%A6%D5%A2%D5%A5%D5%AF%D5%BD%D5%BF%D5%A1%D5%B6.

For further context:

http://wikt.jelzo.com/wiki/Test:el shows a comparison of Greek words and their automatic transliteration versus the transliterations that existed on Wiktionary at some point during 2010.

http://en.wiktionary.org/wiki/User_talk:Conrad.Irwin/Transliterator.php contains some desired transliteration maps for various languages, along with which standards they correspond to (where applicable).

tstarling added a comment.Via ConduitMar 25 2011, 12:24 AM

Are these tables in any way similar to the tables used for transliteration by ICU? I wrote a PHP extension a couple of years ago which provides an interface to ICU's transliteration functions to PHP:

http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/transliterate/

The original idea was to use it to display transliterations of foreign names that come to a wiki via CentralAuth, but I never developed it any further. Would it be useful to provide access to these ICU transliterators via the extension you have developed?

bzimport added a comment.Via ConduitMar 28 2011, 12:10 AM

conrad.irwin wrote:

Yes, I imagine that would be very useful, though not absolutely necessary.

MarkAHershberger added a comment.Via ConduitMay 6 2011, 10:45 PM

A few revisions on this were marked "deferred" when they should be reviewed or at least marked "old." I've gone and changed the status to "new" and will send out an email asking people to review it. See http://www.mediawiki.org/w/index.php?title=Special:Code/MediaWiki/status/new&path=%2Ftrunk%2Fextensions%2FTransliterator

brion added a comment.Via ConduitMay 6 2011, 11:39 PM

Since this is a new extension and nobody remembers what it was, it's not worth worrying about individual old revisions in code review -- the extension should simply be reviewed as a whole.

brion added a comment.Via ConduitMay 7 2011, 1:15 AM

Overall the code looks pretty nice and has comments and stuff which makes me happy. ;)

Couple things that stick out to me:

  • static functions as hook callbacks are intermixed with a non-static singleton class which feels a bit odd to me; it's hard to tell what's what sometimes.
  • not sure what's up with the mPages, mMaps member variables; what's the lifetime of the ExtTransliterator object? If batch jobs are running, will this in-process cache get updated by actions in another process? Or will it be discarded between jobs within the same process?

It looks like a new object is created at ParserFirstCallInit time... offhand I'm not sure whether a new parser will get created for job runs or not. Probably won't break anything in practice, but it's worth looking at for -- in-process caches are always dangerous in a multi-node environment.

  • Several functions accept reference parameters, like isMapPage( &$title ) which don't appear that they should. Unless it's possible to *replace an entire object parameter with another object* or *alter a scalar value or array contents*, references should not be used. If these are to match hook definitions, in many cases the hooks probably need fixing upstream, and when they are fixed these functions will fail on PHP 5.3 unless they are also fixed to remove the references; I'd also recommend naming the functions with an 'on' prefix and the actual name of the calling hook if possible, to make it clearer what's going on.

In general I'd also recommend looking out for what happens when you're given a huge amount of input; the default NFD decomposition implementation is not very efficient, and it might be more likely to keel over and die if, say, you accidentally don't close the tag and try to transliterate a very large page full of non-English text.

Yair_rand added a comment.Via ConduitAug 15 2011, 3:48 AM

This bug is now two years old. Is there any chance it will be resolved any time soon?

MarkAHershberger added a comment.Via ConduitAug 16 2011, 4:59 PM

(In reply to comment #21)

This bug is now two years old. Is there any chance it will be resolved any time
soon?

If you want this deployed, then I suggest you find someone to work on the issues Brion has raised. Otherwise, this is likely to sit longer.

bzimport added a comment.Via ConduitAug 20 2011, 8:01 AM

beau wrote:

Polish Wiktionary community is also interested in using this extension
https://secure.wikimedia.org/wiktionary/pl/w/index.php?title=Wikis%C5%82ownik:Bar&oldid=2318567#Rozszerzenie_Transliterator

bzimport added a comment.Via ConduitMay 25 2012, 5:01 PM

sumanah wrote:

Conrad, could you respond to Brion's thoughts? Also, I encourage you to use developer access to move this to Git:

https://www.mediawiki.org/wiki/Developer_access

https://www.mediawiki.org/wiki/Git/Conversion/Extensions_queue

Aklapper added a comment.Via ConduitJan 23 2013, 12:36 PM

Conrad / Beau: Could you comment on comment 20, please?

Also, for general information see https://www.mediawiki.org/wiki/Writing_an_extension_for_deployment for information on what is needed to get an extension reviewed before potentially deploying it on a wikisite.

bzimport added a comment.Via ConduitMar 16 2013, 11:01 AM

beau wrote:

I don't think we need to consider pl.wiktionary anymore. We have implemented transliteration using lua: https://pl.wiktionary.org/wiki/Module:transliterator

greg added a comment.Via ConduitAug 22 2013, 10:12 PM

Looks to be stalled for a long time. I'm closing this request and people can reopen if needed. There are work arounds available (see comment 26).

bzimport added a comment.Via ConduitAug 23 2013, 2:55 AM

wmf.amgine3691 wrote:

Ignored to death, resulting in a community hack which has to be recreated/reimplemented on every wiktionary. Brilliant wontfix.

But thoroughly predicted.

Aklapper added a comment.Via ConduitAug 23 2013, 9:50 AM

I agree handling of this ticket wasn't the best - comment 20 and comment 22 explain why this was and is stalled. If this is still wanted feel free to reopen, but it will still need somebody to fix the code first.

bzimport added a comment.Via ConduitAug 27 2013, 3:37 AM

wmf.amgine3691 wrote:

The extension is still needed. Without it, every wiktionary which transliterates into the local script will do so manually, or using local template/module systems. It is still requested on at least English and French wiktionaries. I believe EL also wants it, but has assumed it will never happen. (There are already transliteration rules for EL in this extension. Ariel helped Conrad in the development of the extension.)

I do not know of anyone associated with wiktionary (and therefore familiar with the issues) who could fix the code *and* is willing to do hoop dancing for WMF devs (and therefore able to get through the iterative processes necessary to get a deity to merge it.)

Nemo_bis added a comment.Via ConduitAug 27 2013, 9:37 AM

Amgine's frustration is deserved/understandable, but let's see what small sacrifices we can do to the deities in question to help them help us (in Italian we say: aiutati che il ciel t'aiuta). I've checked the last version of the review queue checklist: https://www.mediawiki.org/w/index.php?title=Review_queue&oldid=771682

  1. ok
  2. ok
  3. bug 53393
  4. ok
  5. screencast missing
  6. done now with +design keyword
  7. ok (review done by Brion, comment 20)
  8. ok³
greg added a comment.Via ConduitAug 28 2013, 11:03 PM

(In reply to comment #31)

I've checked the last version of the review queue checklist:
https://www.mediawiki.org/w/index.php?title=Review_queue&oldid=771682

  1. ok
  2. ok
  3. bug 53393
  4. ok
  5. screencast missing
  6. done now with +design keyword

For the design review, I would recommend emailing the design mailing list: https://lists.wikimedia.org/mailman/listinfo/design

  1. ok (review done by Brion, comment 20)

Have the changes been made that Brion suggested? There may need to be more back and forth here.

  1. ok³

Links for En and Fr:
https://en.wiktionary.org/w/index.php?oldid=7110737

https://fr.wiktionary.org/wiki/Sp%C3%A9cial:Filtre_antiabus#Mod.C3.A8le_pour_une_section_Translitt.C3.A9rations (I can't view this, apparently)

Nemo_bis added a comment.Via ConduitAug 29 2013, 10:36 AM

(In reply to comment #32)

For the design review, I would recommend emailing the design mailing list:
https://lists.wikimedia.org/mailman/listinfo/design

Presumably best after a screenshot or something is produced? It's not clear to me what there is to review, this seems to be just a parser function.

> 7) ok (review done by Brion, comment 20)

Have the changes been made that Brion suggested? There may need to be more
back
and forth here.

Well, "the code looks pretty nice" looks good enough a review. Missing pieces should be filed as separate bugs.

ViveLaRosiere added a comment.Via ConduitNov 22 2013, 6:06 PM

Consensus from 3 years and half, well... "Low enhancement". Seriously, should I laught or cry ?

bzimport added a comment.Via ConduitNov 22 2013, 7:20 PM

wmf.amgine3691 wrote:

(In reply to comment #35)

Consensus from 3 years and half, well... "Low enhancement". Seriously,
should I
laught or cry ?

4+ years total. We've gone through the full gamut: excitement, begging, demanding, giggling, shouting, crying...

Aklapper added a comment.Via ConduitNov 23 2013, 5:22 PM

You could go through previous comments and check what has been requested and has not been done by anybody yet who is interested to get this fixed.
That's more productive than shouting and crying.

To start with comment 32: Please provide a link to the design review request on the design mailing list at http://lists.wikimedia.org/pipermail/design/ ? Adding it here helps to keep the process/progress transparent.

(In reply to comment #34)

> > 7) ok (review done by Brion, comment 20)
>
> Have the changes been made that Brion suggested? There may need to be more
> back and forth here.

Well, "the code looks pretty nice" looks good enough a review. Missing pieces
should be filed as separate bugs.

Nobody has done this so it is obviously a very good next step.
I currently see zero Transliterator reports in Bugzilla (open or closed).

Sidenote: Only https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FTransliterator/HEAD/Transliterator.php mentions a license (GPL 2.0) for this extension, other files do not. Mention as https://www.mediawiki.org/wiki/Extension:Transliterator is in the Category:Extensions with unknown license.

bzimport added a comment.Via ConduitNov 23 2013, 5:54 PM

wmf.amgine3691 wrote:

I believe CIrwin was the last semi-active developer with the Wiktionary project. Needless to say his experiences are the exemplar being passed down through generations of wiktionary contributors.

There's no one from the project who *can* do what you suggest, Andre.

Aklapper added a comment.Via ConduitNov 23 2013, 11:50 PM

In that case I don't see a good way forward here except for somebody picking up (at least temporary) maintainership of Transliterator to implement the "suggestions".
In general, deploying unmaintained code sounds like a bad idea (though in this case it's not "much" code so it might be less of a problem).

mxn added a subscriber: mxn.Via WebNov 24 2014, 8:58 PM

Add Comment