Page MenuHomePhabricator

Structured localization framework for Scribunto modules
Open, Needs TriagePublic

Description

For much more details about this project, see https://www.mediawiki.org/wiki/Translatable_modules

Scribunto modules that are useful in more than one language need a convenient and uniform localization framework for translating their messages.

This is needed for modules that are used across wikis, and even for modules that are used in multilingual wikis such as Commons or Wikidata.

At the moment there is no such framework. Like templates, modules can be translated by copying the module code to another wiki, going through its wiki syntax, and changing the strings. This is very flexible, but also extremely inefficient because it creates a forked copy and severs the connection to the original module and doesn't allow proper code reuse and collaboration.

There is also the TNT system, which is based on the Translate extension's page translation capability, but it is primarily made for templates, and has various disadvantages (see T238411).

Some modules try to include capability for translations using arrays indexed by language code, but there is no uniform framework for it nor a tool that allows their translation without diving into Lua code.

It would be much nicer if modules could be translated the same way extensions are:

  • Having the same underlying algorithmic and page layout code for all the languages, but separate translations.
  • Using a dedicated translation interface so that translators wouldn't have to deal with any code or wiki syntax (probably with the Translate extension).

My user-level proposal for how this will be done is described in more detail on the page Global_templates/Draft_spec/TLDR and in even more detail on the page Global_templates/Draft_spec.

Some steps to making this reality:

  • Some adaptations in the Translate extension: Showing translatable modules in the message group selector, and possibly other things.
  • Decision on which syntax to use for inserting messages into the module code. This is already possible with https://www.mediawiki.org/wiki/Extension:Scribunto/Lua_reference_manual#Message_library , but could perhaps be enhanced.
  • Decision on where to store the messages, and how to customize them per project.
  • And probably more things.

Unlike templates, modules are probably easier and better equipped for becoming properly localizable. For example, it's desirable to allow the translation of of template titles and parameter names, because they are frequently used by end users who edit wiki content pages, but since modules are rarely transcluded directly into content pages, this feature is less essential for modules.

A truly comprehensive solution will only come around when it will be possible to efficiently share templates and modules across wikis (T41610, T52329, T121470), but some steps towards designing it and making necessary modifications in Translate can possibly be made earlier.

Some good steps towards designing and implementing such a thing were made in T122086: RFC: Sharing templates and modules between wikis - poor man's version (investigation), and a lot of the ideas from that project could be shared with this one.

This task is specifically for modules. Templates and gadgets have some similar requirements, but they are handled in other tasks. See also:

Related Objects

Event Timeline

Amire80 created this task.Nov 15 2019, 4:58 PM
Restricted Application added a subscriber: Liuxinyu970226. · View Herald TranscriptNov 15 2019, 4:58 PM
Anomie added a subscriber: Anomie.

At the moment there is no such framework.

There is, it's just not at all convenient (it requires individual MediaWiki-namespace pages to be created for each message) so I'd guess few if any modules actually use it. You even linked it later in the post: https://www.mediawiki.org/wiki/Extension:Scribunto/Lua_reference_manual#Message_library

What we probably want is something closer to the banana file format for storing the messages. I wonder whether translations could then be done via translatewiki?

Also, at the code level the structure of MessageValue and related classes recently added to MediaWiki core might be a better model than the MediaWiki Message class that mw.message was based on.

Amire80 updated the task description. (Show Details)Nov 15 2019, 10:07 PM
Izno added a comment.Nov 15 2019, 11:54 PM

It would be good to list out the ad hoc actual implementations of internationalization/localization.

jeblad added a subscriber: jeblad.Dec 14 2019, 7:00 AM

The message library only exposes a subset of methods, and most troublesome is lack of “plural”.

Note that using localized messages in content triggers several problems that isn't solved in current message library.

Simplest solution to the core problem would be to pull in JSON as Lua tables, so localized files can be read, which should be pretty simple. This is nearly identical to the stalled update of TemplateData.

Simplest solution to the core problem would be to pull in JSON as Lua tables, so localized files can be read, which should be pretty simple. This is nearly identical to the stalled update of TemplateData.

Do you have any info about this stalled update?

Partially related patch (5 years old):
https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/Scribunto/+/158323/
proposed to move Module:Arguments from enwiki to MediaWiki and adding translation capabilities to getArgs (translate)

TheDJ removed a subscriber: TheDJ.Jan 29 2020, 11:25 PM
Zolo added a subscriber: Zolo.Feb 20 2020, 2:56 PM
LMN8 added a subscriber: LMN8.Mar 20 2020, 10:49 AM

Btw, Wikia users have written a pure Lua module like this:
https://dev.fandom.com/wiki/Global_Lua_Modules/I18n

Vutting renamed this task from Structured localization framework for Scribunto modules to vservices.Mar 21 2020, 6:16 AM
Vutting closed this task as Resolved.
DannyS712 renamed this task from vservices to Structured localization framework for Scribunto modules.Mar 21 2020, 6:20 AM
DannyS712 reopened this task as Open.
He7d3r added a subscriber: He7d3r.Jun 29 2020, 4:02 PM
jhsoby added a subscriber: jhsoby.Jul 29 2020, 1:04 PM

Btw, Wikia users have written a pure Lua module like this:
https://dev.fandom.com/wiki/Global_Lua_Modules/I18n

I only noticed this amazing thing now. What is truly cool is not even the i18n part, but the "global modules" part. How did Fandom do it? Did they fork Scribunto and add this? Is this something we can reuse to resolve T41610?

cscott added a subscriber: cscott.Aug 18 2020, 4:07 PM

Just to throw in another monkey-wrench: this may be an opportunity to revisit the core language underpinnings of Scribunto, to switch to a language which is *fully* localizable.

https://en.wikipedia.org/wiki/File:Wikimania_2019_-_Multilingual_JavaScript.pdf
Video: https://www.youtube.com/watch?v=SomTEzaoROQ&t=1973

Also: it's not just the messages inside the template, it's also the name of the template itself and the name of every parameter to the template. If there are keyword arguments/enumerations, they need to be localized as well; ideally in a framework which does not require hand-coding by individual Scribunto module authors. (As a member of the Parsing Team, we prefer this to be done at the wikitext level, so that it is available for all templates as well as scribunto modules and other extensions.) This is discussed some in T239294.

Just to throw in another monkey-wrench: this may be an opportunity to revisit the core language underpinnings of Scribunto, to switch to a language which is *fully* localizable.

https://en.wikipedia.org/wiki/File:Wikimania_2019_-_Multilingual_JavaScript.pdf
Video: https://www.youtube.com/watch?v=SomTEzaoROQ&t=1973

That would be nice, and can be done separately, but we already have all this Lua code on the wikis, with probably hundreds of thousands of lines, which are providing useful, essential functionality, and which aren't going to magically rewrite themselves in JavaScript. But doing just the relatively small thing of improving how they handle localization sounds quite doable.

Also: it's not just the messages inside the template, it's also the name of the template itself and the name of every parameter to the template.

This task here is about modules, not templates. Templates are discussed in T239294, as you say. For templates, it's indeed important to localize the title and the parameter names, because they are often used by editors in literally all kinds of wiki pages, most of which are written in the language of the wiki. For modules, localizing the title and the parameters is on a spectrum of:

  • maybe nice to have, but not required - for the same reason as templates, but much less important because modules are rarely directly inserted into wiki pages. Modules are usually used in templates, and they are mostly maintained and used by experienced developers who don't mind using a piece of code with identifiers in a language that is different from the language of the wiki.
  • definitely unnecessary - given the above point, and the cost to develop it, no one will probably complain if names and parameters are not translated. I'm happy to hear different opinions, though.
  • harmful - because Lua is a programming language based on English and ASCII, like most programming languages, it may be actively harmful to encourage people to use non-ASCII and non-English identifiers. But again, I'm happy to hear different opinions.

If there are keyword arguments/enumerations, they need to be localized as well;

I'm not sure I understand this part. What exactly do you refer to when you say "keyword arguments/enumerations"? Is this something in modules? Or in templates? Or TemplateData? (This may be an embarrassing question, so I'll just admit that I'm really not a Lua expert.)

ideally in a framework which does not require hand-coding by individual Scribunto module authors.

Generally, the intention is indeed to move the localization into a consistent framework, which would be separate from raw Lua code in a way that is comparable to how it is done with extensions, with the necessary corrections for the Lua realities, for example that modules are stored as wiki pages and not as files in a file system or Git. If that's what you want, then we are on the same page. How exactly will it look is not decided yet, but hopefully will be decided soon, and your suggestions are very welcome.

(As a member of the Parsing Team, we prefer this to be done at the wikitext level, so that it is available for all templates as well as scribunto modules and other extensions.) This is discussed some in T239294.

What do you exactly mean by "at the wikitext level"?

Hasley added a subscriber: Hasley.Aug 24 2020, 8:29 PM
Zache added a subscriber: Zache.Aug 24 2020, 11:17 PM

About parameter translation: modules have access to the parent frame, which they often take advantage of. This means that modules directly access the parameters defined when transcluding the template, that is, the parameter name is the same in the article and the module, with no template sitting in between and translating, so these parameter names need to be translated in order to avoid English parameter names popping up in articles. However, parameter translation should work differently than output text translation: output texts’ language should depend on UI or page language (e.g. Commons file pages’ content depends on UI language, and MediaWiki.org help pages’ content depends on page language), while parametes’ recognition should cerainly not depend on UI language, but probably not even on page language, only on wiki language (so that e.g. pages translated using Translate’s page translation feature can embed parameter names in their non-translatable part).

About parameter translation: modules have access to the parent frame, which they often take advantage of. This means that modules directly access the parameters defined when transcluding the template, that is, the parameter name is the same in the article and the module, with no template sitting in between and translating, so these parameter names need to be translated in order to avoid English parameter names popping up in articles.

OK, yet again I have to admit that I'm not actually a big expert on modules :)

Can you give me an example of how modules can access the parameters of templates?

In any case, this particular task is only about translating strings within one wiki and not about reusing modules across wikis, although it would be nice to make it usable also for the future when modules will hopefully become sharable across wikis.

However, parameter translation should work differently than output text translation: output texts’ language should depend on UI or page language (e.g. Commons file pages’ content depends on UI language, and MediaWiki.org help pages’ content depends on page language), while parametes’ recognition should cerainly not depend on UI language, but probably not even on page language, only on wiki language (so that e.g. pages translated using Translate’s page translation feature can embed parameter names in their non-translatable part).

Oh, definitely, no doubt about that. I already wrote about it in the long Global templates document in the section "Localizing parameters", as well as the one before it. I am now writing a detailed document about translatable modules, which I'll publish very soon, and it's mentioned there, too.

As I wrote above, when templates are global, translating their titles and parameter names (separately from human-readable strings) will definitely be necessary, but I'm really not sure that making the same possible for modules is necessary. I'm open to having my mind changed, however.

When I first experimented with “global modules”, I used separate data tables on Commons for the template parameters (https://commons.m.wikimedia.org/wiki/Data:Module:Music_charts/parameters.tab) and the text output (https://commons.m.wikimedia.org/wiki/Data:Module:Music_charts/labels.tab), among others; these were called by the central module, which eventually outputs the template code. This method would actually work fine for my needs, but the performance is terrible, so I never implemented it outside of testwiki …

Can you give me an example of how modules can access the parameters of templates?

-- Get frame object
local frame = mw.getCurrentFrame()

-- Parameters passed in the {{#invoke:}} call
local args = frame.args

-- Parameters passed to the template containing the {{#invoke:}} call
local pargs = frame:getParent().args

-- ...but it doesn’t work infinitely:
frame:getParent():getParent() == nil

So if Template:Foo contains the wikicode {{#invoke:Foo|bar|x|y=z|b=a}}, and Module:Foo defines the above variables, and {{Foo|a|b=c|d=e}} is placed in an article, the variables will be (more or less):

args == {
	[1] = 'x',
	['y'] = 'z',
	['b'] = 'a'
}
pargs == {
	[1] = 'a',
	['b'] = 'c',
	['d'] = 'e'
}

Is this clear?

Amire80 added a comment.EditedAug 25 2020, 2:23 PM

Can you give me an example of how modules can access the parameters of templates?

  • Parameters passed to the template containing the {{#invoke:}} call

local pargs = frame:getParent().args

Thanks, wasn't familiar with getParent()! But again, I'm not a true modules expert :)

Amire80 updated the task description. (Show Details)Sep 8 2020, 12:39 PM
mxn added a subscriber: mxn.Sep 9 2020, 5:15 AM
Aklapper removed a subscriber: Anomie.Oct 16 2020, 5:01 PM