Page MenuHomePhabricator

Multilingual development
Open, LowPublic

Description

Originally from: http://sourceforge.net/p/pywikipediabot/feature-requests/101/
Reported by: Anonymous user
Created on: 2007-08-06 17:47:39
Subject: Multilingual development
Original description:
English speaking at end

Francais== Texte d'origine

Bonjour
Je voudrais vous proposer un système qui permet de rendre le robot multilingue. En effet, tous les messages envoyé à la console sont anglophone. Or le but d'un robot est de s'adapter à la multitude des languages pouvant exister de la part des utilisateurs. C'est pour cela que je vous propose le système suivant :

Création d'un nouveau répertoire 'lang'. Dans ce répertoire s'y trouverait des fichiers de type XX.py \(XX étant le code ISO 639 de la langue\). Donc ce répertoire contiendra 1 ficher par code de langue existant.

Lorsque les différents programmes veulent afficher un message sur la console, la commande utilisé est très souvent 'wikipedia.output' ou 'wikipedia.input'. Le travail de cette commande serait d'appeller le fichier xx.py avec le numéro du message à renvoyer en paramètre, le choix du xx serait donnée par la variable mylang de user-config.py. le fichier xx.py enverrais alors le message à afficher en tenant compte des différentes variables de type %s \(ou autre\) bien entendu

Exemple :
dans user-config.py, j'ai "mylang = 'fr'"
Replace.py à la ligne 375 contient la commande "wikipedia.input\(u'Please enter the new text:'\)",

Le nouveau système coderait "wikipedia.input\(u'Please enter the new text:'\)" par "wikipedia.input.message\(284\)"
appelerait donc lang/fr.py et lui demanderais de lui retourner le message n° 284 qui serait "s'il vous plais, entrez le nouveaux texte :" et le lui retourne.

Voila, en esperant avoir compris ma demande.

Je vous remercie de votre écoute

English== Text translates since French by a machine translation system

Hello
I would like to propose you a system which allows to return the multilingual robot. Indeed, all the messages messenger in the console are English-speaking. Now the purpose of a robot is to adapt itself to the multitude of the languages which can exist on behalf of the users. It is for it that I propose you the following system:

Creation of a new directory ' lang '. In this directory would be files of type XX.py \(XX there being the code ISO 639 of the language\). Thus this directory will contain 1 file by existing code of language.

When the various programs want to post a message on the console, the order used is very often ' wikipedia.output ' or ' wikipedia.input '. The work of this order would be to call the xx.py file with the number of the message to be sent back in parameter, the choice of the xx would be given by the mylang variable to user-config.py

The xx.py file would send then the message to be posted\(shown\) by taking into account various variables of type %s \(or other\) naturally

Example:
In user-config.py, I have " mylang = ' fr ' "
Replace.py in the line 375 contains the command " wikipedia.input \(u' Please enter the new text: '\) ",
The new system would code " wikipedia.input \(u' Please enter the new text: '\) "by" wikipedia.input.message \(284\) " call thus lang/fr.py and would ask it to return it the message n° 284 which would be " s'il vous plais, entrez le nouveaux texte : ".

Here we are, by hoping to have understood my demand.
I thank you for your listening


Version: core-(2.0)
Severity: enhancement
See Also:
https://sourceforge.net/p/pywikipediabot/feature-requests/101

Details

Reference
bz55109

Event Timeline

bzimport raised the priority of this task from to Low.
bzimport set Reference to bz55109.
bzimport added a subscriber: Unknown Object (????).
Legoktm created this task.Oct 5 2013, 4:25 AM

Logged In: YES
user\_id=880694
Originator: NO

Your approach works for static strings like wikipedia.input\(u'Please enter the new text: '\), but it doesn't work for dynamic ones like output\(u'Page %s moved to %s' % \(self.title\(\), newtitle\)\).

Instead, we could use gettext: http://docs.python.org/lib/module-gettext.html

What do you think?

  • labels: 745454 -->

Logged In: YES
user\_id=687283
Originator: NO

I have no idea if gettext is usable on non-unix platforms, and whether it allows to use locally saved translations. If it does, I think we might give it a try. We would need to change all 'text %s %s' % \(self.title\(\), newtitle\) commands to their dictionary relatives, 'text %\(title\)s %\(newtitle\)s' % \{'title': self.title\(\), 'newtitle': newtitle\). Not a small job, but doable, I suppose.

If gettext is not \(easily\) available on win32, we might write our own system. I do not like the idea of using integers though ;\)

Logged In: YES
user\_id=855050
Originator: NO

A good idea; however, it should be implemented using the standard Python 'gettext' module, rather than reinventing the wheel.

Logged In: YES
user\_id=880694
Originator: NO

Current status:

I have changed the text colorization system so that it is now possible to internationalize colorized strings.

There is now a branch called i18n which has gettext support. It works fine, and the selflink.py script can be run in German. There is a new config variable to set the UI lang, it will default to mylang and fall back to English if the chosen language is unsupported.

One disadvantage is that source code readability suffers. Before:

choice = wikipedia.inputChoice\(u'\nWhat shall be done with this selflink?', \['unlink', 'make bold', 'skip', 'edit', 'more context'\], \['U', 'b', 's', 'e', 'm'\], 'u'\)

After:

choice = wikipedia.inputChoice\(\_\(u'\nWhat shall be done with this selflink?'\), \[\_\('unlink'\), \_\('make bold'\), \_\('skip'\), \_\('edit'\), \_\('more context'\)\], \[\_\('u \[unlink hotkey\]'\), \_\('b \[make bold hotkey\]'\), \_\('s \[skip hotkey\]'\), \_\('e \[edit hotkey\]'\), \_\('m \[more context hotkey\]'\)\], \_\('u \[unlink hotkey\]'\)\)

valhallasw and I have discussed how to modify inputChoice\(\) to make it less cluttered, but we didn't find a convincing solution. For example, inputChoice could be changed so that it works like this:

choice = wikipedia.inputChoice\(\_\(u'What shall be done with this selflink?'\), \_\(u'\[u\]nlink, make \[b\]old, \[s\]kip, \[e\]dit, \[m\]ore context'\), 0\)

so that it automatically parses the \[brackets\] to find out the hotkeys, and returns an integer for the option that was chosen. But integers make the code very hard to read and maintain when there are long if-elif constructions \(... elif choice == 12 ...\).

So, we currently don't know how to do it better. I think having full i18n is worth cluttering up the code a little bit.

  • priority: 5 --> 7

Logged In: YES
user\_id=687283
Originator: NO

Take a look at my proposal at http://pywiki.pastey.net/71924 . Readability still is not too good; another way would be to create a new input function and let gettext see it as translatable strings. No idea if gettext can handle multiple parameters, but I assume it can handle at least two ;\)

Logged In: YES
user\_id=687283
Originator: NO

Check http://svn.wikimedia.org/viewvc/pywikipedia/branches/pywikipedia/i18n/input\_choice\_proposal/ and see if you like it. I wrote a new function; the system is as follows:

retval = i18nChoice\("Do you want to save?", "\[\('yes', 'y'\), \('no', 'n'\)\]", 'yes'\)

instead of a key, the options \*name\* is given as parameter, and returned.

This returns, with a dutch translation:
Wilt u opslaan? \(\[j\]a, \[n\]ee\) n

then retval == 'no'

To use this, we need to use xgettext:
xgettext --keyword=i18nChoice:1,2 ic.py

and we get plural defenitions:
msgid "Do you want to save?"
msgid\_plural "\[\('yes', 'y'\), \('no', 'n'\)\]"
msgstr\[0\] "Wilt u opslaan?"
msgstr\[1\] "\[\('ja', 'j'\), \('nee', 'n'\)\]"

Maybe not the nicest solution, but the best one I could find. The alternative would be
retval = i18nChoice\(\_\("Do you want to save?"\), "\[\('yes', 'y'\), \('no', 'n'\)\]", 'yes'\)
with
xgettext --keyword=i18nChoice,2 ic.py
\(and the corresponding changes in the function, of course.\)

Logged In: YES
user\_id=687283
Originator: NO

Update: Because of the limitations of the normal xgettext implementation, I am writing my own version, using the python compiler package. With some luck, it will be possible to maintain the original inputChoice format this way \(as there is no reason to use a string anymore; the string-to-translate can be generated from a function parameter that is a list, or a dict, or .... \).

When this implementation is done, only three functions \(wikipedia.input, wikipedia.output and wikipedia.inputChoice\) need to be adapted \(and possibly we need to change wikipedia.Error to translate the error\).

Something new with the dev meeting in Berlin?

Nice to have but not for a high priority. Possibly for the rewrite.

  • labels: --> rewrite
  • priority: 7 --> 2

Now we have pywikibot/i18n.
Actually, it is only used for things that appear on wikis themselves (e.g. edit summaries for scripts). Should we use it for log messages too?

Xqt added a comment.Apr 16 2014, 7:32 AM

Maybe. We have i18n.input() for example and i18n/pywikibot.py for more generic messages. On the other hand log files are good for debugging and therefore there are good reasons that log files should be readably for developers i.e. written in english.

pyfisch wrote:

I think that there are far way more important things about Pywikibot than providing internationalized output and input. Most not end user tools are only English. To translate Pywikibot in around 10 languages we would need many translators we do not have, also wiki pages with translation engine are not always translated and many of them are more important.

jayvdb set Security to None.
jayvdb removed a subscriber: Unknown Object (????).

If I have guessed correctly, this was originally filed by @Crochet.david , many moons ago.

I think this was an infrastructure project, and we now have the requested infrastructure, so this task could be closed.

However, it is quite annoying that pywikibot still does not do the following, described by @valhallasw in 2007 as working in now deleted? branch "input_choice_proposal":

"Wilt u opslaan? ([j]a, [n]ee) n"

Restricted Application added a subscriber: Cyberpower678. · View Herald TranscriptJun 3 2018, 3:27 PM