Page MenuHomePhabricator

Automate addition of new languages for the UI
Closed, ResolvedPublic

Description

We need to setup a system where new languages are added to ISA automatically once some threshold percentage of strings have been translated on translatewiki.net

See https://translatewiki.net/wiki/Translating:Kiwix for an example, which uses 35% as the threshold. They also setup deployment days twice a week where new translations will be checked and imported

Event Timeline

The format of the .po files from Translatewiki doesn't match what pybabel expects, resulting in errors like:

$ pybabel compile -i fr/LC_MESSAGES/messages.po -o fr/LC_MESSAGES/messages.mo
Traceback (most recent call last):
  File "/home/sebastian/workspace/isa/.venv/bin/pybabel", line 11, in <module>
    sys.exit(main())
  File "/home/sebastian/workspace/isa/.venv/lib/python3.5/site-packages/babel/messages/frontend.py", line 929, in main
    return CommandLineInterface().run(sys.argv)
  File "/home/sebastian/workspace/isa/.venv/lib/python3.5/site-packages/babel/messages/frontend.py", line 853, in run
    return cmdinst.run()
  File "/home/sebastian/workspace/isa/.venv/lib/python3.5/site-packages/babel/messages/frontend.py", line 187, in run
    for catalog, errors in self._run_domain(domain).items():
  File "/home/sebastian/workspace/isa/.venv/lib/python3.5/site-packages/babel/messages/frontend.py", line 232, in _run_domain
    catalog = read_po(infile, locale)
  File "/home/sebastian/workspace/isa/.venv/lib/python3.5/site-packages/babel/messages/pofile.py", line 377, in read_po
    parser.parse(fileobj)
  File "/home/sebastian/workspace/isa/.venv/lib/python3.5/site-packages/babel/messages/pofile.py", line 308, in parse
    self._process_comment(line)
  File "/home/sebastian/workspace/isa/.venv/lib/python3.5/site-packages/babel/messages/pofile.py", line 267, in _process_comment
    self._finish_current_message()
  File "/home/sebastian/workspace/isa/.venv/lib/python3.5/site-packages/babel/messages/pofile.py", line 204, in _finish_current_message
    self._add_message()
  File "/home/sebastian/workspace/isa/.venv/lib/python3.5/site-packages/babel/messages/pofile.py", line 198, in _add_message
    self.catalog[msgid] = message
  File "/home/sebastian/workspace/isa/.venv/lib/python3.5/site-packages/babel/messages/catalog.py", line 628, in __setitem__
    self.mime_headers = _parse_header(message.string).items()
  File "/home/sebastian/workspace/isa/.venv/lib/python3.5/site-packages/babel/messages/catalog.py", line 445, in _set_mime_headers
    self.revision_date = _parse_datetime_header(value)
  File "/home/sebastian/workspace/isa/.venv/lib/python3.5/site-packages/babel/messages/catalog.py", line 47, in _parse_datetime_header
    tt = time.strptime(match.group('datetime'), '%Y-%m-%d %H:%M')
  File "/usr/lib/python3.5/_strptime.py", line 504, in _strptime_time
    tt = _strptime(data_string, format)[0]
  File "/usr/lib/python3.5/_strptime.py", line 346, in _strptime
    data_string[found.end():])
ValueError: unconverted data remains: :29

This is caused by the line

"PO-Revision-Date: 2021-12-30 12:12:29+0000\n"

in the .po file. Pybabel doesn't want the second position in the date. I haven't found any way to make the format match, but it is an easy enough thing to preprocess before compiling.

I've started on a maintenance script that:

  1. Looks through "isa/translations/" for .po files.
  2. Checks the progress of translations for each language and compiles any at or above a threshold (e.g. 35%)

One thing to consider is what to do if the translation progress drops, which may happen if new messages are added. Should previously translated languages be kept, even if the translate progress now is below the threshold? It feels a bit strange to have languages suddenly disappearing.

Change 755976 had a related patch set uploaded (by Sebastian Berlin (WMSE); author: Sebastian Berlin (WMSE)):

[labs/tools/Isa@master] Maintenance script for fetching and compiling translations

https://gerrit.wikimedia.org/r/755976

Change 755976 merged by jenkins-bot:

[labs/tools/Isa@master] Maintenance script for fetching and compiling translations

https://gerrit.wikimedia.org/r/755976

Change 774855 had a related patch set uploaded (by Sebastian Berlin (WMSE); author: Sebastian Berlin (WMSE)):

[labs/tools/Isa@master] Checkout to temporary branch for compiling translations

https://gerrit.wikimedia.org/r/774855

I extended the script to work on a temporary branch when fetching new translations. This should prevent any unwanted changes unrelated to translations.

Once the patch is deployed the only remaining step should be to add a Cron-job (or similar).

Change 774855 merged by jenkins-bot:

[labs/tools/Isa@master] Checkout to temporary branch for compiling translations

https://gerrit.wikimedia.org/r/774855

Cron job created on Toolforge for updating once a day.