Page MenuHomePhabricator

Wikipedia Android language sync generates empty ksh translations
Closed, ResolvedPublic

Description

Some bugs have recently surfaced causing invalid translations to appear:

  • values-x-invalidLanguageCode appears to contain Chinese translations
  • values-ksh imports a file with only an XML preamble and no translations. This violates our translation minimum threshold and breaks the Android build system

Event Timeline

For the Chinese characters, have you already excluded character encoding problems?

For ksh, there should be translations: https://translatewiki.net/w/i.php?title=Special:MessageGroupStats&language=ksh&group=out-wikimedia-mobile-wikipedia-android-strings Maybe the issue is that some contain comments in <!-- -->.

I cleaned up a lot of unsupported HTML tags in the ksh translation but I didn't find any HTML comments. It's still exporting just <?xml version="1.0"?>. For zh-hans, zh-hant, and nan, the export encoding is UTF-8 which is what's expected but perhaps I'm missing your point. I also looked for malformed HTML in these translations but didn't notice any.

The empty ksh file is still being generated but values-x-invalidLanguageCode has not been generated lately.

Nikerabbit renamed this task from Wikipedia Android language sync generates invalid file and empty ksh translations to Wikipedia Android language sync generates empty ksh translations.Jan 11 2017, 6:54 PM

Warning: Entity 'nbsp' not defined in Entity, line: 2

Do html entities need special handling in this file format, or this just an issue in the way I use DOMDocument?

Change 331681 had a related patch set uploaded (by Nikerabbit):
Handle ampersands in translations

https://gerrit.wikimedia.org/r/331681

AFAIK HTML entities must be escaped unless they appear within <![CDATA[ and ]]

Change 331681 merged by jenkins-bot:
AndroidXmlFFS: Handle ampersands in translations

https://gerrit.wikimedia.org/r/331681

Nemo_bis triaged this task as Medium priority.Jan 27 2017, 12:17 PM
Nemo_bis removed a project: Patch-For-Review.

This appears to be fully fixed now. Thank you!