Page MenuHomePhabricator

Literal apostrophe incorrectly escaped in Huggle messages exported from translatewiki.net
Closed, ResolvedPublic

Description

As seen on screenshot, http://i.imgur.com/8b7YKnE.png

For some langauges:
Its okey: Arabic, English, Français
Its not: Türkçe, Espanol, Cestina, Italiano, Svenska

Event Timeline

Mavrikant raised the priority of this task from to Needs Triage.
Mavrikant updated the task description. (Show Details)
Mavrikant added a project: Huggle.
Mavrikant subscribed.

This should fix https://github.com/huggle/huggle3-qt-lx/commit/463b4bd34808fb819c276399d02f02341ae1474b

however now translatewiki needs to fetch it from our repo and then updated localization files need to be pushed back and that will take some time

Petrb triaged this task as Medium priority.
Petrb added a project: Patch-For-Review.
Petrb set Security to None.
Petrb moved this task from Doing to Waiting on the Huggle board.

Looks like a bug in the escaping Huggle does when reading those strings? Huggle is using AndroidXmlFFS; I don't think we ever had such issues with Android apps. Can you find a specification for the format, or ensure you parse it with a library compatible with Android expectations?

Nemo_bis renamed this task from Apostrophe creating a problem in interface to Literal apostrophe incorrectly escaped by Huggle interface.Jan 22 2015, 10:42 AM
Nemo_bis added a project: I18n.

You are wrong, huggle is rendering everything properly, problem is that if source en.xml contains

<string key=bla>some "</string>

translatewiki produces localized file with

<string key=bla>some \"</string>

for some reason it fucked up the format.

There is no problem with huggle

BTW this is all irrelevant now, we already replaced all " so we are now just waiting for TWN to fix itself

If I remember correctly literal ' or " in the AndroidXmlFFS will crash the compilation of an android program. Not our fault that the file format is so fucked up.

This is not about fucked up format but the fact that translatewiki changes original " into \" and ' into \'

I don't know if this some attempt to encode it, nonetheless it's not working.

If you point me to source code of TWN that handles this I can try to fix it. Correct way to encode quotes in XML is turning " into &quot; and ' into &apos; users who are translating the text should see those magic words as what they stands for, eg. quotes.

Nemo_bis renamed this task from Literal apostrophe incorrectly escaped by Huggle interface to Literal apostrophe incorrectly escaped in Huggle messages exported from translatewiki.net.Jan 22 2015, 11:16 PM

translatewiki changes original " into \" and ' into \'

Right, I misread your diff (example).

If you point me to source code of TWN that handles this I can try to fix it.

https://phabricator.wikimedia.org/diffusion/ETRA/browse/master/ffs/AndroidXmlFFS.php

This is not about fucked up format but the fact that translatewiki changes original " into \" and ' into \'

Yes this is about fucked file formats (see T47354). What you have is not a valid file. Garbage in garbage out. If you are not building an Android app, this is not a good file format for you and we should be using something else.

From Nikerabbit's link:

<string name="bad_example_2">XML encodings don&apos;t work</string>

So what about switching huggle format from this android shit to https://gerrit.wikimedia.org/r/#/c/186338/

Petrb changed the task status from Open to Stalled.Jan 27 2015, 4:54 PM

This is currently blocked by a bug on translatewiki, please reopen once the bug there is fixed.

Petrb lowered the priority of this task from Medium to Low.Jan 27 2015, 4:55 PM

This is fixed in latest huggle by replacing the garbage characters back to normal ones