Page MenuHomePhabricator

SVG Translate: Skip unsupported text pattern and continue with the supported ones
Closed, DeclinedPublicBUG REPORT

Description

As a SVG Translate user, I want to access labels, so that I can properly use the tool for translation.

Background: The user is experiencing the following issue: "This file does not have any labels available for translation. Please pick another image."

Resources:

Steps to Reproduce:
https://tools.wmflabs.org/svgtranslate/File:COVID-19_Variante.svg
https://tools.wmflabs.org/svgtranslate/File:T248252.svg

Actual Results:

Screenshot_2020-03-22 SVG Translate.png (948×1 px, 26 KB)

Expected Results:

<switch>
<g systemLanguage="de">
 <text x="82.01" y="61.77" font-size="40">Coronavirus - „Worst Case Szenario“</text>
 <text x="83.05" y="116.27" font-size="36">Annahme: 60-70% der Bevölkerung wird akkumuliert infiziert</text>
 <text x="159.35" y="205.03" font-size="36">Zahl der<tspan x="159.35" y="250.03">Infizierten</tspan></text>
 <text x="155.62" y="821.84" font-size="36" text-anchor="middle">0</text>
 <text x="335.1" y="822.58" font-size="36" text-anchor="middle">100</text>
 <text x="507.61" y="821.11" font-size="36" text-anchor="middle">200</text>
 <text x="681.95" y="821.48" font-size="36" text-anchor="middle">300</text>
 <text x="858.02" y="821.84" font-size="36" text-anchor="middle">400</text>
 <text x="521.14" y="877.22" font-size="36" text-anchor="middle">Tage seit der ersten bestätigten Infektion</text>
 <text x="85.13" y="970.66" font-size="24">Quelle: DiePresse tw. basierend auf TU Wien/dwh</text>
 <text x="424.66" y="192.09" font-size="36"><tspan fill="#800" font-weight="bold">Keine Einschränkung<tspan x="424.66" y="237.09">der Sozialkontakte</tspan></tspan><tspan x="424.66" y="282.09">Höhepunkt: 25% infiziert</tspan></text>
 <text x="424.66" y="380.3" font-size="36"><tspan fill="#f60" font-weight="bold">etwas weniger<tspan x="424.66" y="425.3">Sozialkontakte</tspan></tspan><tspan x="424.66" y="470.3">ca. 15% infiziert</tspan></text>
 <text x="608.66" y="582.18" font-size="36"><tspan fill="#080" font-weight="bold">deutlich weniger<tspan x="608.66" y="627.18">Sozialkontakte</tspan></tspan><tspan x="608.66" y="672.18">ca. 5% infiziert</tspan></text>
 <text x="896.73" y="631.93" font-size="36"><tspan fill="#00f" font-weight="bold">Gesundheits-<tspan x="896.73" y="676.93">systemlimit</tspan></tspan><tspan x="896.73" y="721.93">ca. 0.6% infiziert</tspan></text>
</g>
<g>
 <text x="82.01" y="61.77" font-size="40">Coronavirus - „Worst Case Szenario“</text>
 <text x="83.05" y="116.27" font-size="36">Assumption: 60-70% of the citizen get infected</text>
 <text x="159.35" y="205.03" font-size="36">Number of<tspan x="159.35" y="250.03">current Infections</tspan></text>
 <text x="155.62" y="821.84" font-size="36" text-anchor="middle">0</text>
 <text x="335.1" y="822.58" font-size="36" text-anchor="middle">100</text>
 <text x="507.61" y="821.11" font-size="36" text-anchor="middle">200</text>
 <text x="681.95" y="821.48" font-size="36" text-anchor="middle">300</text>
 <text x="858.02" y="821.84" font-size="36" text-anchor="middle">400</text>
 <text x="521.14" y="877.22" font-size="36" text-anchor="middle">days since first infection</text>
 <text x="85.13" y="970.66" font-size="24">source: DiePresse based according to TU Wien/dwh</text>
 <text x="424.66" y="192.09" font-size="36"><tspan fill="#800" font-weight="bold">no restrictions<tspan x="424.66" y="237.09">of social contacts</tspan></tspan><tspan x="424.66" y="282.09">peak: 25% infected</tspan></text>
 <text x="424.66" y="380.3" font-size="36"><tspan fill="#f60" font-weight="bold">less<tspan x="424.66" y="425.3">social contacts</tspan></tspan><tspan x="424.66" y="470.3">ca. 15% infected</tspan></text>
 <text x="608.66" y="582.18" font-size="36"><tspan fill="#080" font-weight="bold">strong reduction of<tspan x="608.66" y="627.18">social contacts</tspan></tspan><tspan x="608.66" y="672.18">ca. 5% infected</tspan></text>
 <text x="896.73" y="631.93" font-size="36"><tspan fill="#00f" font-weight="bold">health care<tspan x="896.73" y="676.93">limit</tspan></tspan><tspan x="896.73" y="721.93">ca. 0.6% infected</tspan></text>
</g>
</switch>

Acceptance Criteria:

  • Fix label bug so that users can properly access label functionality and complete translation work in SVG Translate

Event Timeline

ifried renamed this task from This file does not have any labels available for translation. Please pick another image. to SVG Translate: Fix Label Bug.Mar 26 2020, 4:32 PM
ifried updated the task description. (Show Details)
ifried updated the task description. (Show Details)
ifried updated the task description. (Show Details)
ifried updated the task description. (Show Details)

I would close this ticket as Won't Fix.

The given file is far outside the scope of the SVG Translate Tool. The organization of the file is planar: each language is in a g element that contains many text elements. The tool expects switch elements to hold simple text elements and an entire plane. The tool would have to explode each plane, and then recombine the orphans into translation units. Such an operation was beyond the planned scope.

If the tool had been applied to a monolingual version (one without switch), then the tool would have inserted a switch for each text element. Tool users would have been able to add additional languages by running the tool again.

@Glrx Your descrition does imho not relate to his bug.

Not supporting text without tspans
https://tools.wmflabs.org/svgtranslate/File:T248252.svg
is a very serious bug!

@Glrx Your descrition does imho not relate to his bug.

Not supporting text without tspans
https://tools.wmflabs.org/svgtranslate/File:T248252.svg
is a very serious bug!

@JoKalliauer the bug description's "Steps to Reproduce" gives the SVG file COVID-19_Variante.svg. The content of that SVG file is clearly outside the scope of the SVG Translate tool. The immediate children of the switch element are g elements and not text elements. The tools inability to process that file is clear.

You point to another file, T248252.svg, that does not have any switch elements or systemLanguage attributes.

Looking at that file, it has some text elements with both non-empty #text nodes and tspan elements. It also has tspan elements within other tspan elements. If I fix the former (by wrapping non-empty #text with a tspan) and delete the latter text elements, then SVG Translate finds the translatable text. So at least one of those patterns causes SVG Translate to give up on translating any portion of the file. I have not investigated further; maybe only one class of the suspect text elements causes the problem.

My guess is that the tool gives up on an entire file when it runs into a pattern that it cannot process.

My understanding of the tools design goals were to process only simple text. It does not know how to handle phrasing content. It expects either one-line labels (a text element with only a #text node) or multiline labels (a text element with a single tspan elements for each line). I do not expect it to handle phrasing such as a tspan to subscript or bold some text, so my guess is a tspan within a tspan is outside the tool's scope.

The tool could offer a better explanation of why it gave up on a file.

JoKalliauer raised the priority of this task from High to Needs Triage.Apr 19 2020, 8:42 AM

My guess is that the tool gives up on an entire file when it runs into a pattern that it cannot process.

@Glrx Thanks! The tool is better than I thought. :-D

But if it fails on one element ( T250607 ) it should skip it and continue with the next text-element, that also makes bug-hunting easier, even if no error message provided.

JoKalliauer renamed this task from SVG Translate: Fix Label Bug to SVG Translate: Skip unsupported text pattern and continue with the supported ones.Apr 19 2020, 8:44 AM
JoKalliauer updated the task description. (Show Details)

SVG Translate made a deliberate decision to only work on SVG files with a simple format. It will only handle simple lines of plain text. Handling subscripts, color shifts, and font styling changes are beyond its ability.

SVG Translate also made the translation unit assumption. It expects to translate individual phrases rather than entire language planes. An oversimplification of that assumption is the switch elements should only have text element children.

The first file, File:COVID-19_Variante.svg, violates the translation unit assumption. I do not think it reasonable for SVG Translate to attempt to fix the file.

The file is actually just German at this point. A quick fix would be to replace the switch with its default clause. Then SVG Translate could attempt to translate the individual text elements (which will have their own problems).

Looking at the second file, it is just the default clause of the first file. Some text elements have problems. Here is the SVG source with additional comments:

<?xml version="1.0" encoding="UTF-8"?>
<svg width="1145.5" height="980" font-family="Liberation Serif" xmlns="http://www.w3.org/2000/svg">

  <!-- acceptable -->
  <text x="82.01" y="61.77" font-size="40">Coronavirus - „Worst Case Szenario“</text>
  <text x="83.05" y="116.27" font-size="36">Annahme: 60-70% der Bevölkerung wird akkumuliert infiziert</text>

  <!-- unacceptable: #text and tspan  -->
  <text x="159.35" y="205.03" font-size="36">Zahl der<tspan x="159.35" y="250.03">Infizierten</tspan></text>

  <!-- acceptable -->
  <text x="155.62" y="821.84" font-size="36" text-anchor="middle">0</text>
  <text x="335.1" y="822.58" font-size="36" text-anchor="middle">100</text>
  <text x="507.61" y="821.11" font-size="36" text-anchor="middle">200</text>
  <text x="681.95" y="821.48" font-size="36" text-anchor="middle">300</text>
  <text x="858.02" y="821.84" font-size="36" text-anchor="middle">400</text>
  <text x="521.14" y="877.22" font-size="36" text-anchor="middle">Tage seit der ersten bestätigten Infektion</text>
  <text x="85.13" y="970.66" font-size="24">Quelle: DiePresse tw. basierend auf TU Wien/dwh</text>

  <!-- unacceptable: nested tspan -->
  <text x="424.66" y="192.09" font-size="36"><tspan fill="#800" font-weight="bold">Keine Einschränkung<tspan x="424.66" y="237.09">der Sozialkontakte</tspan></tspan><tspan x="424.66" y="282.09">Höhepunkt: 25% infiziert</tspan></text>
 <text x="424.66" y="380.3" font-size="36"><tspan fill="#f60" font-weight="bold">10% weniger<tspan x="424.66" y="425.3">Sozialkontakte</tspan></tspan><tspan x="424.66" y="470.3">ca. 15% infiziert</tspan></text>
 <text x="608.66" y="582.18" font-size="36"><tspan fill="#080" font-weight="bold">25% weniger<tspan x="608.66" y="627.18">Sozialkontakte</tspan></tspan><tspan x="608.66" y="672.18">ca. 5% infiziert</tspan></text>
 <text x="896.73" y="631.93" font-size="36"><tspan fill="#00f" font-weight="bold">Gesundheits-<tspan x="896.73" y="676.93">systemlimit</tspan></tspan><tspan x="896.73" y="721.93">ca. 0.6% infiziert</tspan></text>

</svg>

There is a potential fix for the first text element. Any toplexel #text nodes can simply be wrapped with tspan elements. SVG Translate will do that for a single #text node child, but the step can be more general.

That step could affect some CSS formatting, but SVG Translate has already chosen to ignore that issue.

Consequently, that problem is fixable.

The text elements with nested tspan elements are more troublesome. In these cases, it is possible to hoist the nested tspan out of its parent. In general, hoisting is complicated. It requires mergng the attributes of the parent to the child (we want "der Sozialkontakte" to inherit fill="#800" font-weight="bold"). That's possible to do, but it goes beyond the intended scope of SVG Translate. To do it correctly requires manipulating both attributes and CSS style information. Presentation properties such as fill can be an attribute or a CSS style property.

Consequently, I do not think it worthwhile for SVG Translate to fix nested tspan elements.

Then we come to another issue. Should SVG Translate offer to translate the elements it understands and ignore the others, or should SVG Translate simply refuse to translate a file that has any problems. Frankly, I leaned to the translate what you can camp, but now I'm no longer sure. I look at the second file, and I would be frustrated being offered to translate "100", "200", "300", and "400" and not most of the text. I suspect it is a better result to refuse to translate the file rather than offer some subset to translate. To say it another way, how could SVG Translate determine the significance of the portions it cannot help translating?

Save for wrapping #text nodes, I would close this item as Won't Fix.

Given that this issue is generally about forging ahead when problems are discovered rather than particular improvements, I would close as Won't Fix.

If any particular improvement is desired (such as stuffing non-empty #text nodes into tspan elements, then that can be a separate feature request.

Due to some faulty programming logic, SVG Translate has been silently but unintentionally forging ahead despite discovering format problems for a long time. That practice has made it difficult to detect other problems with SVG Translate. A recent pull request should stop that behavior. See T271000#8219964.

JMcLeod_WMF subscribed.

Following @Girx's advice.