Maniphest T201207

[Epic] SVG Translate wishlist project
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Niharika
	Aug 3 2018, 9:21 PM

Description

This ticket is to capture the work Community-Tech will be doing for the SVG translate project requested in the 2017 wishlist survey.

Problem:

As a user, I want to translate an SVG file with text labels so I can use it on multiple wikis. Here's an example -
SVG of Anatomy of Human Ear in English. Usage of the file on other wikis:

Proposed solution:

MVP:

A tool on ToolForge that...
- Allows a user to find a file on Commons
  - The tool accepts a file URL and auto-completes SVG file names from Commons, if the user starts typing.
  - Once the user selects the file, the tool opens the file up in the translate view, where it can be translated.
- Allows a user to add a new language translation
  - The user can select what language they want to translate from.
  - The user can select what language they want to translate to.
  - The user is presented with labels for entering the translations.
- Allows a user to view and edit previously existing translations
  - If there are existing translations for a language that user wants to translate to, they get pulled in and are modifiable in the interface.
- Allows a user to preview the file
  - The user can hit preview at any stage during the translation process and see the translated labels appear in the image. (Dependent on technical feasibility)
- Allows a user to login to the tool using OAuth
  - User can login to the tool at any point and the tool will hold on to all their translations.
- Allows a user to upload the file to Commons under their account (using OAuth)
  - This action would be disallowed until the user has added translations for all labels.
  - This action would be disallowed until the user has logged in.
  - This creates a new version of the same file (translations saved using switch) and leads user to commons where they can input the description/file changes.
- Allows a user to download the file with the new translation
  - The user can download the file at any stage (even if all the labels are not translated)

A gadget that...
- Generates a link to the tool for every image on this category on commons.
- Is customizable to be used for any wiki and any category. -----

Content below left as-is for posterity

Current translation solutions:

1. SVG Translate tool on ToolForge (code)

How it works:

User enters a file name in the
The interface contains the SVG image along with an interface to translate text strings found in the SVG:

The user can add translations and select the language the translations are for.
Once translated, the tool creates a new, translated version of the SVG.
The user then has an option of either downloading the new SVG version and uploading to commons themselves or uploading from the tool itself (using OAuth)
The user then links the new translated version to the original version

Status - Doesn't appear to work. OAuth is reportedly broken after January 2017. T164275
Notes:

The tool has 5 languages hard-coded. We will need to change it to add TWN support.

2. Manual translation using graphic/text editors

In the absence of the above tool, it appears that most users use text editors to edit SVG files and upload them. This severely limits the number of people able to do this.

3. WIP TranslateSVG extension (code)

GSoC project roadmap document
Video for the working extension
Based on the video, here's my interpretation of how it works:
- User goes to Special:TranslateSVG
- Inputs the file name
- There is a page which shows all the existing translations
- Click a button to get a pop-up where user enters new language code
- User gets a form with each <text> string in original language along with an input box to translate it.
- Every string has an input for X and Y coordinates as well.
- User saves translations.
- The extension modified the existing file and saves translations using the [<switch>](https://www.w3.org/TR/SVG11/struct.html#SwitchElement) statement and the [systemLanguage](https://www.w3.org/TR/SVG11/struct.html#ConditionalProcessingSystemLanguageAttribute) attribute in the SVG.
- User can embed file using [[File:ABC|lang=de]] syntax

Status: Extension is not deployed anywhere. The testwiki that was setup for the extension no longer works.

A comparison of the ToolForge tool vs the Extension:

ToolForge tool	TranslateSVG extension
Flexibility in terms of interface	Interface is less flexible (there's a special page for converting SVGs)
Any image can be translated, doesn't have to be on Commons	Only images on the wiki the extension is on can be translated
Allows for translated images to be downloaded	Translated images can't be downloaded (at the moment)
No external dependencies	Integrates with Translate extension
Allows for translate links to be shared like this one which opens the form for translation (seems broken right now)	Translate also allows for direct links to translate page
File save is complex (OAuth authentication & save or manual upload)	Saving the translated file is easy
The tool creates a new version of the file for every translation	The extension updates the original file using `switch` statements
The tool doesn't allow for modification of X and Y coordinates of the labels	The extension allows for labels to be repositioned

Open questions:

Are there any downsides to embedding multiple translations in the same file as opposed to creating a new file for each translation?
Are there any downsides to allowing users to modify string coordinates in SVGs?
Are there any downsides to allowing users to adjust text size in SVGs?
Is it possible to suggest translations for strings? (Using TranslateWiki or something else)

Other resources:

Category on Commons for translatable SVGs
- Files added to this category by adding {{Translation possible}} or {{Translate}} templates to the file page.
Category on Commons for SVGs translatable using switch statements.
- Files that have at least one existing switch statement come here when {{Translate | switch=yes}} is added to the file page.
https://meta.wikimedia.org/wiki/Grants:Project/Glrx/SVG_i18n

Related Objects
Search...

Status	Assigned	Task
Resolved	Niharika	T201207 [Epic] SVG Translate wishlist project
Resolved	MaxSem	T184310 [Spike: 8 hours] Investigation: TranslateSvg wishlist project
Resolved	MaxSem	T202181 [Spike 4 hours] Investigate the work involved in defaulting SVGs to show wiki language if available
Resolved	Niharika	T202768 Decide the tech stack for the SVG Translate tool
Resolved	MaxSem	T202771 [8 hours] Investigate ways to handle text breaking in SVGs
Resolved	Samwilson	T203711 Prep work for SVG Translate tool
Resolved	Samwilson	T203714 Create the landing page for the SVG Translate tool
Resolved	Samwilson	T204596 Search image component for SVG Translate tool
Resolved	Samwilson	T204849 SVG Translate tool: Language settings dialog
Resolved	Samwilson	T206712 Create the Translate view
Resolved	Samwilson	T207711 SVG Parsing I - Language dropdowns in SVG Translate tool
Resolved	Samwilson	T207709 SVG Parsing II - Fetching labels and ability to switch languages
Resolved	Samwilson	T207199 Language switching behavior for SVG Translate
Resolved	Samwilson	T207203 Image preview functionality
Resolved	MaxSem	T206724 Create an API for the JavaScript to read the available translations in the SVG (and send the new translations in so the backend creates a new SVG)
Resolved	Samwilson	T210470 Fetch the requested image from Commons
Resolved	Samwilson	T210872 Show partial translations in the From language dropdown
Resolved	Samwilson	T210564 Allow user to upload translated file to Commons
Resolved	Samwilson	T214440 Release new version of mediawiki/oauthclient

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 3 2018, 9:21 PM

Niharika claimed this task.Aug 3 2018, 9:22 PM

Niharika triaged this task as Medium priority.

Niharika mentioned this in T56214: [Epic] Review and deploy TranslateSvg extension on Wikimedia Commons.

Niharika added a subtask: T184310: [Spike: 8 hours] Investigation: TranslateSvg wishlist project.Aug 3 2018, 9:24 PM

Niharika updated the task description. (Show Details)Aug 3 2018, 10:25 PM

Glrx subscribed.Aug 3 2018, 10:28 PM

Niharika updated the task description. (Show Details)Aug 3 2018, 10:31 PM

Niharika updated the task description. (Show Details)Aug 3 2018, 10:39 PM

@Nikerabbit As the person with the most knowledge about the TranslateSVG extension, I would like to seek your input on the extension and if you think it's a better idea to fix up the extension instead of building a new tool/fixing existing tool? How much work do you think it would be to update the existing extension and get it in good shape for production deployment? Do you foresee any upcoming changes to the Translate extension which could cause more work on the extension? Also, could you verify if the way I described the extension works in the task description seems accurate?

The Community Tech team has a limited amount of time to work on this project and we'd like to deliver the most user impact we can during it.

Also @Jarry1250, I'd like to hear your thoughts on the subject too. :)

Niharika added a project: Community-Wishlist-Survey-2017.Aug 3 2018, 11:17 PM

Niharika added a project: Commons.

Niharika updated the task description. (Show Details)Aug 3 2018, 11:32 PM

Jdforrester-WMF subscribed.Aug 6 2018, 7:29 PM

• Prtksxna subscribed.Aug 7 2018, 1:47 AM

Niharika updated the task description. (Show Details)Aug 7 2018, 7:25 PM

Perhelion subscribed.Aug 8 2018, 7:59 AM

@Niharika Thanks for asking. A brain dump follows...

I have seen TranslateSVG work, but there is no doubt the code has bitrotted badly. It would probably need a fair amount of clean-up and rewriting. Especially the UI integration with Special:Translate was done before the TUX project where we rewrote the interface. The new interface has not been designed or coded with extensibility in mind, so it would require some changes in Translate. However, such extensibility is wanted for Translate, so those changes would be welcome. I don't see any upcoming changes in Translate that would cause troubles with this.

As for the general approach, the good part of using Special:Translate is that it is one place translators can go and find everything that there is to Translate. To be honest, I don't know how many people do this as compared to follow links from elsewhere to translate something specific. Other benefit is using a familiar interface (this does not mean parts of it cannot be adapted to better suit the content) with all already available translation tools (automatic checkers, machine translation, translation memory, in other languages, documentation, proofreading).

Then the problematic parts. TranslateSVG saves the translations in the SVG file itself. Given the approach of having each translatable string as a separate translation unit, the updates to files can get excessive if it is updated after every update to any translation unit. Both of the alternatives (have separate update file step; treat one image as one whole translation unit) have their drawbacks and add complexity. In this sense some separate tool where you load an image, add translations, save, can work better. You however lose the ability to easily find things to translate and all the translation tools and interface. In theory Translate could be improved to better support this kind of workflow, but that's an additional effort. The part of finding SVGs to translate could easily be implemented with a tracking category though, and the translation aids are partially available via APIs already, in case someone wants to use them in a new interface. Translate also supports change tracking (source changes -> make translations fuzzy). This would be beneficial for SVG files as well, although I don't know how feasible it is as it requires stable ids or alternative way of identifying what is changed and what isn't.

One interesting question is whether we could serve translations off-the-band without updating the SVG files themselves. It wouldn't be that difficult to modify the SVG files on the fly (with caching like for thumbnails) with translations of the requested language. This would keep file size down, avoid polluting the image page history and some other issues. We could still provide the ability to download a file with all the translations included.

Regarding the pros/cons table, Commons already has Translate installed, so not sure why "Dependent on Translate extension" is a con. I would expect that there isn't much interested to translate files uploaded locally to Wikipedias and other non-Commons wikis. With Translate it is certainly possible to include direct links to translate a specific file.

If I remember well, TranslateSVG had some problems registering the message groups. Translate needs to know what message groups exists, so that it can track them. TranslateSVG tried to add them dynamically so that they don't really exist. That was problematic. Not difficult to fix though. But the principle is same as for marking a page for translation: nothing is translatable by default unless explicitly enabled.

You are probably also aware of this comparison between SVG translation tools. The part I find relevant is the discussion how to identify what are the translation units (e.g. superscripts should not be translated separately).

I can see the appeal of having a minimal standalone tool where you take a file, input translations and re-upload automatically or manually. That might be the solution of least effort. I also see clear benefits of using Translate+TranslateSVG acknowledging that there are some decisions to make and challenges to solve. In the long run it can have lower maintenance overhead and increase the flexibility of Translate that makes it easier to support e.g. subtitle translation in the future. It would also integrate well if we ship SVG files with translatable strings with our software, as those could use the same system except that it would be automatically integrated with version control.

KartikMistry subscribed.Aug 8 2018, 3:14 PM

Thanks for the brain dump, @Nikerabbit! I've tried to address to some of points below:

I have seen TranslateSVG work, but there is no doubt the code has bitrotted badly. It would probably need a fair amount of clean-up and rewriting. Especially the UI integration with Special:Translate was done before the TUX project where we rewrote the interface. The new interface has not been designed or coded with extensibility in mind, so it would require some changes in Translate. However, such extensibility is wanted for Translate, so those changes would be welcome.

If you were to estimate how much work that is for a developer, what would you guess? Both for the parts about cleaning up TranslateSVG and extending Translate. Hand-wavy rough estimations are okay. I'm looking for a general idea.

As for the general approach, the good part of using Special:Translate is that it is one place translators can go and find everything that there is to Translate. To be honest, I don't know how many people do this as compared to follow links from elsewhere to translate something specific. Other benefit is using a familiar interface (this does not mean parts of it cannot be adapted to better suit the content) with all already available translation tools (automatic checkers, machine translation, translation memory, in other languages, documentation, proofreading).

That's true. Having a familiar interface is definitely helpful. If we're using Translate, will we be able to serve them suggestions like on TWN?

TranslateSVG saves the translations in the SVG file itself. Given the approach of having each translatable string as a separate translation unit, the updates to files can get excessive if it is updated after every update to any translation unit.

Could we instead save the updates to the file once the user has translated all the strings they wanted to instead of after every string? What are the drawbacks to that? The interface could work like a form with inputs which aren't saved until the "Save" button is clicked so we could update all translations at once. What am I missing?

> In this sense some separate tool where you load an image, add translations, save, can work better. You however lose the ability to easily find things to translate and all the translation tools and interface.
I like this point. Even if we use a separate tool, we should think about making it easier to find which files need translations for which languages.
The TranslateSVG extension, as it stands currently, does not make finding files to translate easier though, correct?

In theory Translate could be improved to better support this kind of workflow, but that's an additional effort. The part of finding SVGs to translate could easily be implemented with a tracking category though,

Commons has this category for tracking images which can be translated.

and the translation aids are partially available via APIs already, in case someone wants to use them in a new interface.

Could you elaborate on this a bit? What are translation aids? Any links would be welcome!

Translate also supports change tracking (source changes -> make translations fuzzy). This would be beneficial for SVG files as well, although I don't know how feasible it is as it requires stable ids or alternative way of identifying what is changed and what isn't.

Probably good in the long term but I expect it to be out of scope for this project.

One interesting question is whether we could serve translations off-the-band without updating the SVG files themselves. It wouldn't be that difficult to modify the SVG files on the fly (with caching like for thumbnails) with translations of the requested language. This would keep file size down, avoid polluting the image page history and some other issues. We could still provide the ability to download a file with all the translations included.

That's an interesting idea. Although this departs from the project intent a bit but depending on what route we go, it's explorable.

Regarding the pros/cons table, Commons already has Translate installed, so not sure why "Dependent on Translate extension" is a con. I would expect that there isn't much interested to translate files uploaded locally to Wikipedias and other non-Commons wikis. With Translate it is certainly possible to include direct links to translate a specific file.

Thanks for clarifying that. I put it as a con mainly because any work we do on the project will be dependent on making changes to Translate first. And while it's a small consideration, I think the amount of time devs spend to set it up locally won't be trivial either.

If I remember well, TranslateSVG had some problems registering the message groups. Translate needs to know what message groups exists, so that it can track them. TranslateSVG tried to add them dynamically so that they don't really exist. That was problematic. Not difficult to fix though. But the principle is same as for marking a page for translation: nothing is translatable by default unless explicitly enabled.

Good to know.

You are probably also aware of this comparison between SVG translation tools. The part I find relevant is the discussion how to identify what are the translation units (e.g. superscripts should not be translated separately).

I wasn't aware of that. Thanks for the link!

I can see the appeal of having a minimal standalone tool where you take a file, input translations and re-upload automatically or manually. That might be the solution of least effort. I also see clear benefits of using Translate+TranslateSVG acknowledging that there are some decisions to make and challenges to solve. In the long run it can have lower maintenance overhead and increase the flexibility of Translate that makes it easier to support e.g. subtitle translation in the future. It would also integrate well if we ship SVG files with translatable strings with our software, as those could use the same system except that it would be automatically integrated with version control.

That's a good point which we'll be sure to keep in mind when deciding on the project direction. One of the deciding factors is the amount of time we have to devote on the project, which sometimes tends to lead to quicker though not necessarily the ideal solutions, sadly.

Niharika updated the task description. (Show Details)Aug 8 2018, 7:58 PM

@kaldari, @Bawolff, @Glrx As people who I know have opinions on this project, I'd like you all to share your thoughts on the project direction as well. While keeping in mind that this is the Community Tech team which only has a short amount of time to devote to this project. :)

JoKalliauer subscribed.Aug 8 2018, 8:37 PM

In T201207#4489604, @Niharika wrote:

If you were to estimate how much work that is for a developer, what would you guess? Both for the parts about cleaning up TranslateSVG and extending Translate. Hand-wavy rough estimations are okay. I'm looking for a general idea.

Well, I would say 10 hours is too little, and 1000 hours too much. It depends a lot on the speed and scope. I would imagine it would be possible to get familiar with the code and make a small proof of concept in one week (I would gladly help with that) and more weeks for better architecture, UI and polish.

Having a familiar interface is definitely helpful. If we're using Translate, will we be able to serve them suggestions like on TWN?

Yes!

TranslateSVG saves the translations in the SVG file itself. Given the approach of having each translatable string as a separate translation unit, the updates to files can get excessive if it is updated after every update to any translation unit.

Could we instead save the updates to the file once the user has translated all the strings they wanted to instead of after every string? What are the drawbacks to that? The interface could work like a form with inputs which aren't saved until the "Save" button is clicked so we could update all translations at once. What am I missing?

That's not trivial to implement with Translate (as opposed to a separate tool) which strongly relies on the concept of independent translation units. Unless you treat the whole file as one unit like I wrote earlier.

The TranslateSVG extension, as it stands currently, does not make finding files to translate easier though, correct?

Given it doesn't work right now, no, but if fixed in the way I recommend, the files would be listed in the message group selector.

and the translation aids are partially available via APIs already, in case someone wants to use them in a new interface.

Could you elaborate on this a bit? What are translation aids? Any links would be welcome!

Translation aids are same as the translation tools I mentioned above (translation memory, machine translation, documentation, in other languages, automatic checks, proofreading).

-jem- subscribed.Aug 9 2018, 10:58 AM

Given the following factors...

The translate tool on Tool Forge is already sort of working (but is buggy and uploading is broken)
Reviving the TranslateSVG extensions is likely to require a good deal of refactoring (it's 7 years old) and may require work on the Translate extension as well (which is complicated)
We don't have a lot of time to work on this project (basically 3 months until next Wishlist Survey) and are also working on another project simultaneously
The original Wishlist Survey proposal suggested fixing the tool rather than the extension...

I'm leaning towards working on the Tool Forge tool.

• aezell updated the task description. (Show Details)Aug 9 2018, 10:09 PM

Niharika updated the task description. (Show Details)Aug 9 2018, 10:32 PM

Glrx updated the task description. (Show Details)Aug 9 2018, 10:54 PM

@Nikerabbit Looking at the video for the extension, it seems like the translation interface is not Special:Translate. Do you know if that integration is yet to be completed? It's also possible the video is outdated.

Niharika asked for my comment on this.

@Nikerabbit has already written an extensive technical explanation and I really don't have any technical aspects to add.

Using TranslateSVG has these clear advantages:

The Translate interface is familiar to a hundreds of Wikimedia volunteer translators.
Translate already implements a lot of features that are needed for localization: translation memory, "insertables", hints from other languages, progress handling, integrated qqq documentation, RTL handling, etc. These are things that each localization has to reinvent, and it would be nice to avoid.

I cannot estimate the engineering cost of refreshing TranslateSVG, but naive common sense tells me that if one new MediaWiki/Translate developer needed one summer to develop it, experienced MediaWiki developers shouldn't need much more time to get it to a deployable state. Also, maintaining yet another localization tool will have more future costs, whereas maintaining just one localization tool that handles various tasks should be more effective (as long as it can actually handle them well).

In T201207#4500305, @Niharika wrote:

@Nikerabbit Looking at the video for the extension, it seems like the translation interface is not Special:Translate. Do you know if that integration is yet to be completed? It's also possible the video is outdated.

I think that that video has been done before the interface integration.

In T201207#4500768, @Amire80 wrote:

Niharika asked for my comment on this.

Thanks for your comments, Amir!

Using TranslateSVG has these clear advantages:

The Translate interface is familiar to a hundreds of Wikimedia volunteer translators.

Translate already implements a lot of features that are needed for localization: translation memory, "insertables", hints from other languages, progress handling, integrated qqq documentation, RTL handling, etc. These are things that each localization has to reinvent, and it would be nice to avoid.

Good points. If we end up using the external tool, we will need to be cognizant of these things and maybe replicate the interface on tool. Also Niklas has helpfully pointed out that the translation aids are available via API, which we should make use of.

I cannot estimate the engineering cost of refreshing TranslateSVG, but naive common sense tells me that if one new MediaWiki/Translate developer needed one summer to develop it, experienced MediaWiki developers shouldn't need much more time to get it to a deployable state.

Amir, I think that's a flawed argument to make. There's a good deal of difference between getting a piece of software to deployable state versus getting it to a place where users actually use it and love it. And naive common sense does not usually work when it comes to MediaWiki anyway. :)

In T201207#4502443, @Nikerabbit wrote:

In T201207#4500305, @Niharika wrote:

@Nikerabbit Looking at the video for the extension, it seems like the translation interface is not Special:Translate. Do you know if that integration is yet to be completed? It's also possible the video is outdated.

I think that that video has been done before the interface integration.

Do you know if there is any updated documentation about the features the extension implements?

I commend Nikerabbit's brain dump. I'm having trouble paging all the details back in.

A major question is why both SVGTranslate and Translate SVG failed, and if those same factors will continue to come into play with updated versions of either tool. I've forgotten the original developer of SVGTranslate, but Jarry1250 rescued it. Jarry1250 also did Translate SVG. That speaks to needing highly skilled people for such projects. The tools need both SVG and server skills, and that intersection is very small.

Nikerabbit is correct about bit rot. SVGTranslate failed at least twice because libraries it used were incompatibly modified. Translate SVG suffered a similar fate.

The least work would be resurrect SVGTranslate. It apparently does the translation, but it currently fails trying to obtain user authorization. It may have further problems after that OAuth bug is fixed. SVGTranslate has many problems and is far from ideal. It creates forked copies of the image, and it has a poor idea of translation units. Because it creates copies, it needs to extract license information to use with the new copy. It did that with DerivativeFX. I believe that tool fell into decline; some editors wanted DerivativeFX redone as a gadget (https://commons.wikimedia.org/wiki/Commons:User_scripts); I do not know its fate.

T164275 svgtranslate tool-Not Working

SVGTranslate's output, however, can be loaded into any graphics editor and tweaked. Fixing it would restore a status quo ante.

I do not recommend fixing SVGTranslate. Such a fix would be expedient, but its forked copies are a maintenance problem.

Translate SVG is a different story. It has many strengths over SVGTranslate such as using the DOM instead of string matching and trying to be more careful with its edits. In many ways it is far better than SVGTranslate, but it has its own two-copy problem, and for that reason, I do not recommend fixing it. It keeps one set of all translations embedded in switch elements, and it keeps a second set of all translations embedded in the extension. Which set of translations controls? If somebody edits the SVG to include a new translation, what happens? If somebody updates a translation, what happens? Synching the two versions is a headache. There should be but one god.

I think the options for an SVG translation tool should be ( i18n) keep all the translation information in the SVG and do not maintain a shadow translation database or (l10n) have an authoritative translation database that fleshes an SVG skeleton. The advantage with i18n is MW can already handle SVG switch, and the disadvantage with l10n is MW would need changes. Option l10n is the typical industrial choice. It's also how MW handles template translation. The localized SVG files do not have a lot of baggage; a localized file doesn't carry 400 kB of translations that most users do not want to see.

The i18n option has some features, but it gets awkward when there are many translations of many labels. The l10n option is better for serving small SVG files, but Commons is a resource that should also provide an i18n.SVG on demand.

When I looked at the SVG translation problem, the constraints of SVG 1.1 indicated the translation tool wants access to not just the DOM but also the SVGDOM. SVG does not break lines, so somebody else has to do it. The SVGDOM can be used, and it can be done to be compatible with SVG 1.1 and proposed SVG 2.0. To me, that suggests an i18n translation tool should be a client-side application.

That said, I think a client-side remediation/translation tool with l10n SVG processing on the server-side is a better option. It's a lot to do and requires better handling of SVG files by MW.

Many files on Commons are not suitable for translation. Many SVG files need either problem detection or remediation, and that can be helped using the SVGDOM.

A simpler approach would have somebody who knows the issues flag a file as a reasonable candidate for translation. If the file is simple, then a translation tool can work on it.

Translate SVG tries to recognize some problem SVG files, but in the intervening years many other difficult-to-translate SVG files have appeared.

Multiline translation units are a problem because SVG 1.1 and librsvg do not have any facility for automatic line breaking. When lines (translation units) are broken, Common's SVG files may use separate text elements rather than one text with several tspans. Such files should merge the text elements into a single text element that comprises one translation unit. For example, a map might have "Adriatic" and "Sea". To make the map look right implies translating "Adriatic" to "mare" and "Sea" to "Adriatico". If the same map also had "Ionian" and "Sea", that implies translating "Ionian" to "mare" and "Sea" to "Ionio". It would be nice if a tool aided that merging and other remediations.

Many graphics contributors get the right visual result by pasting or spacing. Instead of superscripts, the artist places a separate text string above the current baseline. Color changes and font shifts may also be done that way. Instead of "The New World", the artist might overlay "The______World" with "New". Even graphic artists who produce complex and stunning images may not use text-anchor; they move left-aligned text to visually center- or right-align it. When such text is translated, the translation will be a different length and give the wrong visual effect.

The SVG files on Commons have many quirks, and neither SVGTranslate nor Translate SVG expected many of them. The French and German Graphic Workshops have made switch-translated files with planar translations. SVGTranslate sort of handles that because it translates the smallest units of text, but I think Translate SVG can get confused and start doing multilingual translations within an already monolingual groups.

Are there any downsides to embedding multiple translations in the same file as opposed to creating a new file for each translation?

Yes. Once a file has been switch translated, other graphics editors may not be able to effectively edit the file. Reading a file into an editor might throw away all of the switch translations. I think Inkscape (which uses SVG/XML as its base representation) can tolerate switch-translated files, but I'm not sure about other editors. The issue needs investigation. It would be horrible if User:LadyOfHats (who uses Adobe Illustrator) could no longer edit her files with AI after a translation tool added switch translations. There should also be some investigation about whether editors keep ids or change them. Ids allow skeletons to be edited.

If one goes for i18n.SVG, then there may need to be a method of obtaining an SVG file suitable for graphics editors that can remerge translations. That is one of the strengths of the l10n approach: the nominal skeleton is editable.

See https://commons.wikimedia.org/wiki/Template:NoInkscape

Are there any downsides to allowing users to modify string coordinates in SVGs?

Yes. It is a level of control that is expedient but causes problems in the long run. If you think about just the simple case of single line text, it should be adequately placed using text-anchor. If the line is a little bit long, then the image should be adjusted to give the line more space; that helps not only the current translation, but also other, later, translations whose text is also a little longer than the original.

Are there any downsides to allowing users to adjust text size in SVGs?

Yes. The same reason: if the text is too big, then fix the underlying image rather than diddle with the string; there may be other languages with similar problems. Also, text size can be set in many ways. It may be set on the element, but it may also be set with class and CSS. Adjusting the font size confounds the CSS scheme. I think CorelDraw can handle styles; I'm not sure about AI or Inkscape; I fear that Inkscape will flatten style information.

Is it possible to suggest translations for strings? (Using TranslateWiki or something else)

Yes, but there are technical problems with getting appropriate matches. I did some experiments and found that TranslateWiki could offer few translations. Wiktionary could offer more, but there were still few hits. I do not know if it has one, but I did not find a client-side API for TranslateWiki. There is PHP code for XLIFF conversions, but apparently no API access.

Separately, there's also a problem with user interactions and semantics. Switch-translated files are i18n, but MediaWiki inclusions of switch-translated files are l10n because they specify the language to serve. See T60920. MW needs to understand the impact of that choice. Should users on de.WP have to specify [[File:abc.svg|lang=de]] when they include a file? I do not think so. When a user updates a switch translated SVG to include French, should she really have to go to every page on the fr.WP that uses the file and add in |lang=fr? Even if a user translates an SVG to her language, her translation will not be displayed on her WP until she does that step. That hurdle should be removed. MW should not default to serving English images when more appropriate language versions are available for the wiki. The current mechanism was an expedient to use switch translations without caching identical PNGs for SVGs that do not have a translation, but instead of making that a manual process, it can happen automatically. MW has code to read SVG files and learn which langtags it supports. If File:abc.svg has de in that set, then the de.WP should show the de.PNG without needing lang=de parameter. For the icing on the cake, MW could also use its fallback language sequence to serve the best available translation.

In other words, MW should be making a more rational choice about which localization to serve as the src attribute in an img element: the generic one or a language specific one (which may depend on the SVG's xml:lang attribute):

There's also a parochial problem with "en" being the default language for all MW SVG, but that's for another time.

MW should also start serving small (e.g. 20 kB) SVG files directly rather than running them through librsvg to make PNGs. Or at least give the user a preference that allows direct serving of SVG. Served SVG can still use img elements, but it would be better to use object. That may mean sorting out some issues with width, height, and viewBox.

Regarding things like repositioning text-boxes, changing text size, combining text boxes, etc. my personal opinion is that that sort of work should be done in Inkscape (or other SVG editors), not a translation tool. Modifying positioning data in SVGs can actually be very complicated due to grouped objects, different methods of scaling, etc. It's not like a PNG where you only have to move pixels around. This tool should be for doing one thing and doing it well: translations (especially since everyone on the internet has access to Inkscape for all the other SVG editing tasks). Also, keeping the feature-list short means less maintenance and things to break.

Sidenote: I know many people use the "convert text (object) to path" feature to keep fonts looking how they want (without embedding the entire font in the SVG), which is probably relevant here
Kelvin13 might have good advice. He wrote this page of related guidance (that I just found) https://commons.wikimedia.org/wiki/User:Kelvin13/SVG_text_tutorial , which I wish I had seen a few days ago when I was converting an SVG! He also has a great gallery of examples on his userpage, many of which are translated. (I can't see a phabricator account, so you'll need to ping him at Commons)

Quiddy wrote:

Kelvin13 might have good advice. He wrote this page of related guidance (that I just found) https://commons.wikimedia.org/wiki/User:Kelvin13/SVG_text_tutorial , which I wish I had seen a few days ago when I was converting an SVG!

I disagree.

I have no problem with Kelvin13 complaining about the fonts on MW or deciding that he should convert his fonts to curves to get the exact display he wants. That's the artist's choice. Artists may want an exact font displayed at an exact weight at an exact position.

I have a problem when somebody uses font descriptions such as:

font-size:25px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-family:Frutiger Neue 55;
font-size:25px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-family:Frutiger Neue 75;

And complains "There are no more light or bold fonts." The given font specs were for "normal" fonts in every respect. The SVG is not telling the user agent to use a special font weight. If the user agent does not have "Frutiger Neue 75" (and most user agents will not), then it substitutes a font it has. We cannot expect the user agent to look at the family name and magically intuit that "75" means "bold". Don't blame other user agents for Inkscape's limitations with multiweight fonts and the consequences of a chosen workaround. See T25643 (especially WebType's font spec advice). Yes, most user agents have a poor implementation of font-weight, but most fonts have only two weights: normal and bold. If the difference between Light, Book, Normal, Bold, Heavy, and Black is important, then convert to curves.

The font specs are poor in other respects. My user agents substitute "Times New Roman" for "Frutiger Neue nn", so I see a serif rather than a desired sans serif font when I display the SVG on my laptop. If one is trying to make a figure that displays reasonably close to target on many user agents, a fallback font list is appropriate. If one does not like "DejaVu Sans" on MW, then maybe there is some other MW font that one does like; specify it. List appropriate fonts for MW, Unix, Apple, and Windows. If font family is important, then the last font in the list should be "serif" or "sans-serif" to at least get the serif aspect one desires.

Other aspects of the SVG have problems because the SVG depends on a particular font family's precise font metrics. That's almost a guarantee that substituted fonts will look poor.

The "Map of Long Island example" string was done with text-anchor:start, so it will not be centered in its box with other fonts unless the font metrics match. If the string had been done with text-anchor:middle, it would stay centered in the box no matter what the font metrics are. It still may overflow the box, but the left and right margins would match. One cannot expect the font metrics of two different fonts to match within the 2 percent sought in the MoLI example (26 characters; 1/2 character margin, text-anchor start). If one wants other fonts to work, there must be wider allowances. Center align the string and make the box 10 percent bigger, and many more fonts will work. Even better would be the use of textLength (not supported by librsvg but supported by browsers) which would make the line fit with the desired margins (i.e., the line would be left and right justified). That allows substantial font metric variation.

The leader line from "North shore" also depends on font metrics. If the "North shore" string had used text-anchor:end rather than text-anchor:start, then the leader line would be right for any font. The same goes for "(Like the South shore, but more north)" string beneath it. Notice that in the favored rendering, the two strings are not center aligned. A slight change to right justify both strings at the leader line start would make the strings tolerate wide font metric variations.

So use reasonable font specifications and the font weights will match better. Make allowances for font metric variations and use text-anchor to avoid layout issues that depend on precise matching of font metrics. Do that and you can have an SVG file that gives an appropriate appearance on many user agents without the extra work and file bloat that were suggested.

And Kelvin13's advice still ends up with a file that has odd, untranslatable, behavior. Change the SVG's "North shore" string to "Hello, World", and the visual image does not change at all because "Hello, World"'s opacity is 0. It also gives odd text selection when I display it in my browser; I get a cursor change, but no selection highlighting. It provides an unintuitive user interaction.

In response to @Glrx

Translate has good facilities for syncing translations, so I don't see it as a problem of storing translations twice. The second storage gives the ability to track history, authorship, review status etc. that are often difficult or even impossible to do in the original file format.
Translatewiki.net's translation memory is publicly available . Wikimedia's translation memory could be accessed with the same API, but it has not been made public (it is a simple configuration switch to change). Machine translation suggestions are available via cxserver.

Firstly, thank you so much for your thoughts, @Glrx! That's very helpful. I have some follow-up questions for you below:

Many files on Commons are not suitable for translation. Many SVG files need either problem detection or remediation, and that can be helped using the SVGDOM. A simpler approach would have somebody who knows the issues flag a file as a reasonable candidate for translation. If the file is simple, then a translation tool can work on it. Translate SVG tries to recognize some problem SVG files, but in the intervening years many other difficult-to-translate SVG files have appeared.

The SVG files on Commons have many quirks, and neither SVGTranslate nor Translate SVG expected many of them.

Can you list out some of the problems which come up? Examples would be very helpful. It'll help us be more cautious when making technical and design decisions.

Are there any downsides to embedding multiple translations in the same file as opposed to creating a new file for each translation?

Once a file has been switch translated, other graphics editors may not be able to effectively edit the file. Reading a file into an editor might throw away all of the switch translations. I think Inkscape (which uses SVG/XML as its base representation) can tolerate switch-translated files, but I'm not sure about other editors. The issue needs investigation. If one goes for i18n.SVG, then there may need to be a method of obtaining an SVG file suitable for graphics editors that can remerge translations.

That sounds problematic. How often would you say an SVG, once its put on Commons gets updated? If a graphic editor corrupts or throws away the translations and the user uploads the file as a new version of the image, there isn't much we would be able to do about it. If that happens, then MediaWiki needs to know and be able to warn the user.
I think the easy solution here is to have a list of recommended editors for editing SVG files for the users. Do we already have one?

Multiline translation units are a problem because SVG 1.1 and librsvg do not have any facility for automatic line breaking. When lines (translation units) are broken, Common's SVG files may use separate text elements rather than one text with several tspans. Such files should merge the text elements into a single text element that comprises one translation unit.

That's an excellent point. I wasn't aware about this quirk of SVG 1.1.
For others watching this ticket: I did a quick search and found this example file which you can plug into https://tools.wmflabs.org/svgtranslate/ and you see:

Which brings up a related question - @Glrx do you think it's safe to assume that if there are multiple same strings in the file, they would have the same translation? svgtranslate does not seem to assume that so I'm wondering if there's a reason.

MW should not default to serving English images when more appropriate language versions are available for the wiki. The current mechanism was an expedient to use switch translations without caching identical PNGs for SVGs that do not have a translation, but instead of making that a manual process, it can happen automatically.

I understand that it's a frustrating problem and one that should be fixed. I would be hesitant in fixing that issue as part of this project though. I would try to squeeze in some time for investigating how much work fixing this would be and if it seems viable, we can include it in our work. I wonder if @brion has thoughts on this topic.

The team discussed the potential ways forward on this project yesterday and unanimously voted in favor of building a new ToolForge tool for this project (while pulling relevant code as needed from the svgtranslate codebase). Here are some of the reasons that influenced our decision:

Building an external tool would allow us to move a lot faster and deliver more impact on the project without getting bogged down by a lot of tech-debt work in fixing up outdated code.
Having a standalone codebase would allow volunteer developers to get involved and take over the project once it's complete.
It gives us more flexibility in terms of both the interface and functionality the tool can have.
Translating SVGs isn't a functionality that strictly relates to our projects. It can be used by anyone for any SVG.
As a tool, we'd be able to design it such that it works well in a mobile interface and allows for our users to make micro-contributions.

I recognize the fact that an extension would be able to provide the users with a more seamless experience but owing to the time constraints on this project and the current state of the extension, it seems more prudent to build an external tool.

In T201207#4505427, @Niharika wrote:

Firstly, thank you so much for your thoughts, @Glrx! That's very helpful. I have some follow-up questions for you below:

Many files on Commons are not suitable for translation. Many SVG files need either problem detection or remediation, and that can be helped using the SVGDOM. A simpler approach would have somebody who knows the issues flag a file as a reasonable candidate for translation. If the file is simple, then a translation tool can work on it. Translate SVG tries to recognize some problem SVG files, but in the intervening years many other difficult-to-translate SVG files have appeared.

The SVG files on Commons have many quirks, and neither SVGTranslate nor Translate SVG expected many of them.

Can you list out some of the problems which come up? Examples would be very helpful. It'll help us be more cautious when making technical and design decisions.

One problem is planar translations. Each language is effectively painted on one plane, and only one plane is enabled for display. That groups the translations by language rather than translation unit.

Translate SVG makes the desirable assumption that an SVG switch element holds exactly one translation unit. Each text element is one translations, and separate lines will be tspan child elements. That makes it easy to find translation units.

switch
- text systemLanguage=de Haus
- text systemLanguage=en house
- text systemLanguage=fr maison
- text systemLanguage=it casa
- text Haus
switch
- text systemLanguage=de Strasse
- text systemLanguage=en street
- text systemLanguage=fr rue
- text systemLanguage=it strada
- text Strasse

Many SVG files do not follow that assumption and try to minimize the number of switch elements and systemLanguage attributes. The switch children might be g elements that hold all the translation for one language -- text elements that will be painted all over the diagram. Consequently, collecting all translations of a translation unit is difficult.

switch
- g systemLanguage=de
  - text Haus
  - text Strasse
- g systemLanguage=en
  - text house
  - text street
- g systemLanguage=fr
  - text maison
  - text rue
- g systemLanguage=it
  - text casa
  - text strada
- g
  - text Haus
  - text Strasse

Sometimes, text elements are used to group languages and tspans for strings (which may be only one line of a multiline translation unit).

switch
- text systemLanguage=de
  - tspan Haus
  - tspan Strasse
- text systemLanguage=en
  - tspan house
  - tspan street
- text systemLanguage=fr
  - tspan maison
  - tspan rue
- text systemLanguage=it
  - tspan casa
  - tspan strada
- text
  - tspan Haus
  - tspan Strasse

So what a text element contains varies. One cannot expect an orderly system.

Are there any downsides to embedding multiple translations in the same file as opposed to creating a new file for each translation?

Once a file has been switch translated, other graphics editors may not be able to effectively edit the file. Reading a file into an editor might throw away all of the switch translations. I think Inkscape (which uses SVG/XML as its base representation) can tolerate switch-translated files, but I'm not sure about other editors. The issue needs investigation. If one goes for i18n.SVG, then there may need to be a method of obtaining an SVG file suitable for graphics editors that can remerge translations.

That sounds problematic. How often would you say an SVG, once its put on Commons gets updated? If a graphic editor corrupts or throws away the translations and the user uploads the file as a new version of the image, there isn't much we would be able to do about it. If that happens, then MediaWiki needs to know and be able to warn the user.
I think the easy solution here is to have a list of recommended editors for editing SVG files for the users. Do we already have one?

I think most editors who know how to do switch translated SVG are cautious and have not switch translated files that might see significant editing, so there are not a many switch translated SVG being pulled back into graphics editors. Files do get edited after translation. SVGTranslate was used on File:Bicycle diagram-en (edit).svg ; there are subtle changes in the images. There was a cell diagram that was translated into Turkish. Sometime later, the original artist improved his diagram, so subsequent translations show the improvement, but the Turkish diagram still uses the old art. Diagrams get improved, borders get removed, and errors get fixed on many SVG diagrams. I see no reason that translated SVG files would not see such editing. In addition, as files get translated, there would be more pressure to adjust label positions to give more room for text. Compare "hind leg" in

I came across an SVG the other day that had been translated to Arabic; the translator had done a good job of making the translations single line and placing the labels in better locations. Consequently, I could see a majority of translated images being pulled into a graphics editor just to reposition the labels so they have more room, line up better (snap to grid), or use an appropriate text-anchor. I do those changes with a text editor because I know SVG, but many editors would prefer a graphics editor to hand editing SVG.

https://commons.wikimedia.org/wiki/File:Transformer3d_col3.svg has been around for awhile, but I could see some substantial graphics clean up. "Magnetic Flux" and "Transformer Core" are awkward (the flux turns into the core which turns back into flux!); the "Transformer Core" should be moved out of the way. There's room to make some multiline translation units single lines. Consequently, I think there's always the chance that a file will be pulled back into a graphics editor.

Many diagrams with an embedded PNG should be pulled into graphics editor to vectorize the PNG. SVG files can be PNG raster images overlaid with SVG text labels. It's easy to translate the labels, but at some point it might be good to vectorize the PNG raster.

My gut tells me that only Inkscape will tolerate switch translations. Its native format is SVG, so it could easily adapt to switch elements. Most other editors will read an SVG file, convert it to an internal representation, work with the internal representation, and then write out the result. If the internal representation does not have conditionals, then the file won't be read, the conditionals will be ignored, or just one clause will be incorporated. I doubt AI or CorelDraw handles switch effectively. That needs research.

Telling everybody to use Inkscape might be an option, but editors such as LadyofHats have produced featured images with Illustrator (such as the Culex Pipiens above). I do not want to tell her she must switch editors. Her work, being both good and featured, gets translated a lot.

https://commons.wikimedia.org/wiki/File:Circulatory_System_en.svg

Multiline translation units are a problem because SVG 1.1 and librsvg do not have any facility for automatic line breaking. When lines (translation units) are broken, Common's SVG files may use separate text elements rather than one text with several tspans. Such files should merge the text elements into a single text element that comprises one translation unit.

Which brings up a related question - @Glrx do you think it's safe to assume that if there are multiple same strings in the file, they would have the same translation? svgtranslate does not seem to assume that so I'm wondering if there's a reason.

I do not think that is a safe assumption. In the example I gave above, "Sea" translated to both "Adriatico" and "Ionio". That's a weakness of SVGTranslate -- it only looked at strings. If we look at translation units, then "Adriatic Sea" would be different from "Ionian Sea".

If you don't group translation units, then there will be trouble down the road.

Even with translation units, assuming a translation unit always goes to the same value may not be a safe bet. There might be a diagram that shows 4 bricks. Each brick might be labeled "brick" in the source, but a translator might consider the labels sloppy and be more specific: "brick 1", "brick 2", "brick 3", and "brick 4". Some languages may make finer distinctions. A diagram of a car might label two objects as "bumper", but what if a target language distinguishes "front bumper" and "rear bumper"? It's better to place ids in the SVG. I do not know if ids survive graphics editors. I doubt data-* attributes survive most graphics editors. (I suspect that Inkscape shines here.)

SVG 2 hopes to have line breaking and proposes to do it in a way compatible with SVG 1.1, but AFAIK, no browsers implement the feature yet. I do not know if it will be supported. In SVG 2, a text element with inline-size CSS will ignore absolute positioning commands and break text. SVG 1.1 will just ignore the CSS property and follow the absolute positioning commands.

https://svgwg.org/svg2-draft/text.html#InlineSize

MW should not default to serving English images when more appropriate language versions are available for the wiki. The current mechanism was an expedient to use switch translations without caching identical PNGs for SVGs that do not have a translation, but instead of making that a manual process, it can happen automatically.

I understand that it's a frustrating problem and one that should be fixed. I would be hesitant in fixing that issue as part of this project though. I would try to squeeze in some time for investigating how much work fixing this would be and if it seems viable, we can include it in our work. I wonder if @brion has thoughts on this topic.

I think improving SVG language handling is important to making the whole idea work. For English speaking users, it doesn't have any impact, but for non-English speaking users it means there are two hurdles: do the translation and do the inclusion fix up. Since we are talking about translations, non-English speaking users are the primary target.

Nemo_bis updated the task description. (Show Details)Aug 16 2018, 7:24 AM

Nemo_bis subscribed.

@Glrx Thank you for the notes about Kelvin13's page.
I don't have anything else to add, so will "bow out", but I'm really happy to see this feature being worked on. :)

Thank you for the detailed feedback, @Glrx. I appreciate it a lot. :)

I have started putting out all the information on the project page and asked people to comment on the open questions on the talk page.

@Glrx I understand that broken translation units are a big problem.

Translate SVG makes the desirable assumption that an SVG switch element holds exactly one translation unit. Each text element is one translations, and separate lines will be tspan child elements. That makes it easy to find translation units.

The tool we build will need to make an assumption too. Is this the right assumption to make?

I think besides building this tool, we should also come up with and document the best practices handling SVGs on Commons, for people who create or modify SVGs frequently. Problems will continue to come up unless there is a shared understanding of what is the right way to do something. This can be true for linking to other file versions or adding text labels etc.

In T201207#4510890, @Niharika wrote:

@Glrx I understand that broken translation units are a big problem.

Translate SVG makes the desirable assumption that an SVG switch element holds exactly one translation unit. Each text element is one translations, and separate lines will be tspan child elements. That makes it easy to find translation units.

The tool we build will need to make an assumption too. Is this the right assumption to make?

I'm not sure it is a reasonable assumption. There are too many files out there that do not follow that practice, and that means turning a translation tool loose on those files can be problematic. IIRC, Translate SVG tried to recognize some broken assumptions and quit.

The translation unit per switch should be a requirement. Consider a two-step process. Somebody looks at the SVG, cleans it up for translation, and then marks it as suitable for translation. If the translation program doesn't do line breaking, then never mark a multiline SVG file as suitable for translation. (Or mark portions of the SVG files as suitable for translation.) Part of cleaning up the SVG file is hoisting position and style attributes/properties. If you look at typical Inkscape output, the text elements are weighed down with graphics state. If you look at simple, hand-made, switch-translated, files, the translations are text elements with only the systemLanguage attribute and translation text content; all the graphics state is defaulted (e.g., x="0") or inherited (e.g., font-size and fill). That makes it easy to edit. The program can look to see if the switch element has that clean structure. BIDI handling falls out if the text content is unidirectional, but if it is mixed direction there are problems.

The tool should also respect its:translate and SVG 2.0 translate attributes.

In T201207#4511014, @Glrx wrote:

In T201207#4510890, @Niharika wrote:

@Glrx I understand that broken translation units are a big problem.

Translate SVG makes the desirable assumption that an SVG switch element holds exactly one translation unit. Each text element is one translations, and separate lines will be tspan child elements. That makes it easy to find translation units.

The tool we build will need to make an assumption too. Is this the right assumption to make?

I'm not sure it is a reasonable assumption. There are too many files out there that do not follow that practice, and that means turning a translation tool loose on those files can be problematic. IIRC, Translate SVG tried to recognize some broken assumptions and quit.

The translation unit per switch should be a requirement. Consider a two-step process. Somebody looks at the SVG, cleans it up for translation, and then marks it as suitable for translation. If the translation program doesn't do line breaking, then never mark a multiline SVG file as suitable for translation. (Or mark portions of the SVG files as suitable for translation.)

Does this currently happen on Commons on SVGs are added to Translation possible - SVG (switch) category?
If not, do you think the commons community will be willing to start a new process for this? How many people active on Commons have the skills to do such cleanup for SVGs?

Part of cleaning up the SVG file is hoisting position and style attributes/properties. If you look at typical Inkscape output, the text elements are weighed down with graphics state. If you look at simple, hand-made, switch-translated, files, the translations are text elements with only the systemLanguage attribute and translation text content; all the graphics state is defaulted (e.g., x="0") or inherited (e.g., font-size and fill). That makes it easy to edit. The program can look to see if the switch element has that clean structure. BIDI handling falls out if the text content is unidirectional, but if it is mixed direction there are problems.
The tool should also respect its:translate and SVG 2.0 translate attributes.

Good points. Thanks.

The Translation possible - SVG category is/was an invitation to use SVGTranslate on the file or to use a graphics editor to change the strings and upload to a new file name.

The Translation possible - SVG (switch) is an invitation to use a text editor to add new children to the existing switches. It only means that translations were done using switch.

Cleaning up a file for translation requires graphics editing skills but no foreign language skills. The basic task is making sure there is lots of room for potential translations (reposition labels and leaders), the appropriate text-anchor is used, and text elements contain a translation unit (merge or split text). A significant number of graphic editors have those skills, and your Best Practices document could tell the original artist and others what needs to be done. Selecting the appropriate output options from Inkscape can keep the SVG simpler.

That does not mean all of them will be able to do it. There's been trouble getting editors to remove flowed text in Inkscape. Many editors are novices.

Very few editors can do the hoisting, but your program should do that.

Niharika mentioned this in T202181: [Spike 4 hours] Investigate the work involved in defaulting SVGs to show wiki language if available.Aug 18 2018, 12:30 AM

Hello. As noted above I am the current (non-)maintainer on SVGTranslate and was the main contributor to TranslateSVG.

A lot to agree with in this thread. Some additional points:

TranslateSVG actually worked pretty well. It is not difficult to dream up cases where it doesn't, but I ran some mechanical analyses of SVGs and 80%+ are simple enough for it to work.
I don't disagree that TranslateSVG will have bitrotted in the last few years. Althoug as I recall @Nikerabbit I did implement TUX support, at least for whatever version of TUX was available at the time. See example commit.
By far the biggest challenge with getting TranslateSVG working is that, as a volunteer, getting a new extension deployed into production is very difficult. Last time I seriously engaged with the attempt to get TranslateSVG deployed, we had very few people who felt confident checking image upload code to make sure it didn't introduce any new vulnerabilities. It's certainly a lot easier to get a proof of concept tool deployed.
Re GLRX: "If the same map also had "Ionian" and "Sea", that implies translating "Ionian" to "mare" and "Sea" to "Ionio". It would be nice if a tool aided that merging and other remediations." Just to note that TranslateSVG provide image previews, which can help with this. For the same reason I disagree with @kaldari . Being able to change some small decorative features enable translation to work out of the box (no knowledge of editing SVGs required). Trying to go into a translated SVG and change individual font sizes yourself is a nightmare. Relying on two people with different skill sets just to translate one diagram seem needless (at least in easy cases), particularly when we think about smaller language communities.
I don't know whether Illustrator destroys switches. I accept that would be an issue with a switch-based approached.

Of course I am a little sad that TranslateSVG is not going to be revived at the current time, but I do understand (by contrast SVGTranslate is a hunk of junk, not sad about that at all). You may still be able to pull out some of the backend methods from TranslateSVG -- many of them do have associated tests. I would like to think it is reasonably well documented and fundamentally the SVG spec has not really changed in the last 10 years!

And most of all -- good luck!

Nikerabbit awarded a token.Aug 19 2018, 7:09 PM

Thanks for all your comments, @Jarry1250. I have a couple of questions for you.

In T201207#4513125, @Jarry1250 wrote:

TranslateSVG actually worked pretty well. It is not difficult to dream up cases where it doesn't, but I ran some mechanical analyses of SVGs and 80%+ are simple enough for it to work.

Can you share more about what cases where you ran into troubles? I'm looking to compile a list of pitfalls to avoid. If you have the results from the analysis you run documented somewhere, that would be very helpful.

I don't disagree that TranslateSVG will have bitrotted in the last few years. Althoug as I recall @Nikerabbit I did implement TUX support, at least for whatever version of TUX was available at the time. See example commit.

By far the biggest challenge with getting TranslateSVG working is that, as a volunteer, getting a new extension deployed into production is very difficult. Last time I seriously engaged with the attempt to get TranslateSVG deployed, we had very few people who felt confident checking image upload code to make sure it didn't introduce any new vulnerabilities. It's certainly a lot easier to get a proof of concept tool deployed.

Extensions are harder to get deployed for everyone, not just volunteers. Security reviews take a couple of months alone. That's also a factor why we decided to move forward with building a better tool. We can iterate faster on it and deliver more value for our efforts.

Re GLRX: "If the same map also had "Ionian" and "Sea", that implies translating "Ionian" to "mare" and "Sea" to "Ionio". It would be nice if a tool aided that merging and other remediations." Just to note that TranslateSVG provide image previews, which can help with this. For the same reason I disagree with @kaldari . Being able to change some small decorative features enable translation to work out of the box (no knowledge of editing SVGs required). Trying to go into a translated SVG and change individual font sizes yourself is a nightmare. Relying on two people with different skill sets just to translate one diagram seem needless (at least in easy cases), particularly when we think about smaller language communities.

I went through the process of translating an image/decreasing font-size and even with Inkscape, it's a terrible process. Inkscape looks like something from the 90s. Doing it in a text editor is painstaking. I heartily agree that the tool should allow users to do small scale style changes.
I would be asking the community on what they think about allowing the tool to allow users to do basic styling changes.

Of course I am a little sad that TranslateSVG is not going to be revived at the current time, but I do understand (by contrast SVGTranslate is a hunk of junk, not sad about that at all). You may still be able to pull out some of the backend methods from TranslateSVG -- many of them do have associated tests. I would like to think it is reasonably well documented and fundamentally the SVG spec has not really changed in the last 10 years!

Thanks for all your work on this project, @Jarry1250. I'm sure we'll be able to use some of that code. Do you mind if we reuse the current toolforge instance (https://tools.wmflabs.org/svgtranslate/) to host the revised version of the tool? It'd be great if you could give me access to it (user Niharika29). You can manage maintainers for your tool on Toolsadmin.

Doing it in a text editor is painstaking. I heartily agree that the tool should allow users to do small scale style changes.

Just be aware that implementing that will be a lot of work with lots of edge cases.

In T201207#4515672, @kaldari wrote:

Doing it in a text editor is painstaking. I heartily agree that the tool should allow users to do small scale style changes.

Just be aware that implementing that will be a lot of work with lots of edge cases.

Good point. It won't be a part of the MVP, certainly. I'll be putting up a list of features we want in the MVP in the ticket description shortly.

Niharika updated the task description. (Show Details)Aug 20 2018, 10:36 PM

Niharika updated the task description. (Show Details)Aug 21 2018, 8:00 PM

Niharika updated the task description. (Show Details)Aug 22 2018, 1:36 AM

That's also a factor why we decided to move forward with building a better tool. We can iterate faster on it and deliver more value for our efforts.

There's often an illusion of speed: in practice, you need to run just to stand still compared to MediaWiki and Translate, reimplementing all the features from scratch (including translation memories to reduce redundant work, as noted by various people above).

To "iterate faster", you could just have a test installation of TranslateSVG with a set of high-usage SVGs from Wikimedia Commons which need translations. The SVG files can be moved to Commons in bulk at the end of the testing phase, so there would be no wasted effort.

Niharika added a project: SVG Translate Tool.Aug 24 2018, 7:45 PM

Glrx mentioned this in T203444: Way to change word order?.Sep 6 2018, 4:34 PM

Niharika added subtasks: T203711: Prep work for SVG Translate tool, T203714: Create the landing page for the SVG Translate tool, T204596: Search image component for SVG Translate tool, T204849: SVG Translate tool: Language settings dialog.Sep 19 2018, 5:24 PM

MaxSem closed subtask T202181: [Spike 4 hours] Investigate the work involved in defaulting SVGs to show wiki language if available as Resolved.Oct 3 2018, 9:28 PM

kaldari mentioned this in T170817: Upgrade Thumbor servers to Stretch.Oct 4 2018, 6:31 PM

Niharika closed subtask T203714: Create the landing page for the SVG Translate tool as Resolved.Oct 12 2018, 11:09 PM

Niharika closed subtask T202771: [8 hours] Investigate ways to handle text breaking in SVGs as Resolved.Oct 15 2018, 8:28 PM

Mooeypoo closed subtask T203711: Prep work for SVG Translate tool as Resolved.Oct 25 2018, 7:52 PM

Niharika closed subtask T204596: Search image component for SVG Translate tool as Resolved.Nov 15 2018, 11:07 PM

Niharika closed subtask T202768: Decide the tech stack for the SVG Translate tool as Resolved.Nov 20 2018, 6:09 PM