Page MenuHomePhabricator

Investigate translation for OpenAPI specs
Closed, ResolvedPublicSpike

Description

Investigate a way to internationalize OpenAPI specs for AQS 2.0 by integrating with translatewiki.net. For reference, here's the localized API spec for Toolhub: https://toolhub.wikimedia.org/api-docs

Event Timeline

Possible equation:
Swagger + translatewiki +UI/skin + some custom code we’d need to create/fork to connect swagger and translatewiki + community outreach and collaboration for translation = internationalized, automated openAPI specs

Dumping some high level data in advance of a meeting on this topic:

Toolhub's OpenAPI spec is generated by Django from what is basically code annotations. The message strings in those annotations are marked up so that Django's manage.py makemessages maintenance command collects them to be added to the translation source GNU gettext PO file. This file is committed to git and later read by processes at translatewiki.net (TWN) to populate their translation catalog. When the base percentage of messages translated for a target language has been met, TWN starts outputting a target language specific PO file in it's twice weekly exports. These show up for Toolhub as gerrit patches which are automerged by TWN related bots. Translations are selected at runtime when generating the OpenAPI spec based on Django's knowledge of the target user's language preference.

Action API specifications are localized via TWN as well: https://en.wikipedia.org/w/api.php?uselang=es, https://en.wikipedia.org/wiki/Special:ApiSandbox?uselang=es. The translation source here is MediaWiki's manually curated JSON file system. The interaction with TWN is very similar, as is the runtime locale choice and content assembly.

The custom localization system that we built for the https://developer.wikimedia.org project may also provide useful clues for an OpenAPI localization project. In that project we use a couple of 3rd party Python libraries to extract translation units from source material (YAML and Markdown files) which we then track in git as GNU gettext PO files. These are exchanged with TWN and then at build time in Jenkins we output a separate set of static HTML files for each locale we have a translation catalog for. At runtime language selection is done manually by the visitor via URL manipulation and/or a language selector embedded in the site's chrome.

I had some pending updates about this task after exploring a while about it.

After the meeting that @bd808 kindly offered us, we could conclude the following:

  • The way with which Toolhub translates the Toolhub API won’t work for us because Toolhub is a Django application that generates API definition from some annotations into the code (they use drf-spectacular). In addition to that, they use translatewiki.net to translate the definition fields
  • ActionAPI is another API with support for translation but it’s something similar to ToolHub. There is a tailored process that reads the PHP code to generate the API documentation
  • We talked too about the Wikimedia Developer Portal (https://developer.wikimedia.org) that is translated using MK Docs. It’s a library that generates a static website for every supported language (Python). As far as I know there is no API definition included in this site.

After some additional research about other APIs, I could see the following:

  • MediaWiki Core and other APIs serve just a OAS yaml definition directly using Swagger UI and they don’t have any support for translation
  • Some APIs in the API Portal have a definition manually written in the portal itself and no OAS yaml definition is offered. Since the API portal is a MediaWiki site it has i18n support but it seems there is no translation for its content (I have tried several languages but the content is always offered in English)

In addition to all this, we have to keep in mind that:

  • Each AQS 2.0 service generates its own OAS yaml definition from some annotations added into the code (Go language) using swag but currently we cannot be sure that other services from different projects do it that way

Later, I had another interesting meeting (a Global Session) with Amir Aharoni about the translation process and he explained to me all the details about how translationwiki.net works. So, after both meetings, what I clearly understood is that translatewiki.net is the way we have to support translation for any project within the Foundation (and even outside):

  • This tool just manages the key-value pairs file that contains all fields and messages you need to translate and it generates a file with the right values for every specific language.
  • JSON is the recommended format to use for files that contain fields that must be translated (gettext is older and more problematic).
  • All we have to prepare, in the project repository, is a folder called i18n with 2 files:
    • en.json: Is the original key-value pairs file with all the messages in English we need to translate
    • qqq.json: Is a key-value pairs file with an explanation for each message (to give some context for every field to help the community to translate it)
  • The tool will generate a new key-value pairs file for every language with translated messages
  • It's the community who translate, using this website, every field we include in the i18n key-value file that we have to prepare. It’s very similar to the way that translation files must be prepared in any regular application with i18n support. But, In this case, translatewiki.net has even a bot that takes your i18n files from your repository, makes them available to the community to translate them through translatewiki.net and puts them again into the project repo with all the available updates done everyday
  • Language team gives support if needed (#talk-to-language slack channel). And, if we need support, we can talk with Niklas Langstrom who belongs to this team and is close to the technical details
  • Before starting working with translatewiki.net we have to register a project and follow some instructions:

Anyway, in addition to translatewiki.net, we would need to create something for users to see the API definition in the language they choose. We would need a tailored process. At the end it’s basically what Virginia mentioned some comments before:

  • We could explore using swagger-i18n-extension. If we merged the i18n files that translatewiki provides with the OAS yaml file to create the source that this tool needs, we could just launch it to generate a definition for every language we have included (it seems feasible). After this process we would get, for example, three definition files (definition.eng.yaml, definition.spa.yaml, definition.fre.yaml) if we had translations available for English, Spanish and French languages
  • At this point, we should think about a way for users to choose the language with which they want to read the API definition (maybe we can customize something using rapidoc or any other OAS definition renderer). It’s something we should explore as well

Thanks for the update, @Sfaci!

Since the API portal is a MediaWiki site it has i18n support but it seems there is no translation for its content (I have tried several languages but the content is always offered in English)

This is correct. The API Portal is not currently running the MediaWiki Translate Extension, which is needed to enable translations. We do plan to enable the Translate Extension once the Portal is in a more stable state, but it won't help translate OpenAPI specs with the current tools.

Each AQS 2.0 service generates its own OAS yaml definition from some annotations added into the code (Go language) using swag but currently we cannot be sure that other services from different projects do it that way

There are a few other projects that generate an OpenAPI spec from code annotations, but they all use different tools that are specific to the programming language or framework.

Restricted Application changed the subtype of this task from "Task" to "Spike". · View Herald TranscriptOct 3 2023, 2:00 PM
apaskulin assigned this task to Sfaci.

I think we can call this task done since Sfaci completed the investigation. I think we learned enough to conclude that, without readily available tooling for this use case for Go, there's no easy way to internationalize the API specs at the service level. Internationalizing the OpenAPI specs for the AQS services would require building custom tooling.