Page MenuHomePhabricator

i18n: Improve the workflow for developing & maintaining Codex i18n messages
Closed, ResolvedPublic3 Estimated Story Points

Description

In Codex v1.9.0, we introduced a new useI18n composable to simplify the way that translatable strings are used in components. Some documentation for how to use the useI18n composable can be found here. However, we could stand to improve the current workflow somewhat.

  • Every time a new translatable string is used in a Codex component, there is an expectation that the same string will be added to MediaWiki's i18n files simultaneously. Codex now depends on the MW codebase (implicitly) in a way that it did not in the past.
  • Correct usage is not enforced by any automated linting or testing; "failure mode" silently defaults to English text.
  • Forgetting to add new messages back to MW could result in broken interfaces in production Wikipedia for non-English-speaking users.
  • There is no single place within the Codex docs that keeps track of all i18n messages (and their default values) used across all components.

Recommendations

We should institute a new workflow for working with translatable strings in Codex that attempts to address the issues listed above. Ideally, we would automate as much of the process as possible. Some specific things we could do:

  • Consider adding all new messages to the Codex changelog on each release. When a new Codex release is deployed to MW, cross-reference the list of new messages with MW's i18n JSON files. This process could be added to the Codex release instructions.
  • Consider creating a page in the Codex docs that lists all current i18n strings and their default values, as well as which component(s) they are used in. This could be displayed in a table.
  • Consider linting/testing against every call to useI18n in the Codex codebase; using a new message key without defining it in the proper place (where?) would result in a failure.
  • Consider moving the i18n JSON files which define the component messages into the Codex itself (we'd still want them integrated into TranslateWiki however). This may be necessary if we want better linting/testing for i18n messages.

Next Steps

  • The team has discussed the above recommendations and agreed on what to implement

Codex updates

  • A new constant has been added to the constants.ts file – this should be an array of strings corresponding to each message key currently defined with useI18n
  • A new type is derived from this constant and used in the type definition for the useI18n composable. This means that Typescript should generate an error if useI18n is called with a new message key that has not yet been added to the list in the constant
  • A build script has been added to the Codex package that generates a JSON file from the new constant which contains all message keys currently in use; this file should be included in the published version of the package

MediaWiki updates (these may be moved to a follow-up task in the future)

  • The new message key definition file needs to be added to the files copied over into MW from Codex in foreign-resources.yml
  • A unit test should be added in MW Core that tests the contents of this file against the message keys in MW – any key present in this file but not present in the en.json or qqq.json files should cause an error. This test should run in CI, and it should prevent a new Codex release from being added to MW unless all necessary message keys are added in MW core
  • ResourceLoader should use this new file to automatically deliver the appropriate message keys when Codex components are loaded

Event Timeline

CCiufo-WMF renamed this task from Improve the workflow for developing & maintaining Codex i18n messages to i18n: Improve the workflow for developing & maintaining Codex i18n messages.Jul 31 2024, 4:03 PM
egardner renamed this task from i18n: Improve the workflow for developing & maintaining Codex i18n messages to [Spike] i18n: Improve the workflow for developing & maintaining Codex i18n messages.Jul 31 2024, 6:54 PM
  • Consider linting/testing against every call to useI18n in the Codex codebase; using a new message key without defining it in the proper place (where?) would result in a failure.

What about simply generating a file (maybe a .d.ts one) on release that lists all messages, and linting that in MediaWiki core? So adding a new message wouldn’t cause the Codex release to fail (it’s not actionable there anyway), but it would cause the next update in MediaWiki core to fail unless the message is added to the JSON file.

  • Consider moving the i18n JSON files which define the component messages into the Codex itself (we'd still want them integrated into TranslateWiki however). This may be necessary if we want better linting/testing for i18n messages.

Integrating into Translatewiki shouldn’t be difficult, but interpreting the JSON files the right way probably is: MediaWiki takes user settings, wiki settings and URL parameters into account when determining the requested language, and then uses fallback chains (defined in PHP files) to determine what language to use if no translation in the requested language is available. All these are available for free in mw.msg() but have to be reimplemented in JavaScript/Node.js if Codex does localization on its own.

lwatson set the point value for this task to 3.Aug 5 2024, 5:41 PM
  • Consider linting/testing against every call to useI18n in the Codex codebase; using a new message key without defining it in the proper place (where?) would result in a failure.

What about simply generating a file (maybe a .d.ts one) on release that lists all messages, and linting that in MediaWiki core? So adding a new message wouldn’t cause the Codex release to fail (it’s not actionable there anyway), but it would cause the next update in MediaWiki core to fail unless the message is added to the JSON file.

Thanks for the suggestion. We chatted about this a bit today and I think that we are going to try to implement the following workflow:

  • In Codex, we'll define a new constant array (in the constants.ts file) that lists all current message keys currently in use (similar to how we list all possible options for things like button actions).
  • We'll generate a Typescript type from this file, and that type will be used for the first argument of useI18n. In practice, this will mean that TS will complain every time you define a new message key in Codex until you add that key to the list in the constant.
  • We'll introduce a small build script that reads the value of this constant and introduces a text or JSON file listing all the keys as part of the Codex NPM package. This can be done as a prepack or postpack NPM script.
  • Finally, we'll update some code in MediaWiki core to read the content of this file and use it in a few different places:
    • some kind of unit test that can run against the Codex release commit, to make sure that all message keys defined in the Codex package are present in MW Core's own i18n files
    • Use this in the Codex ResourceLoader code which determines which messages to send to the client
    • Maybe elsewhere depending on what is needed

This should give us a decent amount of safety and predictability whenever new messages get introduced: on the Codex side, we'll have to add them to a dedicated list or TS will throw an error. On the MW side, CI will fail if the new message keys are not found in the i18n files in core.

Change #1060181 had a related patch set uploaded (by Eric Gardner; author: Eric Gardner):

[design/codex@main] i18n: Make the system more type-safe

https://gerrit.wikimedia.org/r/1060181

We chatted about this a bit today and I think that we are going to try to implement the following workflow:

Sounds like a plan, thanks! One thing:

In practice, this will mean that TS will complain every time you define a new message key in Codex until you add that key to the list in the constant.

…but may also complain if you use one of the existing keys, but in a such complex way that TS infers string as its type (rather than a string literal union). I’m okay with this limitation (explicitly specifying type should always work around such problems), but please confirm that you’re also okay with it.

  • Use this in the Codex ResourceLoader code which determines which messages to send to the client

Cool, I haven’t thought about even this being possible!

In practice, this will mean that TS will complain every time you define a new message key in Codex until you add that key to the list in the constant.

…but may also complain if you use one of the existing keys, but in a such complex way that TS infers string as its type (rather than a string literal union). I’m okay with this limitation (explicitly specifying type should always work around such problems), but please confirm that you’re also okay with it.

Can you provide an example of how this might happen? Right now the keys are intended only for use as arguments to the new useI18n and useI18nWithOverride composables. The patch over at https://gerrit.wikimedia.org/r/c/design/codex/+/1060181 relies on Typescript's const assertion feature (aka export const foo as const), which ensures that type I18nMessageKey type never broadens to just "string" – it enforces the exact values in the array. We've used a similar approach for things like Button actions.

I really like this feature of TS not just for the added type safety, but also because it provides helpful developer feedback in any TS-aware IDE like VSCode, where you can see all the valid values as immediate feedback in the editor.

Change #1060181 merged by jenkins-bot:

[design/codex@main] i18n: Make the system more type-safe

https://gerrit.wikimedia.org/r/1060181

Can you provide an example of how this might happen?

const type: 'first'|'last'|'next'|'prev' = 'first';
const msg = `cdx-table-pager-button-${type}-page`; // inferred as generic `string`
useI18n(msg, () => 'foo'); // doesn’t compile

As you see, the problem is with the parameter to useI18n(), not with I18nMessageKey itself – so it leads to false positives (which are annoying but safe), never false negatives. And the workaround is easy: just write explicit const msg: I18nMessageKey = ....

Can you provide an example of how this might happen?

const type: 'first'|'last'|'next'|'prev' = 'first';
const msg = `cdx-table-pager-button-${type}-page`; // inferred as generic `string`
useI18n(msg, () => 'foo'); // doesn’t compile

As you see, the problem is with the parameter to useI18n(), not with I18nMessageKey itself – so it leads to false positives (which are annoying but safe), never false negatives. And the workaround is easy: just write explicit const msg: I18nMessageKey = ....

Ah ok. I don't anticipate doing much dynamic message key generation in this way; instead I think typical usage of useI18n will be with a string literal as the first argument (with the expectation that the same string literal must be added to the constants file to satisfy the compiler).

Components which have a lot of messages (like Table, for the pagination controls) will be the exception IMO. If we need to support usage like what you are describing, then maybe we'll need to rethink some aspects of this system. For now I don't really see that happening.

Change #1060504 had a related patch set uploaded (by Eric Gardner; author: Eric Gardner):

[mediawiki/core@master] Update Codex from v1.10.0 to v1.11.0

https://gerrit.wikimedia.org/r/1060504

Test wiki created on Patch demo by EGardner (WMF) using patch(es) linked to this task:
https://patchdemo.wmcloud.org/wikis/e5da39d6d5/w

Change #1060504 merged by jenkins-bot:

[mediawiki/core@master] Update Codex from v1.10.0 to v1.11.0

https://gerrit.wikimedia.org/r/1060504

Change #1060926 had a related patch set uploaded (by Eric Gardner; author: Eric Gardner):

[design/codex@main] i18n: Build a JSON file for use in MW from Codex message keys

https://gerrit.wikimedia.org/r/1060926

Change #1060926 merged by jenkins-bot:

[design/codex@main] i18n: Build a JSON file for use in MW from Codex message keys

https://gerrit.wikimedia.org/r/1060926

Change #1060934 had a related patch set uploaded (by Eric Gardner; author: Eric Gardner):

[mediawiki/core@master] CodexModule: use list of i18n message keys from Codex package

https://gerrit.wikimedia.org/r/1060934

Change #1060941 had a related patch set uploaded (by Eric Gardner; author: Eric Gardner):

[mediawiki/core@master] tests: Add a test for Codex i18n messages

https://gerrit.wikimedia.org/r/1060941

egardner renamed this task from [Spike] i18n: Improve the workflow for developing & maintaining Codex i18n messages to i18n: Improve the workflow for developing & maintaining Codex i18n messages.Aug 8 2024, 11:23 PM

I've posted a series of patches that should cover almost all of the acceptance criteria listed in the task.

On the Codex side, this patch enables the generation of a new file during the Codex build process – messageKeys.json. This is a flat JSON array containing every string message key currently in use in the library.

On the MediaWiki side, this patch instructs the CodexModule class in ResourceLoader to read from the contents of that file in order to add all the listed messages there to the payload whenever a Codex component is used.

Finally, this follow-up patch in MW adds a "structure" tests to ensure that every message key from the messageKeys.json file is also present in the en.json and qqq.json files for Codex in MW.

The last piece of the puzzle will happen the next time we do a Codex release - at this point, the person doing the release will need to ensure that messageKeys.json gets added to the list of files in the Codex entry in foreign-resources.yml. That will overwrite the file I copied over there and ensure that we get automatic updates with each Codex release going forward.

Change #1060934 merged by jenkins-bot:

[mediawiki/core@master] CodexModule: use list of i18n message keys from Codex package

https://gerrit.wikimedia.org/r/1060934

Change #1060941 merged by jenkins-bot:

[mediawiki/core@master] tests: Add a test for Codex i18n messages

https://gerrit.wikimedia.org/r/1060941

Change #1064134 had a related patch set uploaded (by Catrope; author: Catrope):

[mediawiki/core@master] Update Codex from v1.11.0 to v1.11.1

https://gerrit.wikimedia.org/r/1064134

Change #1064134 merged by jenkins-bot:

[mediawiki/core@master] Update Codex from v1.11.0 to v1.11.1

https://gerrit.wikimedia.org/r/1064134