Page MenuHomePhabricator

JsonConfig remote-data-with-Lua-transform API query
Closed, ResolvedPublic

Assigned To
Authored By
bvibber
Mar 10 2025, 5:07 PM
Referenced Files
F59919746: image.png
May 12 2025, 10:11 PM
F59919750: image.png
May 12 2025, 10:11 PM
F59790614: image.png
May 8 2025, 10:11 PM
F59790610: image.png
May 8 2025, 10:11 PM
F59523833: image.png
Apr 28 2025, 10:06 PM
F59518992: image.png
Apr 28 2025, 7:08 PM
F59518980: image.png
Apr 28 2025, 7:08 PM

Description

As part of work implementing T385540 (allow lua functions to perform transform for data in charts) I'll be enhancing JsonConfig's remote fetching for Data: pages to allow running through a Lua function. This allows transform functions, their maintenance, and their execution to live on the same wiki as the Data: pages themselves (Commons in production) while allowing for caching and rate limiting within the cluster.

Conceptual model:

  • each JsonConfig-using wiki in the farm is either the 'store' wiki for a given type of data and is where the pages in the Data: namespace actually "live", or is a remote wiki that fetches the Data: pages through the standard MediaWiki action API and caches them in a shared cache.
  • -> extending this so that you can also specify a transform function to run through by providing a custom query in the API
  • client wikis use this rather than trying to run their own transforms

Security considerations:

  • we're running code yay! but we could also run the same code by asking the API to parse wikitext with {{#invoke:}} in it, in any of several different ways.
  • -> any abuse considerations for parsing (performance counters, rate limiters, etc) may need to be replicated here

Data considerations:

  • to track dependencies for cache invalidation, it may be wise to pass through information from a ParserOutput; additional page usages via the Lua code will appear in the template links there and can be passed through along with the Module: page itself.

Details:

  • where does the API query belong?
  • how to pass options?
    • language for string localization can be selected via uselang
    • as blob transform={blah} with url-encoded JSON?
    • or as transform=module&transformfunction=func&transformargs={...} etc?
      • we still want arbitrary JSON for the args probably so meh

Next step will be to attach this internally to JCSingleton::getContent() to extend it, or create a side interface, with the transform options, and have that fetch across wikis as well as running locally.

Event Timeline

CCiufo-WMF moved this task from Backlog to Sprint 18 on the Charts board.
CCiufo-WMF edited projects, added Charts (Sprint 18); removed Charts.

Change #1127692 had a related patch set uploaded (by Bvibber; author: Bvibber):

[mediawiki/extensions/JsonConfig@master] WIP Lua filtering backend for JsonConfig Data: pages

https://gerrit.wikimedia.org/r/1127692

We have a WIP patch but it is not ready to review. We are hoping to have it ready by end of week. No blockers right now. If Brooke gets stuck she'll run to Tim to get some help. We will be seeking review from Subbu and Tim.

CCiufo-WMF lowered the priority of this task from High to Medium.Mar 24 2025, 6:34 PM

Ok this is back on the top of my stack, will update the patch for the final ADR starting tomorrow.

Got stuff renamed from 'filter' to 'transform', now updating the jsondata API to extend it to run the transforms. Currently allows inputting transform settings as a JSON blob and having them run, next step hook up structured output so we can store dependencies and an expiry time...

image.png (1×1 px, 201 KB)

image.png (1×1 px, 240 KB)

Now extracting the templatelinks and globaljsonlinks entries, including one for the data page being transformed, and exporting those in the dependencies list to go into the future client side's records for cache invalidation deps

an expiry ttl is also passed through; in theory use of some things can trigger a low expiry time, it might or might not be possible to trigger those from lua right now

image.png (1×1 px, 104 KB)

bvibber renamed this task from JsonConfig remote-data-with-filter API query to JsonConfig remote-data-with-Lua-transform API query.Apr 30 2025, 6:03 PM
bvibber updated the task description. (Show Details)

Adding a new JCContentLoader/JCContentWrapper interface on top of data fetches to encapsulate this better, it's making the code fall together nicely. action=jsontransform is working for exporting local transforms, needs some cleanup for self-documentation. Next step is to add remote transforms via the api. Then hook into that from Charts end.

image.png (1×2 px, 299 KB)

image.png (1×2 px, 164 KB)

Local and remote paths now both function, at least in ideal circumstances. :D Some cleanup left but I expect to open up to code review tomorrow.

image.png (1×2 px, 208 KB)

image.png (1×2 px, 171 KB)

We are going to add tests and I am going to provide a more thorough review.

Change #1151747 had a related patch set uploaded (by Bvibber; author: Bvibber):

[operations/mediawiki-config@master] Enable Lua transform switch for Charts

https://gerrit.wikimedia.org/r/1151747

Change #1127692 merged by jenkins-bot:

[mediawiki/extensions/JsonConfig@master] Lua transform backend for JsonConfig Data: pages

https://gerrit.wikimedia.org/r/1127692

@bvibber could you document the new capability on https://mediawiki.org/wiki/Extension:Chart ? Thanks!

added a listing for 'transform', an entry in the syntax example, and dropped a link to the detail docs for transforms i prepared at https://www.mediawiki.org/wiki/Extension:Chart/Transforms

Change #1151787 had a related patch set uploaded (by Bvibber; author: Bvibber):

[mediawiki/extensions/JsonConfig@wmf/1.45.0-wmf.3] Lua transform backend for JsonConfig Data: pages

https://gerrit.wikimedia.org/r/1151787

Change #1151787 merged by jenkins-bot:

[mediawiki/extensions/JsonConfig@wmf/1.45.0-wmf.3] Lua transform backend for JsonConfig Data: pages

https://gerrit.wikimedia.org/r/1151787

Mentioned in SAL (#wikimedia-operations) [2025-05-29T16:09:05Z] <bvibber@deploy1003> Started scap sync-world: Backport for [[gerrit:1151787|Lua transform backend for JsonConfig Data: pages (T388434)]], [[gerrit:1151788|Chart-side support for Lua transforms (T388616)]]

Mentioned in SAL (#wikimedia-operations) [2025-05-29T16:33:13Z] <bvibber@deploy1003> bvibber: Backport for [[gerrit:1151787|Lua transform backend for JsonConfig Data: pages (T388434)]], [[gerrit:1151788|Chart-side support for Lua transforms (T388616)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-05-29T16:49:17Z] <bvibber@deploy1003> Finished scap sync-world: Backport for [[gerrit:1151787|Lua transform backend for JsonConfig Data: pages (T388434)]], [[gerrit:1151788|Chart-side support for Lua transforms (T388616)]] (duration: 40m 11s)