It has been proposed that we allow users to deploy Lua code which is stored in Git to multiple wikis. Let's figure out what that would look like.
We want something lightweight, and yet we want to take steps towards a full solution, we don't want to preclude future improvements. So we need a plan.
Package granularity
When you are running code from one version of a module, and call require() to get a submodule, you should never get code from a different version of the same module. So instead of modules and submodules, we should talk about packages.
Packages should be in a consistent state during any given parse operation.
require()
We should encourage code that looks like Lua code elsewhere on the internet. Consider LuaRocks as a source of packaging conventions.
In the Lua ecosystem, require() takes a string in hierarchical dotted form, for example require "busted.modules.files.moonscript". There are no relative paths, the string is always globally unique. The first component is the package name.
LuaRocks packages give an explicit map between module names (the string passed to require) and file names (example). The top level name may be defined as a module, conventionally init.lua (example).
Deployment scope
Packages may require() other packages. An on-wiki module may require() any other on-wiki module. Thus the scope of a set of deployed packages and their versions is the whole wiki. A wiki depends on a package. Different modules can't depend on different versions of a package.
The deployment state must be consistent within the context of a parse operation.
With global templates, you can consider the global repository to be a package with its own recursive package dependencies.
Migration and multi-version deployments
@daniel asked if it's possible to allow callers to specify package versions, like require('package@1.0'). Here is a summary of our discussion.
- Motivation: Say if a library will change, deploying a backwards-incompatible version of the same interface. There might be say 20 on-wiki callers. Instead of adding forwards-compatible branches to all 20 callers, it would be easier to explicitly require the old version, then simultaneously update the calling code and the required version by editing each caller.
- Mitigation: Discourage backwards-incompatible changes of that kind. Have a deprecation cycle and provide a new interface instead of breaking the existing interface. Remove the old interface when it has no callers. If this is not convenient, make a new package, so instead of require('package@2.0'), you would have require('package2').
- Con: Specifying the version may increase the effort required for uncomplicated updates. The dashboard proposal (T412317) has updates as one or two clicks, whereas this would potentially require editing many pages.
- Con: Specifying the version in the caller is unconventional. For example, we don't version the Lua standard library or the mw library, so we are committed to backwards compatibility in those areas.
- Con: Packages will depend on each other using version constraints in a manifest file, they don't need to specify the version in require. Multiple packages can be updated in one deployment action. Encouraging this require syntax will complicate moving code from the wiki to packages and back.
- Con: Some packages in LuaRocks assign globals instead of returning a table of functions from require. This would not be possible if we are loading multiple versions of a library in the context of a parse.
- Con: Specifying the version complicates sandbox and pilot deployments. If a module explicitly requires an old version, under what circumstances can we render a page using a pilot or test version?
- Con: Purging of old versions becomes more difficult. If users can specify any version, then we need to cache every version.
We're not going to do this.
Development cycle
Developers must be able to see the effects of their code on a real wiki without making a merge request or deploying the change. Consider two types of development workflow:
- On-wiki. A developer works on module pages, perhaps under a sandbox hierarchy on a production wiki. When they are satisfied, they want to export the current state of the sandbox as a merge request.
- Local files. A developer checks out the package from git and edits in a local IDE. They have a local MediaWiki installation. During parse, MediaWiki fetches modules directly from the development filesystem.
It follows that there must be at least three types of repository client:
- Local sandbox
- Filesystem
- Server cache
Package registry
Users should refer to packages by name, not by URL, so that URLs can be updated globally, and so that packages can easily be overridden during local development. So we need some way to find named packages.
Options:
- Just a prefix, e.g. https://gitlab.wikimedia.org/lua-repos/$name
- Self hosted LuaRocks server. This can solve version dependencies for us, but it has a lot of baggage to take on. Packages would need a rockspec file, which needs to be renamed and updated at each release.
- Some simple DIY map, like a YAML file in a git repo.
Versions
Tagging packages with a version number before deployment enables the following features:
- Packages depending on other packages with version constraints.
- Wikis depending on packages with versions constraints.
- Human-readable package versions, displayed on the wiki.
We could cope with commit hashes, but version numbers seem nice to have.
Adding and updating deployed packages
Fetching is triggered via an unauthenticated endpoint and is allowed as long as the specified repository is under a configured prefix. Typically, fetching will be triggered by a GitLab webhook (T412320). Packages are fetched to shared tables and are available globally.
However, to use a package on a wiki, it must be deployed, and this is requires a user right. A user edits a wiki page, perhaps via a dashboard special page (T412317), adding the package to the list of deployed packages. Similarly, updating the deployed version of a package is done by editing the version on this wiki page. Deployment is local to the wiki.
Parser integration
The simplest thing is to have no additional #invoke feature. Users can make a proxy module Module:Somepackage:
return require "somepackage"
And then {{#invoke:somepackage|func}} will work.
require() could register a link from the page being parsed to the package name, so that the parser cache can be invalidated when a new version of the package is deployed.
In the future we might like to have global creation or shadowing of these proxy pages. That would fit in well with a global templates feature. Similarly trans-wiki inclusion of a proxy module page, like {{#invoke:commons:somepackage|func}}, would fit under the global templates banner but is not necessary for the present work.
Caching repository client
For availability and performance, there should be no access to GitLab during parse.
Downloading new versions of packages and inserting them into persistent storage should be done before allowing the deployment page to be changed.
Rollback of a deployment page to a previous version is a local operation as long as the cache has not been purged.
For performance reasons when scaling this up to 1000+ wikis, package files should not be stored as pages. The persistent cache can then be shared across all wikis. There could be a package file viewer giving read access to this cache.
Gadget synergies
There is a similar request for gadgets in git (T187749). For example, the caching repository client would be useful for Gadgets. Code that can be shared between Scribunto and Gadgets could be in a separate extension.