Page MenuHomePhabricator

Installer Task abstraction
Closed, ResolvedPublic

Description

Installer and DatabaseInstaller have a lot of weird and specific things in them that make them hard to reuse. For example, Installer::__construct() resets the service container. I plan to factor out some of the logic from DatabaseInstaller into a Task class hierarchy. Task::execute() will receive a context object with a relatively narrow interface, feasible to implement outside of the installer such as in the proposed core addWiki.php.

Installation will run a list of such tasks. It will be similar to the current "steps" list except with more OOP.

Event Timeline

I think that when you have a list of tasks to execute that have side effects like creating databases and tables, you want the task to be idempotent, or - alternatively - to have a rollback mechanism in case of failure.

I'm not aware of a way to enforce idempotency in code, besides unit/integration tests. I'm not sure there is a way to do something like:

  • For each task class, run ::execute
  • If it has a ::rollback method, also execute it
  • if it doesn't, run ::execute a second time and ensure it returns the same outcome as the first time

in a generalized enough way to enforce it.

But even if not enforceable, maybe it would be good to ensure existing tasks follow this policy and it's documented.

Also, if you have a hierarchy of tasks that you want to run in a coordinated way, you should always account for the possibility one step fails for external reasons (for instance, the database master crashes suddenly), or for software bugs.

IME the best way to manage such systems is to create a finite state machine that uses task execution to move between states. If all tasks are idempotent, you can just pick up the state machine in its last known good state and continue the execution of tasks from that known good state.

There's multiple other ways to achieve the same result, of course, but given we'll have at least two processes that need to go through quite a few tasks, it might be useful to think of a unified approach at the problem of managing failures (and restarting the work at a later stage).

I think that when you have a list of tasks to execute that have side effects like creating databases and tables, you want the task to be idempotent, or - alternatively - to have a rollback mechanism in case of failure.

I don't want to have any bulk extension updates as part of this task, since that would be redundant with T258852 which I'm told is happening soon. Extension (and core) SQL schema files are typically a sequence of CREATE TABLE statements, and running the SQL file is conditional on the existence of some table in the sequence. That makes them idempotent by a narrow definition, but not atomic. You can run schema creation twice, but if it fails halfway through, that will leave the schema in a state which cannot be automatically resolved into a deterministic state. There is no simple way to implement either rollback or continuation from such a state.

This could be fixed by parsing the abstract schema files at runtime, and creating the tables one at a time, but that's not how @Ladsgroup decided to implement the abstract schema feature, and there are no plans to change it.

We could parse the SQL, but that seems out of scope.

Some tasks can theoretically be rolled back, and I wouldn't mind having a rollback method in the interface, but we can't fully implement it right now.

Change #1075668 had a related patch set uploaded (by Tim Starling; author: Tim Starling):

[mediawiki/core@master] installer: Task abstraction

https://gerrit.wikimedia.org/r/1075668

Change #1082116 had a related patch set uploaded (by Tim Starling; author: Tim Starling):

[mediawiki/core@master] installer: Move key generation from an install step to an env check

https://gerrit.wikimedia.org/r/1082116

Change #1082310 had a related patch set uploaded (by Tim Starling; author: Tim Starling):

[mediawiki/core@master] installer: Reverse disableStorage() by reloading service wiring

https://gerrit.wikimedia.org/r/1082310

Change #1082116 merged by jenkins-bot:

[mediawiki/core@master] installer: Move key generation to getDefaultSettings

https://gerrit.wikimedia.org/r/1082116

Change #1082310 merged by jenkins-bot:

[mediawiki/core@master] installer: Reverse disableStorage() by reloading service wiring

https://gerrit.wikimedia.org/r/1082310

Change #1075668 merged by jenkins-bot:

[mediawiki/core@master] installer: Task abstraction

https://gerrit.wikimedia.org/r/1075668