Page MenuHomePhabricator

Evolve hook system to support "filters" and "actions" only
Open, NormalPublic

Description

  • Objectives
  • Preamble
  • Problem statement
  • Proposal

Objectives

I believe the following would be desirable outcomes for this RFC. Perhaps we should call these "acceptance criteria".

Status quo statements that we'd like to uphold:

  1. Extensions can make small changes to default behaviour of a feature without significant code duplication or large boilerplates, and in a way that isn't mutually exclusive with another extension also making similar changes. (e.g. extensions can collaboratively make small changes to the same implementation).
  2. Extensions can replace an entire service. (e.g. an extension can replace a feature, in a way that is mutually exclusive with core or other extensions).

Statements we'd like to become true in the future:

  1. Most extensions can be written in a way that will remain functional in the latest stable MediaWiki release for a long time without any maintenance. ("a long time" is believed to be multiple major release cycles no shorter than the time between LTS releases; using a deprecated feature is considered functional; it was noted that while mandatory maintenance after an LTS release is acceptable, it should still be minimal in most cases)
  2. Contributors can usually improve core features without needing to break backward compatibility.
  3. Most extensions can and would by default implement their features in a way that won't significantly slow down user-facing web requests. ("can and would" is meant to signify that the performant way shouldn't require "extra care", but rather good performance should be a natural consequence of an extension author selecting the simplest approach available to them).

Preamble

First, a brief summary of the status quo. MediaWiki's current extension interfaces are:

  • Replace an implementation (high abstraction level).
  • Add an implementation (medium abstraction level).
  • Hooks (low abstraction level).
Replace an implementation

These are service-like abstractions in MediaWiki for which the implementation can be swapped.

Examples: JobQueue backend (db, redis, ..), FileRepo backend and LockManager (local filesystem, swift, ..), PoolCounter, RCFeed, Database system (mysql, postgres, sqlite, etc.).

The configuration variable has a (sometimes dynamic) default, and a site administrator would set this to a fully-qualified class name of their choosing. The class in question can either be one of the implementations that ship with core, or provided by an extension.

These services typically have only one instance, which will automatically be used by all components in core and other extensions. The reason for changing the implementation is usually to provide the same set of features but with different underlying technology. For example, to improve performance, scalability, or to share operational resources within an organisation etc.

Add an implementation

Several MediaWiki components allow extensions to register additional implementations, associated with a symbolic name. Typically all implementations are available simultaneously (e.g. not mutually exclusive). It is customary (by social convention, sometimes enforced technically) for extensions not to override existing entries, rather, they are only meant to add new ones.

Examples: Special pages, Skins, API modules, Content models, Media Handlers, Parser tags, Parser functions, and Magic words.

Extension have to implement a class interface, subclass, or function signature; and add their implementation to a registry. Usually through an extension attribute, or site configuration.

The core code then provides the bridge to these implementations from the rest of the system. E.g. the Preferences page automatically lists available skins by localised label, the "useskin" parameter supports rendering with any of the installed skins, the Parser automatically scans for added magic words to call the appropriate function etc.

Hooks

A hook allows arbitrary code to run from a callback ("hook handlers") from a specific point in core code. The core code declares that point with Hooks::run( 'TheHookName', [ $a, $b ] );, then extensions implementation a function somewhere that accepts parameters $a and $b (e.g. MyExtensionHooks::onTheHookName) and they add it to the hooks registry (e.g. in extension.json by adding the function name a Hooks.TheHookName array).

See also https://www.mediawiki.org/wiki/Manual:Hooks.


Problem statement

With this RFC, I'd like to solve problems that affect MediaWiki contributors and extension authors primarily when using the Hooks system. They affect the other extension interfaces as well, but to a lesser extent.

Hooks expose a lot of (mutable) state.

Hooks often expose high-level class instances, and in their entirety (e.g. passing $this or other high-level objects, as parameters). This means a hook handler is able to vary its return value based on anything in the current state, and also to modify any of that state, and to call any of the public methods on those objects at that time or at later times (deferred).

Allowing this kind of inspection and flexibility doesn't have to be a problem. But, I believe it is currently causing problems because we _also_ don't make a distinction between what is "recommended" and what is possible "at your own risk".

A recommended interface would be officially supported, with breaking changes happening rarely and with public announcements. Ideally with much documentation, examples, and migration guides. An at-your-own-risk approach would still be possible but might require more maintenance to keep compatible, and might sometimes change multiple times between LTSes without (much) documentation.

I believe the absence of this distinction causes the following problems for maintainers:

It is difficult to limit (or selectively encourage) how extensions will use a certain hook.

Authors and maintainers of core functionality cannot know what a hook is "supposed" to be used for, or how to change the surrounding code in core in a way that is backwards compatible.

Contributors to core functionality are unable to change code without potentially breaking an extension.

Maintaining or improving MediaWiki core is difficult.

And as consequence of the above, there are problems for extension authors as well:

Extension authors cannot know which hook they should use to implement their custom behaviour in a way that will (likely) remain stable, and that they will (likely) receive support for (in the form of documentation/migration) if and when it does change.

Extension authors have to deal with breaking changes almost every release cycle.

Maintaining MediaWiki extensions is difficult and/or requires a lot of time.

Note that this last point applies to both - extensions maintained by Wikimedia Foundation, and those from third parties.


Proposal 1

Drafted by Daniel Kinzler and Timo Tijhof based on discussions during sessions on the first days of TechConf 2018. Presented in the session "Architecting Core: Extension Interface" on the final day. The below has been amended to incorporate the feedback from that session. (Notes on mediawiki.org).

We group hooks into two broad categories: actions, and filters.

Action hooks

An asynchronous event handler, whereby a handler callback does additional work in response to an action that has occurred.

Their characteristics are:

  • Action hooks cannot modify or cancel the default result or behaviour during the user action.
  • Action hooks cannot be aborted, returning a value is considered an error.
  • Action hooks may do expensive work, which is acceptable because they are asynchronous in nature. That is, they are not part of the atomic transaction that spans the original action. Instead, they are run from a deferred update or job.
  • Action hooks are not given mutable values (e.g. value object with setters, or primitives passed by reference). Instead, they are only given immutable value objects and primitives by value, with any other read or write services obtained from the services container as needed.
  • An action hook callback cannot be prevented from running by other callbacks. (Exceptions are caught, logged, and then the hook continues; similar to deferred updates). {note: open question}
Filter hooks

A synchronous function to modify a specific value or state.

Their characteristics are:

  • Filter hooks allow the callback to modify a value (e.g. a mutable or replaceable value object, or primitive passed by reference).
  • Filter hooks may return false to prevent other callbacks for the same filter from running. For example, when an extension disables a piece of functionality it might empty the value and then return false.
  • Filter hooks must not do expensive work. Execution time will be profiled and violations above a configurable threshold will be logged as warnings.
  • Filter hooks are given only one mutable value, the other parameters must be immutable values, primitives by value, and/or read-only services.
  • Filter hooks must be deterministic and must not cause side-effects. Their behaviour should be based on the input parameters only. Other instances and services should not be obtained. {note: open question}
PHP interface

Note: The section for the PHP interface is still an early draft and has not yet received wider input.

We'll need a way to register callbacks, and a way to run callbacks. Today that is Hooks::run(), Hooks::runWithoutAbort(), and extension attribute holding an array of callables like Hooks.<Name>. We could follow the same names and do it as follows:

  • Register via ActionHooks.<Name> and FilterHooks.<Name>.
  • Run via Hooks::action(), Hooks::filterWithAbort() and Hooks::filter().

In the above interfaces, Hooks::action could hold logic for asserting we are in async context, catch exceptions, and report errors about return values. Hooks::filter could measure execution time (e.g. cumulatively by extension name with a shutdown handler reporting violations with Monolog).

Policy

Note: The section about policy is still an early draft and has not yet received wider input.

  • New hooks must be documented in a way that indicates whether they are actions, filters or legacy hooks, and whether they are abortable.
  • New hooks of type "action" or "filter" must adhere to the "must" statements above. This means that if there is not yet a value class available to represent the information in question, those classes must be written first. {note: open question}
  • ... (more?) ..
Secondary outcomes

At TechConf 2018, a number of outcomes from this proposal were noted as positive, but are not themselves requirements for the RFC. I believe these are important for MediaWiki in the long-term, and that if not improved by this RFC (e.g. we amend it or go with an alternate proposal), should instead discuss these in a separate RFC.

We should not regress on the below factors, but improving these factors is not a requirement for alternate proposals.

  • Encourage database interaction to be in a transaction separate from the main transaction that spans the "pre-send" time of web requests. (E.g. provide eventual consistency via deferred updates or job queue, instead of atomic consistency.)
  • Increase operational high-availability of MediaWiki by reducing the chance of cascading failures.
Open questions
  1. If an action hook encounters a fatal error (e.g. class undefined, out of memory, execution timeout), should we ensure other callbacks still run? If so, how? If not, how should we document this?
  1. The current phrasing allows a filter hook to interact with services not passed as parameter as long as they don't have side-effects (because it says "should not" instead of "must not"). Should this change to a must? If not, what are the use cases. If we make it a "must not", what site configuration? Should we pass Config as parameter to all filters? Or whitelist services->getConfig as the one allowed external call? Or something else?
  1. Should we include a guideline about batching? It was raised at TechConf 2018 that consumers of "filter hooks" might prefer that they are always called for individual items (no batching within the hook), which encourages code reuse and make pure functions easier. It was also raised that for "action hooks", we may want to standardise on always reflecting data shapes, or always reflecting user workflows. E.g. an action hook about uploads, where a user uploaded 3 files, should that invoke the hook 3x with 1 file (file-oriented), or 1x with a list of 3 files (user-oriented). Or could we leave this open to be decided on a case-by-case basis? Or we could encourage implementations to provide both (seems pretty simple to do).
  1. Should we require that for all new hooks, if the feature in question does not have value objects, that those are implemented first? If we don't want to require this, how do we move forward? Perhaps we can state that value objects are mandatory if using the new hook system, but that the old system may continue to be used and have new hooks added until a certain point. E.g. leave non-deprecated until 1.34, and deprecate (and disallow new hooks) in 1.35 or something like that, which would give us time to get value objects figured out over 1 year of time before it starts to block introduction of new hooks.

Event Timeline

Krinkle created this task.Dec 21 2018, 2:36 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 21 2018, 2:36 AM
Krinkle updated the task description. (Show Details)Dec 21 2018, 4:10 AM
  • Action hooks may do expensive work, which is acceptable because they are asynchronous in nature. That is, they are not part of the atomic transaction that spans the original action. Instead, they are run from a deferred update or job.
  • An action hook callback cannot be prevented from running by other callbacks. (Exceptions are caught, logged, and then the hook continues; similar to deferred updates). {note: open question}

It sounds like this really wants to say that when an action hook is fired, the registered callbacks aren't actually called. Instead MediaWiki actually does add a DeferredUpdate for each one. Or, alternatively, you're going to require that "Hook::runAction()" is only called from inside a DeferredUpdate and internally reimplements the catch-and-log logic DeferredUpdate already provides.

I note this may lead to hooks that should be "action hooks" being defined as "filter hooks" instead to avoid race conditions, or to avoid lost updates when a fatal error occurs after the original action but before the deferred update gets run.

We should probably make it reasonably easy for the core code to run an action's deferred updates earlier than pre-send/post-send, and possibly allow a non-standard registration of an action that is run immediately rather than being deferred if the usual behavior is to defer to post-send.

  • Action hooks are not given mutable values (e.g. value object with setters, or primitives passed by reference).

It seems like this will result in us defining a multitude of pairs of read-only and read-write versions of objects, with added logic (and associated performance hit) to create a read-only version from the standard read-write version every time a hook needs to be called. Or weird objects that can somehow get set to "read-only mode", requiring testing of the mode in every setter.

Or, if you relax the requirement a little, a multitude of read-only interfaces that are only implemented by read-write objects. Which could then be cast to the read-write object if the hook function wants to ignore the documentation.

It might be more straightforward to just use the "supported" versus "at your own risk" distinction to specify whether calling setters on the passed object is allowed.

  • Filter hooks are given only one mutable value, the other parameters must be immutable values, primitives by value, and/or read-only services.

What is this limitation supposed to gain us? I'm not sure it's a win to have to do something like

$dummy = (object)[
    'foo' => &$foo,
    'bar' => &$bar,
];
Hooks::run( 'SomeHook', [ $dummy ] );

or, more extreme,

$dummy = new SomeHookDummyMutableObject( $foo, $bar );
Hooks::run( 'SomeHook', [ $dummy ] );
$foo = $dummy->getFoo();
$bar = $dummy->getBar();

just to be able to have more than one modifiable value or service passed to the hook function.

  • Filter hooks must be deterministic and must not cause side-effects. Their behaviour should be based on the input parameters only. Other instances and services should not be obtained. {note: open question}

This seems unlikely to be feasible. Consider, for example, AbuseFilter's hook to potentially prevent an edit from being made. Besides the Config object you already have as an open question, it'll need to access the database to load its defined filters, the main stash to check throttling (which it'll also need to update with the new throttle value), a logger (which will access request state to populate some of the log fields), the Session or some other way to communicate "warning was shown" to a future request so the warning can be overridden, maybe a MessageLocalizer, and possibly other things that aren't so obvious off the top of my head.

  1. If an action hook encounters a fatal error (e.g. class undefined, out of memory, execution timeout), should we ensure other callbacks still run? If so, how? If not, how should we document this?

"If so, how?" is a really good question, considering that a fatal error by definition doesn't allow more PHP code to run. I guess we'd have to put each one as a separate job on a job queue, and avoid excessive queue lag or otherwise overloading the job queue.

  1. Should we include a guideline about batching? [...] Or could we leave this open to be decided on a case-by-case basis?

Case-by-case basis, which includes the possibility of having both per-item and batched hooks, seems most reasonable to me.

  1. Should we require that for all new hooks, if the feature in question does not have value objects, that those are implemented first? If we don't want to require this, how do we move forward? Perhaps we can state that value objects are mandatory if using the new hook system, but that the old system may continue to be used and have new hooks added until a certain point. E.g. leave undeprecated until 1.34, and deprecate (and disallow new hooks) in 1.35 or something like that, which would give us time to get value objects figured out over 1 year of time before it starts to block introduction of new hooks.

One year to completely refactor 15 years' worth of technical debt?

brion added a comment.Jan 3 2019, 6:55 AM

A few quick notes:

  • we should sketch out a few extensions in this model -- I'll take a look at some later in the week
  • aggressively pushing action hooks to deferred or jobqueue means we need to be much better about making the job queue *work* reliably (and quickly) on small installs
  • immutable interfaces are probably sufficient for most of what we want on hook params, without necessarily needing to create full value classes for everything. We want both sides of the hook to know what's promised & what's allowed & what might change from under you
  • side effects issues are something to consider more, about guarantees vs recommendations. needs some real-world testing to see where these guidelines lie
hoo added a subscriber: hoo.Jan 3 2019, 5:52 PM
Bawolff added a subscriber: Bawolff.Jan 7 2019, 9:07 PM
daniel moved this task from Inbox to Under discussion on the TechCom-RFC board.Jan 8 2019, 2:30 PM
  • Action hooks may do expensive work, which is acceptable because they are asynchronous in nature. That is, they are not part of the atomic transaction that spans the original action. Instead, they are run from a deferred update or job.
  • An action hook callback cannot be prevented from running by other callbacks. [..]

It sounds like this really wants to say that when an action hook is fired, the registered callbacks aren't actually called. Instead MediaWiki actually does add a DeferredUpdate for each one. Or, alternatively, you're going to require that "Hook::runAction()" is only called from inside a DeferredUpdate and internally reimplements the catch-and-log logic DeferredUpdate already provides.

Yep, that's the idea. In terms of implementation, I suppose it'd be simplest to reason about if we always have action hooks produce a sequence of deferred updates. One for the loop itself, and one scheduled for each callback from inside that loop; we don't need to detect whether we're already in a DU and nest updates or re-create DU's execution logic. Whether to evaluate the list of callbacks synchronously or whether to only evaluate it at pre-send is a detail I think we should leave for CR, but I'm fine either. We could evaluate it synchronously in order to retain current behaviour of being able to unregister/register during run-time and "skip" hooks that were absent while the hook was scheduled. Or we could evaluate at pre-send and avoid race conditions of registering late (this might be expected/desirable from an event-driven model).

I note this may lead to hooks that should be "action hooks" being defined as "filter hooks" instead to avoid race conditions, or to avoid lost updates when a fatal error occurs after the original action but before the deferred update gets run.

That would violate the principle and not be allowed under this RFC (as currently drafted). But, I understand your reasoning of course, and they're important concerns to think about.

In my opinion, it is a requirement for this RFC to address these use cases at least as well as they are addressed today, and possibly better (but taking care to avoid scope creep; some of these are in my opinion orthogonal issues better improved separately).

I was greatly surprised to learn that our "execution timeout killer" also applies post-send. I'm personally less concerned about fatal errors ("application crashes"), because those should be rare and are in area where I don't think we can provide many meaningful guarantees even if we tried. The only meaningful guarantee I think we can currently give (and will continue to) is that no database writes are committed unless they all are (ensures database integrity), and that jobs/deferred updates can be queued ahead of time in such a way that they'll only run if those writes were successful.

The guarantee I'd like to provide in addition to that, which we used to have before the execution timeouts, is that (apart from fatal errors) a deferred update will always run at least once if the request reached the post-send stage. This is imho out of scope for this RFC, but I imagine what we could do in this area is:

  • Disable the execution timeout as the last step of the pre-send handler.
  • Deprecate non-enqueuable deferred updates. (Or, terminology aside: have only "jobs", which we sometimes prioritise to run post-send, and will queue normally if they fail, or if pressured for some reason, e.g. to minimise post-send time; whether this is a class called "Job" or "Update" is TBD.)

As for race conditions. I think this is an important area not to regress in, but I'm not sure whether this is currently a guarantee already. Code that runs in-request today can certainly overlap out-of-order already. E.g. write requests A B C starting in that order, but reaching db-commit as C A B. Pre-commit they can be out of order. Pre-send/post-send can be out of order. The only one that might be in-order is onTransactionCommitOrIdle, although not even sure about that. If we assume the three writes all require the same db-row-lock and wait for each other, that would mean callback-A sees write-C, and callback-B sees write-A. Which is good. But I'm not sure it means the callback itself will be in-order and see each other's side-effects, given that the PHP threads are concurrent and won't have anything blocking them (it's post-commit).

Granted, the chances and size of the anomaly would grow if we use jobs more regularly. But I think that's a reason to not rely on this in general and rather find better patterns in general for code that currently relies on such ordering.

Let's look at the example use case of an extension mirroring edits from a page to an external system (e.g. Twitter, or Git, or another wiki) with the design requirement that the external system has activity in the "correct" order, matching the original wiki history.

If implemented in a way where the extension just blindly broadcasts the meta data for each edit, I don't think there's a feasible and scalable way to implement that in a way that applies to all hooks in general. That would require a level of locking and orchestrating that I think is possible, but not scalable for most hooks.

Instead, you'd probably want to treat each event as an instruction for the extension to "pull" up to the given point. That is, other edits would be synchronised first as part of that job (if they happened to not have run yet). And given other concurrent jobs running, the extension would likely want to use a lock within the job as well to make sure only one is active at any point in time.

We should probably make it reasonably easy for the core code to run an action's deferred updates earlier than pre-send/post-send, and possibly allow a non-standard registration of an action that is run immediately rather than being deferred if the usual behavior is to defer to post-send.

I hadn't considered allowing them to run earlier. Can you elaborate on what an example use case for this would be?

That would violate the principle and not be allowed under this RFC (as currently drafted). But, I understand your reasoning of course, and they're important concerns to think about.

Whether it's "allowed" or not, it'll probably happen if action-hooks can't satisfy existing use cases.

The guarantee I'd like to provide in addition to that, which we used to have before the execution timeouts, is that (apart from fatal errors) a deferred update will always run at least once if the request reached the post-send stage. This is imho out of scope for this RFC,

It probably shouldn't be out of scope, or this should probably depend on whatever task is for improving deferred update reliability. You're talking about pushing many more things to deferred updates, which exposes many more things to the risk that the main database update happens while the deferred update gets lost. We already seem to be running into this sort of thing as individual things are piecemeal being pushed to deferred updates.

  • Disable the execution timeout as the last step of the pre-send handler.
  • Deprecate non-enqueuable deferred updates. (Or, terminology aside: have only "jobs", which we sometimes prioritise to run post-send, and will queue normally if they fail, or if pressured for some reason, e.g. to minimise post-send time; whether this is a class called "Job" or "Update" is TBD.)

Both of those sound like good plans to me, even though the Closure-based deferred updates will likely be annoying to have to reimplement.

In particular, I think having "Jobs" and "DeferrableUpdate" being separate things is something in the current code base that makes both of them harder to work with, and combining the two concepts would be a win despite the loss of the ability to have non-serializable data in DeferrableUpdates.

As for race conditions. I think this is an important area not to regress in, but I'm not sure whether this is currently a guarantee already. Code that runs in-request today can certainly overlap out-of-order already. E.g. write requests A B C starting in that order, but reaching db-commit as C A B. Pre-commit they can be out of order. Pre-send/post-send can be out of order. The only one that might be in-order is onTransactionCommitOrIdle, although not even sure about that. If we assume the three writes all require the same db-row-lock and wait for each other, that would mean callback-A sees write-C, and callback-B sees write-A. Which is good. But I'm not sure it means the callback itself will be in-order and see each other's side-effects, given that the PHP threads are concurrent and won't have anything blocking them (it's post-commit).

I don't think the problem you laid out there will be a major issue, although it might be a source of tedious bugs. My impression is that we have fairly good support for being able to deal with callback-A seeing write-C.

Possibly more problematic would be that splitting write-A and callback-A into separate transactions means opportunity for requests B and C in between to see the intermediate state. This is sort of thing we can encounter now, for example, when a page edit happens but the update of pagelinks and categorylinks for the page's new content has been lost or delayed.

We should probably make it reasonably easy for the core code to run an action's deferred updates earlier than pre-send/post-send, and possibly allow a non-standard registration of an action that is run immediately rather than being deferred if the usual behavior is to defer to post-send.

I hadn't considered allowing them to run earlier. Can you elaborate on what an example use case for this would be?

How about a cache invalidation? If someone edits a page that affects the UI, we'd ideally like to invalidate the cache entry pre-send so the user who made the edit sees its effect on the post-edit page. If the callback-A for the cache invalidation is forced to be done post-send, though, it'll often enough not be in time for the post-edit render to pick it up.

For core we don't use action-hooks for this sort of thing and can call MessageCache->replace() pre-send, of course. Are we sure enough that extensions will never need to do anything like this that we're ok with denying them the option?

@Krinkle wrote:
  • Action hooks are not given mutable values (e.g. value object with setters, or primitives passed by reference).

It seems like this will result in us defining a multitude of pairs of read-only and read-write versions of objects [..]
Or, if you relax the requirement a little, a multitude of read-only interfaces [..]
It might be more straightforward to just use the "supported" versus "at your own risk" distinction [..]

Agreed.

I do believe there will be cases where an existing scalar value is available that would be better than an heavy object (e.g. avoid where we can). I also believe we should have more value classes than we currently do. But, I agree it doesn't have to be a hard requirements for the hook system to mandate what the objects aren't capable of; what we support would suffice.

@Krinkle wrote:
  • Filter hooks are given only one mutable value, the other parameters must be immutable values, primitives by value, and/or read-only services.

What is this limitation supposed to gain us? I'm not sure it's a win to have to do something like

$dummy = (object)[
    'foo' => &$foo,
    'bar' => &$bar,
];
Hooks::run( 'SomeHook', [ $dummy ] );

or, more extreme,

$dummy = new SomeHookDummyMutableObject( $foo, $bar );
Hooks::run( 'SomeHook', [ $dummy ] );
$foo = $dummy->getFoo();
$bar = $dummy->getBar();

just to be able to have more than one modifiable value or service passed to the hook function.

Yeah. That's not a desirable outcome. I should've given an example.

The idea of "one value per hook" was not meant to be met by merging multiple values into one. The point was to run multiple hooks. E.g. Something-foo and Something-bar. Looking at https://codex.wordpress.org/Plugin_API/Filter_Reference (note: this is outdated), virtually all of these operate on a single value, or something that can be thought of as highly related.

The point is to avoid having filters like "outputpage_render( OutputPage )" which don't relate to a tangible value, but a point in time ("event") given a monolithic object. Instead, I think these should have more granular filters like "outputpage_htmltitle( string $tilte, TitleValue, string $action )", "outputpage_bodyclass", ""outputpage_modules", which have a dedicated purpose and a limited set of factors by which we support their variance. The problems with monolithic hooks are reflected in the problem statement and objectives.

I think my original phrasing here isn't as clear as it could be, but I hope this clarification makes it more understandable. I'm open to finding a better wording here that better reflects this intention.

The point is to avoid having filters like "outputpage_render( OutputPage )" which don't relate to a tangible value, but a point in time ("event") given a monolithic object. Instead, I think these should have more granular filters like "outputpage_htmltitle( string $tilte, TitleValue, string $action )", "outputpage_bodyclass", ""outputpage_modules", which have a dedicated purpose and a limited set of factors by which we support their variance. The problems with monolithic hooks are reflected in the problem statement and objectives.

I'm glad you picked that as an example. It seems pretty likely that an extension might need to manipulate the HTML (to inject content), the modules (to style/enhance the injected content), and the body class (for selectors in its module). Having such granular hooks means would need to register and coordinate across them all, and you're not giving it any parameters that it can easily use to do that so its only option seems to be depending on global state (and hoping that no request manages to call the hooks multiple times in a way that would screw up that state).

Osnard added a subscriber: Osnard.Jan 15 2019, 6:53 AM

(unrelated to the discussion above) How would filter hooks report errors? By throwing an exception? Should that exception be handled by the hook system, letting Hook::filter return normally, or should it bubble up into the calling code, to be handled there explicitly?

I prefer the later, but if we do that, filter hooks should declare what exceptions they can expectedly raise.

Krinkle updated the task description. (Show Details)Feb 7 2019, 3:35 AM
daniel added a comment.Mar 8 2019, 1:06 PM

The proposal states:

Filter hooks must be deterministic and must not cause side-effects. Their behavior should be based on the input parameters only. Other instances and services should not be obtained. {note: open question}

I think filter hooks will need to be able to access some services (even though these should ideally be injected on construction of the handler, not pulled from global state while the hook handler is executed).

I agree however that filter hooks should never persist state. And probably they shouldn't read persistent state either. This could be enforced by setting a flag during the execution of filter hooks which is then checked by storage layer services (perhaps simply the Database object). If the flag is set, storage access should fail. The flag could be implemented as a semaphor object owned by the Hooks service and injected into storage layer services.

Some exceptions may be necessary to this rule to accommodate caching, logging, and profiling.

Anomie added a comment.Mar 8 2019, 4:13 PM

@daniel: You should go read what I wrote in T212482#4841064.

daniel added a comment.Mar 8 2019, 4:23 PM

@daniel: You should go read what I wrote in T212482#4841064.

I went and did that. The AbuseFilter use case is a good example - the hook needs to know the filters, and even though they are cached, they need to be read from somewhere. So I'm retracting "they shouldn't read persistent state" above.

As to writing... not so sure. You are right that there might be a need for some persistence, but we should be restrictive about this. Perhaps we could say that it's OK to write to the session, but only to the session. Maybe cache and stash, too? And we could say that data can be persisted, but not synchronously, so it would have to go via the job queue.

Krinkle updated the task description. (Show Details)Mar 9 2019, 12:49 AM
Krinkle updated the task description. (Show Details)
Krinkle reassigned this task from Krinkle to brion.Mar 26 2019, 6:47 PM

Assigning to @brion for the next step, which is to review the hooks for an existing extension with these recommendations in mind, and how they hold up.

kchapman added a subscriber: kchapman.

@CCicalese_WMF and I are interested in having an IRC meeting on this to keep it moving. This is tied closely to work Core Platform is doing on APIs.

daniel renamed this task from RFC: MediaWiki 2018 extension interfaces to Evolve hook system to support "filters" and "actions" only.Apr 11 2019, 9:27 AM

We should have at least a basic survey of the most-used hooks, and the way they are used, before we have an IRC meeting about this.

I like the idea of separation of concerns and having a codebase that is explicit and easy to use. From my understanding,

  • the action hooks can be used to modify state (expensive calls like update user in DB, log stuff, add tags, etc.)
  • the filter hooks can be used to build/filter things (like building UserDefaultOptions array, adding JS modules).

If I didn't understand it correctly, could you provide any examples?

At this moment I have a couple of concerns. Most of the time I develop extensions (Popups, TextExtracts, MobileFrontend, Minerva skin), and I can say that proposed rules fit MediaWiki Core land correctly, but when it comes to extensions - it looks troubling.

ActionHooks: "Instead, they are run from a deferred update or job."

If we decide to go with a deferred update, we need a way to specify if this is done POSTSEND or PRESEND. In some cases we can encounter a race condition, when the output (page redirect) is sent to the user and browser starts rendering a new page, but the script still processes the ActionHook. That leads into the state when the user sees a partially modified state. But sometimes we process stuff that is not visible to the user and it should be executed on POSTSEND.

FilterHooks: "Other instances and services should not be obtained" -> this might work excellent in the MediaWiki core, but extensions/skins need to access their services in some way. I'm slowly shifting old Singletons into Services (registered via ServiceWiring files).

During hooks execution, I need to access Services, so I can decide what to do next, as I cannot keep all logic inside one Hooks file. Without being able to get stuff via MediaWikiServices I need to keep things like old MobileContext::getSingleton() which I want to avoid. Singletons are difficult to test as there is no easy way to inject mocks. Additionally, it's very easy to create side effects when Singletons are in use.

Once we decide to start naming hooks "ActionHooks.XYZ", "FilterHooks.XYZ" - there will be no easy way to switch one to another. It's a good thing, but it may cause some troubles in the future. Each change in the core requires changes in all extensions/skins. If possible, I'd like to have a hook system that allows us small changes to the Hook signature without fixing all extensions/skins that use this hook.

If I understood properly what FilterHooks are, some of those Hooks could be safely replaced by service tagging. There is no need to listen to a hook when we need to build things. Example:
MobileFrontend has an interface called IMobileTransform. This interface represents a single, small transformation to the DOM structure and it's used during MobileFormatter execution. We have many transforms (like MoveLeadParagraphTransform, AddMobileTocTransform). Now, if we abandon hooks in favor of service tagging: instead of such structure:

$transforms = [];
Hooks::filter( 'MobileFrontend.RegisterTransform', &transforms );
/** @var IMobileTransform $transfor */
foreach( $transforms as $transform) {
   $transform->apply( $dom );
}

We could do something IMHO much cleaner:

$transforms = MediaWikiServices::getAllByTag('MobileFrontend.Transform');
/** @var IMobileTransform $transfor */
foreach( $transforms as $transform) {
   $transform->apply( $dom );
}

This reduces side effects to zero, as there is no Hook call and no arbitrary code execution, just a function that loads sets of services from MediaWikiServices.

My last comment is regarding the general approach. We need to get the control on what's going on in the Hooks world.
Passing variables, references to variables () and depending on which method from Hooks is called doesn't give us much flexibility nor any control. Some extensions will have to access MediaWikiServices::getService() or RequestContext::getMain() anyway to get required objects/services. Hooks cannot provide all dependencies for all extensions. Instead of passing a mutable/immutable arguments and depending on fact which Hooks method is called (action or filter). I'd love to see us passing an object that takes control on what can be done and whats not allowed. We should fully embrace an object-oriented approach to managing the execution flow of hooks. We have Status class, that IMHO fits here perfectly.

As a first step, we could come up with a straightforward interface (let's call it HookExcecution for easier reading) for hook execution with only essential methods. Then, we could do something like that:

$execution = new ActionHookExecution( $name, $data );
Hooks::run( $execution );

Hooks::run() can be a factory, and depending on the $execution type can fire filter or action methods. The HookExecution can have methods like: abort() (only filter), fail() (only action) or getService(). This provides us control over what users can do (they can access only methods from the
HookExecution, because HookExecution provides services there will be no need to access global variables/singletons). Inside the HookExecution we could freeze some things (like make some services read-only, when returning a service we can return a clone/frozen version, so we're sure it doesn't have side effects). Then, most important bits of our system could have concrete HookExecution classes, something like Hooks/SpecialPagesInit that exposes an API to modify set of SpecialPages. By default, the HookExecution class can have get( $key ) and set( $key, $value ). Also, it can implement ArrayAccess, therefore in some cases, no changes to extensions have to be done (when we pass an array as the first argument, for example, ChangeTagsListActive`).

Instead of passing references to variables, first we should organize the code in the way it exposes APIs on how to modify stuff, and then pass those APIs. Making things like adding 'FilterHook.' to the hook name, and passing things as $data = [ 'mutable' => &$variableA, 'notmutable' => $variableB ]
is just a short term improvement that will not fix the system. It makes HooksSystem feel better, but under the hood, IMHO it's still the same.

As final thought, I'd like to mention, that some time ago, I came up with the idea of T171515: Hooks system should support MediaWiki services. This is somewhat related to this task, in general, our Hooks system should utilize MediaWikiServices more often. That approach will help us reduce the number of static functions (at this moment most of the Hooks are static methods which are usually difficult to test), therefore it should minimize possible side effects.

moving back to "under discussion" until we have a better understanding of current hook usage.

Currently, passing values by reference makes it impossible for hook subscribers to know which types they are getting, making type hinting impossible. That causes issues like T214358.

Could the hook runner be extended do validate the types of mutable values after each subscriber returns?

The alternative would be that mutable values would need to be wrapped in container objects that expose interface to modify the value. I imagine something generic like new ModifiableValue( $type, $value ) that exposes get and set that enforce the type is possible, but being far from elegant and hiding the real type.

daniel added a subscriber: BPirkle.May 7 2019, 4:45 PM

The alternative would be that mutable values would need to be wrapped in container objects that expose interface to modify the value. I imagine something generic like new ModifiableValue( $type, $value ) that exposes get and set that enforce the type is possible, but being far from elegant and hiding the real type.

@BPirkle suggested to use anonymous objects that expose setters tailored for just the hook, and document them along with the hook signature. E.g.

$foo an object that exposes setBar( $x, $y, $z ).

I really like this solution, because it allows us to define narrow, one-off interfaces that allow for readable code:

$foo->setBar( 1, 2, 3)

@daniel Given a filter only mutating one value (previously written as "taking one parameter", we since then loosened it up to allow any number of parameters, but only one of which should be mutated by the hook), this could also be solved by using the return value.

That's what WordPress do, for example. This means it can be validated by return type hints. And if we want to enforce it in the hook runner, this could be passed to the hook runner as option, but I'm not sure that added complexity and verboseness throughout is worth it. It would caught naturally soon enough by parameter type hints from other handlers, other methods called after the hook by the source code, or the source's return type hint etc.

We can progressively enforce with PHPCS that filter hook callbacks must have a return type hint, for example. Might be more DRY that way, then to have the source code also document in multiple places, and pass around, and validate in-process etc.

@daniel Given a filter only mutating one value (previously written as "taking one parameter", we since then loosened it up to allow any number of parameters, but only one of which should be mutated by the hook), this could also be solved by using the return value.

That would mean that we'd lose the ability to break the chain of filters by returning false. This simplifies the logic, but may make it difficult to port some existing use cases. How does WordPress do chaining of filters?

If I understood properly what FilterHooks are, some of those Hooks could be safely replaced by service tagging. There is no need to listen to a hook when we need to build things. Example:
[...]
This reduces side effects to zero, as there is no Hook call and no arbitrary code execution, just a function that loads sets of services from MediaWikiServices.

Your example seems to have just pushed the hook call one level up the stack (now MediaWikiServices calls a bunch of registered callbacks instead of your code calling them directly), and possibly also requires that more things be created on every request rather than being determined when they are needed. Either way, I'm not sure that's much of an improvement.

Hooks::run() can be a factory, and depending on the $execution type can fire filter or action methods. The HookExecution can have methods like: abort() (only filter), fail() (only action) or getService(). This provides us control over what users can do (they can access only methods from the
HookExecution, because HookExecution provides services there will be no need to access global variables/singletons). Inside the HookExecution we could freeze some things (like make some services read-only, when returning a service we can return a clone/frozen version, so we're sure it doesn't have side effects). Then, most important bits of our system could have concrete HookExecution classes, something like Hooks/SpecialPagesInit that exposes an API to modify set of SpecialPages. By default, the HookExecution class can have get( $key ) and set( $key, $value ). Also, it can implement ArrayAccess, therefore in some cases, no changes to extensions have to be done (when we pass an array as the first argument, for example, ChangeTagsListActive`).

Do we really need complicated accessors wrapping everything just to try to enforce a policy on which services can be accessed from a hook callback? We discussed the same thing with respect to trying to force hook calls to use immutable objects in T212482#4841064 and T212482#4867960.

As final thought, I'd like to mention, that some time ago, I came up with the idea of T171515: Hooks system should support MediaWiki services. This is somewhat related to this task, in general, our Hooks system should utilize MediaWikiServices more often. That approach will help us reduce the number of static functions (at this moment most of the Hooks are static methods which are usually difficult to test), therefore it should minimize possible side effects.

Static functions aren't always bad. They'd probably be a better approach to wiring than the current "array full of Closure objects" we have in ServiceWiring.php, as mentioned at https://gerrit.wikimedia.org/r/c/mediawiki/core/+/508972/5/includes/Rest/Router.php#152.

While calling hooks on services in MediaWikiServices may be a good idea when a service already exists, at what point are we putting too much into it? Would it really be a win for every extension to register a global public "FooBarHooks" service just to be able to hook hooks? Would it be better to register "private" services by allowing use of ObjectFactory specs (see T222409) when registering hooks?

Currently, passing values by reference makes it impossible for hook subscribers to know which types they are getting, making type hinting impossible. That causes issues like T214358.

The problem in T214358 seems to have been that the parameter was defined as \Language|string, was really \Language|\StubUserLang|string, and in a patch someone assumed it was just \Language. None of that seems to have much to do with passing values by reference beyond making it slightly harder to find where the string a later "filter" received was coming from.

Could the hook runner be extended do validate the types of mutable values after each subscriber returns?
The alternative would be that mutable values would need to be wrapped in container objects that expose interface to modify the value. I imagine something generic like new ModifiableValue( $type, $value ) that exposes get and set that enforce the type is possible, but being far from elegant and hiding the real type.

Both of those require defining the type in the first place, and then a bunch of code run on every hook function call to check it. Does this sort of problem really come up often enough to justify all that complexity? Could we just enable some logging in such situations to log the types when they actually need to be investigated?

@daniel Given a filter only mutating one value

Is that a given, though? T212482#4869654 never received a reply.

daniel added a comment.Jun 4 2019, 9:08 PM

I just came across T157402: Provide a reliable way to pass information between hook handlers, "hooked" objects and I'm wondering how the use cases described there fit into the model proposed here.

I just came across T157402: Provide a reliable way to pass information between hook handlers, "hooked" objects and I'm wondering how the use cases described there fit into the model proposed here.

Currently, it doesn't seem this proposal does anything more to address the issue than the current system does. But,

  • As mentioned in T212482#4869654, this proposal would seem to require more such communication.
  • The current workaround is adding properties on relevant high-level objects that are passed into the hooks. Changes that result in no persistent objects being passed would invalidate that workaround.
Krinkle added a comment.EditedJun 10 2019, 8:52 PM

The point is to avoid having filters like "outputpage_render( OutputPage )" which don't relate to a tangible value, but a point in time ("event") given a monolithic object. Instead, I think these should have more granular filters like "outputpage_htmltitle", "outputpage_bodyclass", ""outputpage_modules", which have a dedicated purpose and a limited set of factors by which we support their variance. [..]

[..] It seems pretty likely that an extension might need to manipulate the HTML [..], the modules [..], and the body class [..]. Having such granular hooks means would need to register and coordinate across them all, and [..] its only option seems to be depending on global state [..]

Yeah, that'd be nasty. We already see that a lot in extensions that hook into Parser, and that's not a pattern we'd want to encourage further. My unverified gut-feeling is that these "filter" hooks would not actually any sort of coordination, though.

By virtue of only being allowed to vary on the given parameters (as our proposed design states), that implicitly also rules out the outcome of other hooks. If such outcome is relevant and the maintainer of the code the hooks are subject to is okay with that side-effect being officially supported, that probably means the code needs some kind of primitive to support that which could then be passed as parameter to the relevant hook. For the most part, though, I expect filters to be deterministic given their parameters and to generally not rely on outcome of other hooks.

As mentioned by Brion already, I think it's time we look at a practical example to see how this would turn out. Continuing further in theory only I think would leave us vulnerable to either wasting time or considering cases that might work out the way we think they would.

debt moved this task from Under discussion to Backlog on the TechCom-RFC board.Aug 14 2019, 1:16 PM
debt moved this task from Backlog to Under discussion on the TechCom-RFC board.
Izno added a subscriber: Izno.Sep 1 2019, 3:48 PM