Page MenuHomePhabricator

Hook container with strong types and DI
Open, MediumPublic

Description

Problem statement

This proposal aims to solve the following problems with the existing hook system:

  • There is no way to inject dependencies into a hook handler, so many handlers are obliged to access the global service container.
  • The types of the hook arguments are undeclared and unvalidated.
  • There is no place to put doc comments relating to hook functions; they are documented in a separate text file and on mediawiki.org. There is no editor integration with the separate documentation.
  • The global hook runner is implemented in static functions rather than a service.

Proposal summary

  • Hook interfaces: Have one interface per hook. Each interface has a single method with a name specific to the hook.
  • Indirect registration: Instead of having extensions register the function name associated with a hook, they register a handler name. Extensions also register an array of handlers, each handler being described by an ObjectFactory spec.
  • Handler dependency injection: As with the REST API, the "services" key in the ObjectFactory spec can be used to inject services into the hook handler constructor.
  • HookRunner: A HookRunner class in core provides a concrete implementation of all core hook interfaces. That is, for each hook, the same interface is used for both calling the hook and handling it. Each method consists of approximately one line of code which proxies the call to HookContainer. Extensions may implement their own HookRunner style classes along the same lines.
  • HookContainer: A service which is responsible for creating, caching and calling handler object instances.

Comparison with past work

  • T212482 Filters and Actions: The current proposal is complementary to the filters and actions RFC. The current proposal allows existing hooks to be remediated, as well as providing a precedent for registration and strong typing that can be used when filters and actions are implemented.
  • T154674 HookRunner: That proposal is conservative and leaves many of the problems with hooks unresolved. I agree with @Tgr's comment at T154674#2998357, and this proposal is very similar to the scheme he proposed there.

Tradeoffs and judgement calls

Having one interface per hook, with a unique method name for each hook, allows extensions to implement many different hooks in a single class, simplifying DI and allowing them to keep the same code structure as is currently typical. In smart code editors, adding an interface to a handler class automatically gives the developer the documentation and a method stub. The system makes it easy for extension developers to implement hooks.

One drawback of using interfaces is that it will no longer be possible to add parameters to existing hook interfaces. Instead, hook interfaces will need to be versioned.

Inevitably, some classes will have a very long "implements" clause. I think the clutter is a small price to pay. PHP might iterate over all parent interfaces in certain circumstances — we should verify that the performance is acceptable.

In T154674#2998357, @Tgr proposed that instead of interfaces, there would be one abstract class per module, for example, UploadHookRunner. This may be convenient for core callers of hooks, but it is not convenient to reuse such an abstract class to ensure strong typing in handlers. The point of using interfaces is to allow sharing of parameter signatures between callers and handlers.

On what level of modularity should HookRunner and hook interfaces be grouped? I am proposing one HookRunner class in core. I previously proposed grouping of all core interfaces under a single namespace, but the proposal has now been updated to place interfaces in the same namespace as the caller. I think it is convenient to keep masses of repetitive boilerplate code in one place. But it is not an essential point.

It's important that method names are globally unique. I suggest naming them after the hook name. If methods are not globally unique then the conflicting hooks cannot be implemented in the same handler class.

Registration

The registration system follows the way we did handler registration in the REST API, except that an extra level of indirection is added in the config, to avoid duplication of ObjectFactory specs. I think the "handler name" could be implicitly prefixed with the extension name, so that handler names would not need to be globally unique.

This sample code shows a very brief potential syntax in which handler names are brought into the same namespace as global functions. The registration system would search first for a hook handler with the corresponding name, and if one does not exist, it is assumed to be a global function. Alternatively, the namespaces could be separated by either changing the top-level "Hooks" key or by making the value be an object like {"handler":"*"}.

{
	"HookHandlers": {
		"*": {
			"class": "MyExtension\\Hooks",
			"services": [ "ActorMigration" ]
		}
	},
	"Hooks": {
		"UserLoginComplete" : "*"
	}
}

Deprecation and superseded hooks

Core will have a list of deprecated hooks, which extensions can add to via the DeprecatedHooks attribute. If an extension registers a handler for a deprecated hook, a deprecation warning will be raised when extension.json is loaded, unless the deprecation is acknowledged in extension.json:

{
	"Hooks": {
		"UserLoginComplete" : {
			"handler": "*",
			"deprecated": true
		},
		"UserLoginComplete2" : {
			"handler": "*"
		}
	}
}

Acknowledging deprecation in this way causes the handler to not be called. In this example, UserLoginComplete2 replaces UserLoginComplete, and calls to UserLoginComplete will be filtered if core is aware that UserLoginComplete is a deprecated hook.

For forwards compatibility, if a handler acknowledges deprecation but core is not aware of the hook in question being deprecated, the call will not be filtered. So in this example, the UserLoginComplete handler will still be called as long as a version of core is used in which it is not deprecated.

Sample handler code

<?php

namespace MyExtension;

use ActorMigration;
use UserLoginCompleteHook;

class Hooks implements UserLoginCompleteHook {
	private $actorMigration;
	
	public function __construct( ActorMigration $actorMigration ) {
		$this->actorMigration = $actorMigration;
	}
	
	/**
	 * @param \User $user
	 * @param string $injectedHtml
	 * @param bool $direct
	 * @return bool
	 */
	function onUserLoginComplete( \User $user, &$injectedHtml, $direct ) {
		// ...
	}
}

Sample core caller

Here is some incomplete sample code showing how legacy reference arguments could be dealt with during migration to the new system. UserLoginComplete is an existing hook which passes $user as a reference, for legacy reasons.

UserLoginComplete.php

<?php

interface UserLoginCompleteHook {
	/**
	 * @param \User $user
	 * @param string $injectedHtml
	 * @param bool $direct
	 * @return bool
	 */
	function onUserLoginComplete( \User $user, &$injectedHtml, $direct );
}

HookRunner.php

<?php

namespace MediaWiki\HookRunner;

use UserLoginCompleteHook;

class HookRunner implements
	// ... many interfaces would be added here ...
	UserLoginCompleteHook,
{
	/** @var HookContainer */
	private $hookContainer;

	public function __construct( HookContainer $hookContainer ) {
		$this->hookContainer = $hookContainer;
	}

	/**
	 * @param \User $user
	 * @param string $injectedHtml
	 * @param bool $direct
	 * @return bool
	 */
	function onUserLoginComplete( \User $user, &$injectedHtml, $direct ) {
		return $this->hookContainer->run( 'UserLoginComplete',
			[ $user, &$injectedHtml, $direct ],
			[ 'legacyRefArgs' => [ 0 ] ]
		);
	}
}

HookContainer.php

For simplicity, I have left out a lot of details present in Hooks::run(). The point is to sketch out the call flow.

<?php
namespace MediaWiki\HookRunner;

class HookContainer {
	// ...

	public function run( $name, $args, $options = [] ) {
		$legacyHandlers = $this->getLegacyHandlers( $name );
		if ( $legacyHandlers ) {
			$legacyArgs = $args;
			if ( isset( $options['legacyRefArgs'] ) ) {
				foreach ( $args as $i => $arg ) {
					if ( in_array( $i, $options['legacyRefArgs'] ) ) {
						$legacyArgs[$i] =& $args[$i];
					}
				}
			}
			foreach ( $legacyHandlers as $handler ) {
				$success = $handler( ...$legacyArgs );
				if ( !$success ) {
					return false;
				}
			}
		}

		$handlers = $this->getHandlers( $name );
		if ( $handlers ) {
			$funcName = 'on' . str_replace( ':', '_', $name );
			foreach ( $handlers as $handler ) {
				 $success = $handler->$funcName( ...$args );
				 if ( !$success ) {
					return false;
				 }
			}
		}

		return true;
	}
}

SpecialCreateAccount.php

@@ -44,6 +48,8 @@ class SpecialCreateAccount extends LoginSignupSpecialPage {
 
 	public function __construct() {
 		parent::__construct( 'CreateAccount' );
+		$this->hookRunner = new HookRunner(
+			MediaWikiServices::getInstance()->getHookContainer() );
 	}

@@ -139,14 +145,14 @@ class SpecialCreateAccount extends LoginSignupSpecialPage {
 		# Run any hooks; display injected HTML
 		$injected_html = '';
 		$welcome_creation_msg = 'welcomecreation-msg';
-		Hooks::run( 'UserLoginComplete', [ &$user, &$injected_html, $direct ] );
+		$this->hookRunner->onUserLoginComplete( $user, $injected_html, $direct );

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
cscott added a subscriber: cscott.Apr 17 2020, 4:46 PM

I found another minor regression:
https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/WikiArticleFeeds/+/589629/1..2/WikiArticleFeeds.php

It appears that prior to 0adc5f342888740e1b611f95f1474be329eae817 landing, the Hook system would accept a hook of the form [ $object, 'ClassName::methodName' ] for virtual methods, but afterwards (as show in the jenkins logs for PS1 of the above patch) this leads to an error trying to resolve ClassName::ClassName::methodName.

Seems straightforward to fix by using [ $object, 'methodName' ] as the hook, which ought to work for both the old hook system and the new one, but I figured it was worth mentioning in case it makes sense to add a little bit of backwards-compat code to handle this case.

Change 574266 merged by jenkins-bot:
[mediawiki/core@master] Automatically generated hook interfaces

https://gerrit.wikimedia.org/r/574266

Change 588936 had a related patch set uploaded (by DannyS712; owner: DannyS712):
[mediawiki/core@master] Update hook interfaces for recent additions and deprecations

https://gerrit.wikimedia.org/r/588936

It appears that prior to 0adc5f342888740e1b611f95f1474be329eae817 landing, the Hook system would accept a hook of the form [ $object, 'ClassName::methodName' ] for virtual methods, but afterwards (as show in the jenkins logs for PS1 of the above patch) this leads to an error trying to resolve ClassName::ClassName::methodName.

This is a result of switching from call_user_func_array() to $func(...$args), which was done for good reasons. I guess we can add b/c for it, although these kinds of special cases reduce the performance of the code.

Change 591238 had a related patch set uploaded (by Tim Starling; owner: Tim Starling):
[mediawiki/core@master] HookContainer: add b/c for non-static handler with colons in the function name

https://gerrit.wikimedia.org/r/591238

Change 591238 merged by jenkins-bot:
[mediawiki/core@master] HookContainer: add b/c for non-static handler with colons in the function name

https://gerrit.wikimedia.org/r/591238

Change 593658 had a related patch set uploaded (by Tim Starling; owner: Tim Starling):
[mediawiki/core@master] Remove LocalisationChecksBlacklistHook and LocalisationIgnoredOptionalMessagesHook

https://gerrit.wikimedia.org/r/593658

Change 593667 had a related patch set uploaded (by Tim Starling; owner: Tim Starling):
[mediawiki/core@master] Hook interface type fixes, to fix Phan errors

https://gerrit.wikimedia.org/r/593667

I don't know whether this was already discussed (CTRL+F says no), but are there plans to remove hooks.txt? The new interfaces are self-documenting, they're all grouped in the same directory, and what's most important, they are (presumably) always up-to-date. https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/593667/ is a clear proof of that.

Change 593982 had a related patch set uploaded (by Tim Starling; owner: Tim Starling):
[mediawiki/core@master] [WIP] Documentation for the new hook system

https://gerrit.wikimedia.org/r/593982

Change 574915 merged by Tim Starling:
[mediawiki/tools/generateHookInterfaces@master] Scripts for automatic hook migration per T240307 (initial commit)

https://gerrit.wikimedia.org/r/574915

Change 588936 merged by jenkins-bot:
[mediawiki/core@master] Update hook interfaces for recent additions and deprecations

https://gerrit.wikimedia.org/r/588936

Change 593658 merged by jenkins-bot:
[mediawiki/core@master] Remove LocalisationChecksBlacklistHook and LocalisationIgnoredOptionalMessagesHook

https://gerrit.wikimedia.org/r/593658

Change 593667 merged by jenkins-bot:
[mediawiki/core@master] Hook interface type fixes, to fix Phan errors

https://gerrit.wikimedia.org/r/593667

Change 577414 merged by jenkins-bot:
[mediawiki/core@master] Add HookRunner classes

https://gerrit.wikimedia.org/r/577414

Foxtrott removed a subscriber: Foxtrott.May 14 2020, 8:55 PM

Change 596914 had a related patch set uploaded (by Tim Starling; owner: Tim Starling):
[mediawiki/core@master] Extension schema updates for HookContainer

https://gerrit.wikimedia.org/r/596914

Change 598554 had a related patch set uploaded (by Reedy; owner: Tim Starling):
[mediawiki/core@REL1_34] Backport docs/extension.schema.v2.json fixes

https://gerrit.wikimedia.org/r/598554

Change 598555 had a related patch set uploaded (by Reedy; owner: Tim Starling):
[mediawiki/core@REL1_33] Backport docs/extension.schema.v2.json fixes

https://gerrit.wikimedia.org/r/598555

Change 598556 had a related patch set uploaded (by Reedy; owner: Tim Starling):
[mediawiki/core@REL1_31] Backport docs/extension.schema.v2.json fixes

https://gerrit.wikimedia.org/r/598556

Change 596914 merged by jenkins-bot:
[mediawiki/core@master] Extension schema updates for HookContainer

https://gerrit.wikimedia.org/r/596914

Change 598556 merged by jenkins-bot:
[mediawiki/core@REL1_31] Backport docs/extension.schema.v2.json fixes

https://gerrit.wikimedia.org/r/598556

Change 598554 merged by jenkins-bot:
[mediawiki/core@REL1_34] Backport docs/extension.schema.v2.json fixes

https://gerrit.wikimedia.org/r/598554

Change 598555 merged by jenkins-bot:
[mediawiki/core@REL1_33] Backport docs/extension.schema.v2.json fixes

https://gerrit.wikimedia.org/r/598555

eprodromou added a subscriber: eprodromou.

Tim, just checking that this is still being worked on.

Change 593982 merged by jenkins-bot:
[mediawiki/core@master] Documentation for the new hook system

https://gerrit.wikimedia.org/r/593982

daniel added a comment.EditedJun 4 2020, 9:11 AM

We ended up not using type hints in the hook interfaces automatically generated for the migration to the new system. Here is why:

Many old hooks expose too much about the calling code. E.g. several hooks in EditPage expose the entire EditPage object. If we (finally!) refactor that class and break it into pieces, there is no good way to still supply an EditPage object to the hook. The strategy in such a case is to define a new hook, deprecate the old one, and supply the old one with a fake that emulates the methods of EditPage that are used by known callers. This wouldn't be necessary if hook signature only used narrow interfaces. But that is sadly not the case. I don't expect this kind of thing to happen often, but it is the reason I asked Tim to remove the strict type hints.

Newly defined hook interfaces can and should use type hints. Care should be taken to only expose minimal information. Ideally, new hook interfaces are designed to be either filters or listeners, and conform to the restrictions outlined in T212482 accordingly.

kostajh added a subscriber: kostajh.Jun 4 2020, 9:51 AM

Fortunately PHPStorm reads the types from the interface doc block, so code completion works. Did you consider keeping primitive types declared?

cscott added a comment.EditedJun 10 2020, 2:40 PM

I noticed a bunch of Parser-related code that I'd manually fixed up so that every known hook user had type hints got new-style hooks that don't have any types. That's unfortunate!

Especially since the new "strong typing" system now doesn't allow the implementation classes to have strong types, because you can't override a missing type with a type hint. So all of the extension code I manually fixed up to enforce strong types is now going to have its type hints stripped as well when they move to the new style hooks. See this patch for instance.

Also: my plan (T236812: Parser.php should be split into a base class and a parser implementation) is to refactor Parser into a base class (Parser) and specific implementations as subclasses (LegacyParser, ParsoidParser). So I'm happy to have Parser as the type hint on the hook, that doesn't interfere with refactoring.

Can we submit patches that individually restore type hints to the new-style hooks for specific audited cases where we've already done the work fixing up all the users of the (old-style) hook?

(Also, I'd like some discussion of T253768: No easy way to suppress hard-deprecation warnings for hooks, which is already causing trouble for 3rd-party code still using deprecated hooks w/ the new hook runner system.)

Change 604412 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/extensions/BoilerPlate@master] Update to use new hook system

https://gerrit.wikimedia.org/r/604412

Can this be closed? Should this still be in "doing" on the board?

Putting this into the text planning review column, for status assessment and planning of next steps.

Aklapper removed a subscriber: Anomie.Fri, Oct 16, 5:02 PM