Page MenuHomePhabricator

Standardize declarative object construction in MediaWiki
Open, Needs TriagePublic

Description

We have several similar but distinct methods of creating an object based on some kind of static description:

  • CentralIdLookup, SessionManager and AuthManager providers, LoggerFactory and MonologSpi, and most ObjectCache related code (cache declarations in EtcdConfig, MultiWriteBagOStuff and ReplicatedBagOStuff) use [[https://www.mediawiki.org/wiki/ObjectFactory|ObjectFactory::getObjectFromSpec]]. Also Wikibase for declaring unit conversion storage classes.
  • ObjectCache itself (ObjectCache::newFromParams) has its own syntax, which also uses class and factory properties (along with lots of caching-specific ones) but not args, calls etc, and unlike ObjectFactory passes the spec to the factory.
  • ApiModuleManager::addModules uses the class and factory subset of the ObjectFactory syntax, except it can take a class name instead of an array as a shorthand.
  • ResourceLoader::getModule / $wgResourceModules uses class / factory parameters; like ObjectCache, the spec includes other things and is passed through to the factory. (Also you can pass an already instantiated object via the object key.)
  • $wgApiFeatureUsageQueryEngineConf uses factory / class but with some specific expectations on what parameters should be passed.
  • StubObject takes a classname or callable and a parameter array. But we are trying to move away from stub objects anyway.
  • Scribunto's $wgScribuntoEngineConf also uses factory / class.

There is also declarative callback specification, most notably for hooks, which is not quite the same thing but there is a large overlap in use cases (see e.g. T171515: Hooks system should support MediaWiki services).

It would be nice to use the same syntax everywhere, so developers don't have to memorize many different ways of declaring classes. Also, many of these lack the same features (e.g. no support for getting the factory from the service container) so a single class responsible for object construction would make improving them easer. ObjectFactory seems like the obvious candidate for that.

One potential problem is that most of these use an array that mixes object construction related properties with other properties (to be passed to the class or factory) which makes adding new properties in the future unsafe.


  • Update ObjectFactory to handle all use cases (dedbaec and c4abcd1)
  • Add ObjectFactory to MediaWikiServices (a11ad5f)
  • Convert use cases to use the ObjectFactory service
    • Things using static ObjectFactory (CentralIdLookup, SessionManager, AuthManager, LoggerFactory, MonologSpi, "most ObjectCache related code")
    • ObjectCache itself
    • ApiModuleManager
    • SpecialPageFactory
    • ResourceLoader
    • StubObject?
    • ApiFeatureUsage
    • Scribunto

Related Objects

Event Timeline

Collecting the various different mechanisms into a more usable list:

  1. An array with the following keys:
    • 'factory': (callable) Code that creates the object. Parameters passed depend on 'args'.
    • 'class': (string) Effectively the same as 'factory' being function ( ...$args ) { return new $class( ...$args ); }
      • Either 'factory' or 'class' is required. 'class' overrides 'factory' if both are given.
    • 'args': (array, optional) Arguments to pass to the 'factory' callable.
    • 'closure_expansion': (bool, optional) By default, any closures in 'args' are called with no arguments and the return value is passed to the factory. Setting this false disables that.
    • 'calls': (array, optional) Map of method names to argument lists to call on the object returned by 'factory'. 'closure_expansion' applies to these argument lists too.
  2. An array with the following keys:
    • 'factory': (callable) Code that creates the object. Passed the original array, with some additional properties added.
    • 'class': (string) Effectively the same as 'factory' being function ( $spec ) { return new $class( $spec ); }
      • Either 'factory' or 'class' is required. 'factory' overrides 'class' if both are given.
    • Arbitrary other keys that 'factory' can use, and a few that affect what additional properties ObjectCache puts into the original array.
  3. A string class name.
    • Like #1 passed [ 'class' => the string, 'args' => something caller-dependent ].
  4. A callable.
    • Like #1 passed [ 'factory' => the callable, 'args' => something caller-dependent ].
  5. An array with keys 'class' and 'factory'
    • In this case 'class' is required, while 'factory' is optional. Like #3 if 'factory' wasn't provided, and #4 if it was with the additional constraint that the object must be an instance of 'class'.
    • The benefit here is that the class of the object can be determined from the wiring without having to actually instantiate it first.
  6. An array with the following keys:
    • 'object': (object) The object to return.
    • 'factory': (callable) Like #2.
    • 'class': (string) Like #2.
      • All are optional. If all are omitted, defaults to 'class' being ResourceLoaderFileModule.
      • Unless 'object' was given, ->setConfig() and ->setLogger() are always called on the resulting object.
  7. Just the object itself, already created.

ObjectFactory is #1, ObjectCache, ApiFeatureUsage, and Scribunto are all #2, ApiModuleManager accepts both #3 and #5, and ResourceLoader is #6. SpecialPageFactory::getPage() accepts #4, #3, and #7.

StubObject is not really the same thing at all, so I'm going to ignore it. While hooks are vaguely similar in being "a way to somewhat generically call some function", there are enough differences that I'm going to ignore them too.


As for unifying these, I'd probably recommend something like the following.

First, some preprocessing for things that accept cases #3 or #4:

  • If it accepts #4 and is_callable() returns true, use [ 'factory' => $value ] as the spec.
  • If it accepts #3 and the value is a string and class_exists() returns true, use [ 'class' => $value ] as the spec.

Then the actual processing step, which always gets an array $spec and optional array $extraArgs:

  1. Test whether 'spec_is_arg' is empty()
    • If it is empty, set $args = $spec['args'] ?? []. Then do ObjectFactory's closure expansion depending on 'closure_expansion'.
    • If it's not, set $args = [ $spec ].
  2. $args = array_merge( $extraArgs, $args );
  3. If 'use_di_container' is set, array_unshift( $args, $services )
    • The exact ordering of $args, $extraArgs, and $services doesn't really matter as nothing currently has more than one of these.
  4. Create the object by doing the first of the following that is possible:
    1. If 'factory' is set, call $spec['factory']( ...$args ). If 'class' is also set, assert that the returned object is an instance of the class.
    2. If 'class' is set, call new {$spec['class']}( ...$args ).
    3. Throw an exception.
  5. If 'calls' is set, perform them as ObjectFactory already does.

This should be able to accommodate everything, plus some anticipated future use cases.

  • Things using ObjectFactory can work as-is, as long as they're not passing both 'factory' and 'class'.
  • ObjectCache, ApiFeatureUsage, and Scribunto can do $spec += [ 'spec_is_arg' => true ] before calling ObjectFactory. Again, a slight behavior change if both 'factory' and 'class' are given, or if some other key being used collides with a key ObjectFactory uses.
    • ResourceLoader is similar, but a bit more complicated because it should skip ObjectFactory if 'object' is set and needs to ensure that one of 'factory' or 'class' is set.
  • ApiModuleManager needs the preprocessing step, and would pass its additional arguments as $extraArgs.
  • SpecialPageFactory can test for #7, and if that's not the case can do the preprocessing step and call ObjectFactory.

The new 'use_di_container' key is to better support DI with ObjectFactory, by letting the wiring easily specify that the DI container be passed. To properly support this we'd probably want to add the DI container as an optional parameter to ObjectFactory::getObjectFromSpec() so it only needs to do MediaWikiServices::getInstance() for callers that aren't already passing it through from their own wiring callable.


If you really want to do DI for something like ApiModuleManager that does delayed instantiation, it's not exactly an obvious problem to solve.

  • You can't instantiate every possible module to inject them into the ApiModuleManager, as that's no longer delayed instantiation.
  • You don't want to use MediaWikiServices::getInstance() inside ->getModule() to get the service container to pass to ObjectFactory, as that is contrary to DI.
  • You probably don't want to be injecting the service container itself into ApiModuleManager either, as that smells bad.

IMO the thing to do would be to inject some sort of factory into the ApiModuleManager that opaquely does the instantiation on demand, so ApiModuleManager only depends on that factory. That factory can be very generic:

/**
 * A service for delayed instantiation of objects from wiring specifications
 */
interface IObjectFactoryService {
    /**
     * @param array $spec Wiring specification
     *   (add details about the format, or refer to ObjectFactory)
     * @param array $extraArgs Extra arguments to pass to the object's constructor/factory
     * @return object
     */
    public function createObject( array $spec, array $extraArgs = [] );
}

class ObjectFactoryService implements IObjectFactoryService {
    /** @var MediaWikiServices */
    private $services;

    public function __construct( MediaWikiServices $services ) {
        $this->services = $services;
    }

    public function createObject( array $spec, array $extraArgs = [] ) {
        return ObjectFactory::getObjectFromSpec( $spec, $extraArgs, $this->services );
    }
}

I like this proposal a lot. I can't find anything missing, either.

The only minor annoyance I spotted is with this:
If 'use_di_container' is set, array_unshift( $args, $services )

Having access to the service container in a factory method (wiring code), injecting it into a constructor is bad (the constructor of ObjectFactoryService itself being a rare exception to that rule). Objects holding on to a reference to the ServiceContainer in a member field is worse (again, ObjectFactoryService being the notable exception). The documentation of use_di_container should make this very clear.

Oh, and then there is the name. I would prefer ObjectFactory, Service is redundant. We can just use the existing class, add non-static methods, and keep the static methods as deprecated stubs.

One thing that we could add would be a convenience function that allows the caller to assert that the return value actually has the correct type.

The only minor annoyance I spotted is with this:
If 'use_di_container' is set, array_unshift( $args, $services )

Having access to the service container in a factory method (wiring code), injecting it into a constructor is bad (the constructor of ObjectFactoryService itself being a rare exception to that rule). Objects holding on to a reference to the ServiceContainer in a member field is worse (again, ObjectFactoryService being the notable exception). The documentation of use_di_container should make this very clear.

Remember that ObjectFactory is intended as the helper for turning declarative wiring into objects, particularly when the details of that wiring depend on user configuration. This is, IMO, the same as the fact that every callback in includes/ServiceWiring.php is passed the service object.

Oh, and then there is the name. I would prefer ObjectFactory, Service is redundant. We can just use the existing class, add non-static methods, and keep the static methods as deprecated stubs.

Note that ObjectFactory is https://packagist.org/packages/wikimedia/object-factory, so we can't tie it too closely to MediaWiki-specific concepts. We might wind up implementing some parts of the proposal as a wrapper around that library. But sure, that wrapper could be named MediaWiki\Services\ObjectFactory.

Change 511752 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/libs/ObjectFactory@master] Enhance to be able to replace most/all MediaWiki object instantiation

https://gerrit.wikimedia.org/r/511752

The standardization part seems very reasonable (I don't really like that spec_is_arg switches between two fundamentally different behaviors, and overlapping ObjectFactory parameters with constructor/factory parameters seems like a problem waiting to happen, but I don't see how we could avoid it for now, given that both patterns are widely used). IMO use_di_container is the wrong way around. Anything that receives a service container and uses it to fetch dependencies should be in the wiring. (ObjectFactory itself holding the container is fine, as it only forwards it; it can be thought of as being in the same "package" as MediaWikiServices, even if the librarization makes it a bit awkward.) There are two fundamental goals with the wiring:

  • all service dependencies should be in a single place and easy to review;
  • fetching service dependencies should be handled by MediaWikiServices so it can handle "meta-dependencies" in the future (such as creating different service instances for foreign wikis).

If you pass in the service container to (say) MyApiModule::factory, and then that method pulls services out of it, the first goal is hampered as dependencies will now be all over the place in factories and constructors; and the second might be hampered depending on how ObjectFactory would obtain the service container (if ApiModuleManager is turned into a service, and that service gets an ObjectFactory instance injected on creation, that should work fine).

  • all service dependencies should be in a single place and easy to review;

Impossible when you consider extensions, they can't edit core's ServiceWiring.php.

Also impossible if you want to make some aspects of the wiring configurable by the end user (in MediaWiki terms, via LocalSettings.php). Usually we wind up with some "Factory" class in the wiring, with the wiring of the real class in that factory.

If you pass in the service container to (say) MyApiModule::factory, and then that method pulls services out of it,

You're looking at it wrong. The "factory" in the spec in this case would be basically the same thing you'd put in ServiceWiring.php or reference from wiring data in extension.json.

If you're using "class" in the spec so the object's constructor gets passed the service container, you're just using a service locator pattern. The fact that you're using that pattern via ObjectFactory rather than custom wiring of some sort is immaterial, and I see no reason to try to put technical barriers in the way of it.

and the second might be hampered depending on how ObjectFactory would obtain the service container (if ApiModuleManager is turned into a service, and that service gets an ObjectFactory instance injected on creation, that should work fine).

In the abstract, what you have in parentheses is exactly the plan for how non-static ObjectFactory is intended to be used. A "FooBarFactory" service would be injected with an ObjectFactory and the set of specs that represent the wiring for the FooBars. Then FooBarFactory can instantiate the FooBars on demand without itself having to take every FooBar's dependencies as its own (and without every FooBar having to take as dependencies everything that any one FooBar might need).

In the specific case of ApiModuleManager it won't work in exactly that way, but details of the composition pattern used there would probably be offtopic here.

Impossible when you consider extensions, they can't edit core's ServiceWiring.php.

Right. Call it a predictable set of locations, then.

Also impossible if you want to make some aspects of the wiring configurable by the end user (in MediaWiki terms, via LocalSettings.php).

Changes by the end user are irrelevant for code/architecure review, it's not a problem if they are all over the place. (Not a problem for us, anyway...) Extensions can also redefine the wiring via the MediaWikiServices hook, which is more problematic, but comparable to how other things work (you can use hooks to dynamically define messages, API modules etc - sometimes it's unavoidable and hopefully extensions won't do it when it's not).

Usually we wind up with some "Factory" class in the wiring, with the wiring of the real class in that factory.

I'm not sure we mean the same thing by wiring. This has all the wiring in the same place:

// ServiceWiring.php
'FooFactory' => function ( $services ) {
    $fooType = $services->getMainConfig()->get( 'FooType' );
    Assert::precondition( is_subclass_of( $fooType, 'FooBase' ) );
    $dependency = $services->getDependency;
    return new FooFactory( $fooType, $dependency );
}

// FooFactory.php
class FooFactory {
    public function create( $bar ) {
        $fooType = $this->fooType;
        return new $fooType( $dependency, $bar );
    }
}

Even though the class can be configured in LocalSettings (or wherever), looking at the wiring is enough to determine the service dependency graph.

This has wiring in a nonstandard place:

// extension.json or some other config file
"SomeFoo": {
    "factory": "SomeFoo::factory",
    "use_di_container": true
}

// Foo.php
class Foo {
    public static function factory( $services, $bar ) {
        return new $fooType( $service->getDependency(), $bar );
    }
}

Without looking at Foo.php you can't tell what services it depends on. It's not the end of the world, but IMO makes it harder to follow changes which introduce or remove dependencies.

This has all the wiring in the same place:
[...]
Even though the class can be configured in LocalSettings (or wherever), looking at the wiring is enough to determine the service dependency graph.

This has wiring in a nonstandard place:
[...]
Without looking at Foo.php you can't tell what services it depends on. It's not the end of the world, but IMO makes it harder to follow changes which introduce or remove dependencies.

In the first case you're still assuming everything comes from ServiceWiring.php in core, but extensions can't use that. And if you're debugging things in a repl, all you get when trying to find that wiring function is "Closure" (until you think to try ReflectionFunction, which thankfully works). While with the second you see "SomeFoo::factory" right there.

Also in the first case you're assuming every implementation has the same one dependency as far as being able to configure it in LocalSettings.php goes.

Change 511752 merged by jenkins-bot:
[mediawiki/libs/ObjectFactory@master] Enhance to be able to replace most/all MediaWiki object instantiation

https://gerrit.wikimedia.org/r/511752

This comment was removed by Krinkle.

Change 531710 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/core@master] Add ObjectFactory as a service

https://gerrit.wikimedia.org/r/531710

Change 531710 merged by jenkins-bot:
[mediawiki/core@master] Add ObjectFactory as a service

https://gerrit.wikimedia.org/r/531710