Page MenuHomePhabricator

'LinkRenderer' service violates to not inspect request/session
Closed, ResolvedPublic

Description

Problem

The LinkRenderer service currently violates the best practice that a service must only depend on the wiki and its site-wide configuration. Specifically, it must not logically vary by information from the current web request.

Proposal

If there is use for a generic default that is e.g. based on the default user preferences for CLI use or something like that, then we could keep it. But if not, I suggest we deprecate and remove it in favour of the LinkRendererFactory service which exists already.

Impact

From a quick look in our code (Code search), it seems LinkRenderer is mainly used in code that is user-facing, which makes sense. The majority of its calls are in classes that have a direct connection to RequestContext:

  • Action and subclasses.
  • SpecialPage and subclasses.
  • Utility classes solely used by the above (e..g ProtectionForm).

For example:

includes/EditPage.php
	public function makeTemplatesOnThisPageList( array $templates ) {
		$templateListFormatter = new TemplatesOnThisPageFormatter(
			$this->context, MediaWikiServices::getInstance()->getLinkRenderer()
		);

The EditPage is a user-facing that requires a WebRequest and session context etc. It already has a req context object handy and could use getLinkRendererFactory()->forUser( $context->getUser() ) instead.

includes/OutputPage.php
		$linkRenderer = MediaWikiServices::getInstance()->getLinkRenderer();
		return $linkRenderer->makeLink( $title, null, [], $query );

Idem. OutputPage already has getUser() and knows how to deal with it. Trivial to replace.

And then there are a few odd-ball cases like the below:

// link to the group description page, if it exists
$linkTitle = self::getGroupPage( $group );
$linkRenderer = MediaWikiServices::getInstance()->getLinkRenderer();
`

This is tricky because the class is generic (not specific to skinned output), and the caller isn't inside a method that takes a "current user" parameter. The thing is, though, this code is effectively using legacy wgUser. Which is known-bad. Except this is worse because it's white-washed behind a modern service, evoking the illusion that the only code smell is lack of dependency injection for the future use case of being able to invoke the code for other wikis. When really it's much worse.

I could only find one or two cases like this, and I would suggest we convert them to what they are by calling RequestContext::getMain()->getUser() (formerly known as $wgUser) explicitly inline, which is what the service wiring currently does. On a case-by-case basis we may be able to figure out if it really needs to depend on the current user and maybe make it use a generic instance instead, or we may find that there is a nearby LinkRenderer, context or User object we can use already.

Event Timeline

Krinkle renamed this task from 'LinkRenderer' service violates constraint of service construction inspecting request/session to 'LinkRenderer' service violates to not inspect request/session.Apr 7 2020, 9:58 PM
Krinkle updated the task description. (Show Details)
Anomie subscribed.

The LinkRenderer service currently violates the best practice that a service must only depend on the wiki and its site-wide configuration. Specifically, it must not logically vary by information from the current web request.

I note this task doesn't specify how exactly LinkRenderer does so. After reading through the code review on the referenced patch, this task is apparently about the fact that ServiceWiring for LinkRenderer does

'LinkRenderer' => function ( MediaWikiServices $services ) : LinkRenderer {
    if ( defined( 'MW_NO_SESSION' ) ) {
        return $services->getLinkRendererFactory()->create();
    } else {
        // Normally information from the current request would not be passed in here;
        // this is an exception. (See also the class documentation.)
        return $services->getLinkRendererFactory()->createForUser(
            RequestContext::getMain()->getUser()
        );
    }
},

Specifically the call to createForUser(). That method uses the specified user's stub threshold preference:

public function createForUser( User $user ) {
    $linkRenderer = $this->create();
    $linkRenderer->setStubThreshold( $user->getStubThreshold() );

    return $linkRenderer;
}

I'll leave it to others to decide whether the claimed "best practice" is actually best practice or not. A lot of things depend on the request user in various ways.

The problem is the call to RequestContext::getMain() in ServiceWiring. The file doc states that as a thing not to do.

If we allow ServiceWiring to use RequestContext::getMain() then we should not have deprecated $wgUser, $wgTitle and just embrace that code will magically sometimes do the right thing (when called in page view context), and outside that it might or might not work, and one won't be able to re-use that internal code to apply to other users, titles or wikis (either within the same web request or outside of it in jobs, updates, CLI, and static endpoints).

It is not the service wiring's responsibility to know what user/title/session the caller of getLinkRenderer is wanting to use it for. And it is most certainly not the expectation that when deep inside generic DI-controlled utility code that suddenly a obtaining service X will obtain LinkRenderer which will then initiatlise RequestContext and start running hooks that will likely break and have never been intended to run outside skinned page view contexts.

When a job is acting on a user and interacts with some generic code that takes a User parameter, that code should be using SomethingFactory::method(User) in order to work right. But will it? The generic code might not know it will one day be used by a job, and if getSomething() works fine elsewhere people will use it. Worse, in a job it either be acting on the wrong user due to not seeing the original session, or if it honours NO_SESSION it will wrongly use the site defaults. Which could then cause cache corruption, database corruption, or even rights escalation (e.g. no longer blocked etc.). For link renderer, if the application doesn't crash due to a hook or unavailable dependency first, then the impact would be low given it's just about the "stub treshold". But if we accept this pattern here principally, then there is no reason to not expect it will be accepted elsewhere as well, at which the stakes can be more more severe.

If we hope the hacky default will never be used, it should instead be a fatal exception. E.g. "if MW_ENTRY_POINT is index/ap, and we're not under NO_SESSION, use the RequestContext, otherwise throw fatal error". But then, what is the point of having a service if the caller can't rely on it always being available and thus starts poisoning other services?

If there really are cases where a generic default is intentionally needed and wanted, I suppose the factory could offer a singleton like getSomethingFactory()->forSiteDefault(). If that's commonly needed it might even be its own top-level service (e.g. one that is always site-wide generic).

Examples of what could or has gone wrong:

Effectively the same problem:

I suggest we deprecate and remove it in favour of the LinkRendererFactory service which exists already.

I suppose this means we don't deprecate the LinkRenderer, but rather MediaWikiServices::getLinkRenderer() (and any other way to get a global instance).

The proposal sounds reasonable, but we should introduce instance caching into LinkRendererFactory. Otherwise we may end up creating a great many instances.

[…] we should introduce instance caching into LinkRendererFactory. Otherwise we may end up creating a great many instances.

Yeah, if we end up with many forUser() calls then yes. I suggest we wait and see though. Skin will request an instance for sure, which it can share with OutputPage (or vice-versa) and also be passed down from there to the handful of places that need it.

We can't inject it into services anyway, and I suspect there are little to no cases where a service would need to be given a User object at run-time just to pass it to its LinkRendererFactory. If that's the case, it might be just as feasible for that method to be given the caller's LinkRenderer instead.

I suppose it will depend in the case which makes sense, and caching certainly seems fine (might want to use MapCacheLRU to limit growth). But I also think that if it ends up rare, then we don't need to invest in caching upfront :)

This seems no longer an issue, because the threshold was removed with T284917 / https://gerrit.wikimedia.org/r/c/mediawiki/core/+/709245 and LinkRenderer no longer depends on User and Request

@Krinkle: Do you know if this should be closed per last comment?

Krinkle assigned this task to Pchelolo.

Yep, all done here.