Page MenuHomePhabricator

Create a User factory
Closed, ResolvedPublic

Description

There should be a UserFactory service for constructing users

In the initial stage, it can just proxy the existing User::newFrom* methods, similar to the existing TitleFactory, which was created in https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/95586dd93721d17135cd4d0dea53f62bd7b781f5

Add TitleFactory
Makes it possible to mock static Title methods in tests, where they are one of the more common reasons for not being able to use MediaWikiUnitTestCase.
Actually introducing dependency injection to Title is left for the future.

Subsequently, it will be converted to an actual factory, replacing (check mark means the logic is now in the factory)

  • User::newFromName - inject UserNameUtils
  • User::newFromId
  • User::newFromActorId
  • User::newFromIdentity
  • User::newFromAnyId
  • User::newFromConfirmationCode - inject LoadBalancer
  • User::newFromRow

This will be another step in splitting up the User "god object"

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change 598144 had a related patch set uploaded (by DannyS712; owner: DannyS712):
[mediawiki/core@master] UserFactory v.1 - wrapper for User::newFrom* static methods

https://gerrit.wikimedia.org/r/598144

eprodromou subscribed.

@DannyS712 thanks for tagging us. In the future, just tag CPT, and we'll route it to the right place on our boards!

Also maybe idFromName() ? Doesn't return a User though. :/

Change 598144 merged by jenkins-bot:
[mediawiki/core@master] UserFactory v.1 - wrapper for User::newFrom* static methods

https://gerrit.wikimedia.org/r/598144

Change 605297 had a related patch set uploaded (by DannyS712; owner: DannyS712):
[mediawiki/core@master] Move User::newFromIdentity to UserFactory

https://gerrit.wikimedia.org/r/605297

Also maybe idFromName() ? Doesn't return a User though. :/

Though, I wonder if that method should be replaced with something like:

newFromName( 'Example' )->getId()

Change 605297 merged by jenkins-bot:
[mediawiki/core@master] Move User::newFromIdentity logic to UserFactory

https://gerrit.wikimedia.org/r/605297

Change 630241 had a related patch set uploaded (by DannyS712; owner: DannyS712):
[mediawiki/core@master] Move User::newFromName logic to UserFactory

https://gerrit.wikimedia.org/r/630241

Change 630241 had a related patch set uploaded (by DannyS712; owner: DannyS712):
[mediawiki/core@master] Move User::newFromName logic to UserFactory

https://gerrit.wikimedia.org/r/630241

Now has more code to review

Change 630241 merged by jenkins-bot:
[mediawiki/core@master] Move User::newFromName logic to UserFactory

https://gerrit.wikimedia.org/r/630241

Change 630795 had a related patch set uploaded (by DannyS712; owner: DannyS712):
[mediawiki/core@master] Add UserRigorOptions interface

https://gerrit.wikimedia.org/r/630795

Change 630795 merged by jenkins-bot:
[mediawiki/core@master] Add UserRigorOptions interface

https://gerrit.wikimedia.org/r/630795

Change 631481 had a related patch set uploaded (by DannyS712; owner: DannyS712):
[mediawiki/core@master] Migrate more User::newFrom* methods to UserFactory

https://gerrit.wikimedia.org/r/631481

Change 631811 had a related patch set uploaded (by DannyS712; owner: DannyS712):
[mediawiki/core@master] Inject dependencies into PasswordReset, and cleanup

https://gerrit.wikimedia.org/r/631811

Change 631811 merged by jenkins-bot:
[mediawiki/core@master] Inject dependencies into PasswordReset, and cleanup

https://gerrit.wikimedia.org/r/631811

Change 631481 merged by jenkins-bot:
[mediawiki/core@master] Migrate more User::newFrom* methods to UserFactory

https://gerrit.wikimedia.org/r/631481

Since User::newFromRow is basically just a wrapper for User::loadFromRow, I don't think it needs to be migrated, which results in needing to create the service even though none of the dependencies would be needed

I suggest next creating a UserStore for database interactions:

  • User::createNew -> UserStore::createNewUser
  • User::addToDatabase -> UserStore::addUserToDatabase
  • User::updateActorId -> UserStore::updateActorId (private method, only needed for ::createNew and ::addToDatabase)

User::newSystemUser could go in either the UserStore or the UserFactory. Thoughts?

The patch https://gerrit.wikimedia.org/r/631481 broke the tests in the GoogleLogin extension, and probably many more. What the extension does is:

  • It creates User objects with new User() for anonymous users, and User::newFromId( 123 ) for logged in users with an arbitrary id. This is not rare. Many, many tests do this.
  • The only methods that are ever called are User::getId() and User::isAnon().
  • Both don't need anything but the id – which is already given. getId should just return the given id (or 0 for the anonymous user). isAnon should just return true or false depending on the id.
  • Note this test is marked as a unit test, i.e. it is not allowed to access globals.

What happens now:

  • InvalidArgumentException: Key "MaxNameChars" not found in input sources from a ServiceOptions class, somehow executed as part of the ServiceWiring.

I did a Git bisect and found the patch mentioned above.

In other words: We can't use User objects in unit tests any more, even if nothing we do with the User object requires any knowledge about globals or the database – i.e. we use it as a pure value object. That was possible before – as it is with several core classes. Some behave like pure value objects when you limit yourself to a small set of methods (like getId and isAnon in this example).

Now all User objects require knowledge about globals they don't even use.

Yea, I can replace that newFromId call with a mock. But why? This feels like a bug. This scenario now executes additional code that was not executed before, and is not even needed. I'm told startup time is critical (pinging Performance-Team).

PS: I can't even replace the bad calls with a mock. I mean, I can do this in the tests, but there are newFromId calls in production code as well. It became impossible to test this code in a unit test.

Change 645338 had a related patch set uploaded (by Thiemo Kreuz (WMDE); owner: Thiemo Kreuz (WMDE)):
[mediawiki/extensions/GoogleLogin@master] Make compatible with PHP8: can't use "match" any more

https://gerrit.wikimedia.org/r/645338

Change 645338 merged by jenkins-bot:
[mediawiki/extensions/GoogleLogin@master] Make compatible with PHP8: can't use "match" any more

https://gerrit.wikimedia.org/r/645338

Still broken: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/GoogleLogin/+/646705. Hey, that's not even my codebase. Still I care. Still any response would be helpful.

Still broken: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/GoogleLogin/+/646705. Hey, that's not even my codebase. Still I care. Still any response would be helpful.

@thiemowmde We looked at the patch linked and a possible way forward here would be to modify the GoogleUserMatching class to inject UserFactory as a dependency. This would modify the calls to the User static methods to using the UserFactory instead.

As for the unit tests in GoogleUserMatchingTest, you'd need to mock the UserFactory in setUp() and inject it into all calls of the GoogleUserMatching class. Also, for the two tests that are using newFromId() mock the method to return the user with the correct id.

Sorry, I'm afraid my post was misleading. I know how dependency injection works. I do have two problems:

  1. I won't update this extension. Who is going to do this?
  2. The change means that a lot of code is executed that was not executed before, and is not even needed. This concerns me because the change is done in a fundamental area of core that possibly affects the performance of billions of pageviews. It must be possible to initialize this stuff only when it's actually needed. This is how it was done before, and it worked.

In other words: We can't use User objects in unit tests any more, even if nothing we do with the User object requires any knowledge about globals or the database – i.e. we use it as a pure value object. That was possible before – as it is with several core classes. Some behave like pure value objects when you limit yourself to a small set of methods (like getId and isAnon in this example).

Ideally, this use case would be covered by a UserIdentity. If only User::getId() and User::isAnon(), that should work - but of course type hints would need to be changed. I have not checked whether this is possible here.

This concerns me because the change is done in a fundamental area of core that possibly affects the performance of billions of pageviews. It must be possible to initialize this stuff only when it's actually needed. This is how it was done before, and it worked.

For live performance, that would be a problem if the majority of requests could before be handled without loading name validation configuration and code, and now this code gets executed. As far as I know, we always needed the basic name validation to be loaded and executed, so nothing really changed.

However, I see the issue that a "pure" unit test that would previously function with a User instance (if one was careful) now no longer can: the static pseudo-constructors in the User class now need MediaWikiServices, which means they need a full MediaWiki runtime. That is indeed annoying. But if we want to move away from the old "monster" classes like User, the only way forward I can see is to use true value objects wherever possible.

EDIT: as it turns out, the relevant classes already type hint against UserIdentity. So the tests can just use PageIdentityValue, with no need for mocks. I made a patch to try this out: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/GoogleLogin/+/648346. It's not much different from the previous patch, just without the need to mock User.

However, the call to User::newFromId in GoogleUserMatching::authenticatedEmailMatcher remains a problem that could only be resolved by injecting a mock UserFactory.

I understand that it may seem problematic that, while it used to be possible to instantiate a User based on an ID as a "value object" without the need of further mocking, as long as you didn't trigger the lazy loading mechanism.

However, this lazy loading adds quite a bit of complexity (especially in the case of User, it's a constant source of bugs) for little benefit, since we generally end up fully loading the data anyway.

Unfortunately, we don't have a UserManager or UserStore service or such yet - but in this context, it doesn't matter, you have to make the factory either way. The code is creating a user instance that is based on system state. If we want to get rid of global state, this means tests will have to mock the mechanism that provides that state. I see no way around this.

Since User::newFromRow is basically just a wrapper for User::loadFromRow, I don't think it needs to be migrated, which results in needing to create the service even though none of the dependencies would be needed

I suggest next creating a UserStore for database interactions:

  • User::createNew -> UserStore::createNewUser
  • User::addToDatabase -> UserStore::addUserToDatabase
  • User::updateActorId -> UserStore::updateActorId (private method, only needed for ::createNew and ::addToDatabase)

User::newSystemUser could go in either the UserStore or the UserFactory. Thoughts?

I see that T272689: Investigate design of UserStore or UserAccountStore doesn't cover User::newSystemUser. @daniel do you think User::newSystemUser should be in UserStore (and thus this task be resolved) or in the UserFactory?

User::newSystemUser could go in either the UserStore or the UserFactory. Thoughts?

I see that T272689: Investigate design of UserStore or UserAccountStore doesn't cover User::newSystemUser. @daniel do you think User::newSystemUser should be in UserStore (and thus this task be resolved) or in the UserFactory?

Maybe. The interface drafts are not intended to be exhaustive, we can add things as needed.

Note btw that system users make sense as actors, but not as user accounts.

(leaving for CPT for the last bit of User::newSystemUser)

daniel lowered the priority of this task from High to Medium.Jan 27 2021, 5:39 PM
daniel claimed this task.

Resolving. Creation of system users will be addressed separately.