Page MenuHomePhabricator

Document: extensions that store user IDs or usernames need to listen to UserMerge hooks
Open, Needs TriagePublic



We're planning on deploying a global user merge tool to Wikimedia sites shortly. As the name suggests, it merges multiple users into one.

This means that if your extension is storing user ids or user names, it will need to listen to one of the UserMerge hooks (UserMergeAccountFields, MergeAccountFromTo, UserMergeAccountDeleteTables, or DeleteAccount) to make sure it isn't referring to non-existent users. Reedy & I previously did an audit last November of all deployed extensions, however new ones have been deployed since then. Please check your extension(s) and if they need updating, file bugs that block T49918 and T69758.

This should be documented. Extension storing user IDs and/or usernames have to handle it on the WMF cluster and any other wiki where Renameuser or MergeUser extensions are installed.

  • Mention in Writing an extension for deployment
  • Document this on other pages about extension development... TBD! Create a new manual page about usernames and IDs? Mention on Database design pages? Where else?
  • Identify good extensions to link to as source code examples from the documentation.

Event Timeline

Spage raised the priority of this task from to Needs Triage.
Spage updated the task description. (Show Details)
Spage added subscribers: Spage, Legoktm.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Any extension that is going to be deployed to SUL wikis on the Wikimedia cluster needs to support UserMerge (user_id, user_name) and Renameuser (user_name).

Since Renameuser is shipped in the tarball it would be a good idea for extensions in general to support it, but it's not *required* since it's not a core feature (there's a bug somewhere to merge it into core).

Spage set Security to None.

Some guidance on which hook to use would be nice; different hooks which seemingly perform the exact same thing create developer angst.
I'm guessing the fields/tables hooks should be preferred as they stay inside the transaction?

UserMergeAccountFields is preferred, as it'll automatically take care of transactions and batching. But in some cases you have to use MergeAccountFromTo (ex: FlaggedRevs) if merging requires some logic besides just updating the id.

For some tables, we use UPDATE IGNORE in the UserMergeAccountFields hook, and then delete any duplicates using the UserMergeAccountDeleteTables hook. and DeleteAccount is the equivalent of MergeAccountFromTo but for deletion.

For some extensions like Translate's sandbox feature, they don't listen to the merge hook and just act on the account deletion since we decided that retaining the data wasn't useful.