Page MenuHomePhabricator

UserMerge: Code Stewardship Review
Open, NormalPublic

Description

Rubric as requested by documentation:

A succinct problem statement to give context for why the review was initiated.

UserMerge is an extension that allows merging two accounts. While very useful it doesn't work on Wikimedia, or better said, we are adviced not to use it. In order to merge two user accounts, all edits and log data gets merged. Wikimedia's active development of new extensions, API and other features, merging an user account is no longer just merging edits and logs. UserMerge has not been kept up with the whole updates and usage of this extension is likely to cause incomplete or broken user account merges that will need to be manually fixed. UserMerge does also perform heavy DB queries/full table scans that have caused huge replag in the past when usage was attempted at dewiki and commonswiki (see data below). No local or global group is currently granted the 'usermerge' permission, making the extension also unused.

Entry in Developers/Maintainers with:
@Legoktm is listed as maintainer.

Code Steward
None.

Maintainer (non-WMF team)
No team.

In-training
None.

Number, severity, and age of known and confirmed security issues
None.

Was it a cause of production outages or incidents? List them.

Does it have sufficient hardware resources for now and the near future (to take into account expected usage growth)?
I can't answer that question with the information I have.

Is it a frequent cause of monitoring alerts that need action, and are they addressed timely and appropriately?
No, because we don't use it as it's unreliable and/or cause huge DB load.

When it was first deployed to Wikimedia production
2012 in rOMWC57e3970888acd367e835eacb6d40556af0e57767

Usage statistics based on audience(s) served
N/A

Changes committed in last 1, 3, 6, and 12 months

Reliance on outdated platforms (e.g. operating systems)
N/A

Number and age of open patches
We currently have 9 open patches. The oldest one is https://gerrit.wikimedia.org/r/#/c/139085/ opened 3 years and 3 months ago.

Number and age of open bugs
We currently have 21 open Tasks. The oldest one is from year 2016.

Number of known dependencies?
N/A

Is there a replacement/alternative for the feature? Is there a plan for a replacement?
No.

Submitter's recommendation
If there's no plans or resources or people interested in keeping up with the required maintenance of this extension to work on the projects, my recomendation is to undeploy UserMerge from Wikimedia Production.


NOTE: Feedback can now be provided in this Task or at mediawiki.org.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 18 2018, 6:27 PM
Aklapper updated the task description. (Show Details)Oct 5 2018, 1:51 PM
Tgr added a subscriber: Tgr.Dec 13 2018, 10:34 PM

Now that user data is deduplicated into its own table, maybe the extension (which does mass rewriting of revisions etc. to change the referenced user, which is fragile and hard to impossible to undo) could be rewritten to integrate with user table handling and work in a redirect-like fashion? That is, add something like User::getRawId() and have User::getId() return the merge target account's ID. (And have the extension provide a CentralIdLookup that does the same for global IDs.) DB queries where the actor ID is a condition would become slightly more complicated but otherwise it seems like a less painful approach.

(Of course that doesn't address the quesion of whether the extension is important enought to maintain in the first place.)

MaxSem added a subscriber: MaxSem.Dec 14 2018, 5:55 AM

Now that user data is deduplicated into its own table, maybe the extension (which does mass rewriting of revisions etc. to change the referenced user, which is fragile and hard to impossible to undo) could be rewritten to integrate with user table handling and work in a redirect-like fashion?

This doesn't address the problem with other extensions storing user ID in billion other places and especially with our current trend of moving functionality into standalone services.

Tgr added a comment.Dec 14 2018, 6:51 AM

You'd keep storing the old user ID, but when you look it up you'd be required to use the User class and not direct DB manipulation (which is good practice anyway, a DB table should be handled by a single component), and that class would handle the redirection to the new user. A bunch of code would probably have to be rewritten for that to work, but then it would have to be rewritten anyway due to the actor migration.

Unless you want to join that old ID with something...

MarcoAurelio updated the task description. (Show Details)
Tgr added a comment.Dec 17 2018, 7:28 AM

Then you'd call something like $user->getIdSet() which would return an array of all the IDs and put 'user_id' => $ids in the query. That's a problem when you want a unique join, that is, when you are joining the user table with something that each user can only have a single instance of (unlike edits etc). But any such thing requires special handling on merge anyway.

In its current state, UserMerge isn't suitable for use on Wikimedia sites due to database performance issues, and poor integration with CentralAuth/SUL. These problems aren't insurmountable, but I don't think the time investment into fixing them is worth the ability to merge users, nor has anyone stepped up to fix them.

I would recommend/endorse undeployment until those problems can be resolved, either by improving the database schema so UserMerge doesn't have to make unindexed, expensive database update queries, or some redirect thing like Tgr has suggested.

My key takeaway from this discuss is that although we could address the issues/shortcomings of this extension as it is today, there isn't much perceived value in doing so. As a result, it appears to make the most sense to move forward undeploying/sunsetting this extension.

Ltrlg added a subscriber: Ltrlg.Jan 30 2019, 8:09 AM

@MarcoAurelio, we're going to move forward on undeploying this extension from production. I'll create a task shortly.

Jrbranaa triaged this task as Normal priority.Feb 13 2019, 11:29 PM
Jrbranaa moved this task from In Review to Prioritized on the Code-Stewardship-Reviews board.

@MarcoAurelio, we're going to move forward on undeploying this extension from production. I'll create a task shortly.

@Jrbranaa Thank you for moving forward with this one. I'll take a look at the tasks and see if I can help in the process.

I understand that this extension is likely to fall by the wayside. I do not have any technical advice to give, although I have been able to use the extension successfully.

I did some research also, because, I wanted to delete the displayed traces of the logs following a merge then the deletion of one or more users. I did not like the logs to remain visible, while the spammer user accounts had been merge and deleted.

I then discovered the use of https://www.mediawiki.org/wiki/Manual:RevisionDelete/fr which allowed me to select the UserMerge logs and make them neutral, with a simple checkbox.

According to my readings and weak technical understandings of Mediawiki, I understood that the script Cleanmediawiki.sh could have been more adapted to my needs: Suppression of mass, to delete users, pages, and, the logs going with .
However, I could not find this script from the official pages of Mediawiki, so I find it disturbing, I do not know if this script is stable. (Https://github.com/ZerooCool/cleanmediawiki)

Nevertheless, and, this is the reason for my participation in this post, I find that UserMerge is perfectly referenced on Google, during my French search for "Delete a user account".

This is also the only consistent answer I could find, I would not know, without UserMerge, how to delete a spammer user.

I read to you, that the API is wonderful, maintained, does a lot of things, yet I do not know anything about the good use of the API.

For his reasons, I think it would be good not to make UserMerge disappear, but to maintain it.

UserMerge is not going to dissapear. It'll just be undeployed from Wikimedia wikis given that there's too much technical debt and incompatible stuff that makes this extension unusable for us, WMF projects. Third-party wikis will not be affected. The extension will remain at mediawiki for everyone to download and install, and patches will still be welcome at Gerrit for those willing to maintain it of course.

OK thank you for your answer.
In this regard, I completed the thread, since the link mediawiki extension.

I take advantage of having experts on this subject to ask my question, if it is possible on this thread. If UserMerge is no longer supported, how do you do to remove spam users with the API? As I said, I do not know the API. It's a lot of learning. Can you guide me to a tutorial, a way to do well, to delete spammers users, with their pages, their changes ...

I do not want to be redirected to the vandalism page, I have already studied it, and, apart from UserMerge, I can not find any information to destroy the user.

Tgr added a comment.Mar 19 2019, 1:46 AM

If UserMerge is no longer supported, how do you do to remove spam users with the API?

Just block them, hide them, and rename them to Spammer0001, Spammer0002 etc. I don't think UserMerge solves any real problem wrt. spam.