Page MenuHomePhabricator

Code review CentralAuth's GlobalUserMerge and related UserMerge code before SUL finalization mass usage
Closed, ResolvedPublic

Description

Fix the bugs in Special:GlobalUserRename

Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.

Event Timeline

Keegan raised the priority of this task from to Needs Triage.
Keegan updated the task description. (Show Details)
Keegan added a project: SUL-Finalization.
Keegan moved this task to To-do on the SUL-Finalization board.
Keegan changed Security from none to None.
Keegan added subscribers: Keegan, Legoktm.

This code needs review by a senior architect, and it must be tested and deployed in the first week or two of May, at the latest.

Given that we're at the end of March, I'd like to ensure that the resources needed to check on this are available. This needs to be checked before SUL finalization begins (still slated for April 15th) - we're a little under 3 weeks away. @RobLa or @Brion, pinging to draw your attention. Possible to give a timeframe?

bd808 renamed this task from Revisit GlobalUserMerge after testing to Code review CentralAuth's GlobalUserMerge and related UserMerge code before SUL finalization mass usage.Mar 27 2015, 3:04 PM
This comment was removed by Keegan.

The code, as written, works in theory but not in practice and breaks in all kinds of interesting ways with only slight pushes.

Not that we don't know that it's needed, but here's just the onwiki backlog waiting on this tool: https://meta.wikimedia.org/wiki/Steward_requests/Username_changes#Requests_with_usurps_and_complex_renames

After finalization it will likely be a hundred fold for the first couple of weeks.

I'm giving the updated extension a first-pass review now...

@brion, hows the review going so far?

The code, as written, works in theory but not in practice and breaks in all kinds of interesting ways with only slight pushes.

Is there any more information about the circumstances under which it fails, and the failure mode?

The code, as written, works in theory but not in practice and breaks in all kinds of interesting ways with only slight pushes.

Is there any more information about the circumstances under which it fails, and the failure mode?

Absolutely. Here's a couple of scenarios that occurred when I was testing it in November:

  1. Account 1 is registered and makes an edit
  2. Account 2 is registered and makes no edit
  3. Account 1 is merged into Account 2
  4. The edit vanishes from attribution but sits in the database
  1. Account 1 is registered, no edits
  2. Account 2 is registered, no edits
  3. Account 3 is registered, an edit
  4. All accounts merged into Account 1
  5. The edit shows up in Account 1's contributions, but Account 3 was deleted from the database, corrupting the move log as though the account never existed

...and other variations of the sort. GUM is having trouble staying in sync between attribution, logging, and account storage*.

*edit: in that, it will delete the local account that is being merged instead of just renaming it and attaching it to the global account

The code is not too bad. The LocalUserMergeJob potentially has a lot of work to do, which it does in a lot of separate transactions, so the work could easily be incomplete if an exception is encountered, e.g. a deadlock. LocalUserMergeJob::allowRetries() returns true (the default is not changed), which may possibly mitigate the issue depending on where the exception occurs.

I think a good way to proceed would be to test it on beta and to file detailed bugs if any issues occur. By detailed, I mean including the relevant entries from the runJobs log and the exception log.

I don't know if there will be job queue exceptions in practice. Maybe we can get away with running it as it is, i.e. without rearchitecting for robustness. Pilot deployment may help to nail down how much work we need to do here.

I have filed the bug @Keegan found as T96117.

  1. Account 1 is registered and makes an edit
  2. Account 2 is registered and makes no edit
  3. Account 1 is merged into Account 2
  4. The edit vanishes from attribution but sits in the database

I can't reproduce this. I checked the revision table on beta's metawiki for referential integrity, but the only issues I could find were caused by T96117. I tried doing it myself, but everything seems fine afterwards: http://meta.wikimedia.beta.wmflabs.org/w/index.php?title=User:Timtest2&action=history

I don't recall the circumstances at this point, I just recall talking to @Legoktm about it as it happened. It was in his very first release of the code to betalabs, so it could have very well have been a one-off bug or fixed.

Legoktm claimed this task.

At this point we're just doing testing, and nothing is majorly breaking so far...