Page MenuHomePhabricator

Update the serial mapping config for generating temporary user names on beta
Closed, ResolvedPublic1 Estimated Story Points

Description

The config on beta is currently:

$wgAutoCreateTempUser['serialMapping'] = [ 'type' => 'scramble' ];

Following T345255: Decide the numerical order for temporary accounts (Scramble/Serial), we'd like:

$wgAutoCreateTempUser['serialMapping'] = [ 'type' => 'plain-numeric' ];

Investigate

Before doing this, we should investigate whether it's OK just to switch from scramble to serial:

  • Do we risk naming conflicts?
  • Do we need to do something to make this safe, e.g. manually change the latest uas_values?
    • Answer: No longer needed after T353545 because we can offset the value if required.

Event Timeline

Investigate

Before doing this, we should investigate whether it's OK just to switch from scramble to serial:

  • Do we risk naming conflicts?

Yes, and not even really a risk but a blocker from what I can tell.

The current value being scramble means that when changing this to plain-numeric we can no longer be safe in the assumption that a new temporary account name will not have already been used. While the documentation of SerialProvider::acquireIndex says Acquire an integer such that it is unlikely to be used again, and return it, the callers to this method seem to assume that it is instead impossible and not unlikely.

For example, TempUserCreator::create will cause a fatal error if the username already exists and when called by EditPage::internalAttemptSave this will just cause the edit to not be saved.

In my local testing I found that it would give you an error about the temporary account already existing, preventing you from saving the edit. However, it would at the same time assign you the temporary account name and allow you to edit under it when you used a different edit form. This is bad...

  • Do we need to do something to make this safe, e.g. manually change the latest uas_values?

The code uses the following to provide the ID (where $value is the uas_value):

return $value * $this->numShards + $shard

Reversing the equation we get $value = ( $id - $shard ) / $this->numShards. On betawikis the value for $this->numShards is 8. The value for $shard will range between 0 and 7, and for our case the largest $value will come from the smallest $shard. Therefore, if we set all uach_value to greater than MAX ( $id ) / $this->numShards it should prevent any collisions. This change in value must be done after the configuration change is made, but must also be done quickly to prevent issues.

Therefore, if this was to change in production we would need to write a script for it to prevent temporary accounts being created while the switch is made.


On a separate note, we need to find a better way to deal with collisions. While setting uas_value to be above certain number for all of the shards works in this case, going back to scramble would again cause issues that would require another increase. Ideally, the code should use the provider with the understanding that it is unlikely to be used again but not impossible and as such should try getting another temporary account name.

In summary we can update the config, but must make the change to the global_user_autocreate_serial immediately after the configuration change is made. If this change needs to be made in production, this will require a maintenance script that both makes the configuration change and immediately updates global_user_autocreate_serial while also preventing temporary accounts from being created to prevent issues.

The alternative is to fix the users of the serial providers to assume that collisions can occur, and in these cases re-attempt using a different uas_value.

@Tchanders any thoughts and does what I've found make sense?

Thanks @Dreamy_Jazz.

In my local testing I found that it would give you an error about the temporary account already existing, preventing you from saving the edit. However, it would at the same time assign you the temporary account name and allow you to edit under it when you used a different edit form. This is bad...

...Indeed. From a very cursory glance, it looks as though we may want to call AuthManager::autoCreateUser with login false, and login after we've confirmed there are no errors... (That call is made here.)

The alternative is to fix the users of the serial providers to assume that collisions can occur, and in these cases re-attempt using a different uas_value.

I think this is a good idea - we shouldn't allow the above scenario to accidentally happen, and we can't be sure the config won't change at some point.

Therefore, if we set all uach_value to greater than MAX ( $id ) / $this->numShards it should prevent any collisions.

This is really helpful to know. Though I agree it would be nice to avoid having to do this. For the changes we're making on beta, I suspect avoiding collisions may be enough - I don't think anyone will mind if the numbering sequence isn't perfectly chronological. (If T349494: Update the format of temporary user names to include the year and hyphens were to go first, I suppose there wouldn't be any collisions, but it's still helpful to have collision handling.)

I've implemented collision handling in the subtask (T353390). As such, a change to the beta wikis config can be made once that subtask is resolved. Therefore, moving this back to 'Ready' while this is essentially blocked on that task.

Dreamy_Jazz subscribed.

Will be done as part of T349486: Change temporary user pattern configuration on beta to match the updated prefix. Doing this as part of that ticket also means we do not need to set an offset (which was added in T353545) so only the change in the ticket description is needed for this ticket.

Change 992670 had a related patch set uploaded (by STran; author: STran):

[operations/mediawiki-config@master] Update beta configs to reflect new temp account naming pattern

https://gerrit.wikimedia.org/r/992670

Change 992670 merged by jenkins-bot:

[operations/mediawiki-config@master] Update beta configs to reflect new temp account naming pattern

https://gerrit.wikimedia.org/r/992670