Page MenuHomePhabricator

Update the format of temporary user names to include the year and hyphens
Closed, ResolvedPublic

Description

From the parent task:

  • The prefix for temporary usernames will be ~YYYY-n where YYYY indicates the year when the temporary username is created
  • The identifying temp user # n is broken into groups of 5 separated by hyphens. If the numbers don't neatly divide into groups of 5 then the very last group on the right can have fewer numbers.
  • At the end of every year n is reset and the counting starts all over.

Handled elsewhere: changing the prefix, resetting the counter each year.

Temporary user names are generated by TempUserCreator::acquireName:

private function acquireName(): string {
	$index = $this->getSerialProvider()->acquireIndex();
	$serialId = $this->getSerialMapping()->getSerialIdForIndex( $index );
	return $this->config->getGeneratorPattern()->generate( $serialId );
}

This combines the genPattern and the string generated from serialMapping, defined on the AutoCreateTempUser config.

The serialMapping option determines how to map a unique integer to a unique string for the user name.

There are a few approaches we could use here:

  1. Add a hook to acquireName that can be handled by WikimediaMessages can use to add a year and the hyphens
  2. Add another serialMapping type that adds the year and hyphens
  3. Add another config option addYear that can be used with any serialMapping, and also addhyphens for specifying whether numbers will be broken up by hyphens, and how many numbers should be in each group

(1) is my preference, since it keeps the code base clean, and allows for any more tweaks. However, it feels a little dangerous to rely on WikimediaMessages for usernames everywhere, but I'm not quite sure why...

(2) and (3) seem brittle and as though we're adding unnecessary WMF-specific complexity to core

Chosen approach: (3), as discussed in the comments.

Event Timeline

Tchanders renamed this task from Update the format for the unique part of temporary user names to Update the format of temporary user names to include the year and hyphens.Oct 23 2023, 10:48 AM
Tchanders updated the task description. (Show Details)

(1) is my preference, since it keeps the code base clean, and allows for any more tweaks. However, it feels a little dangerous to rely on WikimediaMessages for usernames everywhere, but I'm not quite sure why...

@kostajh asked me to expand on this, so I've thought a little more. I think it pushes the responsibility of keeping the temporary user names in a consistent format out to an extension, which means in practice that we're making a practical (albeit not logical) dependency from core onto an extension.

If we were just using WikimediaMessages to alter the way the name is displayed in the UI, but not how it is stored in the DB, I think this would be more palatable.

(2) and (3) seem brittle and as though we're adding unnecessary WMF-specific complexity to core

I would argue that (2) or (3) seem better over (1). Some thoughts on why:

  • While our chosen format may be just for the WMF, I think that if our naming scheme works well then third-party wikis might want to use it. Furthermore, if we go with (3) then third-party wikis can include the year and/or hyphens in their custom naming format.
  • To add the year to the temporary account name, we need a reliable way to determine when the new year has hit that does not cause race conditions. As discussed in T349501, this probably involves adding the year as a column to user_autocreate_serial and then once the new year hits we switch to a new row that matches the current year.
    • If we need to add a column and we go with (1), we would be adding a column to core that is only used by an extension. This adds complication to the DB which isn't justified by a useful feature.

If we need to add a column and we go with (1), we would be adding a column to core that is only used by an extension. This adds complication to the DB which isn't justified by a useful feature.

Yeah I agree. I think the need for T349501: Update temporary user names to start their counter again each year means we have to admit that we're adding year handling in core.

This needs doing at the same time as T349501: Update temporary user names to start their counter again each year otherwise we get naming conflicts. E.g.:

+-----------+-----------+----------+
| uas_shard | uas_value | uas_year |
+-----------+-----------+----------+
|         0 |       257 |        0 |
|         0 |         1 |     2023 |
+-----------+-----------+----------+

The serial ID generated from the index 1 in the second row was already generated by the first row (back when the index was 1). If the year isn't also in the name, the names will be the same.

Change 983900 had a related patch set uploaded (by Tchanders; author: Tchanders):

[mediawiki/core@master] WIP Use year in temporary user names and restart index each year

https://gerrit.wikimedia.org/r/983900

Change 984514 had a related patch set uploaded (by Tchanders; author: Tchanders):

[mediawiki/core@master] WIP Use year in temporary user names and restart index each year

https://gerrit.wikimedia.org/r/984514

Change 984514 abandoned by Tchanders:

[mediawiki/core@master] Use year in temporary user names and restart index each year

Reason:

Duplicate of If51acb3f4efa361ce36d919c862a52501a5a7d24

https://gerrit.wikimedia.org/r/984514

Tchanders updated the task description. (Show Details)

Change 983900 merged by jenkins-bot:

[mediawiki/core@master] Use year in temporary user names and restart index each year

https://gerrit.wikimedia.org/r/983900