Page MenuHomePhabricator

Disallow certain numbers from being generated in the temporary account creation process
Closed, ResolvedPublic

Description

Motivation

Certain numbers are considered defamatory and we should refrain from using these numbers through the temporary account generation process. These include:

  • 88
  • 737
  • 511
  • 1423
  • 1488
  • 311
  • 318
  • 336
Spec:
  • Do not generate temporary accounts with these prefixes.
Source:

Event Timeline

It might be a good idea to generally start the numbers quite high (i.e. make them totally random, 9-10 numbers long) to avoid people trying to get certain numbers.

I don't know how much this matters, but 13 and 666 are superstitious numbers in Western culture, and might be off-putting to some users.

https://en.wikipedia.org/wiki/88_(number)#Cultural_significance indicates a wide variety of meanings. In Chinese, it's a lucky number. In amateur radio, it means "love and kisses".

Since all of these are below 1500, then starting with 1500 would prevent the unique part of the username from being (only) one of these numbers.

The solutions to this problem that have been discussed are:

  1. Extending TitleBlacklist to allow a filter to block a specific format for temporary account usernames
  2. Defining a list of numbers which cannot be used in the wgAutoCreateTempUser config
  3. Adding 1500 to any generated number to avoid usernames that are defamatory, as suggested above - This would be done via the offset in the serialMapping config of wgAutoCreateTempUser

I would oppose using TitleBlacklist for this solution, because:

  1. If the extension is logging title blacklist hits, then we would need to prevent the logging for temporary account creations (as the user who tried to make the edit did not choose this username and instead the software did). If we did log the hits, then an admin reviewing the log may assume that the users are hitting the filter for a bad-faith decision they made (and not the software choosing a username for them)
    1. If we don't log this to the log, then it makes it difficult to work out if the title blacklist entry is broken, because a large volume of hits in the logs could indicate a problem.
  2. We do not want to require the user to re-submit their edit after the username matches, which is what would happen if the title blacklist is hit after the code that acquires the username has finished getting the username. Furthermore, if the automatically generated username is rejected we do not want to increase the rate limit on acquiring usernames. As such, we would need to define a new core hook that allows the TitleBlacklist extension to disallow certain temporary account usernames at the point they are generated so that a new one can be chosen.
  3. All of the above means we also need to have a new attribute that defines that a rule applies only to temporary account creations. This adds complexity to the code, as this new attribute also has special rules about logging and how the rule is applied.
  4. If the TitleBlacklist entries are set to prevent a specific number at the start of the username, then the code could spend a long time generating a valid username. For example, if the format ~2024-123 is blocked then if the number after the - is 123000000 it would mean 1 million different failed attempts to create a temporary account username. As such, if we were to prevent usernames based a prefix (i.e. don't allow numbers starting with 123) we will need to handle this in a way which does not involve a brute-force way of finding a valid username.
  5. Any username which is later treated as not valid will cause a the TitleBlacklist extension to prevent the autocreation on any other wiki - As such, it would make it impossible for that user to edit on a new wiki without exiting their current session and starting with a new username.

I would be okay with 2 or 3.

Idea 3 seems reasonable as, while the site https://www.adl.org/resources/hate-symbols/search?page=0 seems down for me, it seems that setting the number to start at 1500 would avoid all the numbers listed in the task description. Furthermore, this is the simplest solution and would only require adding one line in the WMF specific config / the default config.

If we need to prevent numbers starting/ending with these numbers, then we would need to go with solution 2.

Change #1014526 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[operations/mediawiki-config@master] Add wgAutoCreateTempUser configuration for production

https://gerrit.wikimedia.org/r/1014526

Based on discussions had off Phabricator with @Niharika, we only need to prevent the numbers that appear and not if these numbers appear within another larger number.

As such, I implement solution 3 as this is the best and simplest. The decision can be changed as part of the review process.

While the config patch itself is not ready to be merged as it depends on a core patch which is awaiting review, the config patch can be reviewed and feedback provided.

Change #1014526 merged by jenkins-bot:

[operations/mediawiki-config@master] Add wgAutoCreateTempUser configuration for production

https://gerrit.wikimedia.org/r/1014526

Mentioned in SAL (#wikimedia-operations) [2024-04-15T13:06:04Z] <urbanecm@deploy1002> Started scap: Backport for [[gerrit:1014526|Add wgAutoCreateTempUser configuration for production (T349506 T337090)]], [[gerrit:1019694|Change mul deployment on beta to limited version (T356169)]]

Mentioned in SAL (#wikimedia-operations) [2024-04-15T13:08:01Z] <urbanecm@deploy1002> urbanecm and arthurtaylor and dreamyjazz: Backport for [[gerrit:1014526|Add wgAutoCreateTempUser configuration for production (T349506 T337090)]], [[gerrit:1019694|Change mul deployment on beta to limited version (T356169)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-04-15T13:58:16Z] <urbanecm@deploy1002> sync-world aborted: Backport for [[gerrit:1014526|Add wgAutoCreateTempUser configuration for production (T349506 T337090)]], [[gerrit:1019694|Change mul deployment on beta to limited version (T356169)]] (duration: 52m 11s)

Mentioned in SAL (#wikimedia-operations) [2024-04-15T14:01:23Z] <urbanecm@deploy1002> Started scap: Backport for [[gerrit:1014526|Add wgAutoCreateTempUser configuration for production (T349506 T337090)]], [[gerrit:1019694|Change mul deployment on beta to limited version (T356169)]]

Mentioned in SAL (#wikimedia-operations) [2024-04-15T14:14:15Z] <urbanecm@deploy1002> urbanecm and dreamyjazz and arthurtaylor: Backport for [[gerrit:1014526|Add wgAutoCreateTempUser configuration for production (T349506 T337090)]], [[gerrit:1019694|Change mul deployment on beta to limited version (T356169)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-04-15T14:31:35Z] <urbanecm@deploy1002> Finished scap: Backport for [[gerrit:1014526|Add wgAutoCreateTempUser configuration for production (T349506 T337090)]], [[gerrit:1019694|Change mul deployment on beta to limited version (T356169)]] (duration: 30m 12s)