Page MenuHomePhabricator

[SPIKE] Investigate TitleBlacklist extension to see if changes are required for IP Masking
Closed, ResolvedPublic

Description

A preliminary investigation (T326759) has found that the TitleBlacklist extension may be affected by IP Masking

Acceptance criteria
  • As a developer
    • I would like to know of all of the parts of the TitleBlacklist extension that are affected by IP Masking
    • And this should be documented once a thorough investigation has been performed

Timebox: 3 hours

Design

N/A

External dependencies
Other dependencies

https://www.mediawiki.org/wiki/Help:Temporary_accounts/How_it_works

Event Timeline

The campaigns team will take care of this extension.

MHorsey-WMF renamed this task from Prepare TitleBlacklist extension for IP Masking to [SPIKE] Investigate TitleBlacklist extension to see if changes are required for IP Masking.Jun 29 2023, 1:23 PM
MHorsey-WMF updated the task description. (Show Details)
MHorsey-WMF updated the task description. (Show Details)

Maybe on account creation temp user should bypass title blacklist to avoid that these are blocked by accident due to bad regex or such on the title blacklist.
Possible this should be handled the same as with AbuseFilter, which could also work/block on temp users account creation.

I reviewed the code of the extension, which doesn't deal with users that much. All the permission checks do not check whether the user is logged-in or logged-out, they just check specific user rights. The only thing that's potentially related to temp users are the account creation checks. Those also work correctly with temporary accounts, but as noted by Umherirrender above, TB could prevent the creation of a temp account if the account name is matched by any of the regexps in the blacklist.

Therefore, I think the only question is whether temp account creations should bypass the TB, and whether that should be unconditional, or controlled by a new attribute in the TB entries. Personally, I don't think the TB should be bypassed. If any regexp is preventing the creation of temp accounts, that will appear immediately in the TB hit log, and admins can update the TB accordingly. Still, this is a product question and I think that 1) both approaches would have their pros and cons, and 2) I'm not the one who should make a decision here.

Hello, @Niharika! It looks like a product question came out of this investigation, so I am pinging you here to see what product input you may have. Thanks!

Hello, @Tchanders & @Niharika! I'm wondering who would be the best person to respond to the product question brought up in the above comment as a result of the investigation. Thanks in advance!

Personally, I don't think the TB should be bypassed. If any regexp is preventing the creation of temp accounts, that will appear immediately in the TB hit log, and admins can update the TB accordingly.

That sounds reasonable to me. (cc @Tchanders)

Thanks for your feedback, @kostajh! Would @Tchanders be responsible for making the final decision?

Thanks for your feedback, @kostajh! Would @Tchanders be responsible for making the final decision?

I'll bring it up in one of our weekly meetings and we can document a decision.

If any regexp is preventing the creation of temp accounts, that will appear immediately in the TB hit log, and admins can update the TB accordingly

We'll probably have a similar issue like with AbuseFilter, where we need to know how to attribute TB hits, which is probably going to end up being solved with T359405: Create temporary account early in edit cycle for all edit attempts

Personally, I don't think the TB should be bypassed. If any regexp is preventing the creation of temp accounts, that will appear immediately in the TB hit log, and admins can update the TB accordingly.

That sounds reasonable to me. (cc @Tchanders)

If we run TB checks on temp account creations, then it looks as though we could hit a similar problem to T358806 in that we'd try to make a log for an IP address, then hit the CannotCreateActorException.

Based on a cursory scan of https://meta.wikimedia.org/w/index.php?title=Title_blacklist, we may be OK to skip the TB check for temp account names.

Hello, @Tchanders & @Niharika! I'm wondering who would be the best person to respond to the product question brought up in the above comment as a result of the investigation. Thanks in advance!

I guess the product question here is: is it OK for a something that matches a prohibited regexp to show up in a temp account name?

As for who has authority over temp accounts-related product decisions, in other cases it has been @Niharika, who may choose to delegate to the product owner, i.e. @ifried. If both are happy to delegate to engineers then I'm sure we can decide by consensus, as we are doing elsewhere.

Personally, I don't think the TB should be bypassed. If any regexp is preventing the creation of temp accounts, that will appear immediately in the TB hit log, and admins can update the TB accordingly.

That sounds reasonable to me. (cc @Tchanders)

+1 I think it is a good idea to not bypass TitleBlacklist.

Summary of a discussion with @Niharika:

  • TitleBlacklist hits for temp account creation don't need to be logged, since it's a system failure and not an action by a person. If it's much easier to log, the log could be attributed to a system user (e.g. TempAccountCreator). If we could silently re-attempt the creation, that would be ideal.
  • Could we use the TitleBlacklist to solve T337090?
  • Do we need to remove this from https://meta.wikimedia.org/w/index.php?title=Title_blacklist (it shouldn't be necessary once we have done T349506:

Prevent usernames similar to temporary accounts (phab:T345678)
^~\d{4}[\d-]+ <newaccountonly>

I have removed the Campaigns team tag, since AHT will be taking on this work (so they can decide what to do with this investigation ticket & next steps).

Note this comment from @Dreamy_Jazz which is relevant here: T337090#9683710.

That task was about disallowing certain numbers, and the chosen solution was to start from a higher number than the disallowed numbers. @Niharika gave the go-ahead on this, and confirmed that longer numbers that contain those disallowed numbers as substrings will be allowed. (Unfortunately that doesn't seem to be documented on the task, but that discussion would have happened later than this comment: T326929#9657849)

Note this comment from @Dreamy_Jazz which is relevant here: T337090#9683710.

That task was about disallowing certain numbers, and the chosen solution was to start from a higher number than the disallowed numbers. @Niharika gave the go-ahead on this, and confirmed that longer numbers that contain those disallowed numbers as substrings will be allowed. (Unfortunately that doesn't seem to be documented on the task, but that discussion would have happened later than this comment: T326929#9657849)

Since the variable part of a temp account name is numerical, and the disallowed numbers were fixed in T337090: Disallow certain numbers from being generated in the temporary account creation process, I think we can bypass TitleBlacklist for temp account creations. I'll close this task now based on that.