Page MenuHomePhabricator

[SPIKE] Identify areas affected by delaying purging of blocks by X days [8H]
Closed, ResolvedPublic

Description

From T293263: [SPIKE] Investigate getting number of past block for IP (including IP ranges) [8H]:

@Tchanders: AIUI you're the member of Anti-Harassment with the most experience with blocks. Would there be any other side effects of delaying the purge (other than displaying X-day-old blocks to patrollers)?

I believe there are many places where rows from the ipblocks table are selected without checking if the blocks are expired. Whether all of these only concern displaying the blocks to other users, I'd have to spend some time looking into that...

In general where actions are being performed (e.g. when attempting to block and IP, or when an anonymous user is attempting to edit) it looks as though the DatabaseBlock::newFrom* methods are largely used, which won't return expired blocks.

But we should go through this with a fine-tooth comb before delaying the purge, and probably find somewhere to communicate that the expiry should always be checked too.

If it helps:
https://codesearch.wmcloud.org/search/?q=%27ipblocks%27&i=nope&files=php&excludeFiles=&repos=

Questions to be answered:

  1. Will delaying the purge affect anything that currently checks against ipblocks and if it does, what can/should be done?
  2. Where should this be documented?

Event Timeline

@phuedx @Tchanders I went ahead and created this follow-up task but could you confirm my understanding of it? We're worried that somewhere else something is checking against ipblocks with the assumption that any block there is unexpired because it assumes expired blocks are regularly purged? So we would need to find those instances and make sure that if necessary the expiry is taken into account.

@phuedx @Tchanders I went ahead and created this follow-up task but could you confirm my understanding of it? We're worried that somewhere else something is checking against ipblocks with the assumption that any block there is unexpired because it assumes expired blocks are regularly purged? So we would need to find those instances and make sure that if necessary the expiry is taken into account.

Yes. That matches my understanding of @Tchanders' comment.

per @STran we can likely ignore the updater and test files.

ARamirez_WMF renamed this task from [SPIKE] Identify areas affected by delaying purging of blocks by X days to [SPIKE] Identify areas affected by delaying purging of blocks by X days [8H].Dec 14 2021, 4:56 PM

@phuedx @Tchanders I went ahead and created this follow-up task but could you confirm my understanding of it? We're worried that somewhere else something is checking against ipblocks with the assumption that any block there is unexpired because it assumes expired blocks are regularly purged? So we would need to find those instances and make sure that if necessary the expiry is taken into account.

Yes. That matches my understanding of @Tchanders' comment.

Yes, the task description captures what I meant - thanks.

From @Tchanders off-band:

I'm thinking that one outcome of https://phabricator.wikimedia.org/T297185 should be to decide whether we should do https://phabricator.wikimedia.org/T297947. I have concerns about changing how blocks are purged in MediaWiki core for the sake of one small feature in the initial, beta version of IPInfo - particularly if the investigation finds that it would have a wide impact

Will delaying the purge affect anything that currently checks against ipblocks and if it does, what can/should be done?

From a community tool perspective, this is a breaking change. I know I have at least two tools that check for blocks by simply JOINing on the ipblocks table, and I'm sure that other tools do the same thing. While the fix would be simple (JOIN ipblocks ON ... AND (ipb_expiry = "infinite" OR ipb_expiry < NOW())), it's still something that needs to be fixed. Accordingly, if this is the direction this looks to be going, it would be a good idea to get it in User-notice sooner rather than later.

Tchanders added subscribers: Niharika, Prtksxna.

@AntiCompositeNumber Thanks for summarising the impact on our community tools.

Impact on MediaWiki

There are a few places in the UI that would be affected by delaying the purge. For example, users with an expired block display as blocked in Special:Users and Special:ActiveUsers, as well as Special:Investigate (to give an example of an affected extension). These lists are costly to generate and involve several queries, so adding another filter to the ipblocks might be costly. (These are user blocks rather than IP blocks, but delaying the purge only for IP blocks could be unmaintainably confusing.)

There are also some public methods on the core block classes that don't filter out expired blocks, such as DatabaseBlock::getQueryInfo, used for building a query to select blocks. Callers of these methods would need to be warned.

I didn't see a discussion around how we'd handle blocks that are manually removed (e.g. via Special:Unblock), but obviously delaying the purge would not cause these to be kept. Similarly, autoblocks against the IP address of a blocked user get deleted when the user's block expires - these would also not be kept around by delaying the purge.

Recommendation

Keeping expired blocks in the ipblocks table would have a wide impact, so making this change would be a significant piece of work. I spoke with @Prtksxna about the importance of displaying expired blocks in IPInfo, and while it is undoubtedly a useful feature, an important purpose of IPInfo (at least the first iteration) is to maintain patrollers' current experience, whereas showing expired blocks would be a new feature. (And I note that it might be something that patrollers want to use outside of the context of IPInfo.)

@Niharika @Prtksxna I'd recommend that displaying expired blocks be treated as a separate project, outside of the scope of IPInfo. I'd also recommend thinking about how it could be accessed from more tools than just IPInfo.