Page MenuHomePhabricator

Purge expired data from the cuci_temp_edit and cuci_user tables
Closed, ResolvedPublic1 Estimated Story Points

Description

Technical background

In T368151 we added the central tables cuci_temp_edit and cuci_user, to give a central index for GUC and global autoblocks.

The cuci_temp_edit table contains private data (i.e. the IP address) which needs to be purged when the associated row in the local cu_changes table has been purged. The cuci_user table does not necessarily contain private data, but we still need to purge it as it can indicate that an account has performed an action only shown in CheckUser results (such as logging in or logging out).

This should be done before we start writing to the central tables, so that we can be sure that data will be appropriately deleted.

What needs doing
  • Add code to purge data from cuci_temp_edit and cuci_user where the associated timestamp is older than the cutoff for purging data on the given wiki
  • Make a new maintenance script to do this purging or add this to purgeOldData.php
  • Ensure that purging occurs using the existing purging job

Related Objects

StatusSubtypeAssignedTask
Resolvedkostajh
DeclinedNone
ResolvedNiharika
ResolvedMadalina
OpenNone
ResolvedDreamy_Jazz
OpenNone
OpenNone
OpenNone
OpenNone
ResolvedSTran
ResolvedDreamy_Jazz
ResolvedDreamy_Jazz
ResolvedDreamy_Jazz
ResolvedDreamy_Jazz
ResolvedDreamy_Jazz
ResolvedDreamy_Jazz
ResolvedDreamy_Jazz
ResolvedDreamy_Jazz

Event Timeline

Dreamy_Jazz removed Dreamy_Jazz as the assignee of this task.

Change #1064051 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[mediawiki/extensions/CheckUser@master] Remove wgCheckUserPurgeOldClientHintsData

https://gerrit.wikimedia.org/r/1064051

Change #1064428 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[mediawiki/extensions/CheckUser@master] Clean up code in PruneCheckUserDataJob

https://gerrit.wikimedia.org/r/1064428

Change #1064429 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[mediawiki/extensions/CheckUser@master] PruneCheckUserDataJob: Move static function into a private method

https://gerrit.wikimedia.org/r/1064429

Change #1064798 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[mediawiki/extensions/CheckUser@master] Only purge in purgeOldData.php if able to acquire a lock

https://gerrit.wikimedia.org/r/1064798

Change #1064799 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[mediawiki/extensions/CheckUser@master] Create CheckUserCentralIndexManager service

https://gerrit.wikimedia.org/r/1064799

JayCano set the point value for this task to 1.Aug 26 2024, 10:18 AM

Change #1066797 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[mediawiki/extensions/CheckUser@master] Improve documentation and tests for PruneCheckUserDataJob

https://gerrit.wikimedia.org/r/1066797

Change #1064798 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@master] Only purge in purgeOldData.php if able to acquire a lock

https://gerrit.wikimedia.org/r/1064798

Change #1066797 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@master] Improve documentation and tests for PruneCheckUserDataJob

https://gerrit.wikimedia.org/r/1066797

Change #1064799 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@master] Create CheckUserCentralIndexManager service

https://gerrit.wikimedia.org/r/1064799

I would suggest that this ticket is difficult to QA until T373021 and T371788 are also ready for QA. This is because the table will not contain entries on a normal wiki until those tasks are complete.

Change #1070046 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[mediawiki/extensions/CheckUser@master] Fix running of recent changes purge in purgeOldData.php

https://gerrit.wikimedia.org/r/1070046

Change #1070046 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@master] Fix running of recent changes purge in purgeOldData.php

https://gerrit.wikimedia.org/r/1070046

dom_walden subscribed.

I tested the purgeOldData maintenance script in T359560.

I also checked that rows were purged from cuci_user and cuci_temp_edit as part of the hook when editing.

Because deleting from cu_* and cuci_* are done as part of the same job, if deleting from the latter tables fails so will deleting from the former.