Page MenuHomePhabricator

purgeExpiredMentorStatus.php does not operate correctly when number of rows to delete is higher than batch size
Closed, ResolvedPublic

Description

Today, I discovered that purgeExpiredMentorStatus does not operate correctly when the number of deleted rows is higher than batch size.

At my dev wiki, I have the following entries for the growthexperiments-mentor-away-timestamp user property:

MariaDB [awiki]> select * from user_properties where up_property='growthexperiments-mentor-away-timestamp';
+---------+-----------------------------------------+----------------+
| up_user | up_property                             | up_value       |
+---------+-----------------------------------------+----------------+
|      33 | growthexperiments-mentor-away-timestamp | 19700101000000 |
|      34 | growthexperiments-mentor-away-timestamp | 19700101000000 |
|      37 | growthexperiments-mentor-away-timestamp | 19700101000000 |
|      41 | growthexperiments-mentor-away-timestamp | 19700101000000 |
|      42 | growthexperiments-mentor-away-timestamp | 19700101000000 |
|      43 | growthexperiments-mentor-away-timestamp | 19700101000000 |
|      44 | growthexperiments-mentor-away-timestamp | 19700101000000 |
+---------+-----------------------------------------+----------------+
7 rows in set (0.002 sec)

MariaDB [awiki]>

Running purgeExpiredMentorStatus.php's dry run with default batch size (which is 100) correctly reports it would delete 7 rows:

urbanecm@notebook  ~/unsynced/gerrit/mediawiki/extensions/GrowthExperiments
$ php maintenance/purgeExpiredMentorStatus.php --wiki=awiki --dry-run
Would delete 7 rows from user_properties.
urbanecm@notebook  ~/unsynced/gerrit/mediawiki/extensions/GrowthExperiments
$

Decreasing batch size to 5 makes the script report deletion of 20 rows:

urbanecm@notebook  ~/unsynced/gerrit/mediawiki/extensions/GrowthExperiments
$ php maintenance/purgeExpiredMentorStatus.php --wiki=awiki --dry-run --batch-size 5
Would delete 20 rows from user_properties.
urbanecm@notebook  ~/unsynced/gerrit/mediawiki/extensions/GrowthExperiments
$

This means the script tries to delete some rows multiple times. It happens because the maintenance script's batching logic does not empty $batch after yielding it, see https://gerrit.wikimedia.org/g/mediawiki/extensions/GrowthExperiments/+/f2d321081d9f302d46b6d3744a447a642b7e17f0/maintenance/purgeExpiredMentorStatus.php#63.

Event Timeline

Change 748384 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[mediawiki/extensions/GrowthExperiments@master] purgeExpiredMentorStatus: Fix batching logic

https://gerrit.wikimedia.org/r/748384

Urbanecm_WMF moved this task from Incoming to Code Review on the Growth-Team (Current Sprint) board.

It's highly unlikely the wrong behavior will be invoked at production in foreseeable future (assuming normal usage of the feature). However, we're still running a script automatically in production that has a major flaw in batching (which should not happen). Triaging as high for that reason.

Change 748384 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] purgeExpiredMentorStatus: Fix batching logic

https://gerrit.wikimedia.org/r/748384

JJMC89 changed the task status from Duplicate to Resolved.Dec 21 2021, 12:11 AM