Page MenuHomePhabricator

Run scanFilesInScanTable.php automatically on WMF wikis
Closed, ResolvedPublic2 Estimated Story Points

Description

We have settled on the following command to scan the backlog of files on commonswiki:

mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose

This script is currently being run "manually" via a tmux session on maint2002. However, this means that the script:

  1. Will in a weeks time be running on a no longer released version of mediawiki
  2. Is not resistant to interruptions, such as restarts of the maint2002 hosts or changes of the active DC.

Instead this script should be run through a puppet configuration such that the job is:

  1. Restarted semi-frequently to ensure updates to the script are applied
  2. Resistant to being randomly interrupted, as it would automatically restart

Event Timeline

Discussed in a meeting just now. We'll see if this breaks in practice. If so, we'll revisit automating the script restart.

Change #1100421 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[mediawiki/extensions/MediaModeration@master] Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php

https://gerrit.wikimedia.org/r/1100421

Change #1100426 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[operations/mediawiki-config@master] Create a DB list for wikis with continuous MediaModeration scans

https://gerrit.wikimedia.org/r/1100426

Change #1100427 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[operations/puppet@production] [WIP] Update MediaModeration module to run scans automatically

https://gerrit.wikimedia.org/r/1100427

Change #1100426 merged by jenkins-bot:

[operations/mediawiki-config@master] Create a DB list for wikis with continuous MediaModeration scans

https://gerrit.wikimedia.org/r/1100426

Mentioned in SAL (#wikimedia-operations) [2024-12-04T11:53:04Z] <dreamyjazz@deploy2002> Started scap sync-world: Backport for [[gerrit:1100426|Create a DB list for wikis with continuous MediaModeration scans (T355169)]]

Change #1100430 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[mediawiki/extensions/MediaModeration@wmf/1.44.0-wmf.6] Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php

https://gerrit.wikimedia.org/r/1100430

Mentioned in SAL (#wikimedia-operations) [2024-12-04T11:59:12Z] <dreamyjazz@deploy2002> dreamyjazz: Backport for [[gerrit:1100426|Create a DB list for wikis with continuous MediaModeration scans (T355169)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Change #1100434 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[mediawiki/extensions/MediaModeration@wmf/1.44.0-wmf.5] Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php

https://gerrit.wikimedia.org/r/1100434

Mentioned in SAL (#wikimedia-operations) [2024-12-04T12:06:06Z] <dreamyjazz@deploy2002> Finished scap sync-world: Backport for [[gerrit:1100426|Create a DB list for wikis with continuous MediaModeration scans (T355169)]] (duration: 13m 02s)

Change #1100421 merged by jenkins-bot:

[mediawiki/extensions/MediaModeration@master] Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php

https://gerrit.wikimedia.org/r/1100421

Change #1100434 merged by jenkins-bot:

[mediawiki/extensions/MediaModeration@wmf/1.44.0-wmf.5] Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php

https://gerrit.wikimedia.org/r/1100434

Change #1100430 merged by jenkins-bot:

[mediawiki/extensions/MediaModeration@wmf/1.44.0-wmf.6] Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php

https://gerrit.wikimedia.org/r/1100430

Mentioned in SAL (#wikimedia-operations) [2024-12-04T12:59:12Z] <dreamyjazz@deploy2002> Started scap sync-world: Backport for [[gerrit:1100442|Stats: Move StatsFactory flush into emitBufferedStats (T380609)]], [[gerrit:1100434|Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169)]], [[gerrit:1100430|Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169)]]

Mentioned in SAL (#wikimedia-operations) [2024-12-04T13:00:51Z] <dreamyjazz@deploy2002> Started scap sync-world: Backport for [[gerrit:1100442|Stats: Move StatsFactory flush into emitBufferedStats (T380609)]], [[gerrit:1100434|Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169)]], [[gerrit:1100430|Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169)]]

Mentioned in SAL (#wikimedia-operations) [2024-12-04T13:28:15Z] <dreamyjazz@deploy2002> Started scap sync-world: Backport for [[gerrit:1100442|Stats: Move StatsFactory flush into emitBufferedStats (T380609)]], [[gerrit:1100434|Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169)]], [[gerrit:1100430|Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169)]], [[gerrit:1100449|Revert "Stats: Move StatsFactory flush into emitBufferedSta

Mentioned in SAL (#wikimedia-operations) [2024-12-04T13:33:49Z] <dreamyjazz@deploy2002> dreamyjazz: Backport for [[gerrit:1100442|Stats: Move StatsFactory flush into emitBufferedStats (T380609)]], [[gerrit:1100434|Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169)]], [[gerrit:1100430|Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169)]], [[gerrit:1100449|Revert "Stats: Move StatsFactory flush into emitBufferedStats"]] synced

Mentioned in SAL (#wikimedia-operations) [2024-12-04T13:42:53Z] <dreamyjazz@deploy2002> Finished scap sync-world: Backport for [[gerrit:1100442|Stats: Move StatsFactory flush into emitBufferedStats (T380609)]], [[gerrit:1100434|Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169)]], [[gerrit:1100430|Fix handling of 'last-checked' as 'never' in scanFilesInScanTable.php (T355169)]], [[gerrit:1100449|Revert "Stats: Move StatsFactory flush into emitBufferedSt

Change #1100427 merged by RLazarus:

[operations/puppet@production] Update MediaModeration module to run scans automatically

https://gerrit.wikimedia.org/r/1100427

This will be hard to QA, given that the logs are private and that the only other thing to check would be that the script is scanning (which I have verified).