Page MenuHomePhabricator

[wikireplicas] add proper dry-run/diff mode to maintain-views
Open, In Progress, MediumPublic

Description

Maintain-views should have a way to detect which existing views need updating. This is both so we can have an alert when there's unapplied changes, as well as to make the script more efficient by not replacing views that don't need replacing without the need to specify --tables.

The major problem here is that SQL text format that maintain-views comes up with differs from the one produced by SHOW CREATE VIEW. Here's the definition of enwiki_p.ip_changes to demonstrate this problem:

CREATE OR REPLACE                     DEFINER= viewmaster                           VIEW `enwiki_p`.`ip_changes` AS select                        ipc_rev_id,                                        ipc_rev_timestamp,                                               ipc_hex               FROM  `enwiki`.`ip_changes`,     `enwiki`.`revision`  WHERE                        ipc_rev_id  =                      rev_id  AND                     (rev_deleted  & 4) = 0
CREATE            ALGORITHM=UNDEFINED DEFINER=`viewmaster`@`%` SQL SECURITY DEFINER VIEW `enwiki_p`.`ip_changes` AS select `enwiki`.`ip_changes`.`ipc_rev_id` AS `ipc_rev_id`,`enwiki`.`ip_changes`.`ipc_rev_timestamp` AS `ipc_rev_timestamp`,`enwiki`.`ip_changes`.`ipc_hex` AS `ipc_hex` from (`enwiki`.`ip_changes` join `enwiki`.`revision`) where `enwiki`.`ip_changes`.`ipc_rev_id` = `enwiki`.`revision`.`rev_id` and `enwiki`.`revision`.`rev_deleted` & 4  = 0

The first line is the definition from maintain-views, second one is MariaDB's SHOW CREATE VIEW format. Whitespace was formatted by me to make the side-by-side difference easier to see, otherwise that's exactly the text both of these tools came up with.

So I roughly see three options here:

  • Update maintain-views and its config to produce identical SQL text, and then compare the two strings
  • Update maintain-views to parse the SQL text somehow, at which point you can compare what they parse to instead of comparing the raw text
  • Store the currently live definition as produced by maintain-views somewhere, and compare that instead of comparing to the SHOW CREATE VIEW format.

Event Timeline

fnegri triaged this task as Low priority.Jul 1 2024, 9:20 AM
fnegri changed the task status from Open to In Progress.May 8 2025, 9:51 AM
fnegri claimed this task.

Change #1148358 had a related patch set uploaded (by FNegri; author: FNegri):

[operations/puppet@production] wikireplicas: remove dashes from script names

https://gerrit.wikimedia.org/r/1148358

Change #1148394 had a related patch set uploaded (by FNegri; author: FNegri):

[operations/puppet@production] wikireplicas scripts: setup pytest, add first test

https://gerrit.wikimedia.org/r/1148394

I split the refactoring work I've started into a subtask: T395266: [wikireplicas] Refactor maintenance scripts to allow local testing

This task will retain its original scope of "adding proper dry-run/diff mode to maintain-views".

fnegri renamed this task from add proper dry-run/diff mode to maintain-views to [wikireplicas] add proper dry-run/diff mode to maintain-views.May 26 2025, 4:24 PM
fnegri changed the status of subtask T395266: [wikireplicas] Refactor maintenance scripts to allow local testing from Open to In Progress.

I'm testing the patches above on clouddb1017, and this caused T422779: clouddb1017 reports a new database created. I didn't know we had alerts for new databases created on those hosts.

The reason is that I found a bug in https://gitlab.wikimedia.org/repos/cloud/wikireplicas-utils/-/merge_requests/9 where a \n was removed in the query builder and that caused some of the queries to fail, crashing the script. I will fix that today.

A possible fix is to always clean up the _maintain databases at the end of maintain-views, even when there was an error. Another one would be to exclude _maintain databases from the check.

I fixed the issue and it now completes successfully, and it does find some views that are not in sync:

fnegri@clouddb1017:~/wikireplicas-utils$ sudo python3 src/wikireplicas_utils/maintain_views.py --databases itwikivoyage --diff-mode --clean
2026-04-09 09:38:14,612 INFO Processing instance 's1'
2026-04-09 09:38:14,612 INFO Processing instance 's3'
2026-04-09 09:38:14,612 INFO Processing db 'itwikivoyage'
2026-04-09 09:38:14,838 INFO   Updating view actor
2026-04-09 09:38:14,894 INFO   Updating view actor_logging
2026-04-09 09:38:14,963 INFO   Updating view comment
2026-04-09 09:38:15,022 INFO   Updating view comment_logging
2026-04-09 09:38:15,384 INFO Dropping view s3.itwikivoyage_p.interwiki
2026-04-09 09:38:15,385 INFO Dropping view s3.itwikivoyage_p.user_autocreate_serial
maintain-views completed on 2 instance(s) and 1 database(s)
DIFF MODE enabled, no changes were made.
Views that would have been created: 0
Views that would have been updated: 4
Views that would have been deleted: 2

I tested one full run on all databases on clouddb1017. It's slower than the old version, but acceptable: it took 12 minutes in total.

The results are:

fnegri@clouddb1017:~/wikireplicas-utils$ sudo python3 src/wikireplicas_utils/maintain_views.py --all-databases --diff-mode --clean
[...]
maintain-views completed on 2 instance(s) and 868 database(s)
DIFF MODE enabled, no changes were made.
Views that would have been created: 608
Views that would have been updated: 3472
Views that would have been deleted: 1911
Views with SQL errors: 0

A sample of the affected views (many are repeated across all dbs, but some are only shown for some dbs):

2026-04-14 11:23:09,191 INFO Processing instance 's1'
2026-04-14 11:23:09,192 INFO Processing db 'enwiki'
2026-04-14 11:23:09,489 INFO   Updating view actor
2026-04-14 11:23:09,535 INFO   Updating view actor_logging
2026-04-14 11:23:09,603 INFO   Updating view comment
2026-04-14 11:23:09,649 INFO   Updating view comment_logging
2026-04-14 11:23:10,146 INFO Dropping view s1.enwiki_p.user_autocreate_serial
2026-04-14 11:23:10,147 INFO Dropping view s1.enwiki_p.interwiki
2026-04-14 11:23:10,147 INFO Dropping view s1.enwiki_p.flaggedrevs_tracking
[...]
2026-04-14 11:34:36,937 INFO   Creating new view wikifunctionsclient_usage
2026-04-14 11:35:53,210 INFO   Creating new view ores_classification
2026-04-14 11:35:53,211 INFO   Creating new view ores_model
[...]

I'm double checking that all the diffs displayed are correct.

Here's an example of showing a diff with --debug:

fnegri@clouddb1017:~/wikireplicas-utils$ sudo python3 src/wikireplicas_utils/maintain_views.py --database itwikivoyage --table actor --diff-mode --debug
[...]
2026-04-17 14:26:02,581 DEBUG Processing custom views for s3.itwikivoyage:
2026-04-17 14:26:02,581 DEBUG  Custom view: actor
2026-04-17 14:26:02,612 DEBUG   Difference:
--- View currently in database

+++ View defined in YAML configuration file

@@ -90,7 +90,7 @@

     WHERE
       `itwikivoyage`.`logging`.`log_actor` = `itwikivoyage`.`actor`.`actor_id`
       AND `itwikivoyage`.`logging`.`log_deleted` & 4 = 0
-      AND `itwikivoyage`.`logging`.`log_type` IN ('abusefilter', 'articlefeedbackv5', 'block', 'campus', 'close', 'contentmodel', 'course', 'create', 'delete', 'eparticle', 'gather', 'gblblock', 'gblrename', 'gblrights', 'globalauth', 'growthexperiments', 'gwtoolset', 'import', 'institution', 'instructor', 'interwiki', 'liquidthreads', 'lock', 'managetags', 'massmessage', 'merge', 'moodbar', 'move', 'mwoauthconsumer', 'newsletter', 'newusers', 'notifytranslators', 'online', 'pagelang', 'pagetranslation', 'pagetriage-copyvio', 'pagetriage-curation', 'pagetriage-deletion', 'patrol', 'protect', 'renameuser', 'review', 'rights', 'spamblacklist', 'stable', 'student', 'tag', 'thanks', 'timedmediahandler', 'translationreview', 'upload', 'usermerge')
+      AND `itwikivoyage`.`logging`.`log_type` IN ('abusefilter', 'abusefilterblockeddomainhit', 'articlefeedbackv5', 'block', 'campus', 'close', 'contentmodel', 'course', 'create', 'delete', 'eparticle', 'gather', 'gblblock', 'gblrename', 'gblrights', 'globalauth', 'growthexperiments', 'gwtoolset', 'import', 'institution', 'instructor', 'interwiki', 'liquidthreads', 'lock', 'managetags', 'massmessage', 'merge', 'moodbar', 'move', 'mwoauthconsumer', 'newsletter', 'newusers', 'notifytranslators', 'online', 'pagelang', 'pagetranslation', 'pagetriage-copyvio', 'pagetriage-curation', 'pagetriage-deletion', 'patrol', 'protect', 'renameuser', 'review', 'rights', 'spamblacklist', 'stable', 'student', 'tag', 'thanks', 'timedmediahandler', 'translationreview', 'upload', 'usermerge')
     LIMIT 1
   )
   OR EXISTS(
[...]

I verified that running the old version of the script to update that specific table clears the diff:

fnegri@clouddb1017:~/wikireplicas-utils$ sudo maintain-views --database itwikivoyage --table actor --replace
2026-04-17 14:34:18,024 INFO Full views for s3.itwikivoyage:
2026-04-17 14:34:18,024 INFO Custom views for s3.itwikivoyage:
2026-04-17 14:34:18,026 INFO [s3.itwikivoyage_p.actor]
2026-04-17 14:34:18,029 INFO [s3.itwikivoyage_p.actor_user]
2026-04-17 14:34:18,032 INFO [s3.itwikivoyage_p.actor_archive]
2026-04-17 14:34:18,034 INFO [s3.itwikivoyage_p.actor_block]
2026-04-17 14:34:18,036 INFO [s3.itwikivoyage_p.actor_ipblocks]
2026-04-17 14:34:18,039 INFO [s3.itwikivoyage_p.actor_image]
2026-04-17 14:34:18,041 INFO [s3.itwikivoyage_p.actor_oldimage]
2026-04-17 14:34:18,043 INFO [s3.itwikivoyage_p.actor_filearchive]
2026-04-17 14:34:18,045 INFO [s3.itwikivoyage_p.actor_recentchanges]
2026-04-17 14:34:18,047 INFO [s3.itwikivoyage_p.actor_logging]
2026-04-17 14:34:18,050 INFO [s3.itwikivoyage_p.actor_revision]
2026-04-17 14:34:18,053 INFO [s3.itwikivoyage_p.image_compat]
2026-04-17 14:34:18,057 INFO [s3.itwikivoyage_p.logging_compat]
2026-04-17 14:34:18,060 INFO [s3.itwikivoyage_p.oldimage_compat]
2026-04-17 14:34:18,064 INFO [s3.itwikivoyage_p.recentchanges_compat]
fnegri@clouddb1017:~/wikireplicas-utils$ sudo python3 src/wikireplicas_utils/maintain_views.py --database itwikivoyage --table actor --diff-mode
2026-04-17 14:34:24,335 INFO Processing instance 's1'
2026-04-17 14:34:24,336 INFO Processing instance 's3'
2026-04-17 14:34:24,336 INFO Processing db 'itwikivoyage'
maintain-views completed on 2 instance(s) and 1 database(s)
DIFF MODE enabled, no changes were made.
Views that would have been created: 0
Views that would have been updated: 0
Views that would have been deleted: 0
Views with SQL errors: 0