The final step when setting up labs replication for a wiki is to run maintain-replicas.pl, which creates redacted views of the replicated data.
This script is currently unmaintained, and it has several issues:
- It is written in Perl. Perl is a fine language, but it's not one we tend to use, so proficiency and tooling are scarce.
- It is poorly-factored, which makes it hard to follow. The bulk of the work is done in file scope; explanatory comments are few and far in between; and the source language alternates between Perl and large blocks of SQL.
- There is no separation of code and configuration. The list of tables to be mirrored and columns to expose or redact are all hard-coded. This hurts readability, and it is liable to break on future schema changes. Such breakage could result in leakage of private user data.
- It makes faulty assumptions about the execution environment that are not always documented or correct. For example, the script attempts to load the credentials for the database from ~/.my.cnf, which it parses incorrectly.
- All the work is done in a single-shot. The script does not pause or ask for confirmation before performing destructive operations. Because the work is not factored into discrete subroutines, there is no way to resume the work if the script is interrupted.
- It blows up and rebuilds all wikis on every run. There is no way to run it for just one wiki. There is no "dry run" mode.
It should be rewritten, such that:
- run-time configuration can be specified on the command-line, via the environment, or some other mean that does not involve making modifications to the script.
- table and column definition and visibility settings are defined in separate configuration files
- the script asks for confirmation before performing destructive operations.
- users can preview operations with a "dry run" flag
- users can limit the scope of operations to a subset of dbs.
- the script is verbose about what it is doing
Ideally, the script will clean up after itself in case of failure, or at least provide some useful hints about what needs to be done.