Determine fate of sql.php/patchSql.php
Open, MediumPublic
Actions

Assigned To

None

Authored By

	Krinkle
	Apr 21 2020, 5:06 PM

Description

Objective

Have one (1) maintenance script that we recommend for the purpose of applying schema patches manually
- ... this would have the same protections and disable-guards as update.php,
- .. use a writeble/admin database connection by default,
- .. might run Installer and DatabaseUpdater-related classes and hooks.
Have one (1) maintenance script that we recommend for the purpose of ad-hoc debugging against a production database.
- ... this would use a read connection by default,
- .. is incapable of running any Installer or DatabaseUpdater-related classes or hooks.

Background

In T157651#6037453, @Jdforrester-WMF wrote:

In T157651#6037432, @Krinkle wrote:

I suggest that we:

Keep patchSql.php and to the extent not already the case make it only support running queries from a patch file (not arbitrary input), and carry over any relevant features or conveniences from sql.php.

Keep mysql.php which is what I believe most developers use already for ad-hoc querying. For example, the sql <dbname> utility that we have in WMF production uses mwscript mysql.php --wiki <dbname> underneath. For other db-backends we support in core a similar thing could be created if there is demand for it.

Remove sql.php. Or, if there is demand for it, strip it down to just the bare minimum for an interactive shell that sends read queries and nothing else (no file paths, no programmatic use, no schema options, no master/write queries).

This feels like a good set of choices (including removing sql.php and pointing users to the other two alternatives).

(Do we have an equivalent of mysql.php for sqlite and postgress?).

In T157651#6039302, @Tgr wrote:

I would suggest the opposite: keep sql.php, drop patchSql.php. I don't think many people are familiar with the latter […] and I don't think it's terribly useful - passing a file path is more user-friendly than passing a patch name. And it does not even replace the schema vars, meaning it's actively harmful to anyone who uses table prefixes or non-default table settings.

So, IMO

keep mysql.php which is indeed widely used at least in Wikimedia production for debugging, more user-friendly than sql.php (which channels query output through PHP which does weird things to it) and not problematic (it already requires a write flag for performing any changes - although that relies on a master/slave distinction so that could be improved, cf T249683#6039238 - and does not accidentally run updares).

kill patchSql.php which is IMO pretty useless. (Probably worth a wikitech question to ensure it is indeed not used.)

keep sql.php manual debugging mode, which is the only way to debug a non-MySQL server, but require an explicit --debug flag used. Do not invoke the updater (not even for variable transformations) when that's used, it seems pointless and just extra code path exposure (cf the fatal error it gave during the incident).

keep sql.php for running scripts but require a --write flag like mysql.php does for scripts that change data. (I would even separate an admin mode for schema changes and a write mode for data changes via a restricted user.)

if sql.php is invoked with no script file and --debug flag just exit with an error (and without creating a DatabaseUpdater or doing anything else nontrivial).

In T157651#6041910, @Krinkle wrote:

As said, I don't mind keeping sql.php as a cross-rdbms debug tool if there's interest for that.

However, I think use in production [cron jobs] is problematic and should not be allowed […] The fact that that a Wikibase cron job is the only thing in production not following that, I think means this is something we generally follow but just didn't enforce or document in a way that new developers would have learned.

keep sql.php for running scripts but require a --write flag like mysql.php does for scripts that change data. (I would even separate an admin mode for schema changes and a write mode for data changes via a restricted user.)

Also, when invoking sql.php in this way, it should always fatal in wmf production in the same way that update.php are disabled. Possibly by levering the same feature flag.

For the use case of patching, I think either of sql.php --write --admin --patch <filename> or patchSql.php <filename> would work. I do still prefer tha latter as it would make sql.php much easier to reason about.

We'd have a script that on the one side:

Uses the regular wiki user.

Can do read or --write.

Only works in an interative shell.

And on the other side:

Is disabled by $wgAllowSchemaUpdates.

Uses the administrative user and can execute schema changes.

Invokers Installer classes.

Takes a patch file as parameter.

Having the same maintenance script responsible for both seems like it could very well cause another outage in the future. […] in terms of developer ergonimics that a separate endpoint would be easier to explain in documentation and make use of. I do agree that the current patchSql.php script is too hard to use, we'd re-model in after sql.php instead.