Page MenuHomePhabricator

sql.php must not run LoadExtensionSchemaUpdates
Closed, ResolvedPublic

Description

tgr@deployment-tin:~$ mwscript sql.php --wiki=enwiki
...vote_ip in table securepoll_votes already modified by patch /srv/mediawiki-staging/php-master/extensions/SecurePoll/patches/patch-vote_ip-extend.sql.
...echo_subscription doesn't exist.
>

That sounds like sql.php somehow triggered the update process, which really shouldn't be happening.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 507898 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[mediawiki/extensions/Wikibase@master] Register schema update callbacks rather than running them on DatabaseUpdater construction

https://gerrit.wikimedia.org/r/507898

Change 511778 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[mediawiki/extensions/Wikibase@master] Register schema update callbacks rather than running them on DatabaseUpdater construction

https://gerrit.wikimedia.org/r/511778

hashar removed a subscriber: hashar.May 21 2019, 9:27 PM
aaron added a comment.Sep 9 2019, 9:42 PM

So, getting this test merged depends on redoing the wikibase schema hook application order for update.php. In CI, there seems to be a problem when it interacts with Flow hooks trying to make pages.

aaron removed aaron as the assignee of this task.Sep 9 2019, 9:42 PM

@aaron are you requesting code review from Core Platform or do you need something else?

aaron added a comment.Sep 30 2019, 8:05 PM

@aaron are you requesting code review from Core Platform or do you need something else?

It would need some debugging and code changes to work with Wikibase+Flow

So, getting this test merged depends on redoing the wikibase schema hook application order for update.php. In CI, there seems to be a problem when it interacts with Flow hooks trying to make pages.

Tagging Wikidata and pinging @Pablo-WMDE and @Ladsgroup for the Wikibase issue.

Anomie added a subscriber: Anomie.

Long term, this sort of thing should wind up being fixed by T191231: RFC: Abstract schemas and schema changes.

Short term, Wikibase changes are unlikely to be in scope for CPT. Looked at the core structure test patch.

Change 586473 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] maintenance: Remove sql.php temporarily

https://gerrit.wikimedia.org/r/586473

Change 586474 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@wmf/1.35.0-wmf.26] maintenance: Remove sql.php temporarily

https://gerrit.wikimedia.org/r/586474

This is merely to show that the code path involving Wikibase\…\DatabaseSchemaUpdater->doSchemaUpdate was in fact called by sql.php. The fatal error itself below is harmless in this context.

2020-04-06T23:02:13 snapshot1008  /srv/mediawiki/multiversion/MWScript.php maintenance/sql.php --wiki wikidatawiki --json --query SELECT MAX(page_id) AS max_page_id FROM page
…
Fatal error: Call to undefined method setUser()
…
#0 /srv/mediawiki/php-1.35.0-wmf.26/extensions/GlobalPreferences/includes/Hooks.php(43): GlobalPreferences\Hooks::getPreferencesFactory(User)
#1 /srv/mediawiki/php-1.35.0-wmf.26/includes/Hooks.php(174): GlobalPreferences\Hooks::onUserLoadOptions(User, array)
#2 /srv/mediawiki/php-1.35.0-wmf.26/includes/Hooks.php(202): Hooks::callHook(string, array, array, NULL)
#3 /srv/mediawiki/php-1.35.0-wmf.26/includes/user/User.php(5135): Hooks::run(string, array)
#4 /srv/mediawiki/php-1.35.0-wmf.26/includes/user/User.php(2889): User->loadOptions()
#5 /srv/mediawiki/php-1.35.0-wmf.26/includes/parser/ParserOptions.php(1177): User->getOption(string)
#6 /srv/mediawiki/php-1.35.0-wmf.26/includes/parser/ParserOptions.php(986): ParserOptions->initialiseFromUser(User, LanguageEn)
#7 /srv/mediawiki/php-1.35.0-wmf.26/includes/parser/ParserOptions.php(1011): ParserOptions->__construct(User)
#8 /srv/mediawiki/php-1.35.0-wmf.26/includes/Storage/DerivedPageDataUpdater.php(1433): ParserOptions::newFromUser(User)
#9 /srv/mediawiki/php-1.35.0-wmf.26/includes/page/WikiPage.php(2068): MediaWiki\Storage\DerivedPageDataUpdater->doUpdates()
#10 /srv/mediawiki/php-1.35.0-wmf.26/extensions/Wikibase/repo/includes/Store/Sql/SqlStore.php(336): WikiPage->doEditUpdates(MediaWiki\Revision\RevisionStoreRecord, User)
#11 /srv/mediawiki/php-1.35.0-wmf.26/extensions/Wikibase/repo/includes/Store/Sql/DatabaseSchemaUpdater.php(89): Wikibase\Repo\Store\Sql\SqlStore->rebuild()
#12 /srv/mediawiki/php-1.35.0-wmf.26/extensions/Wikibase/repo/includes/Store/Sql/DatabaseSchemaUpdater.php(61): Wikibase\Repo\Store\Sql\DatabaseSchemaUpdater->doSchemaUpdate(MysqlUpdater)
#13 /srv/mediawiki/php-1.35.0-wmf.26/includes/Hooks.php(174): Wikibase\Repo\Store\Sql\DatabaseSchemaUpdater::onSchemaUpdate(MysqlUpdater)
#14 /srv/mediawiki/php-1.35.0-wmf.26/includes/Hooks.php(202): Hooks::callHook(string, array, array, NULL)
#15 /srv/mediawiki/php-1.35.0-wmf.26/includes/installer/DatabaseUpdater.php(129): Hooks::run(string, array)
#16 /srv/mediawiki/php-1.35.0-wmf.26/includes/installer/DatabaseUpdater.php(195): DatabaseUpdater->__construct(Wikimedia\Rdbms\MaintainableDBConnRef, boolean, MwSql)
#17 /srv/mediawiki/php-1.35.0-wmf.26/maintenance/sql.php(92): DatabaseUpdater::newForDB(Wikimedia\Rdbms\MaintainableDBConnRef, boolean, MwSql)
#18 /srv/mediawiki/php-1.35.0-wmf.26/maintenance/doMaintenance.php(99): MwSql->execute()
#19 /srv/mediawiki/php-1.35.0-wmf.26/maintenance/sql.php(230): require_once(string)
#20 /srv/mediawiki/multiversion/MWScript.php(101): require_once(string)
#21 {main}

Change 586473 merged by jenkins-bot:
[mediawiki/core@master] maintenance: Remove sql.php temporarily

https://gerrit.wikimedia.org/r/586473

This is a very, very ugly accident waiting to happen.

This bug brought down everything for twenty minutes and caused an important table to vanish in production today.

Change 586474 merged by jenkins-bot:
[mediawiki/core@wmf/1.35.0-wmf.26] maintenance: Remove sql.php temporarily

https://gerrit.wikimedia.org/r/586474

Mentioned in SAL (#wikimedia-operations) [2020-04-07T01:06:52Z] <jforrester@deploy1001> Synchronized php-1.35.0-wmf.26/autoload.php: T157651 Remove sql.php from autoloader (duration: 00m 58s)

Mentioned in SAL (#wikimedia-operations) [2020-04-07T01:08:26Z] <jforrester@deploy1001> Synchronized php-1.35.0-wmf.26/maintenance/: T157651 Remove sql.php from maintenance/ (duration: 00m 58s)

Change 587010 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] restore: Add s8 instance to db1095

https://gerrit.wikimedia.org/r/587010

Change 587010 merged by Jcrespo:
[operations/puppet@production] restore: Add s8 instance to db1095

https://gerrit.wikimedia.org/r/587010

kostajh added a subscriber: kostajh.

Flow was mentioned in the comments as a blocker to this work. I rebased the structure test patch to see if the error still comes up.

daniel added a comment.EditedApr 7 2020, 9:02 AM

Looking at the code, there seems to be a whole set of dubious things going on which all together caused the disaster above:

  • sql.php connects to master per default. Slightly dubious, but ok, if you want to be bale to make modifications, you'd want that.
  • if connecting to master, sql.php constructs a DatabaseUpdater just so it can do $db->setSchemaVars( $updater->getSchemaVars() ). We could just have the logic that is currently in MysqlUpdater::getSchemaVars() in MysqlDatabase::getDefaultSchemaVars(), or introduce Database::getUpdateSchemaVars().
  • also, the only thing this does is add support for /*$wgDBTableOptions*/. Is that really needed in sql.php?
  • Anyway. DatabaseUpdater does a hole lot of scary things in the constructor which should probably be done later, probably in doUpdates(). The constructor calls $this->initOldGlobals() to modify a bunch of global variables (!), $this->loadExtensions() in case extensions are not yet loaded (why is this needed?), and Hooks::run( 'LoadExtensionSchemaUpdates', [ $this ] ) to allow extensions to inject stuff into the updater.
  • Wikibase registered DatabaseSchemaUpdater::onSchemaUpdate as a handler for the LoadExtensionSchemaUpdates hook. onSchemaUpdate() calls doSchemaUpdate(), which calls methods like dropTable() which directly drops a table, instead of dropExtensionTable, which only marks a table for being dropped later when DatabaseUpdater::doUpdates() is run. The wikibase DatabaseSchemaUpdater class seems to be a scary mix of "do it now" and "do litlater". Which is understandable, because:
  • DatabaseUpdater doesn't document the semantic different between e.g. dropTable and dropExtensionTable. It doesn't say which should be used when.
  • The documentation for the LoadExtensionSchemaUpdates hook doesn't mention that handlers for this hook should only register updates, not directly perform them. The example given in the docs uses modifyExtensionField(), but nothing there mentions that methods that directly modify the database must not be used. The documentation states ambiguously: "Fired when MediaWiki is updated to allow extensions to update the database".

Changing any of the above might have prevented the problem. I think most if not all of these things should be fixed.

daniel added a comment.EditedApr 7 2020, 9:15 AM

Flow was mentioned in the comments as a blocker to this work. I rebased the structure test patch to see if the error still comes up.

I see violations from Wikibase:

  • /workspace/src/extensions/Wikibase/repo/includes/Store/Sql/ChangesSubscriptionSchemaUpdater.php:48
  • /workspace/src/extensions/Wikibase/repo/includes/Store/Sql/DatabaseSchemaUpdater.php:78

This is a very, very ugly accident waiting to happen.

This bug brought down everything for twenty minutes and caused an important table to vanish in production today.

This is technically a wikibase bug, caused by the hook handler doing direct schema modifications, rather than registering them. But there isn't any documentation that would have made it clear that this was a bad idea (tm)...

Change 587213 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/core@master] DatabaseUpdate: warn extensions about direct modification

https://gerrit.wikimedia.org/r/587213

daniel added a project: Wikidata-Campsite.

Tagging Wikidata-Campsite for the issues with the wikibase schema updater.

Restricted Application added a project: Wikidata. · View Herald TranscriptApr 7 2020, 9:33 AM

Tagging Wikidata-Campsite for the issues with the wikibase schema updater.

I think this task will explode. What do you think of creating a subtask for Wikibase?

daniel removed a project: Wikidata-Campsite.

Tagging Wikidata-Campsite for the issues with the wikibase schema updater.

I think this task will explode. What do you think of creating a subtask for Wikibase?

Done, see T249598: Wikibase schema updaters must not modify database directly.

T249599: DatabaseUpdater should not call hooks in the constructor should address the original issue in this task's title.

mark added a project: DBA.Apr 7 2020, 12:11 PM

Looking at the code, there seems to be a whole set of dubious things going on which all together caused the disaster above

  • the only thing this does is add support for /*$wgDBTableOptions*/. Is that really needed in sql.php?

I believe the main purpose of sql.php is to apply schema patches in a way you can control as an administrator, e.g. instead of update.php. The way you do that is by passing the patch file to sql.php. For that to work, it must expand table prefixes and table options, just like update.php would.

Having a maintenance script that can do this, is important to have and keep.

But, that also gets us to a number of other problems:

  • For the use case of patching the schema, we also have maintenance/patchSql.php which makes this use case even easier. But maybe that is too abstract currently? (It lets you omit the patch- file name prefix, and the .sql suffix, and the relevant subdirectory for the current rdbms). I think for someone who doesn't use update.php their entry point is presumably the list of patch files since the previous release and wanting to apply them. Not sure what's best, but we shouldn't have both.
  • For the use case of running ad-hoc queries while debugging something in production, we also have maintenance/mysql.php which is better suited for this purpose.

In fact: the maintenance/mysql.php script says:

* Note that this will not do table prefixing or variable substitution.
* To safely run schema patches, use sql.php.

I suggest that we:

  • Keep patchSql.php and to the extent not already the case make it only support running queries from a patch file (not arbitrary input), and carry over any relevant features or conveniences from sql.php.
  • Keep mysql.php which is what I believe most developers use already for ad-hoc querying. For example, the sql <dbname> utility that we have in WMF production uses mwscript mysql.php --wiki <dbname> underneath. For other db-backends we support in core a similar thing could be created if there is demand for it.
  • Remove sql.php. Or, if there is demand for it, strip it down to just the bare minimum for an interactive shell that sends read queries and nothing else (no file paths, no programmatic use, no schema options, no master/write queries).

I suggest that we:

  • Keep patchSql.php and to the extent not already the case make it only support running queries from a patch file (not arbitrary input), and carry over any relevant features or conveniences from sql.php.
  • Keep mysql.php which is what I believe most developers use already for ad-hoc querying. For example, the sql <dbname> utility that we have in WMF production uses mwscript mysql.php --wiki <dbname> underneath. For other db-backends we support in core a similar thing could be created if there is demand for it.
  • Remove sql.php. Or, if there is demand for it, strip it down to just the bare minimum for an interactive shell that sends read queries and nothing else (no file paths, no programmatic use, no schema options, no master/write queries).

This feels like a good set of choices (including removing sql.php and pointing users to the other two alternatives.

(Do we have an equivalent of mysql.php for sqlite and postgress?).

Tgr added a comment.Apr 8 2020, 10:26 AM

I would suggest the opposite: keep sql.php, drop patchSql.php. I don't think many people are familiar with the latter (compare patchSql vs sql docs for example) and I don't think it's terribly useful - passing a file path is more user-friendly than passing a patch name. And it does not even replace the schema vars, meaning it's actively harmful to anyone who uses table prefixes or non-default table settings.

So, IMO

  • keep mysql.php which is indeed widely used at least in Wikimedia production for debugging, more user-friendly than sql.php (which channels query output through PHP which does weird things to it) and not problematic (it already requires a write flag for performing any changes - although that relies on a master/slave distinction so that could be improved, cf T249683#6039238 - and does not accidentally run updares).
  • kill patchSql.php which is IMO pretty useless. (Probably worth a wikitech question to ensure it is indeed not used.)
  • keep sql.php manual debugging mode, which is the only way to debug a non-MySQL server, but require an explicit --debug flag used. Do not invoke the updater (not even for variable transformations) when that's used, it seems pointless and just extra code path exposure (cf the fatal error it gave during the incident).
  • keep sql.php for running scripts but require a --write flag like mysql.php does for scripts that change data. (I would even separate an admin mode for schema changes and a write mode for data changes via a restricted user.)
  • if sql.php is invoked with no script file and --debug flag just exit with an error (and without creating a DatabaseUpdater or doing anything else nontrivial).
Krinkle added a comment.EditedApr 9 2020, 12:40 AM

As said, I don't mind keeping sql.php as a cross-rdbms debug tool if there's interest for that.

However, I think use in production is problematic and should not be allowed for the same reason that echo <something> | mwscript eval.php shouldn't be used as part of production workflows. Such code will likely be untested, not reviewed by relevant maintainers, and not seen when change are made to the underlying code. We don't index the Puppet repository as part of "WMF-deployed MediaWiki code" and I don't think we should start. Data from a db table should generally be read via a PHP class that abstracts it. But short of that, at the very least with a minimal WMF-specific script in the WikimediaMaintenance extension.

The fact that that a Wikibase cron job is the only thing in production not following that, I think means this is something we generally follow but just didn't enforce or document in a way that new developers would have learned.

  • keep sql.php for running scripts but require a --write flag like mysql.php does for scripts that change data. (I would even separate an admin mode for schema changes and a write mode for data changes via a restricted user.)

Also, when invoking sql.php in this way, it should always fatal in wmf production in the same way that update.php are disabled. Possibly by levering the same feature flag.

For the use case of patching, I think either of sql.php --write --admin --patch <filename> or patchSql.php <filename> would work. I do still prefer tha latter as it would make sql.php much easier to reason about.

We'd have a script that on the one side:

  • Uses the regular wiki user.
  • Can do read or --write.
  • Only works in an interative shell.

And on the other side:

  • Is disabled by $wgAllowSchemaUpdates.
  • Uses the administrative user and can execute schema changes.
  • Invokers Installer classes.
  • Takes a patch file as parameter.

Having the same maintenance script responsible for both seems like it could very well cause another outage in the future. Because so long as sql.php contains code that engages administrator credentials and initialises Installer classes, all it takes is one bad typo in a commit that changes sql.php and then the next person invoking it could be facing similar problems in the future. This is essentially Murphey's Law, which we've seen up close in T249565 where half a dozen edge cases lined up in just the right way to make it happen. Leaving this unplugged by itself likely won't cause any problems, but it's just something that could easily regress and we'd not know for many months until sometimes else ends up using it by mistake.

It also seems in terms of developer ergonimics that a separate endpoint would be easier to explain in documentation and make use of. I do agree that the current patchSql.php script is too hard to use, we'd model in after sql.php instead.

Change 587389 had a related patch set uploaded (by Krinkle; owner: Tim Starling):
[mediawiki/core@master] Revert "maintenance: Remove sql.php temporarily"

https://gerrit.wikimedia.org/r/587389

Change 587389 merged by jenkins-bot:
[mediawiki/core@master] Revert "maintenance: Remove sql.php temporarily"

https://gerrit.wikimedia.org/r/587389

I would suggest the opposite: keep sql.php, drop patchSql.php. I don't think many people are familiar with the latter (compare patchSql vs sql docs for example) and I don't think it's terribly useful - passing a file path is more user-friendly than passing a patch name. And it does not even replace the schema vars, meaning it's actively harmful to anyone who uses table prefixes or non-default table settings.

So, IMO

  • keep mysql.php which is indeed widely used at least in Wikimedia production for debugging, more user-friendly than sql.php (which channels query output through PHP which does weird things to it) and not problematic (it already requires a write flag for performing any changes - although that relies on a master/slave distinction so that could be improved, cf T249683#6039238 - and does not accidentally run updares).
  • kill patchSql.php which is IMO pretty useless. (Probably worth a wikitech question to ensure it is indeed not used.)
  • keep sql.php manual debugging mode, which is the only way to debug a non-MySQL server, but require an explicit --debug flag used. Do not invoke the updater (not even for variable transformations) when that's used, it seems pointless and just extra code path exposure (cf the fatal error it gave during the incident).
  • keep sql.php for running scripts but require a --write flag like mysql.php does for scripts that change data. (I would even separate an admin mode for schema changes and a write mode for data changes via a restricted user.)
  • if sql.php is invoked with no script file and --debug flag just exit with an error (and without creating a DatabaseUpdater or doing anything else nontrivial).

This all makes sense to me. Since DDLs are not transactional in mariadb and so on, it adds an extra layer of disruption over "DELETE WHERE 1=1". Permissions with temp tables could get annoying though (Database.php read-only enforcement already has to look out for them). Starting off with a --write flag seems doable for now. It would be nice to make it use a replica DB by default (if --write is missing and --replicadb is not already set to a host or *).

daniel claimed this task.Apr 15 2020, 4:09 PM

Change 587213 merged by jenkins-bot:
[mediawiki/core@master] DatabaseUpdate: warn extensions about direct modification

https://gerrit.wikimedia.org/r/587213

Change 511778 abandoned by Daniel Kinzler:
[DNM] Register schema updates as callbacks during DatabaseUpdater construction

Reason:
Superceeded by I84f5fe6836a89352c3ff19c9a985ca359a5bf74d

https://gerrit.wikimedia.org/r/511778

It seems like all essential patches are merged. Can this be closed now?

Krinkle renamed this task from sql.php runs LoadExtensionSchemaUpdates to sql.php must not run LoadExtensionSchemaUpdates.EditedApr 21 2020, 4:17 PM

[mediawiki/core@master] Structure test for wrong calls in LoadExtensionSchemaUpdates
https://gerrit.wikimedia.org/r/475065

@daniel I noticed a patch that makes some methods inaccesible to extensions. Does that mean this structure test is no longer needed? Or are there some dangerous methods exposed still?

Also, does the sql.php script now never run this hook?

@daniel I noticed a patch that makes some methods inaccesible to extensions. Does that mean this structure test is no longer needed? Or are there some dangerous methods exposed still?

See my comment on the patch: the structure test could still ensure that no dangerous methods are called on the Database object (rather than on DatabaseUpdater). The dangerous methods on DatabaseUpdater are no longer accessible to extensions, but they could still call Database::dropTable() directly.

Also, does the sql.php script now never run this hook?

The hook call was removed from the constructor of DatabaseUpdater, and is now triggered from doUpdates(). I see no way for sql.php to trigger it.

Krinkle closed this task as Resolved.Apr 21 2020, 4:58 PM

Change 507898 abandoned by Aaron Schulz:
Register schema update callbacks rather than running them on DatabaseUpdater construction

Reason:
obsolete

https://gerrit.wikimedia.org/r/507898

Aklapper removed a subscriber: Anomie.Oct 16 2020, 5:40 PM