Page MenuHomePhabricator

Make rdbms "$defaultSchemas" extendable
Open, LowPublic

Description

So when an extension adds a new database type, it's impossible to add the new dbtype to this array, causing Notice: Undefined index: *** in .../includes/db/Database.php on line 876


Version: 1.24rc
Severity: normal

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:17 AM
bzimport set Reference to bz64365.
bzimport added a subscriber: Unknown Object (MLST).

skizzerz wrote:

There's definitely a change that needs to be made here (at the very least it needs to gracefully fall back to no schema if the $dbType isn't in the list without emitting an E_NOTICE for not finding the index), but I'm not certain that a config variable is necessary for this, for a couple of reasons.

  1. You can pass the desired schema in as part of the $options array to DatabaseBase::factory(), e.g. $dbw = DatabaseBase::factory( 'customdb', array( 'schema' => 'foobar' ) );
  2. By making it a configuration variable, that indicates that changing the default values for existing database types is supported, when in reality that is not the case (trying to set a schema for MySQL will end badly in all cases, and even though Postgres and Oracle support them, the MediaWiki drivers for them assume that there isn't one/it doesn't need to be specified so things may subtly break).

An alternative would be a new hook to define the default schema for a new database type, it would be passed in $dbType and &$schema (which allows for setting a single schema for that database, defaulting to null). This way we avoid the pitfalls of point 2 by not allowing the defaults for existing drivers to be changed, while also avoiding the extra boilerplate required in the solution presented in point 1 in the event a db object needs to be constructed multiple times.

Change 134562 had a related patch set uploaded by Skizzerz:
(bug 64365) Gracefully fail when getting the default schema if we get a non-standard db type instead of throwing notices. Also add a hook to allow extensions that add non-standard types to configure said default

https://gerrit.wikimedia.org/r/134562

Change 134562 had a related patch set uploaded by Krinkle:
database: Gracefully fail when non-standard dbType has no default schema

https://gerrit.wikimedia.org/r/134562

Change 134562 had a related patch set uploaded (by Liangent):
database: Gracefully fail when non-standard dbType has no default schema

https://gerrit.wikimedia.org/r/134562

Change 134562 abandoned by Krinkle:
database: Gracefully fail when non-standard dbType has no default schema

https://gerrit.wikimedia.org/r/134562

Krinkle renamed this task from $defaultSchemas is not available as a $wg to Make rdbms "$defaultSchemas" extendable .Jul 28 2018, 11:34 PM
Krinkle moved this task from Untriaged to Rdbms library on the MediaWiki-libs-Rdbms board.
Krinkle removed a project: Patch-For-Review.

This code now lives in MWLBFactory, where based on the private getDbTypesWithSchemas(), the DBmwschema config is passed to the specified IDatabase implementation.

The registry for IDatabase classes is open via DBtype, but the list of getDbTypesWithSchemas() is private.

Rather than exposing that internal array via a hook, I'd rather get rid of it entirely. Perhaps we can just pass it to Database::factory unconditionally and let some backends ignore it?

Krinkle lowered the priority of this task from Medium to Low.Jul 23 2019, 5:26 PM

IMO we should probably rethink the whole concept of "schemas" in MediaWiki. Starting with defining terms:

  • "section": A set of replicas of a server.
  • "server": A process running on the hardware or VM.
  • "database": A collection of schemas that you can access via one TCP connection / in one SQL query.
  • "schema": Separate namespaces within a database that can be accessed over one connection / in one SQL query.

MySQL has only one database per server. What it names "database" is actually a schema by these definitions, which is the root of all the confusion on this topic.

PostgreSQL has multiple databases per server. Unfortunately when someone originally implemented PG support they didn't realize that what MediaWiki was calling a "database" is actually what PG calls a schema, leading to the confusion on this topic.

SQLite doesn't have a server. Each file is basically one schema, and each schema you want to use needs to be explicitly opened (see https://www.sqlite.org/lang_naming.html).

I don't know about MSSQL and Oracle, and I don't want to spend the time researching them to work it out.

MediaWiki generally expects to work out of one schema at a time. The Database class basically represents a connection to a schema; we generally try not to do cross-schema queries to keep flexibility for scaling by moving schemas to different sections. LoadBalancer/LBFactory try to manage connections to databases by changing a spare Database object's default schema instead of reconnecting.

TL;DR: Everything MediaWiki calls a "schema" is bogus. We need to rework the PG abstraction layer to map MediaWiki's idea of a "database" to a PG schema, then we should be able to drop MediaWiki's idea of "schemas" entirely.