Page MenuHomePhabricator

Cannot create a new wiki on beta cluster
Closed, ResolvedPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

Login to Beta Clustera and try to add a new wiki.

mwscript extensions/WikimediaMaintenance/addWiki.php --wiki=aawiki en wikipedia test2wiki test2.wikipedia.beta.wmcloud.org

What happens?:

System fails with an error

pmiazga@deployment-deploy03:~$ mwscript extensions/WikimediaMaintenance/addWiki.php --wiki=aawiki en wikipedia test2wiki test2.wikipedia.beta.wmcloud.org
Creating database test2wiki for en.wikipedia (English)
Initialising tables
RuntimeException from line 2739 of /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/Database.php: Could not open "/srv/mediawiki-staging/php-master/extensions/Math/sql/mysql/mathoid.sql"
#0 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/DBConnRef.php(119): Wikimedia\Rdbms\Database->sourceFile('/srv/mediawiki-...')
#1 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/DBConnRef.php(810): Wikimedia\Rdbms\DBConnRef->__call('sourceFile', Array)
#2 /srv/mediawiki-staging/php-master/extensions/WikimediaMaintenance/addWiki.php(310): Wikimedia\Rdbms\DBConnRef->sourceFile('/srv/mediawiki-...')
#3 /srv/mediawiki-staging/php-master/extensions/WikimediaMaintenance/addWiki.php(128): AddWiki->createMainClusterSchema(Object(Wikimedia\Rdbms\DBConnRef), 'test2wiki', 'wikipedia')
#4 /srv/mediawiki-staging/php-master/maintenance/includes/MaintenanceRunner.php(698): AddWiki->execute()
#5 /srv/mediawiki-staging/php-master/maintenance/run.php(51): MediaWiki\Maintenance\MaintenanceRunner->run()
#6 /srv/mediawiki-staging/multiversion/MWScript.php(158): require_once('/srv/mediawiki-...')
#7 {main}

What should have happened instead?:

Wiki should be created.

Other information (browser name/version, screenshots, etc.):

System fails due to not being able to find a mathoid.sql file to initialize a new DB. This file was removed in T349442 -- Patch Remove explicit DB access.

I don't know if this is only a Math extension issue or is it problem with AddWiki scripts.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

As a quick solution, I think we could bring the SQL files that were removed in https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Math/+/975432

Another suggestion from @matmarex is to update the AddWiki script - after reading T349442 looks like those SQL files were removed intentionally.

I'll update addscript to not parse those SQL files.

https://gerrit.wikimedia.org/g/mediawiki/extensions/WikimediaMaintenance/+/08e7e94988ed3ea542b54e7b9dc330bf3b042ec2/addWiki.php#310

pmiazga changed the task status from Open to In Progress.Feb 22 2024, 4:07 PM
pmiazga claimed this task.
pmiazga triaged this task as High priority.

Another suggestion from @matmarex is to update the AddWiki script - after reading T349442 looks like those SQL files were removed intentionally.

I'll update addscript to not parse those SQL files.

https://gerrit.wikimedia.org/g/mediawiki/extensions/WikimediaMaintenance/+/08e7e94988ed3ea542b54e7b9dc330bf3b042ec2/addWiki.php#310

^^ That would be the better approach.

Change 1005533 had a related patch set uploaded (by Pmiazga; author: Pmiazga):

[mediawiki/extensions/WikimediaMaintenance@master] Remove Math tables from

https://gerrit.wikimedia.org/r/1005533

@Physikerwelt thanks for joining the conversation. I quickly created the ticket so I don't forget and started investigating this issue. We found out that those changes were intentional and we should update the add wiki script.

Change 1005533 merged by jenkins-bot:

[mediawiki/extensions/WikimediaMaintenance@master] Remove Math tables from

https://gerrit.wikimedia.org/r/1005533

After updating the Add Wiki script - I was able to run the script - but it kept failing due to partially created DB. Previous run already created some tables so I had to manually drop the database.

Now it fails on another extension.

pmiazga@deployment-deploy03:~$ mwscript extensions/WikimediaMaintenance/addWiki.php --wiki=aawiki en wikipedia test2wiki test2.wikipedia.beta.wmcloud.org
Creating database test2wiki for en.wikipedia (English)
Initialising tables
RuntimeException from line 2739 of /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/Database.php: Could not open "/srv/mediawiki-staging/php-master/extensions/Linter/sql/tables-generated.sql"
#0 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/DBConnRef.php(119): Wikimedia\Rdbms\Database->sourceFile('/srv/mediawiki-...')
#1 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/DBConnRef.php(810): Wikimedia\Rdbms\DBConnRef->__call('sourceFile', Array)
#2 /srv/mediawiki-staging/php-master/extensions/WikimediaMaintenance/addWiki.php(315): Wikimedia\Rdbms\DBConnRef->sourceFile('/srv/mediawiki-...')
#3 /srv/mediawiki-staging/php-master/extensions/WikimediaMaintenance/addWiki.php(128): AddWiki->createMainClusterSchema(Object(Wikimedia\Rdbms\DBConnRef), 'test2wiki', 'wikipedia')
#4 /srv/mediawiki-staging/php-master/maintenance/includes/MaintenanceRunner.php(698): AddWiki->execute()
#5 /srv/mediawiki-staging/php-master/maintenance/run.php(51): MediaWiki\Maintenance\MaintenanceRunner->run()
#6 /srv/mediawiki-staging/multiversion/MWScript.php(158): require_once('/srv/mediawiki-...')
#7 {main}

After updating the Add Wiki script - I was able to run the script - but it kept failing due to partially created DB. Previous run already created some tables so I had to manually drop the database.

I recommend running the addWiki script with something like --skipclusters=main (or similar; the full set of clusters is main,extstore,echo,growth,mediamoderation), which will cause the script to finish up, but skip initializing the DB. Then, you can manually load the tables from the right places via mwscript sql.php or similar, and update addWiki.php to work well for its next user. HTH!

Now it fails on another extension.

pmiazga@deployment-deploy03:~$ mwscript extensions/WikimediaMaintenance/addWiki.php --wiki=aawiki en wikipedia test2wiki test2.wikipedia.beta.wmcloud.org
Creating database test2wiki for en.wikipedia (English)
Initialising tables
RuntimeException from line 2739 of /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/Database.php: Could not open "/srv/mediawiki-staging/php-master/extensions/Linter/sql/tables-generated.sql"
#0 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/DBConnRef.php(119): Wikimedia\Rdbms\Database->sourceFile('/srv/mediawiki-...')
#1 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/DBConnRef.php(810): Wikimedia\Rdbms\DBConnRef->__call('sourceFile', Array)
#2 /srv/mediawiki-staging/php-master/extensions/WikimediaMaintenance/addWiki.php(315): Wikimedia\Rdbms\DBConnRef->sourceFile('/srv/mediawiki-...')
#3 /srv/mediawiki-staging/php-master/extensions/WikimediaMaintenance/addWiki.php(128): AddWiki->createMainClusterSchema(Object(Wikimedia\Rdbms\DBConnRef), 'test2wiki', 'wikipedia')
#4 /srv/mediawiki-staging/php-master/maintenance/includes/MaintenanceRunner.php(698): AddWiki->execute()
#5 /srv/mediawiki-staging/php-master/maintenance/run.php(51): MediaWiki\Maintenance\MaintenanceRunner->run()
#6 /srv/mediawiki-staging/multiversion/MWScript.php(158): require_once('/srv/mediawiki-...')
#7 {main}

It looks like Linter needs someone to run maintenance/generateSchemaSql.php against it's sql/tables.sql source file and commit that to the repo.

@bd808 in Linter script was moved to sql/mysql folder to match how we have it in other extensions

@Urbanecm_WMF thanks for the point - that's helpful. I checked the addWiki script for all remaining SQL files and looks like they are on the right place. I'm going to submit another PR to fix the Linter SQL file location and then hopefully it's gonna work.

Change 1005534 had a related patch set uploaded (by Pmiazga; author: Pmiazga):

[mediawiki/extensions/WikimediaMaintenance@master] Update location of Linter sql files

https://gerrit.wikimedia.org/r/1005534

Change 1005534 merged by jenkins-bot:

[mediawiki/extensions/WikimediaMaintenance@master] Update location of Linter sql files

https://gerrit.wikimedia.org/r/1005534

After all fixes now script still fails, but this time on something new - which I think may be a result of running and dropping addWiki.php script multiple times.

pmiazga@deployment-deploy03:~$ mwscript extensions/WikimediaMaintenance/addWiki.php --wiki=aawiki en wikipedia test2wiki test2.wikipedia.beta.wmcloud.org
Creating database test2wiki for en.wikipedia (English)
Initialising tables
Initialising external storage cluster1...
Writing main page to Main_Page
Wikimedia\Rdbms\DBQueryError from line 1203 of /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/Database.php: Error 1049: Unknown database 'test2wiki'
Function: Wikimedia\Rdbms\DatabaseMySQL::doSelectDomain
Query: USE `test2wiki`

#0 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/Database.php(1187): Wikimedia\Rdbms\Database->getQueryException('Unknown databas...', 1049, 'USE `test2wiki`', 'Wikimedia\\Rdbms...')
#1 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/Database.php(1161): Wikimedia\Rdbms\Database->getQueryExceptionAndLog('Unknown databas...', 1049, 'USE `test2wiki`', 'Wikimedia\\Rdbms...')
#2 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/DatabaseMySQL.php(204): Wikimedia\Rdbms\Database->reportQueryError('Unknown databas...', 1049, 'USE `test2wiki`', 'Wikimedia\\Rdbms...')
#3 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/Database.php(1514): Wikimedia\Rdbms\DatabaseMySQL->doSelectDomain(Object(Wikimedia\Rdbms\DatabaseDomain))
#4 /srv/mediawiki-staging/php-master/includes/libs/rdbms/loadbalancer/LoadBalancer.php(896): Wikimedia\Rdbms\Database->selectDomain(Object(Wikimedia\Rdbms\DatabaseDomain))
#5 /srv/mediawiki-staging/php-master/includes/libs/rdbms/loadbalancer/LoadBalancer.php(775): Wikimedia\Rdbms\LoadBalancer->reuseOrOpenConnectionForNewRef(2, Object(Wikimedia\Rdbms\DatabaseDomain), 0)
#6 /srv/mediawiki-staging/php-master/includes/libs/rdbms/loadbalancer/LoadBalancer.php(767): Wikimedia\Rdbms\LoadBalancer->getServerConnection(2, 'test2wiki', 0)
#7 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/DBConnRef.php(103): Wikimedia\Rdbms\LoadBalancer->getConnectionInternal(-1, Array, 'test2wiki', 0)
#8 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/DBConnRef.php(117): Wikimedia\Rdbms\DBConnRef->ensureConnection()
#9 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/DBConnRef.php(369): Wikimedia\Rdbms\DBConnRef->__call('selectRow', Array)
#10 /srv/mediawiki-staging/php-master/includes/libs/rdbms/querybuilder/SelectQueryBuilder.php(771): Wikimedia\Rdbms\DBConnRef->selectRow(Array, Array, Array, 'MediaWiki\\User\\...', Array, Array)
#11 /srv/mediawiki-staging/php-master/includes/user/User.php(835): Wikimedia\Rdbms\SelectQueryBuilder->fetchRow()
#12 /srv/mediawiki-staging/php-master/extensions/WikimediaMaintenance/addWiki.php(211): MediaWiki\User\User::newSystemUser('Maintenance scr...', Array)
#13 /srv/mediawiki-staging/php-master/maintenance/includes/MaintenanceRunner.php(698): AddWiki->execute()
#14 /srv/mediawiki-staging/php-master/maintenance/run.php(51): MediaWiki\Maintenance\MaintenanceRunner->run()
#15 /srv/mediawiki-staging/multiversion/MWScript.php(158): require_once('/srv/mediawiki-...')
#16 {main}
pmiazga@deployment-deploy03:~$

@Urbanecm_WMF do you know what can be wrong now ? It fails with selecting the test2wiki. I'm using the aawiki, I assume that something uses a different db which doesn't have the test2wiki database. We don't have multiple shards for beta so I'm bit lost on what's causing this issue.

After all fixes now script still fails, but this time on something new - which I think may be a result of running and dropping addWiki.php script multiple times.

pmiazga@deployment-deploy03:~$ mwscript extensions/WikimediaMaintenance/addWiki.php --wiki=aawiki en wikipedia test2wiki test2.wikipedia.beta.wmcloud.org
Creating database test2wiki for en.wikipedia (English)
Initialising tables
Initialising external storage cluster1...
Writing main page to Main_Page
Wikimedia\Rdbms\DBQueryError from line 1203 of /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/Database.php: Error 1049: Unknown database 'test2wiki'
Function: Wikimedia\Rdbms\DatabaseMySQL::doSelectDomain
Query: USE `test2wiki`

#0 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/Database.php(1187): Wikimedia\Rdbms\Database->getQueryException('Unknown databas...', 1049, 'USE `test2wiki`', 'Wikimedia\\Rdbms...')
#1 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/Database.php(1161): Wikimedia\Rdbms\Database->getQueryExceptionAndLog('Unknown databas...', 1049, 'USE `test2wiki`', 'Wikimedia\\Rdbms...')
#2 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/DatabaseMySQL.php(204): Wikimedia\Rdbms\Database->reportQueryError('Unknown databas...', 1049, 'USE `test2wiki`', 'Wikimedia\\Rdbms...')
#3 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/Database.php(1514): Wikimedia\Rdbms\DatabaseMySQL->doSelectDomain(Object(Wikimedia\Rdbms\DatabaseDomain))
#4 /srv/mediawiki-staging/php-master/includes/libs/rdbms/loadbalancer/LoadBalancer.php(896): Wikimedia\Rdbms\Database->selectDomain(Object(Wikimedia\Rdbms\DatabaseDomain))
#5 /srv/mediawiki-staging/php-master/includes/libs/rdbms/loadbalancer/LoadBalancer.php(775): Wikimedia\Rdbms\LoadBalancer->reuseOrOpenConnectionForNewRef(2, Object(Wikimedia\Rdbms\DatabaseDomain), 0)
#6 /srv/mediawiki-staging/php-master/includes/libs/rdbms/loadbalancer/LoadBalancer.php(767): Wikimedia\Rdbms\LoadBalancer->getServerConnection(2, 'test2wiki', 0)
#7 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/DBConnRef.php(103): Wikimedia\Rdbms\LoadBalancer->getConnectionInternal(-1, Array, 'test2wiki', 0)
#8 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/DBConnRef.php(117): Wikimedia\Rdbms\DBConnRef->ensureConnection()
#9 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/DBConnRef.php(369): Wikimedia\Rdbms\DBConnRef->__call('selectRow', Array)
#10 /srv/mediawiki-staging/php-master/includes/libs/rdbms/querybuilder/SelectQueryBuilder.php(771): Wikimedia\Rdbms\DBConnRef->selectRow(Array, Array, Array, 'MediaWiki\\User\\...', Array, Array)
#11 /srv/mediawiki-staging/php-master/includes/user/User.php(835): Wikimedia\Rdbms\SelectQueryBuilder->fetchRow()
#12 /srv/mediawiki-staging/php-master/extensions/WikimediaMaintenance/addWiki.php(211): MediaWiki\User\User::newSystemUser('Maintenance scr...', Array)
#13 /srv/mediawiki-staging/php-master/maintenance/includes/MaintenanceRunner.php(698): AddWiki->execute()
#14 /srv/mediawiki-staging/php-master/maintenance/run.php(51): MediaWiki\Maintenance\MaintenanceRunner->run()
#15 /srv/mediawiki-staging/multiversion/MWScript.php(158): require_once('/srv/mediawiki-...')
#16 {main}
pmiazga@deployment-deploy03:~$

@Urbanecm_WMF do you know what can be wrong now ? It fails with selecting the test2wiki. I'm using the aawiki, I assume that something uses a different db which doesn't have the test2wiki database. We don't have multiple shards for beta so I'm bit lost on what's causing this issue.

Did you try with the --skip-clusters option that @Urbanecm_WMF suggested in T358236#9569124?

I'm also confused about this message in the output: Creating database test2wiki for en.wikipedia (English). What does that mean? Why would there be a test2wiki database for en.wikipedia?

@kostajh nope, we didn't do --skip-clusters. On first run it failed with missing migration, we fixed the script and we executed it again - but on the second execution it couldn't go through - so we dropped the test2wiki on beta cluster db (db11).

The --skip-clusters option came up later, after the drop was already executed.

For the log - we're creating a new test2wiki env on the beta cluster. But this env didn't create due to numerous issues around addwiki.php script. Now we're trying to solve the broken replication.

Recreating a new wiki from scratch fails with:

pmiazga@deployment-deploy03:~$ mwscript extensions/WikimediaMaintenance/addWiki.php --wiki=aawiki en wikipedia test2wiki test2.wikipedia.beta.wmcloud.org
Creating database test2wiki for en.wikipedia (English)
Initialising tables
Initialising external storage cluster1...
Wikimedia\Rdbms\DBLanguageError from line 87 of /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/Query.php: Wikimedia\Rdbms\Query::isWriteQuery called with incorrect flags parameter
#0 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/Database.php(678): Wikimedia\Rdbms\Query->isWriteQuery()
#1 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/Database.php(643): Wikimedia\Rdbms\Database->executeQuery(Object(Wikimedia\Rdbms\Query), 'ExternalStoreDB...', 8)
#2 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/DBConnRef.php(119): Wikimedia\Rdbms\Database->query(Object(Wikimedia\Rdbms\Query), 'ExternalStoreDB...', 8)
#3 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/DBConnRef.php(302): Wikimedia\Rdbms\DBConnRef->__call('query', Array)
#4 /srv/mediawiki-staging/php-master/includes/externalstore/ExternalStoreDB.php(266): Wikimedia\Rdbms\DBConnRef->query('-- Blobs table ...', 'ExternalStoreDB...', 8)
#5 /srv/mediawiki-staging/php-master/extensions/WikimediaMaintenance/addWiki.php(419): ExternalStoreDB->initializeTable('cluster1')
#6 /srv/mediawiki-staging/php-master/extensions/WikimediaMaintenance/addWiki.php(188): AddWiki->createExternalStoreClusterSchema('test2wiki', Object(Wikimedia\Rdbms\LBFactoryMulti))
#7 /srv/mediawiki-staging/php-master/maintenance/includes/MaintenanceRunner.php(698): AddWiki->execute()
#8 /srv/mediawiki-staging/php-master/maintenance/run.php(51): MediaWiki\Maintenance\MaintenanceRunner->run()
#9 /srv/mediawiki-staging/multiversion/MWScript.php(158): require_once('/srv/mediawiki-...')
#10 {main}

That part worked earlier. After a quick check, it looks like it's new code introduced in https://gerrit.wikimedia.org/r/c/mediawiki/core/+/995371.

@aaron @Ladsgroup could you advise on what to do here? The addWiki.php script uses the ExternalStoreDB to create tables. ExternalStoreDB::initializeTable uses the SQL file to create the schema.

The ExternalStoreDBcalls $dbw->query() with QUERY_IGNORE_DBO_TRX -

		$rawTable = $this->getTable( $dbw, $cluster ); // e.g. "blobs_cluster23"
		$encTable = $dbw->tableName( $rawTable );
		$dbw->query(
			str_replace(
				[ '/*$wgDBprefix*/blobs', '/*_*/blobs' ],
				[ $encTable, $encTable ],
				$sql
			),
			__METHOD__,
			$dbw::QUERY_IGNORE_DBO_TRX
		);

But looks like isWriteQuery() doesn't check for QUERY_IGNORE_DBO_TRX, none of flags matches, therefore it throws an exception (https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/b0d1abbcbfe10d210c10003b0b08bd1255186ba0/includes/libs/rdbms/database/Query.php#69)

Probably the query that inserts the table needs write flag, but I want to confirm with you first.

Change 1011145 had a related patch set uploaded (by Pmiazga; author: Pmiazga):

[mediawiki/core@master] externalstore: use default flags when initializing tables

https://gerrit.wikimedia.org/r/1011145

The logic of handling an SQL string in Database::query() looks solid to me, but...

#2 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/DBConnRef.php(119): Wikimedia\Rdbms\Database->query(Object(Wikimedia\Rdbms\Query), 'ExternalStoreDB...', 8)
#3 /srv/mediawiki-staging/php-master/includes/libs/rdbms/database/DBConnRef.php(302): Wikimedia\Rdbms\DBConnRef->__call('query', Array)
#4 /srv/mediawiki-staging/php-master/includes/externalstore/ExternalStoreDB.php(266): Wikimedia\Rdbms\DBConnRef->query('-- Blobs table ...', 'ExternalStoreDB...', 8)

...it looks like a query string is transformed into a Query object in the process of a simple call proxying, and Database::query() is already receiving an object (presumably with invalid flags)? No idea what's going on there.

@Tgr Now, if flag doesn't match one of SQLPlatform::QUERY_CHANGE_NONE, SQLPlatform::QUERY_CHANGE_TRX, SQLPlatform::QUERY_CHANGE_LOCKS. SQLPlatform::QUERY_CHANGE_ROWS , SQLPlatform::QUERY_CHANGE_SCHEMA, SQLPlatform::QUERY_PSEUDO_PERMANENT - we throw exception,

where previously we were doing a fallback:

return QueryBuilderFromRawSql::buildQuery( $this->sql, 0 )->isWriteQuery();

@aaron suggested to use the SQLPlatform::QUERY_CHANGE_SCHEMA flag in ExternalStoreDB which totally makes sense. I'm going to check if other queries in addWiki.php script are fine and then once the PR gets merged and I'm going to run the script again.

Database::query() uses QueryBuilderFromRawSql::buildQuery() to convert raw SQL strings to query objects, and that method will always add either QUERY_CHANGE_NONE or QUERY_CHANGE_ROWS. But that stack trace doesn't match that happening - Database::query() already receives a Query object, somehow. Manually overriding the flags might paper over the problem, which might not be the best approach here. What if the Query object is corrupted in other ways as well?

@Tgr In ExternalStoreDB::initializeTable() we were passing QUERY_IGNORE_DBO_TRX only, which anyway was incorrect - the schema initialization code should pass QUERY_CHANGE_SCHEMA which is also a flag that isWriteQuery() would consider.

Because ExternalStoreDB::initializeTable() was passing only QUERY_IGNORE_DBO_TRX, this didn't match any checks in the isWriteQuery() which caused exception to be thrown.

This worked because if none flags were matched, system was doing a fallback to QueryBuilderFromRawSql ::isWriteQuery() which doesnt check flags but does a preg_match to detect if query is doing any updates.

Right now, we call Database::query() and we pass $flags) (see my fix https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1011145/2/includes/externalstore/ExternalStoreDB.php).

The place where string $sql magically becomes a Query object is here - https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/b5c913fd00760f712e98df81351e4a75f1374be1/includes/libs/rdbms/database/Database.php#634 - which builds the Query object with $flags (which come from that ExternalDBStore::initializeTable() ).

Change 1011304 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/core@master] Fix QueryBuilderFromRawSql::buildQuery()

https://gerrit.wikimedia.org/r/1011304

Change 1011309 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/core@master] rdbms: Improve QueryBuilderFromRawSql flag logic

https://gerrit.wikimedia.org/r/1011309

@Tgr In ExternalStoreDB::initializeTable() we were passing QUERY_IGNORE_DBO_TRX only, which anyway was incorrect - the schema initialization code should pass QUERY_CHANGE_SCHEMA which is also a flag that isWriteQuery() would consider.

IDatabase::query() doesn't require change flags to be passed (it doesn't really require anything as the parameter is mostly undocumented but other calls don't pass change tags either) so that wasn't really incorrect. The logic setting those flags was just buggy. (Although ideally we'd pass a Query object in the first place.)

The place where string $sql magically becomes a Query object is here - https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/b5c913fd00760f712e98df81351e4a75f1374be1/includes/libs/rdbms/database/Database.php#634 - which builds the Query object with $flags (which come from that ExternalDBStore::initializeTable() ).

That doesn't match the stack trace which says Database::query() was already called with a Query object.

Change 1011145 merged by jenkins-bot:

[mediawiki/core@master] externalstore: use QUERY_CHANGE_SCHEMA when initializing tables

https://gerrit.wikimedia.org/r/1011145

Change 1011304 merged by jenkins-bot:

[mediawiki/core@master] Fix QueryBuilderFromRawSql::buildQuery()

https://gerrit.wikimedia.org/r/1011304

Although ideally we'd pass a Query object in the first place.

Yes, any call to Database::query should send a Query object not raw SQL. That's the whole idea. if you send raw SQL., you end up in the buggy radioactive swamp that's called QueryBuilderFromRawSql.

Even more ideally, it shouldn't call ::query() directly.

Change #1011309 merged by jenkins-bot:

[mediawiki/core@master] rdbms: Improve QueryBuilderFromRawSql flag logic

https://gerrit.wikimedia.org/r/1011309

Change #1015038 had a related patch set uploaded (by Pmiazga; author: Pmiazga):

[mediawiki/core@master] externalstore: Pass Query object when initializing tables

https://gerrit.wikimedia.org/r/1015038

Even more ideally, it shouldn't call ::query() directly.

I tried to not call ::query(), and do the sourceStream() instead.

	
$sqlFilePath = "$IP/maintenance/storage/blobs.sql";
$handler = fopen( $sqlFilePath, 'r');
if ( $handler === false ) {
	throw new RuntimeException( "Failed to read '$sqlFilePath'." );
}

$rawTable = $this->getTable( $dbw, $cluster ); // e.g. "blobs_cluster23"
$encTable = $dbw->tableName( $rawTable );
$dbw->sourceStream(
	$handler,
	null,
	null,
	__METHOD__,
	function( $sql ) use ( $encTable ) {
		return str_replace(
			[ '/*$wgDBprefix*/blobs', '/*_*/blobs' ],
			[ $encTable, $encTable ],
			$sql
		);
	}
);
fclose( $handler );

But this doesn't work, as the $inputCallback callback is executed after variable replacement ($this->platform->replaceVars( $cmd ); ) therefore the $sql gets stripped from /*$wgDBprefix*/blobs'.

I feel I fell into a rabbit hole, therefore for now I decided to leave the ExternalStoreDB initialization the way it is. The small fix I proposed is to pass the Query object instead of raw SQL, tested locally looks like works. For now we will have to live with it processing/calling the ::query().

Let me know once you can create a new wiki so I can start creating some new ones in production

Change #1015038 merged by jenkins-bot:

[mediawiki/core@master] externalstore: Pass Query object when initializing tables

https://gerrit.wikimedia.org/r/1015038

Today I executed the script once again, looks like the DB creation is finally successful, but script failed again.

pmiazga@deployment-deploy03:~$ mwscript extensions/WikimediaMaintenance/addWiki.php --wiki=aawiki --skipclusters=main,echo,growth,mediamoderation en wikipedia test2wiki test2.wikipedia.beta.wmcloud.org
Creating database test2wiki for en.wikipedia (English)
Initialising external storage cluster1...
Wikimedia\Services\CannotReplaceActiveServiceException from line 278 of /srv/mediawiki-staging/php-master/vendor/wikimedia/services/src/ServiceContainer.php: Cannot replace an active service: RevisionStore
#0 /srv/mediawiki-staging/php-master/extensions/WikimediaMaintenance/addWiki.php(198): Wikimedia\Services\ServiceContainer->redefineService('RevisionStore', Object(Closure))
#1 /srv/mediawiki-staging/php-master/maintenance/includes/MaintenanceRunner.php(698): AddWiki->execute()
#2 /srv/mediawiki-staging/php-master/maintenance/run.php(51): MediaWiki\Maintenance\MaintenanceRunner->run()
#3 /srv/mediawiki-staging/multiversion/MWScript.php(158): require_once('/srv/mediawiki-...')
#4 {main}

@Ladsgroup you worked on https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikimediaMaintenance/+/539147/1/addWiki.php#158 - do you have any idea why this stopped working now?

I found some recent changes to RevisionStore and ServiceWirings that may cause this issue - https://gerrit.wikimedia.org/r/c/mediawiki/core/+/979991

I managed to run this script locally in docker env. But it gets pass that line, therefore there has to be something initialising the RevisionStore in configs.

I came up with this ugly hack (copy over from Installer source file)

		// T212881: Redefine the RevisionStore service to explicitly use the new DB name.
		// Otherwise, ExternalStoreDB would be instantiated with an implicit database domain,
		// causing it to use the DB name of the wiki the script is running on due to T200471.
		// T358236: Something on BetaCluster causes RevisionStore to be already initialized,
		// therefore we need to reset entire container.
		MediaWikiServices::resetGlobalInstance();
		$services = MediaWikiServices::getInstance();
		$services->redefineService(
			'RevisionStore',
			static function ( MediaWikiServices $services ) use ( $dbName ): RevisionStore {
				return $services->getRevisionStoreFactory()->getRevisionStore( $dbName );
			}
		);

(see lines resetGlobalContainer() and next). It's ugly but most likely it's gonna fix the problem. @daniel what do you think?

Change #1016370 had a related patch set uploaded (by Pmiazga; author: Pmiazga):

[mediawiki/extensions/WikimediaMaintenance@master] Reset entire container after initializing the Database

https://gerrit.wikimedia.org/r/1016370

The most recent error is caused by Echo and Thanks extensions.

  1. During initialisation, Setup.php is iterating over $wgExtensionFunctions to initialize all extensions.
  2. That executes the Echo Echo/Hooks::initEchoExtension(), which later on triggers the BeforeCreateEchoEvent hook.
  3. HookRunner is initialising all handlers that listen to BeforeCreateEchoEvent. We have 12 listeners on BetaCluster.
  4. Thanks extension main hook listener listens to BeforeCreateEchoEvent, therefore, gets initialised.
  5. This listener requires the RevisionLookup service which is defined as RevisionStore - this leads to RevisionStore initialisation even before the addWiki.php gets executed.

The change that introduced service injection is pretty recent (March 15th) Inject services in Hooks and MobileFrontendHandler.

There are three ways to solve this problem.
a) fix the addWiki.php to reset the entire container and override configs. Most future-proof
b) fix the Thanks extension - introduce a second EchoHook that would handle the BeforeCreateEchoEvent hook as this hook has no dependencies. Should fix current scenario but won't protect us from similar issues in the future
c) revert Idff34ebce914ad37bcaea8de04b3ef5e01d7d98d. Not ideal, it's a pretty good change and we should do service injection everywhere. If we decide to work on Thanks, I prefer doing option b).

@daniel @Krinkle any thoughts? I assume the safest would be to do a) and b), and there is no need to initialize multiple services so early ( MainConfig, GenderCache, PermissionManager, RevisionLookup, UserFactory, UserOptionsManager ).

Change #1017067 had a related patch set uploaded (by Pmiazga; author: Pmiazga):

[mediawiki/extensions/Thanks@master] Move Echo hooks to new EchoHooks handler that has no dependencies

https://gerrit.wikimedia.org/r/1017067

Change #1017067 merged by jenkins-bot:

[mediawiki/extensions/Thanks@master] Move Echo hooks to new EchoHooks handler that has no dependencies

https://gerrit.wikimedia.org/r/1017067

The addWiki.php script execution was finally successful.

Let me know once you can create a new wiki so I can start creating some new ones in production

@Ladsgroup It works now. There is still a question of whether we want to make the addWiki.php more future-proof and reset the container with db override.

Change #1016370 abandoned by Pmiazga:

[mediawiki/extensions/WikimediaMaintenance@master] Reset entire container after initializing the Database

Reason:

in favour of I0e348872ba5dc313325e3f4f296fd84bfb2c785b - went with short term fix and made Thanks extension not break previous flow. For long-term support we might need to revisit this.

https://gerrit.wikimedia.org/r/1016370