Page MenuHomePhabricator

Implement support for IDatabase::ALL_ROWS in UpdateQueryBuilder
Closed, ResolvedPublic

Description

As far as I can tell, a query that updates all rows cannot be changed to use UpdateQueryBuilder. Take this example from EntitySchema’s SqlIdGenerator (operating on a table with only one row):

pre-UQB code
$success = $database->update(
	$this->tableName,
	[ 'id_value' => $id ],
	IDatabase::ALL_ROWS,
	__METHOD__
);

This replacement doesn’t work:

where( ALL_ROWS )
$success = $database->newUpdateQueryBuilder()
	->update( $this->tableName )
	->set( [ 'id_value' => $id ] )
	->where( $database::ALL_ROWS )
	->caller( __METHOD__ )->execute()

Wikimedia\Rdbms\DBQueryError: Error 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '*)' at line 1
Function: EntitySchema\DataAccess\SqlIdGenerator::generateNewId
Query: UPDATE unittest_entityschema_id_counter SET id_value = 2 WHERE (*)

This is because UpdateQueryBuilder::where() always makes $this->conds an array, but ISQLPlatform::ALL_ROWS ('*') must be passed into $conds directly, not as an array element.

Note that omitting the ->where() call is also not allowed:

no where()
$success = $database->newUpdateQueryBuilder()
	->update( $this->tableName )
	->set( [ 'id_value' => $id ] )
	->caller( __METHOD__ )->execute();

UnexpectedValueException: Wikimedia\Rdbms\UpdateQueryBuilder::execute expects at least one condition to be set

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

It turns out that SelectQueryBuilder also doesn’t support ->where( $dbr::ALL_ROWS ), though it’s not as much of a problem there because you can just omit the ->where() call.

I don't recall if the omission was intentional by @tstarling and @Ladsgroup. It do think it stands out as suspicious so if the transition to the Builder pattern can serve as a carrot to improve code to not do this, I think that's worthwhile a few temporary exemptions. It depends on whether they'd be temporary.

The cited example from Wikibase SqlIdGenerator does not look to me like it needs to update all or even multiple rows. It fetches a single row from an unsorted select query, and then makes an unbounded update. It also locks the rows in question. I'm guessing this is a single row table in practice. If so, then this query seems to be feeding uncertainty. Could this not update based on the increment value it selected? Even better would be to also select a primary key and update explicitly by that.

Note: I'm aware of Wikibase SqlIdGenerator having been subject of much contention and iteration, so I don't expect the above to be a practical suggestion, rather I'm aiming to understand it through a straw example.

Yeah to me it really feels like it should be avoided and I see it as a good opportunity to discourage it. You can set the id_type in the where condition?

You’re both talking about the wrong extension / class. EntitySchema’s SqlIdGenerator works on a database table that only has one row, and doesn’t have an id_type column. The id_value is the only column in the table – there’s nothing else we even could use to address any specific row.

I also feel there’s a logical inconsistency here. If queries aren’t supposed to ever operate on all rows, then there shouldn’t be a dedicated constant, ISQLPlatform::ALL_ROWS, specifically for addressing all rows; conversely, since that constant exists, I expect it’s understood that there are situations, as rare as they may be, where this is a useful thing to do, and that the rdbms library features proper support for this mode.

You’re both talking about the wrong extension / class. EntitySchema’s SqlIdGenerator works on a database table that only has one row, and doesn’t have an id_type column. The id_value is the only column in the table – there’s nothing else we even could use to address any specific row.

Ah that mess. I remember now. Any plans to move it to actually the correct table? having one table with one column and one row is wrong... Maybe wikibase repo could provide an interface to allow setting this in the correct table?

I also feel there’s a logical inconsistency here. If queries aren’t supposed to ever operate on all rows, then there shouldn’t be a dedicated constant, ISQLPlatform::ALL_ROWS, specifically for addressing all rows; conversely, since that constant exists, I expect it’s understood that there are situations, as rare as they may be, where this is a useful thing to do, and that the rdbms library features proper support for this mode.

That constant is an artifact of the past. I would like to eventually get rid of it but it takes time.

Just ran into this again while trying to convert

class ChangeDeletionNotificationJobTest extends RecentChangesModificationTest {

	private function countRevisions( array $conditions = null ): int {
		return $this->db->selectRowCount(
			'recentchanges',
			'*',
			$conditions,
			__METHOD__
		);
	}

to

		return $this->db->newSelectQueryBuilder()
			->table( 'recentchanges' )
			->where( $conditions ?? $this->db::ALL_ROWS )
			->caller( __METHOD__ )->fetchRowCount();

which doesn’t work because ALL_ROWS isn’t supported as a ->where() condition. (I can’t just pass in $conditions either because it’s nullable and ->where( null ) also doesn’t do the right thing.) Instead, I have to do something much uglier:

		$selectQueryBuilder = $this->db->newSelectQueryBuilder();
		$selectQueryBuilder->table( 'recentchanges' );
		if ( $conditions !== null ) {
			$selectQueryBuilder->where( $conditions );
		}
		return $selectQueryBuilder->caller( __METHOD__ )->fetchRowCount();

(Edit: the same issue also exists in DispatchChangeDeletionNotificationJobTest::countArchive().)

Krinkle renamed this task from UpdateQueryBuilder does not support IDatabase::ALL_ROWS updates to Implement support for IDatabase::ALL_ROWS in UpdateQueryBuilder.Jun 2 2023, 8:59 PM

@Lucas_Werkmeister_WMDE Regarding SelectQueryBuilder, I believe the signature of SelectQueryBuilder::where() should already prevent use of null from passing static validation, is that right? If I understand correctly, fetchRowCount() and SelectQueryBuilder already work correctly in your example when skipping where.

I agree skipping a chainable method call is suboptimal, but I don't see a way to support that without a compromise. E.g. making the method nullable would introduce a significant surface for mistakes (footgun) with surprising side-effects that may be difficult to detect prior to affecting production.

It seems unlikely though that the same code would be interested in both a full table row count, and the count of a specific query, yet still be abstracted, and not have other conditionals, or other reasons to separate concerns in a different way. In this case, it seems like the conditional is helpful to signify that this can in fact produce two very different queries.

Regarding UpdateQueryBuilder, the task description mentions a case where adoption of the query builder is actually not possible in Wikibase's SqlIdGenerator. That's a more pressing issue in my view. My previous comment suggests there is a way to make this work since SqlIdGenerator does not actually need to operate on multiple rows. Do I understand that correctly?

Unless we find cases in prod code where ALL_ROWS is actually needed, I tend to agree with Amir that it may be best to not bring this feature along to the new query builders. The direct query methods will remain for a long time (or perhaps indefinitely), and that seems to be a good way to incentive the handful of callers to re-evaluate their code to find a more explicit way to performt their query whenever possible.

I agree skipping a chainable method call is suboptimal, but I don't see a way to support that without a compromise. E.g. making the method nullable would introduce a significant surface for mistakes (footgun) with surprising side-effects that may be difficult to detect prior to affecting production.

I’m not asking for the argument to be nullable, I’m asking for it to support the existing IDatabase::ALL_ROWS constant. In my opinion, the line ->where( $conditions ?? $this->db::ALL_ROWS ) has an obvious meaning to any human reader, and that meaning is not “produce an inexplicable SQL syntax error when $conditions is null”.

@Ladsgroup’s Migrate most calls of Database::delete to DeleteQueryBuilder also ran into this issue (SQLBagOStuff, PS4), because DeleteQueryBuilder has the same problem:

@@ -1514,7 +1514,10 @@ public function deleteAll() {
 			try {
 				$db = $this->getConnection( $shardIndex );
 				for ( $i = 0; $i < $this->numTableShards; $i++ ) {
-					$db->delete( $this->getTableNameByShard( $i ), '*', __METHOD__ );
+					$db->newDeleteQueryBuilder()
+						->delete( $this->getTableNameByShard( $i ) )
+						->where( '*' )
+						->caller( __METHOD__ )->execute();
 				}
 			} catch ( DBError $e ) {
 				$this->handleDBError( $e, $shardIndex );

Change 932193 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/core@master] rdbms: Support ISQLPlatform::ALL_ROWS in query builders

https://gerrit.wikimedia.org/r/932193

It strikes me that everything would be so much simpler if the actual value of the ALL_ROWS constant was something that evaluates to true in SQL, e.g. 1=1, instead of something that results in invalid syntax. You wouldn't need to special case it in any way, it would just work.

It strikes me that everything would be so much simpler if the actual value of the ALL_ROWS constant was something that evaluates to true in SQL, e.g. 1=1, instead of something that results in invalid syntax. You wouldn't need to special case it in any way, it would just work.

I’m not convinced this would be an improvement. That way, the following code would “work”:

$dbw->newDeleteQueryBuilder() // or UpdateQueryBuilder
	->delete( 'some_table' )
	->where( $dbw::ALL_ROWS )
	->andWhere( [ 'some_column' => 1 ] )
	->caller( __METHOD__ )->execute();

but the “all rows” would be misleading in that case. I think it’s better if this code produces some kind of error, and keeping ALL_ROWS = '*' is a relatively simple way to achieve that (because WHERE (*) AND some_column = 1 is a syntax error).

Change 934994 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/core@master] rdbms: Add support for ALL_ROWS in two query builders

https://gerrit.wikimedia.org/r/934994

Change 934994 merged by jenkins-bot:

[mediawiki/core@master] rdbms: Add support for ALL_ROWS in two query builders

https://gerrit.wikimedia.org/r/934994

Krinkle assigned this task to Ladsgroup.

Change 932193 abandoned by Lucas Werkmeister (WMDE):

[mediawiki/core@master] rdbms: Support ISQLPlatform::ALL_ROWS in query builders

Reason:

https://gerrit.wikimedia.org/r/932193

It turns out that SelectQueryBuilder also doesn’t support ->where( $dbr::ALL_ROWS ), though it’s not as much of a problem there because you can just omit the ->where() call.

Would be nice to fix this as well, though (the cases listed in T332329#8898506 used SQB). WDYT?

(Uploaded core and EntitySchema changes to take advantage of the new support and convert some code to DQB/UQB.)

It turns out that SelectQueryBuilder also doesn’t support ->where( $dbr::ALL_ROWS ), though it’s not as much of a problem there because you can just omit the ->where() call.

Would be nice to fix this as well, though (the cases listed in T332329#8898506 used SQB). WDYT?

⇒ T341870: Support IDatabase::ALL_ROWS in SelectQueryBuilder

(I tried putting a change for this a few days ago but didn’t upload it – I think I wasn’t sure where best to make the change and thought it was better for the rdbms maintainers to make that decision.)