Page MenuHomePhabricator

Deprecate Database::nextSequenceValue()
Closed, ResolvedPublic

Description

After T164898, only Oracle will require Database::nextSequenceValue() to be called prior to inserting a row into a table with an auto-increment column. I propose emulating MySQL's behaviour in a trigger. The trigger could store the sequence name or value somewhere so that it can be retrieved in insertId() without the caller having to know the sequence name. Or Database::insert() could store the last-used table name, and trigger names could be derivable from table names. Then the CURRVAL pseudocolumn could be used.

Then Database::nextSequenceValue() can be deprecated. There will be no remaining references to sequences and sequence names except as implementation details hidden inside Database subclasses.

Event Timeline

Then Database::nextSequenceValue() can be deprecated.

Wow, this will break our extension [0] at first sight.

[0] https://github.com/SemanticMediaWiki/SemanticMediaWiki/blob/13c068907df6e7a2a44358e25513c5e7654cdd33/includes/storage/SQLStore/SMW_Sql3SmwIds.php#L894

PS: I wish WMF would make more thorough investigations before throwing around deprecation.

It's just a proposal. This is the start of the investigation, thank you for helping.

No, this is fine, you are not depending on nextSequenceValue(). Actually this proposed change would have avoided the referenced bug (T44659). Once the dependencies are met, that call to nextSequenceValue(), like all others, can trivially be removed.

No, this is fine, you are not depending on nextSequenceValue

I think we are depending on it for Postgres [0] with IDs easily causing "duplicate key value violates unique constraint" (if I remember correctly) [0] therefore I'm a bit hesitant to agree with your position yet and as we are currently struggling to run our test suite on 1.29+/Postgres (due to T149454) any change to this method or Postgres should be accompanied by a pool of integration tests (not unit tests).

[0] https://github.com/SemanticMediaWiki/SemanticMediaWiki/blob/master/src/SQLStore/TableBuilder/PostgresTableBuilder.php#L361-L375

[0] https://github.com/SemanticMediaWiki/SemanticMediaWiki/blob/master/src/SQLStore/TableBuilder/PostgresTableBuilder.php#L361-L375

That's fine too, it will continue to work. T164898 is probably more relevant for you, this task is mostly about Oracle support. PostgreSQL will continue to use sequences, and you can reference them in PostgreSQL-specific queries if you wish to do so. The point of T164898 is to harmonize PostgreSQL with MySQL, so that extension developers will not have to work so hard to achieve PostgreSQL support. The PostgreSQL schema's sequences will continue to exist, but they will be transparent, requiring no developer awareness.

this task is mostly about Oracle support. PostgreSQL will continue to use sequences, and you can reference them in PostgreSQL-specific queries if you wish to do so.

Please clarify, when you talk about Database::nextSequenceValue as being "can be deprecated" then you mean the inner workings in regards to Oracle but the method as such remains part of the interface, right? Because in your initial proposal you sounded like to deprecate the method Database::nextSequenceValue as a whole which would implicitly mean to be removed at one point and certainly break extensions hence my initial opposition.

Please clarify, when you talk about Database::nextSequenceValue as being "can be deprecated" then you mean the inner workings in regards to Oracle but the method as such remains part of the interface, right? Because in your initial proposal you sounded like to deprecate the method Database::nextSequenceValue as a whole which would implicitly mean to be removed at one point and certainly break extensions hence my initial opposition.

My proposal is to make it do nothing, calling it will be pointless. Simultaneously, it will be soft-deprecated. That means it will have @deprecated in the doc comment, but no warning will be raised if you call it. Migration will be trivial, just delete the lines of code that call it. The point of doing all this is to allow the calls to be removed, simplifying the calling code and eliminating bugs which occur due to code that works on MySQL but not PostgreSQL. Thus, the change will benefit extensions like yours. Yes, the method may eventually be removed, but I don't think there is any need to do that before the number of usages in Gerrit-hosted extensions falls to zero.

My proposal is to make it do nothing, calling it will be pointless.

As I pointed out earlier, $sequenceValue = $db->nextSequenceValue( $this->getIdTable() . '_smw_id_seq' ); returns an ID, so it does something and making it do nothing or remove it raises my opposition.

Migration will be trivial, just delete the lines of code that call it.

I wonder, because someone has to clean-up the code and make sure that IDs are generated as before without impacting existing MySQL or Postgres installations.

I note PG doesn't accept NULL when inserting into SERIAL column like MySQL does for AUTO_INCREMENT, so we'll have to do something to keep things working for old code that still calls the do-nothing nextSequenceValue() instead of just omitting the column in the call to insert().

A few options come to mind:

  1. Have nextSequenceValue() return some placeholder that insert() (or makeList() or whatever) turns into the DEFAULT keyword.
  2. Have nextSequenceValue() return some placeholder that causes insert() to not include the column at all in the SQL INSERT statement.
    • I don't think using PHP null for either #1 or #2 would be a good idea, it's possible for a column to be nullable but have a non-null default.
  3. Have a before-insert trigger on every SERIAL column that turns NULL into the appropriate sequence value.

I see only a few places in core and deployed extensions where the value returned from nextSequenceValue() is used for something other than being passed to insert(). In all cases, it's assuming a non-null value is the ID it needs and skipping the call to insertId().

  • Here in Echo's EventMapper.php
  • Here in Revision.php
  • Here in LogEntry.php
  • Here in LogPage.php

Probably everyone here is aware of the code, using Database::nextSequenceValue in Postgres is distinct from DatabaseMysql::nextSequenceValue (where it just returns null) therefore the "do nothing" approach seems misplaced and not what consumers of nextSequenceValue expect, at least not on Postgres.

The problem is with the $seqName that carries special meaning (as noted before) which neither insertId or insert is aware of. Of course, it could be that I'm missing something here, so please clarify.

Database (incl. DatabaseMysql)

public function nextSequenceValue( $seqName ) {
	return null;
}

DatabasePostgres

public function nextSequenceValue( $seqName ) {
	$safeseq = str_replace( "'", "''", $seqName );
	$res = $this->query( "SELECT nextval('$safeseq')" );
	$row = $this->fetchRow( $res );
	$this->mInsertId = $row[0];

	return $this->mInsertId;
}

Of course, it could be that I'm missing something here, so please clarify.

If users of IDatabase::nextSequenceValue() are using it for anything other than for retrieving an opaque value to pass to IDatabase::insert() as the value of a column that is using the named sequence to generate auto-incrementing values[1] so that IDatabase::insertId() will correctly return the resulting auto-incremented value, they're using it for something outside of its intended and documented purpose. Such (mis)use does not seem common, and all of the misuses identified so far are easily fixed by not skipping the insertId() call.

If the intended use is the only thing you're using it for, then you should have nothing to worry about. When DatabasePostgres::nextSequenceValue() starts returning null, it will be doing so because DatabasePostgres::insert() and DatabasePostgres::insertId() are prepared to correctly handle receiving null for the column.

[1]: In PostgreSQL terms, for 'column' when that is declared column INTEGER NOT NULL DEFAULT nextval('table_column_seq'::regclass) a.k.a. column SERIAL.

  1. Have nextSequenceValue() return some placeholder that insert() (or makeList() or whatever) turns into the DEFAULT keyword.

This was the idea I had. It would be like Database::anyChar(). If any callers intractably need to have array elements for inserting into autoincrement columns, we could introduce Database::defaultValue() which would work the same way except it would not require a sequence name.

  1. Have a before-insert trigger on every SERIAL column that turns NULL into the appropriate sequence value.

:P

I see only a few places in core and deployed extensions where the value returned from nextSequenceValue() is used for something other than being passed to insert(). In all cases, it's assuming a non-null value is the ID it needs and skipping the call to insertId().

Those will need to be fixed.

Change 353182 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/core@master] Fix usage of $db->nextSequenceValue()

https://gerrit.wikimedia.org/r/353182

Change 353183 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/extensions/Echo@master] Fix usage of $db->nextSequenceValue()

https://gerrit.wikimedia.org/r/353183

Change 353183 merged by jenkins-bot:
[mediawiki/extensions/Echo@master] Fix usage of $db->nextSequenceValue()

https://gerrit.wikimedia.org/r/353183

Change 353182 merged by jenkins-bot:
[mediawiki/core@master] Fix usage of $db->nextSequenceValue()

https://gerrit.wikimedia.org/r/353182

Change 375011 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/core@master] Deprecate IDatabase::nextSequenceValue()

https://gerrit.wikimedia.org/r/375011

Change 375011 merged by jenkins-bot:
[mediawiki/core@master] Deprecate IDatabase::nextSequenceValue()

https://gerrit.wikimedia.org/r/375011

Anomie claimed this task.

Well, it's deprecated now, so let's mark this as Resolved.