Page MenuHomePhabricator

Flow's database schema should use BYTEA in postgres
Open, Needs TriagePublic

Description

In the mysql schema there are BINARY(11) to store the 22-bit UUID from Flow.

Historical mediawiki is using binary mysql data type for text to avoid collation issues with UTF-8.

But in this case the column is not used for UTF-8, it stores real binary values.
To store binaries in Flow it should use BYTEA data type (and pg_escape_bytea/pg_unescape_bytea).

The schema should be adjust to use bytea, but the abstract schema does not allow that at the moment.
See T257755 and T298692

Alternative is to use another store format in postgres like the alphadecimal format, because there are no length constraints on the data types

Current test failures are:

21:40:32 1) Flow\Tests\Api\ApiFlowEditHeaderTest::testEditHeader
21:40:32 Flow\Exception\InvalidInputException: Unknown input to UUID class

21:40:32 2) Flow\Tests\Api\ApiFlowEditTopicSummaryTest::testEditTopicSummary
21:40:32 Flow\Exception\InvalidInputException: Unknown input to UUID class

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Tgr subscribed.

How would this work, interface-wise, given that the data to insert or compare against can be in the middle of a query array structure? Something like 'workflow_id' => new SqlBinary( $id )? and then the query builder would have to unwrap that into pg_escape_bytea or nothing, depending on the engine? And how would unescaping work?