Currently Database and SQLPlatform classes have horrors such as:
static $regexes = null; if ( $regexes === null ) { // Regex with a group for quoted table 0 and a group for quoted tables 1..N $qts = '((?:\w+|`\w+`|\'\w+\'|"\w+")(?:\s*,\s*(?:\w+|`\w+`|\'\w+\'|"\w+"))*)'; // Regex to get query verb, table 0, and tables 1..N $regexes = [ // DML write queries "/^(INSERT|REPLACE)\s+(?:\w+\s+)*?INTO\s+$qts/i", "/^(UPDATE)(?:\s+OR\s+\w+|\s+IGNORE|\s+ONLY)?\s+$qts/i", "/^(DELETE)\s+(?:\w+\s+)*?FROM(?:\s+ONLY)?\s+$qts/i", // DDL write queries "/^(CREATE)\s+TEMPORARY\s+TABLE(?:\s+IF\s+NOT\s+EXISTS)?\s+$qts/i", "/^(DROP)\s+(?:TEMPORARY\s+)?TABLE(?:\s+IF\s+EXISTS)?\s+$qts/i", "/^(TRUNCATE)\s+(?:TEMPORARY\s+)?TABLE\s+$qts/i", "/^(ALTER)\s+TABLE\s+$qts/i" ]; } $queryVerb = null; $queryTables = []; foreach ( $regexes as $regex ) { if ( preg_match( $regex, $sql, $m, PREG_UNMATCHED_AS_NULL ) ) { $queryVerb = $m[1]; $allTables = preg_split( '/\s*,\s*/', $m[2] ); foreach ( $allTables as $quotedTable ) { $queryTables[] = trim( $quotedTable, "\"'`" ); } break; } }
or
return !preg_match( '/^\s*(BEGIN|ROLLBACK|COMMIT|SAVEPOINT|RELEASE|SET|SHOW|EXPLAIN|USE)\b/i', $sql );
These are extremely hard to understand, prone to errors and mistakes and given that they are run on every query mediawiki makes (and I haven't even listed all regexes), they are quite taxing.
A rather easy solution is to introduce a new class called Query (as a data value) that has query verb in it as an attribute (and e.g. selectSqlText sets the verb properly). We probably have to keep these regexes for a while until we deprecate and remove support for string for ::query() but at least they won't be run for majority of cases, slowly we can start making Query object take query and value and start using prepared statements.
Does that sound like a good idea?