Page MenuHomePhabricator

Enable MariaDB/MySQL's Strict Mode
Open, Stalled, LowPublic

Description

If you are looking for the full RFC, go to T112637

This task's scope is:

  • switching to MariaDB's Strict Mode, first on CI (T119371), then on production
  • Make it so that DB insert/update that would overflow would error out rather than be silently truncated
  • Generally promote better use of MariaDB by making many things fail precommit and/or fail loudly in production rather than failing quietly

Options discussed:

  • sql_mode = 'TRADITIONAL' (equivalent to STRICT_TRANS_TABLES, STRICT_ALL_TABLES, NO_ZERO_IN_DATE, NO_ZERO_DATE, ERROR_FOR_DIVISION_BY_ZERO, NO_AUTO_CREATE_USER)
  • sql_mode = ONLY_FULL_GROUP_BY

Often, our non-strict usage is a fountain of bugs and security issues.

Slide by @jcrespo that explains it: https://www.slideshare.net/jynus/query-optimization-with-mysql-57-and-mariadb-10-even-newer-tricks/43 and https://www.slideshare.net/jynus/query-optimization-with-mysql-57-and-mariadb-10-even-newer-tricks/44

Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.
StatusSubtypeAssignedTask
StalledNone
Resolvedhashar
ResolvedKrinkle
OpenNone
ResolvedNone
ResolvedNikerabbit
Resolved RobLa-WMF
OpenNone
ResolvedNikerabbit
ResolvedEjegg
OpenNone
ResolvedMarostegui
OpenNone
ResolvedTK-999
ResolvedNone
OpenBUG REPORTNone
OpenBUG REPORTNone
ResolvedNone
ResolvedBUG REPORTDiesel_kapasule
OpenNone
ResolvedBUG REPORTapasternak

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

One potential issue I see with ONLY_FULL_GROUP_BY is that the optimizer still sucks at dealing with it:

Without ONLY_FULL_GROUP_BY
mysql:wikiadmin@db1083 [enwiki]> explain SELECT page_namespace, page_title, page_id, page_is_redirect FROM page WHERE page_namespace=0 AND page_title>='@' GROUP BY page_title LIMIT 11;
+------+-------------+-------+------+---------------+------------+---------+-------+----------+------------------------------------+
| id   | select_type | table | type | possible_keys | key        | key_len | ref   | rows     | Extra                              |
+------+-------------+-------+------+---------------+------------+---------+-------+----------+------------------------------------+
|    1 | SIMPLE      | page  | ref  | name_title    | name_title | 4       | const | 20324890 | Using index condition; Using where |
+------+-------------+-------+------+---------------+------------+---------+-------+----------+------------------------------------+
1 row in set (0.00 sec)

mysql:wikiadmin@db1083 [enwiki]> SELECT page_namespace, page_title, page_id, page_is_redirect FROM page WHERE page_namespace=0 AND page_title>='@' GROUP BY page_title LIMIT 11;
+----------------+----------------------+----------+------------------+
| page_namespace | page_title           | page_id  | page_is_redirect |
+----------------+----------------------+----------+------------------+
|              0 | @                    |  2214518 |                1 |
|              0 | @!                   | 37696354 |                1 |
|              0 | @$$                  | 21089863 |                1 |
|              0 | @&*!_Smilers         | 20573031 |                1 |
|              0 | @-Bristol            | 23905089 |                1 |
|              0 | @-symbol             | 25176688 |                1 |
|              0 | @440                 | 18159966 |                1 |
|              0 | @ANCAlerts           | 47685646 |                0 |
|              0 | @Andheri             | 42866631 |                0 |
|              0 | @Andheri_(2014_film) | 42914733 |                1 |
|              0 | @BarackObama         | 35944823 |                1 |
+----------------+----------------------+----------+------------------+
11 rows in set (0.09 sec)

mysql:wikiadmin@db1083 [enwiki]> SHOW STATUS LIKE 'Hand%';
+----------------------------+--------+
| Variable_name              | Value  |
+----------------------------+--------+
| Handler_commit             | 2      |
| Handler_delete             | 0      |
| Handler_discover           | 0      |
| Handler_external_lock      | 0      |
| Handler_icp_attempts       | 497545 |
| Handler_icp_match          | 16     |
| Handler_mrr_init           | 0      |
| Handler_mrr_key_refills    | 0      |
| Handler_mrr_rowid_refills  | 0      |
| Handler_prepare            | 0      |
| Handler_read_first         | 0      |
| Handler_read_key           | 1      |
| Handler_read_last          | 0      |
| Handler_read_next          | 11     |
| Handler_read_prev          | 0      |
| Handler_read_rnd           | 0      |
| Handler_read_rnd_deleted   | 0      |
| Handler_read_rnd_next      | 168    |
| Handler_rollback           | 0      |
| Handler_savepoint          | 0      |
| Handler_savepoint_rollback | 0      |
| Handler_tmp_update         | 0      |
| Handler_tmp_write          | 166    |
| Handler_update             | 0      |
| Handler_write              | 0      |
+----------------------------+--------+
25 rows in set (0.00 sec)
With ONLY_FULL_GROUP_BY
mysql:wikiadmin@db1083 [enwiki]> SET SESSION sql_mode = 'TRADITIONAL,ONLY_FULL_GROUP_BY';
Query OK, 0 rows affected (0.00 sec)

mysql:wikiadmin@db1083 [enwiki]> explain SELECT page_namespace, page_title, page_id, page_is_redirect FROM page WHERE page_namespace=0 AND page_title>='@' GROUP BY page_title LIMIT 11;
ERROR 1055 (42000): 'enwiki.page.page_namespace' isn't in GROUP BY
mysql:wikiadmin@db1083 [enwiki]> explain SELECT page_namespace, page_title, page_id, page_is_redirect FROM page WHERE page_namespace=0 AND page_title>='@' GROUP BY page_namespace, page_title, page_id, page_is_redirect LIMIT 11;
+------+-------------+-------+------+---------------+------------+---------+-------+----------+----------------------------------------------------+
| id   | select_type | table | type | possible_keys | key        | key_len | ref   | rows     | Extra                                              |
+------+-------------+-------+------+---------------+------------+---------+-------+----------+----------------------------------------------------+
|    1 | SIMPLE      | page  | ref  | name_title    | name_title | 4       | const | 20324891 | Using index condition; Using where; Using filesort |
+------+-------------+-------+------+---------------+------------+---------+-------+----------+----------------------------------------------------+
1 row in set (0.00 sec)

mysql:wikiadmin@db1083 [enwiki]> SELECT page_namespace, page_title, page_id, page_is_redirect FROM page WHERE page_namespace=0 AND page_title>='@' GROUP BY page_namespace, page_title, page_id, page_is_redirect LIMIT 11;
+----------------+----------------------+----------+------------------+
| page_namespace | page_title           | page_id  | page_is_redirect |
+----------------+----------------------+----------+------------------+
|              0 | @                    |  2214518 |                1 |
|              0 | @!                   | 37696354 |                1 |
|              0 | @$$                  | 21089863 |                1 |
|              0 | @&*!_Smilers         | 20573031 |                1 |
|              0 | @-Bristol            | 23905089 |                1 |
|              0 | @-symbol             | 25176688 |                1 |
|              0 | @440                 | 18159966 |                1 |
|              0 | @ANCAlerts           | 47685646 |                0 |
|              0 | @Andheri             | 42866631 |                0 |
|              0 | @Andheri_(2014_film) | 42914733 |                1 |
|              0 | @BarackObama         | 35944823 |                1 |
+----------------+----------------------+----------+------------------+
11 rows in set (32.67 sec)

mysql:wikiadmin@db1083 [enwiki]> SHOW STATUS LIKE 'Hand%';
+----------------------------+----------+
| Variable_name              | Value    |
+----------------------------+----------+
| Handler_commit             | 2        |
| Handler_delete             | 0        |
| Handler_discover           | 0        |
| Handler_external_lock      | 0        |
| Handler_icp_attempts       | 12637141 |
| Handler_icp_match          | 12139612 |
| Handler_mrr_init           | 0        |
| Handler_mrr_key_refills    | 0        |
| Handler_mrr_rowid_refills  | 0        |
| Handler_prepare            | 0        |
| Handler_read_first         | 0        |
| Handler_read_key           | 1        |
| Handler_read_last          | 0        |
| Handler_read_next          | 12139611 |
| Handler_read_prev          | 0        |
| Handler_read_rnd           | 0        |
| Handler_read_rnd_deleted   | 0        |
| Handler_read_rnd_next      | 168      |
| Handler_rollback           | 1        |
| Handler_savepoint          | 0        |
| Handler_savepoint_rollback | 0        |
| Handler_tmp_update         | 0        |
| Handler_tmp_write          | 166      |
| Handler_update             | 0        |
| Handler_write              | 0        |
+----------------------------+----------+
25 rows in set (0.00 sec)

That was a silly toy example and it still took a lot longer to return the 11 rows. The optimizer seems to fail in the same way if the GROUP BY contains any fields that aren't part of the index being used for grouping (or that are but are constant in the WHERE clause).

The particularly fun part about this example is that "WHERE page_namespace=0 GROUP BY page_title" already uniquely specifies a result row since name_title is a unique index.

I suppose we could always cheat...

mysql:wikiadmin@db1083 [enwiki]> SELECT 0 AS page_namespace, page_title, MAX(page_id) AS page_id, MAX(page_is_redirect) AS page_is_redirect FROM page WHERE page_namespace=0 AND page_title>='@' GROUP BY page_title LIMIT 11;
+----------------+----------------------+----------+------------------+
| page_namespace | page_title           | page_id  | page_is_redirect |
+----------------+----------------------+----------+------------------+
|              0 | @                    |  2214518 |                1 |
|              0 | @!                   | 37696354 |                1 |
|              0 | @$$                  | 21089863 |                1 |
|              0 | @&*!_Smilers         | 20573031 |                1 |
|              0 | @-Bristol            | 23905089 |                1 |
|              0 | @-symbol             | 25176688 |                1 |
|              0 | @440                 | 18159966 |                1 |
|              0 | @ANCAlerts           | 47685646 |                0 |
|              0 | @Andheri             | 42866631 |                0 |
|              0 | @Andheri_(2014_film) | 42914733 |                1 |
|              0 | @BarackObama         | 35944823 |                1 |
+----------------+----------------------+----------+------------------+
11 rows in set (0.09 sec)

This was fixed on 5.7: http://mysqlserverteam.com/mysql-5-7-only_full_group_by-improved-recognizing-functional-dependencies-enabled-by-default/ which given the other large issue we have with the optimizer we should seriously consider (I would have to check MariaDB 10.1).

My rationale is that 5.7, for external users, has this mode enabled by default, and we should be compatible with it. Enforcing it would be a second thought. But given the past experiences, I think it would fix more issues than it break things.

Let's give a chance to beta, which should be stricter than production, and not the other way round!

I have myself have no idea how to enable strict mode / which part of puppet config files have to be tweaked. The beta cluster database are also still running on Precise instance and MariaDB 5.5 , we might want to migrate them to migrate them to Jessie and MariaDB 5.10 first (that is T138778)

This is not blocked by that- it only needs to be enabled on config (see my comment on the child ticket).

Thanks @Huji for noting the SecurePoll strict mode problems (T147875).

I'll note this task still seems to be unassigned. @hashar, is the Beta cluster running in strict mode?

The beta cluster databases have been migrated to Jessie / MariaDB 5.10 T138778 and their config file is https://github.com/wikimedia/operations-puppet/blob/d3a05afd154/templates/mariadb/beta.my.cnf.erb

On beta I would guess it is about setting $wgSQLMode via operations/mediawiki-config , maybe that already has been done. For CI the task is T119371.

I am lurking on this task but reasonably have zero free time to lead the effort.

https://gerrit.wikimedia.org/r/#/c/429386/ would enable sql_mode = TRADITIONAL via includes/DevelopmentSettings.php which is loaded by CI. Tracked by sub task T119371.

The need from Wikimedia-Rdbms was to add the option and any additional support. I believe that part is done. (Ideally that would've been a subtask, but alas, better next time.)

Joe changed the task status from Open to Stalled.Mar 20 2023, 11:17 AM
Joe lowered the priority of this task from Medium to Low.
Joe added a subscriber: Joe.

I fail to see how this task is related to incident followup, or sustainability. So removing the tag. Adding the sprint week tag as I was triaging this task for the SRE sprint week.

Additionally, given the low activity:

  • Set as stalled
  • Lower priority.

Additionally: strict mode is on by default since mariadb 10.2, this has nothing to do with production usage at this point. Maybe another task should be opened?

I fail to see how this task is related to incident followup, or sustainability. So removing the tag. Adding the sprint week tag as I was triaging this task for the SRE sprint week.

I want to remember this was detected as a mitigation to a MW incident/bug in which, if the flag was enabled on testing, the issue would have been caught earlier. But I don't remember the details.

Additionally: strict mode is on by default since mariadb 10.2, this has nothing to do with production usage at this point. Maybe another task should be opened?

I believe ONLY_FULL_GROUP_BY is not enabled, which was part of the above scope. Strict is technically only STRICT_TRANS_TABLES.