Page MenuHomePhabricator

Audit MySQL configurations
Closed, ResolvedPublic

Description

The new masters in eqiad might not have all the configurations of the old ones.

Double check and ensure that they are properly configured.
First things to check that come to my mind or I've seen different between db1041 and db1033 investigating on T133309:

  • semi-sync replication: rpl_semi_sync_%
  • sync_binlog (and similar)
  • binlog expiration: expire_logs_days
  • connect_timeout
  • wait_timeout
  • userstat

Evaluate the need of having local_infile enabled on misc/core services (T214248)

Event Timeline

Volans moved this task from Pending comment to In progress on the DBA board.

Change 285649 had a related patch set uploaded (by Volans):
MariaDB: Load and enable semi-sync replication

https://gerrit.wikimedia.org/r/285649

Change 285664 had a related patch set uploaded (by Volans):
Add the semi_sync parameter

https://gerrit.wikimedia.org/r/285664

Change 285664 merged by Volans:
Add the semi_sync parameter

https://gerrit.wikimedia.org/r/285664

Current configuration for eqiad:

shard  server                userstat    perfomance_schema    wait_timeout connect_timeout expire_logs_days sync_binlog
m1-M   db1001.eqiad.wmnet    ON          OFF                  28800        3               30               1
m1     db1016.eqiad.wmnet    ON          OFF                  3600         3               7                1
m2     db1020.eqiad.wmnet    ON          OFF                  3600         3               2                1
m3-M   db1043.eqiad.wmnet    ON          OFF                  10           3               7                1
m3     db1048.eqiad.wmnet    ON          OFF                  60           3               7                1
m4     db1046.eqiad.wmnet    ON          OFF                  3600         10              7                1
m5     db1009.eqiad.wmnet    ON          OFF                  3600         3               7                1
s1     db1051.eqiad.wmnet    ON          OFF                  3600         3               7                1
s1     db1052.eqiad.wmnet    OFF          ON                  3600         3               7                1
s1     db1053.eqiad.wmnet    OFF          ON                  3600         3               7                1
s1     db1055.eqiad.wmnet    ON          OFF                  3600         3               7                1
s1-M   db1057.eqiad.wmnet    OFF          ON                  3600         3               7                1
s1     db1065.eqiad.wmnet    OFF          ON                  3600         3               7                1
s1     db1066.eqiad.wmnet    ON          OFF                  3600         5               7                0
s1     db1072.eqiad.wmnet    ON          OFF                  3600         3               7                0
s1     db1073.eqiad.wmnet    ON          OFF                  3600         3               7                0
s1,2m4 db1047.eqiad.wmnet    ON          OFF                  3600         10              7                0
s2-M   db1018.eqiad.wmnet    OFF          ON                  3600         3               7                1
s2     db1021.eqiad.wmnet    OFF          ON                  3600         3               7                1
s2     db1024.eqiad.wmnet    OFF          ON                  3600         3               7                1
s2     db1036.eqiad.wmnet    OFF          ON                  3600         3               7                1
s2     db1054.eqiad.wmnet    OFF          ON                  3600         3               7                1
s2     db1060.eqiad.wmnet    OFF          ON                  3600         3               7                1
s2     db1063.eqiad.wmnet    OFF          ON                  3600         3               7                1
s2     db1067.eqiad.wmnet    OFF          ON                  3600         3               7                1
s2     db1074.eqiad.wmnet    OFF          ON                  3600         3               7                1
s2     db1076.eqiad.wmnet    OFF          ON                  3600         3               7                1
s3     db1015.eqiad.wmnet    OFF          ON                  3600         3               7                1
s3     db1027.eqiad.wmnet    OFF          ON                  3600         3               7                1
s3     db1035.eqiad.wmnet    OFF          ON                  3600         3               7                1
s3     db1038.eqiad.wmnet    ON          OFF                  28800        5               30               1
s3     db1044.eqiad.wmnet    OFF          ON                  3600         3               7                1
s3-M   db1075.eqiad.wmnet    OFF          ON                  3600         3               7                1
s3     db1077.eqiad.wmnet    OFF          ON                  3600         3               7                1
s3     db1078.eqiad.wmnet    OFF          ON                  3600         3               7                1
s4     db1019.eqiad.wmnet    ON          OFF                  3600         3               7                0
s4     db1040.eqiad.wmnet    ON          OFF                  28800        5               30               0
s4-M   db1042.eqiad.wmnet    OFF          ON                  3600         3               7                1
s4     db1056.eqiad.wmnet    ON          OFF                  3600         5               7                0
s4     db1059.eqiad.wmnet    ON          OFF                  3600         5               7                0
s4     db1064.eqiad.wmnet    ON          OFF                  3600         3               7                1
s4     db1068.eqiad.wmnet    ON          OFF                  3600         10              7                0
s5     db1026.eqiad.wmnet    ON          OFF                  3600         3               7                0
s5     db1045.eqiad.wmnet    ON          OFF                  3600         3               7                0
s5-M   db1049.eqiad.wmnet    OFF          ON                  3600         3               7                1
s5     db1058.eqiad.wmnet    ON          OFF                  28800        5               30               1
s5     db1070.eqiad.wmnet    OFF          ON                  3600         3               7                1
s5     db1071.eqiad.wmnet    OFF          ON                  3600         3               7                1
s6     db1022.eqiad.wmnet    OFF          ON                  3600         3               7                1
s6     db1023.eqiad.wmnet    ON          OFF                  28800        5               30               0
s6     db1030.eqiad.wmnet    ON          OFF                  3600         3               7                0
s6     db1037.eqiad.wmnet    ON          OFF                  3600         3               7                0
s6-M   db1050.eqiad.wmnet    OFF          ON                  3600         3               7                1
s6     db1061.eqiad.wmnet    ON          OFF                  3600         5               7                0
s6(ex) db1010.eqiad.wmnet    ON          OFF                  3600         3               7                0
s7     db1028.eqiad.wmnet    OFF          ON                  3600         3               7                1
s7     db1033.eqiad.wmnet    ON          OFF                  28800        5               30               1
s7     db1034.eqiad.wmnet    ON          OFF                  3600         3               7                1
s7     db1039.eqiad.wmnet    ON          OFF                  3600         5               7                0
s7-M   db1041.eqiad.wmnet    OFF          ON                  3600         5               7                0
s7     db1062.eqiad.wmnet    ON          OFF                  3600         10              7                0
td     db1011.eqiad.wmnet    OFF         OFF                  28800        3               0                0
x1     db1029.eqiad.wmnet    ON          OFF                  28800        3               30               1
x1-M   db1031.eqiad.wmnet    OFF          ON                  3600         3               7                1

@jcrespo: Proposed runtime changes:

  • set sync_binlog = 1 for all the ones that have it in their my.cnf except db1047 and db1011 (tendril)
  • set expire_logs_days = 7 for old masters that have it increased to 30
  • set expire_logs_days = 30 for new masters that have it still at their default 7
  • connect_timeout: too many values are used, let's agree on one between 3 and 5 and use it everywhere unless there are exceptions (db1046, db1047 maybe?)
  • set wait_timeout = 3600 for old master that have it increased to 28800
  • set wait_timeout = 28800 for new masters ONLY IF we still have a reason to have it different on the masters
  • leave untouched userstat and perfomance_schema given that when one is on the other is off with the only exception of tendril that has both off.
  • You cannot change userstat and perfomance_schema dynamically, the differences are due to the lack of reboot after the change. In some cases, we do not really need performance_schema overhead, such as dbstores/labs/sanitarium, etc.
  • Let's keep non core production servers untouched (db1046, db1047, db1011, db1069, etc. - we do not really care about data consistency there, and some where modified due to past hardware/load issues)
  • ok with expire days plan, but let's check available space on the current masters to not FAIL to notice low disk space
  • I do not care much about wait_timeout 3600 or 28K makes no difference (watchdogs should will threads faster), except on m3 hosts (multiple past issues with phab thread handling)
  • re: connect timeout: difficult issue. HHVM' default was increased from 1 to 3, so it will not make a huge difference having it >3. But look at my last comment.

In general, ok with the changes, but let's do them slowly.

  • have a look at flush_log_at_trx_commit too, sometimes both sync_binlog and that are changed on a lag "emergency"
  • we may also want to have a look at the pool config, as you mentioned

BTW, we really need this on our monitoring in a nice dashboard (tendril or graphite): T112473 T119619

Mentioned in SAL [2016-04-28T12:31:53Z] <volans> Increase eqiad masters expire_logs_days (according to available space) T133333

expire_logs_days

  • set to 15 days on s2, s4, s7 masters (db1018, db1042, db1041) due to limited space
  • set to 30 days on: s1, s3, s5, s6, x1, es2, es3 masters (db1057, db1075, db1049, db1050, db1031, es1015, es1019)
  • leaved untouched the old masters with 5.5 that have 30 days because they are going to be reimaged/upgraded

Change 285649 merged by Volans:
MariaDB: Load and enable semi-sync replication

https://gerrit.wikimedia.org/r/285649

Change 287394 had a related patch set uploaded (by Volans):
MariaDB: tune thread-pool to avoid Aborted_connects

https://gerrit.wikimedia.org/r/287394

Change 288195 had a related patch set uploaded (by Volans):
Add the replication_role parameter

https://gerrit.wikimedia.org/r/288195

Change 288195 merged by Volans:
Add the replication_role parameter

https://gerrit.wikimedia.org/r/288195

Change 287394 merged by Volans:
MariaDB: tune thread-pool to avoid Aborted_connects

https://gerrit.wikimedia.org/r/287394

Mentioned in SAL [2016-05-12T09:47:18Z] <volans> Apply at runtime thread_pool_max_threads=2000 for all coredb masters (Gerrit 287394, T133333)

Mentioned in SAL [2016-05-12T09:51:38Z] <volans> Apply at runtime thread_pool_stall_limit=10 for all coredb masters (Gerrit 287394, T133333)

Mentioned in SAL [2016-05-19T16:31:47Z] <volans> Set runtime value for max_allowed_packet, innodb_buffer_pool_dump_at_shutdown, innodb_buffer_pool_load_at_startup to their configured values for s1-s7, es1-es3, x1 T133333

@jcrespo below are the relevant remaining differences between runtime and configuration for s1-s7, es1-es3, x1.

My proposal is to not touch hosts to be decommissioned hopefully soon (< db1050), for the remaining:

  1. align connect_timeout to 3 seconds as per configuration (monitoring wLogDBError) unless it was changed for specific reasons.
  2. slowly set sync_binlog and innodb_flush_log_at_trx_commit to 1 monitoring replication lag.
  3. set back_log back to 500 as per configuration.

What about:

  1. max_connections: probably the best way is to puppettize it dynamically based on the server resources. thoughts?
  2. innodb_use_native_aio ? Some servers with old kernels have it on while the config is off.
  3. event_scheduler off on db1041?
= S7

Variable                   db1062 /etc/my.cnf
========================== ====== ===========
connect_timeout            10     3
innodb_use_native_aio      ON     0
sync_binlog                0      1

Variable                   db1041 /etc/my.cnf
========================== ====== ===========
connect_timeout            5      3
event_scheduler            OFF    1
innodb_flush_log_at_trx... 0      1
sync_binlog                0      1

Variable                   db1039 /etc/my.cnf
========================== ====== ===========
back_log                   150    500
connect_timeout            5      3
innodb_use_native_aio      ON     0
sync_binlog                0      1

= S6

Variable                   db1061 /etc/my.cnf
========================== ====== ===========
connect_timeout            5      3
innodb_use_native_aio      ON     0
sync_binlog                0      1
thread_pool_stall_limit    500    100

Variable                   db1030 /etc/my.cnf
========================== ====== ===========
innodb_flush_log_at_trx... 0      1
sync_binlog                0      1

Variable                   db1037 /etc/my.cnf
========================== ====== ===========
innodb_flush_log_at_trx... 0      1
sync_binlog                0      1


= S5

Variable                   db1049 /etc/my.cnf
========================== ====== ===========
max_connections            10000  5000
thread_pool_stall_limit    20     10

Variable                   db1026 /etc/my.cnf
========================== ====== ===========
innodb_flush_log_at_trx... 0      1
sync_binlog                0      1

Variable                   db1045 /etc/my.cnf
========================== ====== ===========
innodb_flush_log_at_trx... 0      1
sync_binlog                0      1

= S4

Variable                   db1068 /etc/my.cnf
========================== ====== ===========
connect_timeout            10     3
innodb_flush_log_at_trx... 0      1
sync_binlog                0      1
thread_pool_stall_limit    50     100

Variable                   db1056 /etc/my.cnf
========================== ====== ===========
connect_timeout            5      3
innodb_use_native_aio      ON     0
sync_binlog                0      1
thread_pool_stall_limit    500    100

Variable                   db1019 /etc/my.cnf
========================== ====== ===========
innodb_flush_log_at_trx... 0      1
sync_binlog                0      1

Variable                   db1059 /etc/my.cnf
========================== ====== ===========
back_log                   150    500
connect_timeout            5      3
innodb_use_native_aio      ON     0
sync_binlog                0      1
thread_pool_stall_limit    500    100

= S1

Variable                   db1072 /etc/my.cnf
========================== ====== ===========
innodb_flush_log_at_trx... 0      1
max_connections            10000  5000
sync_binlog                0      1

Variable                   db1073 /etc/my.cnf
========================== ====== ===========
innodb_flush_log_at_trx... 0      1
max_connections            10000  5000
sync_binlog                0      1

Variable                   db1066 /etc/my.cnf
========================== ====== ===========
connect_timeout            5      3
innodb_flush_log_at_trx... 2      1
innodb_use_native_aio      ON     0
sync_binlog                0      1

Variable                   db1051 /etc/my.cnf
========================== ====== ===========
innodb_flush_log_at_trx... 0      1
use_stat_tables            NEVER  preferably

I would set lock_wait_timeout to 60 instead of the infinite time it has now to avoid metadata lock issues. More queries will fail, but better than pileups.

Rationale: T135809#2311768 https://gerrit.wikimedia.org/r/#/c/289820/1/dbtools/osc_host.sh

Mentioned in SAL [2016-05-23T15:47:27Z] <volans> testing thread_pool_max_threads=2000 on db1076 (s2) T133333

Mentioned in SAL [2016-05-23T16:01:59Z] <volans> testing thread_pool_max_threads=2000 on db1072 (s1) [instead of db1076 (s2)] T133333

Monitoring mysql status for threads, aborted and connections statistics I didn't find any difference in the patterns between db1073 and db1072 that has almost the same load in the same shard.

Leaving the configuration live for a longer test.

Mentioned in SAL [2016-05-27T08:57:29Z] <volans> Set sync_binlog=1 on db2011 (m2) and monitoring it. T133333

Mentioned in SAL [2016-05-27T10:11:10Z] <volans> restarting MySQL on db2038 to test change 286858 - T133333

Volans renamed this task from Audit new eqiad masters configuration to Audit MySQL configurations.May 27 2016, 11:44 AM

Mentioned in SAL [2016-05-27T12:22:29Z] <volans> Align runtime MySQL configurations on codfw slaves with the my.cnf ones T133333

Aligned thread_pool_max_threads and max_connections on all slaves on codfw.
thread_pool_size is not dynamic and most of codfw hosts have 40 cores so at restart they will have 40 instead of the current 32.

Mentioned in SAL [2016-05-27T16:34:28Z] <volans> Align runtime MySQL max_connections on codfw masters with the my.cnf ones T133333

Change 291265 had a related patch set uploaded (by Volans):
Use 0/1 instead of off/on for read_only

https://gerrit.wikimedia.org/r/291265

Change 291265 merged by Volans:
Use 0/1 instead of off/on for read_only

https://gerrit.wikimedia.org/r/291265

Change 291299 had a related patch set uploaded (by Volans):
MariaDB: use 0/1 instead of off/on for read_only

https://gerrit.wikimedia.org/r/291299

Change 291299 merged by Volans:
MariaDB: use 0/1 instead of off/on for read_only

https://gerrit.wikimedia.org/r/291299

Mentioned in SAL [2016-05-30T08:44:34Z] <volans> Align thread_pool_max_threads to my.cnf value on 1 slave/shard in eqiad (db1065,db1076,db1078,db1040,db1026,db1061,db1039) T133333

How to check

to check the cluster status/differences between runtime and my.cnf use commands like this one:

sudo salt -C 'G@cluster:mysql and G@site:codfw' cmd.run 'pt-config-diff --defaults-file=/root/.my.cnf --ignore-variables=optimizer_switch,innodb_buffer_pool_load_at_startup,ssl_ca,ssl_cert,ssl_key,ssl_cipher,thread_pool_size --report-width=200 h=localhost /etc/my.cnf | grep -v performance_schema'

Rationale of the ignore-variables:

  • optimizer_switch: is always different, because the runtime value shows a lot of additional values
  • innodb_buffer_pool_load_at_startup is my.cnf only, does not show at runtime
  • ssl_*: those are tracked by T111654
  • thread_pool_size: we recently changed it to a value that is set dynamically with the number of CPU cores, some hosts with high CPU cores have it lower at runtime but this parameter is not dynamic.
  • grep -v performance_schema: those are tracked by T99485

CORE DBs

max_connections / thread_pool_max_threads

STATUS: codfw done for both, eqiad done thread_pool_max_threads in one slave per shard (db1065,db1076,db1078,db1040,db1026,db1061,db1039).
TODO: slowly continue to align runtime values to the new my.cnf ones (10000/2000).

sync_binlog / innodb_flush_log_at_trx_commit / innodb_use_native_aio

STATUS: a bunch of servers have either one , two or all of those turned off, probably done at runtime in cases when they were lagging.
TODO: on newer (>db1050) servers evaluate/try to set them back on if they can keep up.

connect_timeout

STATUS: Some servers have a value greater than the configured 3 in my.cnf
TODO: Evaluate why it was increased and put it back to it's configured value, or adapt the configuration if it needs to stay high.

back_log

STATUS: some hosts have it set at 150 instead of the 500 from my.cnf
TODO: evaluate why was changed and possibly put it back to it's my.cnf value.

EXPECTED DIFFS

expire_logs_days

The masters have 30 days in my.cnf but that was reduced at runtime on some masters due to space limitations.

read_only

The coredb masters have read_only = 0 at runtime but = 1 in my.cnf because master election is done by orchestration.

OTHER DBs

db2011 (m2 codfw host)

STATUS: runtime value of read_only is OFF while my.cnf value is 1.
TODO: Ensure that it's not used for writes and it's ok to put it back RO.

DBSTORE

STATUS: my.cnf set innodb_buffer_pool_size and tokudb_cache_size at 24G, but dbstore1001 and dbstore2001 have both at 32G at runtime.
TODO: Evaluate to align to 32G the other two hosts too or differentiate them in the configuration so that the my.cnf is in sync with the runtime value.

binlog_format

STATUS: db1001 and labsdb1005 have ROW at runtime but statement or mixed in my.cnf
TODO: align the configuration if the runtime value is the expected one

OTHER DIFFS

There are minor other differences related to max_allowed_packet, interactive_timeout, wait_timeout, event_scheduler, log_warnings that need to be checked on a case-by-case basis.

Volans moved this task from In progress to Pending comment on the DBA board.
Marostegui lowered the priority of this task from High to Medium.Nov 10 2016, 11:50 AM
LSobanski claimed this task.
LSobanski added a subscriber: LSobanski.

Closing as this is likely to be out of sync with the current configuration (last update was almost 5 years ago).