Page MenuHomePhabricator

implement performance_schema for mysql monitoring
Closed, ResolvedPublic

Description

  • Review and validate this implementation plan (DONE)
  • Investigate p_s overhead in production. (DONE, constant reviewing)
  • Get a good configuration trade off between features and performance_impact (DONE)
  • Slowly roll it in throughout the fleet (DONE, only a few exceptions that are not blockers)
  • Implement collectors to graphite
  • Implement graphs in grafana/tendril

Event Timeline

Springle created this task.May 18 2015, 3:51 AM
Springle assigned this task to jcrespo.
Springle raised the priority of this task from to Normal.
Springle updated the task description. (Show Details)
Springle added projects: acl*sre-team, DBA.
Springle added a subscriber: Springle.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 18 2015, 3:51 AM

This is a bug I hit on pc1003, but not on the other hosts: https://bugs.launchpad.net/percona-server/+bug/1329772

The procedure means that, just in case, when upgrading from 14->16 or similar, restart the server once more for the changes to take effect on these memory tables. Please note that this could be a Fork-only problem, as MariaDB P_S versions and MySQL ones are not in sync, but updated from time to time (PS .20 -> .24 was updated in 10.0.18).

Production was not affected at all, only P_S queries were (failed), and we do not use those.

jcrespo moved this task from Triage to In progress on the DBA board.Jun 2 2015, 2:21 PM
jcrespo added a comment.EditedJun 3 2015, 7:13 AM

In order to check how p_s affects us, I want to run https://wikitech.wikimedia.org/wiki/MariaDB/query_performance on some test machines for a week. For example, db1018.

jcrespo moved this task from In progress to Backlog on the DBA board.Jun 18 2015, 10:24 AM
jcrespo moved this task from Backlog to In progress on the DBA board.Jun 25 2015, 8:57 AM
jcrespo moved this task from In progress to Backlog on the DBA board.Jul 2 2015, 1:40 PM
Restricted Application added a subscriber: Matanya. · View Herald TranscriptJul 2 2015, 1:40 PM
jcrespo added a comment.EditedJul 6 2015, 2:24 PM

P_S OFF:

# 1094.7s user time, 9.4s system time, 141.22M rss, 205.21M vsz
# Current date: Wed Jul  1 07:32:28 2015
# Hostname: db1018
# Files: STDIN
# Overall: 4.66M total, 640 unique, 53.47 QPS, 0.02x concurrency _________
# Time range: 2015-06-30 07:00:10 to 2015-07-01 07:11:37
# Attribute          total     min     max     avg     95%  stddev  median
# ============     ======= ======= ======= ======= ======= ======= =======
# Exec time          1320s     1us      3s   283us   332us     3ms   152us
# Lock time           238s       0    13ms    51us    93us    39us    52us
# Rows sent          5.02M       0   4.16k    1.13    1.96    8.69    0.99
# Rows examine       9.50M       0 232.93k    2.14    3.89  261.15    0.99
# Merge passes           0       0       0       0       0       0       0
# Query size         1.06G      17  67.89k  243.89  511.45  368.99  192.76
# Boolean:
# Filesort       8% yes,  91% no
# Full scan     94% yes,   5% no
# Priority que   3% yes,  96% no
# Tmp table     29% yes,  70% no
# Tmp table on   1% yes,  98% no

P_S ON:

# 1200.2s user time, 11.3s system time, 140.67M rss, 198.67M vsz
# Current date: Tue Jun 23 06:49:06 2015
# Hostname: db1018
# Files: STDIN
# Overall: 5.22M total, 671 unique, 65.24 QPS, 0.02x concurrency _________
# Time range: 2015-06-22 07:59:07 to 2015-06-23 06:11:48
# Attribute          total     min     max     avg     95%  stddev  median
# ============     ======= ======= ======= ======= ======= ======= =======
# Exec time          1531s     1us      7s   293us   348us     6ms   159us
# Lock time           280s       0     5ms    53us    98us    39us    54us
# Rows sent          5.51M       0   8.28k    1.11    1.96    9.70    0.99
# Rows examine       9.66M       0 232.12k    1.94    3.89  202.02    0.99
# Merge passes           0       0       0       0       0       0       0
# Query size         1.18G      17  67.95k  243.84  511.45  330.28  192.76
# Boolean:
# Filesort      14% yes,  85% no
# Full scan     95% yes,   4% no
# Priority que   2% yes,  97% no
# Tmp table     31% yes,  68% no
# Tmp table on   1% yes,  98% no

this preliminary results could mean a 4.5% of overhead, but:

  • This is expected given the default configuration. With 5000 max_connections, due to our particular mysql tuning, Performance schema need some tuning to not be over-dimensioned. See https://bugs.mysql.com/bug.php?id=68514 We also probably are the perfect example of issues mentioned here.
  • The query profile is not exactly the same: 8% vs 14% of filesorts, that could account for some of that differences.
  • user_stats has not been disabled, which would be redundant with P_S

I would like to try again with more conservative performance schema variables.

jcrespo renamed this task from investigate performance_schema for wmf prod to implement performance_schema for wmf prod.Sep 30 2015, 1:25 PM
jcrespo raised the priority of this task from Normal to High.
jcrespo updated the task description. (Show Details)
jcrespo set Security to None.
jcrespo edited subscribers, added: ori; removed: Springle.

With its default configuration, due to our extreme configuration of the following variables:

#max_connections                = 5000
#table_open_cache               = 50000
#table_definition_cache         = 40000
#open-files-limit               = 200000

P_S autosizes to consume 1.6GB of memory.

If we set those to the defaults, it only consumes 36MB.

This is fixed in 5.7 with more dynamic memory allocation, but we have to enforce P_S configuration variables for now.

By setting the following configuration:

# Activating P_S by default
performance_schema = 1
# downsizing performance schema memory usage: T99485
performance_schema_digests_size = 1000
performance_schema_max_thread_instances = 500
performance_schema_max_cond_instances = 1000
performance_schema_accounts_size = 10
performance_schema_events_statements_history_size = 10
performance_schema_events_statements_history_long_size = 1000
performance_schema_events_waits_history_size = 10
performance_schema_events_waits_history_long_size = 1000
performance_schema_events_stages_history_size = 10
performance_schema_events_stages_history_long_size = 1000
performance_schema_max_mutex_instances = 5000
performance_schema_max_rwlock_instances = 2000
performance_schema_max_socket_instances = 500
performance_schema_max_table_instances = 1000

The memory usage goes down to 80MB. Maybe too conservative. But we can check if we need to increase some of these options by putting the server in production and checking:

MariaDB PRODUCTION s6 localhost (none) > SHOW GLOBAL status like 'performance%';
+-----------------------------------------------+-------+
| Variable_name                                 | Value |
+-----------------------------------------------+-------+
| Performance_schema_accounts_lost              | 0     |
| Performance_schema_cond_classes_lost          | 0     |
| Performance_schema_cond_instances_lost        | 0     |
| Performance_schema_digest_lost                | 179   |
| Performance_schema_file_classes_lost          | 0     |
| Performance_schema_file_handles_lost          | 0     |
| Performance_schema_file_instances_lost        | 0     |
| Performance_schema_hosts_lost                 | 0     |
| Performance_schema_locker_lost                | 0     |
| Performance_schema_mutex_classes_lost         | 0     |
| Performance_schema_mutex_instances_lost       | 0     |
| Performance_schema_rwlock_classes_lost        | 0     |
| Performance_schema_rwlock_instances_lost      | 0     |
| Performance_schema_session_connect_attrs_lost | 0     |
| Performance_schema_socket_classes_lost        | 0     |
| Performance_schema_socket_instances_lost      | 0     |
| Performance_schema_stage_classes_lost         | 0     |
| Performance_schema_statement_classes_lost     | 0     |
| Performance_schema_table_handles_lost         | 0     |
| Performance_schema_table_instances_lost       | 0     |
| Performance_schema_thread_classes_lost        | 0     |
| Performance_schema_thread_instances_lost      | 0     |
| Performance_schema_users_lost                 | 0     |
+-----------------------------------------------+-------+
23 rows in set (0.00 sec)

For example, in a few minutes we have lost already some statement digests.

jcrespo added a comment.EditedOct 9 2015, 3:40 PM

Setting performance_schema_digests_size = 5000

Edit: (This is rejected as an invalid value and gets resized to 200, which is not enough and forces losing digests)

This seem to work ok for now, with a 100M footprint:

# Activating P_S by default
performance_schema = 1
# downsizing performance schema memory usage: T99485
performance_schema_digests_size = -1
performance_schema_max_thread_instances = 500
performance_schema_max_cond_instances = 1000
performance_schema_accounts_size = 100
performance_schema_events_statements_history_size = 10
performance_schema_events_statements_history_long_size = 1000
performance_schema_events_waits_history_size = 10
performance_schema_events_waits_history_long_size = 1000
performance_schema_events_stages_history_size = 10
performance_schema_events_stages_history_long_size = 1000
performance_schema_max_mutex_instances = 5000
performance_schema_max_rwlock_instances = 2000
performance_schema_max_socket_instances = 500
performance_schema_max_table_instances = 1000

We'll see with more realistic loads later.

Change 247615 had a related patch set uploaded (by Jcrespo):
Enabling performance schema experimentally on db1018

https://gerrit.wikimedia.org/r/247615

jcrespo updated the task description. (Show Details)Oct 20 2015, 5:44 PM
jcrespo moved this task from Backlog to In progress on the DBA board.

Change 247615 merged by Jcrespo:
Enabling performance schema experimentally on db1018

https://gerrit.wikimedia.org/r/247615

Change 249372 had a related patch set uploaded (by Jcrespo):
Enabling performance schema experimentally on db1022

https://gerrit.wikimedia.org/r/249372

Change 249372 merged by Jcrespo:
Enabling performance schema experimentally on db1022

https://gerrit.wikimedia.org/r/249372

When setting db1022 on production, we lost some accounts and hosts:

mysql> SHOW GLOBAL STATUS like 'performance%';
+-----------------------------------------------+-------+
| Variable_name                                 | Value |
+-----------------------------------------------+-------+
| Performance_schema_accounts_lost              | 36530 |
[...]
| Performance_schema_hosts_lost                 | 36260 |
[...]
23 rows in set (0.00 sec)

We must tune that.

jcrespo updated the task description. (Show Details)Oct 28 2015, 7:19 PM
jcrespo renamed this task from implement performance_schema for wmf prod to implement performance_schema for mysql monitoring.Nov 25 2015, 11:58 AM
jcrespo updated the task description. (Show Details)

Adding @fgiunchedi so he is in the loop, but not having any actionables for him here.

performance_schema has been rolled out to all of s3. This is an achievement for several reasons:

  • All servers of a single shard are covered with detailed monitoring
  • s3 is one of the most problematic ones, which requires more monitoring
  • The fact that was possible to roll it here with not appreciable performance impact, when it has the most objects per server, it is a huge step
jcrespo moved this task from In progress to Backlog on the DBA board.Dec 1 2015, 5:23 PM
jcrespo moved this task from Backlog to In progress on the DBA board.Feb 20 2016, 7:33 PM
jcrespo moved this task from In progress to Backlog on the DBA board.Mar 9 2016, 11:09 AM
jcrespo moved this task from Backlog to Next on the DBA board.Mar 23 2016, 2:15 PM

jcrespo moved this task to Next on the DBA workboard.

Why is this "Next" when T69223 isn't?

jcrespo moved this task from Next to In progress on the DBA board.Apr 1 2016, 11:14 AM
jcrespo moved this task from In progress to Backlog on the DBA board.
Volans added a subscriber: Volans.Apr 20 2016, 1:24 PM

Performance schema is set in the configuration for all coredb (s1-s7), will be active at next restart or before if manually activated.

jcrespo updated the task description. (Show Details)Apr 22 2016, 9:53 AM

p_s generally available on all masters and many slaves- it only needs some restarts on some pending servers, but new collectors can already be programmed and used.

The following production eqiad host are pending implementing p_s:

db1066.eqiad.wmnet:
    @@global.performance_schema
    0
db1068.eqiad.wmnet:
    @@global.performance_schema
    0
db1062.eqiad.wmnet:
    @@global.performance_schema
    0
db1059.eqiad.wmnet:
    @@global.performance_schema
    0
db1064.eqiad.wmnet:
    @@global.performance_schema
    0
db1061.eqiad.wmnet:
    @@global.performance_schema
    0
db1051.eqiad.wmnet:
    @@global.performance_schema
    0
db1019.eqiad.wmnet:
    @@global.performance_schema
    0
db1072.eqiad.wmnet:
    @@global.performance_schema
    0
db1037.eqiad.wmnet:
    @@global.performance_schema
    0
db1026.eqiad.wmnet:
    @@global.performance_schema
    0
db1073.eqiad.wmnet:
    @@global.performance_schema
    0
db1045.eqiad.wmnet:
    @@global.performance_schema
    0
db1039.eqiad.wmnet:
    @@global.performance_schema
    0
db1030.eqiad.wmnet:
    @@global.performance_schema
    0
db1055.eqiad.wmnet:
    @@global.performance_schema
    0

Pending ones:

$ sudo salt --output=txt -C 'G@mysql_group:core' cmd.run 'mysql --defaults-file=/root/.my.cnf --batch --skip-column-names -e "SELECT @@global.performance_schema"' | grep ': 0'
db1066.eqiad.wmnet: 0
db1051.eqiad.wmnet: 0
db1037.eqiad.wmnet: 0
db1045.eqiad.wmnet: 0
db1026.eqiad.wmnet: 0
db1030.eqiad.wmnet: 0
es1016.eqiad.wmnet: 0
es1012.eqiad.wmnet: 0
es1018.eqiad.wmnet: 0
db1039.eqiad.wmnet: 0

Change 319806 had a related patch set uploaded (by Jcrespo):
Allow SSL (TLS) and performance_schema on misc servers

https://gerrit.wikimedia.org/r/319806

Change 319806 merged by Jcrespo:
Allow SSL (TLS) and performance_schema on misc servers

https://gerrit.wikimedia.org/r/319806

jcrespo closed this task as Resolved.Feb 14 2017, 3:16 PM

Performance schema is finally on all servers, and so does the sys schema. I will close this because the pending tasks are old- most likely, the backend will be a private prometheus instance.