Page MenuHomePhabricator

Test tendril events and tokudb on 10.4
Closed, DeclinedPublic

Description

Let's make sure all the events on tendril will work fine on 10.4
Apart from this know upstream bug T247728, let's make sure all the events start correctly and there are no reported errors.

Update: this ticket has drifted into checking TokuDB on mariadb 10.4 and buster.
There are issues with jemalloc 5 and TokuDB compilation

Event Timeline

Marostegui triaged this task as Medium priority.Mar 31 2020, 8:28 AM
Marostegui moved this task from Triage to Pending comment on the DBA board.

Change 584882 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] install_server: Reimage db2093 as buster

https://gerrit.wikimedia.org/r/584882

Change 584882 merged by Marostegui:
[operations/puppet@production] install_server: Reimage db2093 as buster

https://gerrit.wikimedia.org/r/584882

Change 584886 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] install_server: Allow db2093 reimage without formating /srv

https://gerrit.wikimedia.org/r/584886

Change 584886 merged by Marostegui:
[operations/puppet@production] install_server: Allow db2093 reimage without formating /srv

https://gerrit.wikimedia.org/r/584886

Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts:

['db2093.codfw.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202003310853_marostegui_87010.log.

Completed auto-reimage of hosts:

['db2093.codfw.wmnet']

and were ALL successful.

So, this wasn't as easy as it looked like :-)
db2093 (and tendril) uses tokudb, and the latest packages we are compiling we do not compile tokudb.
I have built a new package for 10.4.12 with tokudb enabled:

cmake . -DBUILD_CONFIG=mysql_release -DWITH_SSL=system -DPLUGIN_MROONGA=NO -DPLUGIN_OQGRAPH=NO -DPLUGIN_AWS_KEY_MANAGEMENT=NO -DPLUGIN_TOKUDB=yes -DWITH_JEMALLOC=system -DWITH_SYSTEMD=yes

However, we still have a problem when loading the plugin:

mysql:root@localhost [(none)]> INSTALL SONAME 'ha_tokudb';
ERROR 1126 (HY000): Can't open shared library '/opt/wmf-mariadb104/lib/plugin/ha_tokudb.so' (errno: 2, /lib/x86_64-linux-gnu/libjemalloc.so.2: cannot allocate memory in static TLS block)

Which seems related to: https://jira.mariadb.org/browse/MDEV-16183
We need tokudb not only for tendril but for dbstore hosts, as one of their databases uses Toku still, so we need to support it.
@jcrespo - I know you've dealt with TokuDB compilation issues in the past, is this something you've experienced in some other way?

One of the solutions I have read on a related ticket is to downgrade jemalloc, which is a no-go.

This comment is interesting

Permalink
mschorm
Michal Schorm added a comment - 2019-05-10 03:54
Can be closed.

I finally found a solution.

The "WITH_JEMALLOC" is likethe only option, that does not accept uppercase values, beacuse it's not CMake bool, but a string, which does not have those uppercase values defined.

When using "yes" or "no", it works as expected.

—

Offtopic question:
Why the ha_tokudb.so plugin does not link the jemalloc library right away and the LD_PRELOAD must be configured?

So looks like adding this to the systemd unit works:

[Service]
Environment="LD_PRELOAD=/lib/x86_64-linux-gnu/libjemalloc.so.2"
mysql:root@localhost [(none)]> INSTALL SONAME 'ha_tokudb';
Query OK, 0 rows affected (0.001 sec)

mysql:root@localhost [(none)]> pager grep Toku
PAGER set to 'grep Toku'
mysql:root@localhost [(none)]> show engines;
| TokuDB             | YES     | Percona TokuDB Storage Engine with Fractal Tree(tm) Technology                                  | YES          | YES  | YES        |
11 rows in set (0.000 sec)

Change 584958 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/software@master] control-mariadb-10.4: Increase package version

https://gerrit.wikimedia.org/r/584958

From 10.2 (https://mariadb.com/kb/en/changes-improvements-in-mariadb-102/):

MariaDB is no longer compiled with jemalloc
TokuDB is now a separate package, not part of the server RPM (because TokuDB still needs jemalloc).

I have also tried to pass it via [mysqld_safe] using malloc-lib=jemalloc instead of using the systemd unit but it doesn't make any difference.
There's a bug (not exactly the same) filed with mariadb https://jira.mariadb.org/browse/MDEV-16676 which is confirmed and set to be fixed on 10.4
But it is not exactly the same, as for us the flag is not making any difference.

This is with the LD_PRELOAD on the systemd unit:

root@db2093:/usr# lsof -n -p 12371 | grep lib | grep jemall
mysqld  12371 mysql  mem    REG                9,0     646352     798878 /usr/lib/x86_64-linux-gnu/libjemalloc.so.2

Via mysqld_safe:

root@db2093:/usr# lsof -n -p 14156 | grep lib | grep jemall
root@db2093:/usr#

So even though the server starts fine and mysqld_safe starts fine, it is not picking that flag, and it is listed at https://mariadb.com/kb/en/mysqld_safe/ could this be a mariadb bug?

Ok, so this works:

root@db2093:/opt/wmf-mariadb104# ./bin/mysqld_safe --malloc-lib=/lib/x86_64-linux-gnu/libjemalloc.so.2
200331 16:35:36 mysqld_safe Adding '/lib/x86_64-linux-gnu/libjemalloc.so.2' to LD_PRELOAD for mysqld
200331 16:35:36 mysqld_safe Logging to '/srv/sqldata/db2093.err'.
200331 16:35:36 mysqld_safe Starting mysqld daemon with databases from /srv/sqldata
mysql:root@localhost [(none)]> INSTALL SONAME 'ha_tokudb';
Query OK, 0 rows affected (0.014 sec)

mysql:root@localhost [(none)]>

Looks like that for the hosts that will require TokuDB (tendril and analytics dbstore) we need to find a way to use jemalloc on start if we want to keep using TokuDB

I just realised that dbstore hosts do not use TokuDB anymore.
The only tables we have with tokudb are on tendril and they are:

+-----------------------+--------------+
| table_name            | table_schema |
+-----------------------+--------------+
| client_statistics_log | tendril      |
| general_log_sampled   | tendril      |
| global_status_log     | tendril      |
| global_status_log_5m  | tendril      |
| index_statistics_log  | tendril      |
| innodb_locks_log      | tendril      |
| innodb_trx_log        | tendril      |
| queries               | tendril      |
| queries_seen_log      | tendril      |
| query_footprint_log   | tendril      |
| slave_status_log      | tendril      |
| slow_log_sampled      | tendril      |
| table_statistics_log  | tendril      |
| user_statistics_log   | tendril      |
+-----------------------+--------------+
14 rows in set (0.23 sec)

I think it is probably better to work towards converting them to InnoDB rather than keep maintaining TokuDB just for a few tables on Tendril?
Some of them are empty or apparently not in use:

client_statistics_log
+-----------------------+---------------------+
| table_name            | update_time         |
+-----------------------+---------------------+
| client_statistics_log | 2020-04-01 05:21:49 |
+-----------------------+---------------------+
general_log_sampled
+---------------------+---------------------+
| table_name          | update_time         |
+---------------------+---------------------+
| general_log_sampled | 2020-04-01 00:00:28 |
+---------------------+---------------------+
global_status_log
+-------------------+---------------------+
| table_name        | update_time         |
+-------------------+---------------------+
| global_status_log | 2020-04-01 05:21:48 |
+-------------------+---------------------+
global_status_log_5m
+----------------------+---------------------+
| table_name           | update_time         |
+----------------------+---------------------+
| global_status_log_5m | 2020-04-01 05:20:56 |
+----------------------+---------------------+
index_statistics_log
+----------------------+---------------------+
| table_name           | update_time         |
+----------------------+---------------------+
| index_statistics_log | 2020-04-01 05:21:50 |
+----------------------+---------------------+
innodb_locks_log
+------------------+---------------------+
| table_name       | update_time         |
+------------------+---------------------+
| innodb_locks_log | 2020-04-01 05:21:47 |
+------------------+---------------------+
innodb_trx_log
+----------------+---------------------+
| table_name     | update_time         |
+----------------+---------------------+
| innodb_trx_log | 2020-04-01 05:21:49 |
+----------------+---------------------+
queries
+------------+---------------------+
| table_name | update_time         |
+------------+---------------------+
| queries    | 2018-02-22 08:36:23 |
+------------+---------------------+
queries_seen_log
+------------------+---------------------+
| table_name       | update_time         |
+------------------+---------------------+
| queries_seen_log | 2020-04-01 00:00:34 |
+------------------+---------------------+
query_footprint_log
+---------------------+---------------------+
| table_name          | update_time         |
+---------------------+---------------------+
| query_footprint_log | 2014-05-30 02:44:55 |
+---------------------+---------------------+
slave_status_log
+------------------+---------------------+
| table_name       | update_time         |
+------------------+---------------------+
| slave_status_log | 2020-04-01 05:21:51 |
+------------------+---------------------+
slow_log_sampled
+------------------+---------------------+
| table_name       | update_time         |
+------------------+---------------------+
| slow_log_sampled | 2020-04-01 00:00:49 |
+------------------+---------------------+
table_statistics_log
+----------------------+---------------------+
| table_name           | update_time         |
+----------------------+---------------------+
| table_statistics_log | 2020-04-01 05:21:50 |
+----------------------+---------------------+
user_statistics_log
+---------------------+---------------------+
| table_name          | update_time         |
+---------------------+---------------------+
| user_statistics_log | 2020-04-01 05:21:50 |
+---------------------+---------------------+

I would assume most of those tables are tokuDB just for compression issues? I don't know the history, but that is my guess.
However, db1115 has plenty of space available:

root@db1115:/home/marostegui# df -hT  /srv
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/md2       xfs   3.6T  1.9T  1.8T  51% /srv

Some of them though, look huge:

341G	_tendril_sql_5174_62f9bc_key_i1_afef_1_1a_B_0.tokudb
339G	_tendril_sql_5174_62f9bc_key_i2_afef_1_1a_B_1.tokudb
805G	_tendril_sql_5174_62f9bc_main_afef_1_1a_B_2.tokudb

I guess they've never been purged.

Should we start by converting the smaller tables to InnoDB compressed and see what breaks and how much space they take on disk?

Marostegui renamed this task from Test tendril events on 10.4 to Test tendril events and tokudb on 10.4.Apr 1 2020, 5:54 AM
Marostegui updated the task description. (Show Details)

Looks like that for the hosts that will require TokuDB (tendril and analytics dbstore) we need to find a way to use jemalloc on start if we want to keep using TokuDB

Per Sergey's comment on https://jira.mariadb.org/browse/MDEV-15034, the LDPRELOAD hack wil be the only option, apparently they used to dlopen() libjemalloc at runtime, but the comment from the jemalloc upstream developer at https://github.com/jemalloc/jemalloc/issues/937#issuecomment-311242450 explains why that no longer works with 5.x which is in Buster.

Maybe we can start by purging global_log* tables more often to reduce their size:

root@db1115.eqiad.wmnet[tendril]> show create event tendril_purge_global_status_log_5m\G
*************************** 1. row ***************************
               Event: tendril_purge_global_status_log_5m
            sql_mode:
           time_zone: SYSTEM
        Create Event: CREATE DEFINER=`root`@`208.80.154.151` EVENT `tendril_purge_global_status_log_5m` ON SCHEDULE EVERY 5 MINUTE STARTS '2014-05-23 00:00:00' ON COMPLETION NOT PRESERVE ENABLE DO begin

    if (get_lock('tendril_purge_global_status_log_5m', 1) = 0) then
      signal sqlstate value '45000' set message_text = 'get_lock';
    end if;

    select @stamp := now() - interval 7 day;
    delete from global_status_log_5m where stamp < @stamp limit 10000;

    do release_lock('tendril_purge_global_status_log_5m');
  end
character_set_client: utf8
collation_connection: utf8_general_ci
  Database Collation: latin1_swedish_ci
1 row in set (0.00 sec)

Not sure if we really need 7 days there.

For global_status maybe we don't need a whole month of data:

root@db1115.eqiad.wmnet[tendril]> select * from global_status_log order by stamp asc limit 1;
+-----------+---------------------+---------+-----------+
| server_id | stamp               | name_id | value     |
+-----------+---------------------+---------+-----------+
|      1273 | 2020-03-03 00:00:00 |     421 | 180367401 |
+-----------+---------------------+---------+-----------+
1 row in set (0.73 sec)

Looks like that for the hosts that will require TokuDB (tendril and analytics dbstore) we need to find a way to use jemalloc on start if we want to keep using TokuDB

Per Sergey's comment on https://jira.mariadb.org/browse/MDEV-15034, the LDPRELOAD hack wil be the only option, apparently they used to dlopen() libjemalloc at runtime, but the comment from the jemalloc upstream developer at https://github.com/jemalloc/jemalloc/issues/937#issuecomment-311242450 explains why that no longer works with 5.x which is in Buster.

Yeah, looks like that's the only workaround for now, if we need to stick to TokuDB. Thank you for taking a look.

Change 585115 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] tendril.pp: Set default basedir depending on the OS

https://gerrit.wikimedia.org/r/585115

Yep, this is known, tendril doesn't work on buster anyway, so it is a non-issue.

Yep, this is known, tendril doesn't work on buster anyway, so it is a non-issue.

Non-issue? Key tables for tendril are on TokuDB

Non-issue? Key tables for tendril are on TokuDB

Yes, the format of the tables doesn't matter if the application cannot even run on buster- see db2093. I sent an email to you and mark mentioning this back on October 0:-), very worried. The issue with tokudb doesn't matter, because the application doesn't even start with buster. So no reason to upgrade/do maintenance on something that cannot even run in the first place. This was tracked on T224589 and warned by email at "Finish current goals and add goal in the middle of the quarter? (backups/tendril)". This was considered a low prio issue/I shouldn't work on it. That is why I say it is a non-issue- and mark supports this view. Not worth fixing what it is broken beyond repair.

Maybe we will get something out of Google Summer of Code?

Change 584958 abandoned by Marostegui:
control-mariadb-10.4: Increase package version

Reason:
https://phabricator.wikimedia.org/T248957#6017575

https://gerrit.wikimedia.org/r/584958

Change 585115 merged by Marostegui:
[operations/puppet@production] tendril.pp: Set default basedir depending on the OS

https://gerrit.wikimedia.org/r/585115