Page MenuHomePhabricator

db1069: convert user_groups table to InnoDB across all the wikis
Closed, ResolvedPublic


In order to avoid further crashes with the same table, we should alter user_groups table across all the wikis to conver it to InnoDB.

Event Timeline

root@db1069:/srv# find . -name user_groups.frm  | awk -F "." '{print $3}' | awk -F "/" '{print $1}' | uniq -c | sort -k2
      1 s1
     17 s2
    830 s3
      1 s4
      2 s5
      3 s6
     12 s7

@jcrespo I assume we want these tables converted to InnoDB across all the shards, right? And also, replicate that alter downstream.

Yes, ideally everything would be on InnoDB, we probably can only do it on a subset of tables for now. These should be ok in size (they should be small), but check available disk space both on db1069 and labsdb100[13]. Also check replication filtering is not broken in the process (it should not be for a simple engine change). Just be careful about: the replication filtering; the triggers on sanitarium; the existing views on labs.

Do not worry about lag, the conversion should be fast and user queries usually create worse issues. Just do plain alters and let them replicate.

So far I have only converted:

S1: enwiki/user_groups

It all looked fine but I do not want to do more tables at the same time at the end of the week, just in case we find something weird and given that we will be on an offsite next week.
The table is 12M so disk space shouldn't be an issue if they are all around that size

I think user_groups on s3 failed today:

*************************** 8. row ***************************
      Id: 3472432
    User: system user
      db: urwiki
 Command: Connect
    Time: 11074
   State: update
    Info: INSERT /* User::addGroup  */ IGNORE INTO `user_groups` (ug_user,ug_group) VALUES ('XXXXX
Progress: 0.000
jcrespo triaged this task as High priority.Sep 28 2016, 5:15 PM

I have just converted S2 user_group tables to InnoDB.

Note: Percona has not replied yet to the bug report after I sent the stacktraces 9 days ago.

Mentioned in SAL (#wikimedia-operations) [2016-10-03T06:30:21Z] <marostegui> altering S3,S4,S5,S6,S7 user_groups tables in sanitarium to avoid tokudb bug - T146121

S3 tables have been converted to InnoDB.

S4 (commonswiki database) table has been converted

S5 (dewiki and wikidatawiki) table converted

S6 (frwiki, jawiki, ruwiki) converted

S7 tables converted

All the user_groups tables across all the shards are now running InnoDB engine instead of tokudb.
Will be difficult to provide further stacktraces to the open Percona/MariaDB bug ( as this is likely to be resolved by converting all to InnoDB.