Page MenuHomePhabricator

Fix dbstore2001 and dbstore2002
Closed, ResolvedPublic

Description

Aria engine on dbstore2002 crashed, the symptom is that the replica was broken with:

160316 14:13:56 [ERROR] Master 's3': Slave SQL: Error 'Got error 122 "Internal (unspecified) error in handler" from storage engine Aria' on query. Default database: 'bnwikisource'. Query: 'INSERT /* ShortUrlUtils::encodeTitle [...OMISSIS...] */ IGNORE INTO `shorturls`...
5527782, Internal MariaDB error code: 1030
160316 14:13:56 [Warning] Master 's3': Slave: Got error 122 "Internal (unspecified) error in handler" from storage engine Aria Error_code: 1030

A check of the table revealed that it was crashed:

MariaDB DBSTORE localhost bnwikisource > CHECK TABLE shorturls;
+------------------------+-------+----------+----------------------------------------------------------------------------------------------------+
| Table                  | Op    | Msg_type | Msg_text                                                                                           |
+------------------------+-------+----------+----------------------------------------------------------------------------------------------------+
| bnwikisource.shorturls | check | error    | Page 3010:  Wrong data in bitmap.  Page_type: 1  full: 0  empty_space: 2916  Bitmap-bits: 4 'full' |
| bnwikisource.shorturls | check | error    | Corrupt                                                                                            |
+------------------------+-------+----------+----------------------------------------------------------------------------------------------------+
2 rows in set (0.51 sec)

A repair worked fine:

MariaDB DBSTORE localhost bnwikisource > REPAIR NO_WRITE_TO_BINLOG TABLE shorturls;
+------------------------+--------+----------+----------------------------------------------+
| Table                  | Op     | Msg_type | Msg_text                                     |
+------------------------+--------+----------+----------------------------------------------+
| bnwikisource.shorturls | repair | warning  | Number of rows changed from 216621 to 216622 |
| bnwikisource.shorturls | repair | status   | OK                                           |
+------------------------+--------+----------+----------------------------------------------+
2 rows in set (1.47 sec)

But as soon as the slave was restarted the table crashed again. In fact all the tables with Aria engine were seen as crashed leading to the suggestion that the Aria engine itself was crashed.

dbstore2001 is probably corrupt, too. Both need a reload (TokuDB -> InnoDB compressed anyway).

Event Timeline

After having stopped all the slaves (is with multi-source replication) I'm attempting a clean shutdown

jcrespo triaged this task as Medium priority.Mar 16 2016, 4:59 PM

Low to normal- this server is unused.

Unable to stop after more that 2 hours, looks like it's in a deadlock, this is a ~10 seconds strace

root@dbstore2002:/srv/sqldata# strace -fF -p 4068
Process 4068 attached with 39 threads
[pid 64429] futex(0x17282cc, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid 47289] futex(0x7fa94f4cf0cc, FUTEX_WAIT_PRIVATE, 5, NULL <unfinished ...>
[pid  4244] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid  4243] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid  4242] futex(0x7fbdb78fc744, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid  4241] select(0, NULL, NULL, NULL, {0, 106130} <unfinished ...>
[pid  4240] select(0, NULL, NULL, NULL, {0, 177439} <unfinished ...>
[pid  4236] futex(0x7fbdb78fc5c4, FUTEX_WAIT_PRIVATE, 225430733, NULL <unfinished ...>
[pid  4232] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid  4105] futex(0x7fbdb7b6f844, FUTEX_WAIT_PRIVATE, 11083109, NULL <unfinished ...>
[pid  4234] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid  4233] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
[pid  4104] futex(0x7fbdb7b6f7c4, FUTEX_WAIT_PRIVATE, 10951849, NULL <unfinished ...>
[pid  4103] futex(0x7fbdb7b6f744, FUTEX_WAIT_PRIVATE, 10945787, NULL <unfinished ...>
[pid  4102] futex(0x7fbdb7b6f6c4, FUTEX_WAIT_PRIVATE, 11054577, NULL <unfinished ...>
[pid  4101] futex(0x7fbdb7b6f644, FUTEX_WAIT_PRIVATE, 11031009, NULL <unfinished ...>
[pid  4100] futex(0x7fbdb7b6f5c4, FUTEX_WAIT_PRIVATE, 11047271, NULL <unfinished ...>
[pid  4099] futex(0x7fbdb7b6f544, FUTEX_WAIT_PRIVATE, 11056801, NULL <unfinished ...>
[pid  4098] futex(0x7fbdb7b6f4c4, FUTEX_WAIT_PRIVATE, 11372307, NULL <unfinished ...>
[pid  4097] futex(0x7fbdb7b6f444, FUTEX_WAIT_PRIVATE, 2029805, NULL <unfinished ...>
[pid  4096] futex(0x7fbdb7b6f3c4, FUTEX_WAIT_PRIVATE, 2034503, NULL <unfinished ...>
[pid  4095] futex(0x7fbdb7b6f344, FUTEX_WAIT_PRIVATE, 2028781, NULL <unfinished ...>
[pid  4094] futex(0x7fbdb7b6f2c4, FUTEX_WAIT_PRIVATE, 2046561, NULL <unfinished ...>
[pid  4093] futex(0x7fbdb7b6f244, FUTEX_WAIT_PRIVATE, 2040015, NULL <unfinished ...>
[pid  4092] futex(0x7fbdb7b6f1c4, FUTEX_WAIT_PRIVATE, 2036883, NULL <unfinished ...>
[pid  4091] futex(0x7fbdb7b6f144, FUTEX_WAIT_PRIVATE, 2085673, NULL <unfinished ...>
[pid  4090] futex(0x7fbdb7b6f0c4, FUTEX_WAIT_PRIVATE, 2054163, NULL <unfinished ...>
[pid  4089] futex(0x7fbdb78fdfc4, FUTEX_WAIT_PRIVATE, 2072907, NULL <unfinished ...>
[pid  4088] futex(0x7fbdb78fdf44, FUTEX_WAIT_PRIVATE, 2070597, NULL <unfinished ...>
[pid  4086] futex(0x7fbdb78fde44, FUTEX_WAIT_PRIVATE, 2081527, NULL <unfinished ...>
[pid  4087] futex(0x7fbdb78fdec4, FUTEX_WAIT_PRIVATE, 2074451, NULL <unfinished ...>
[pid  4085] futex(0x7fbdb78fddc4, FUTEX_WAIT_PRIVATE, 2057155, NULL <unfinished ...>
[pid  4082] futex(0x7fbdb78fdc44, FUTEX_WAIT_PRIVATE, 2067365, NULL <unfinished ...>
[pid  4084] futex(0x7fbdb78fdd44, FUTEX_WAIT_PRIVATE, 2057405, NULL <unfinished ...>
[pid  4083] futex(0x7fbdb78fdcc4, FUTEX_WAIT_PRIVATE, 2095117, NULL <unfinished ...>
[pid  4081] futex(0x7fbdb78fdbc4, FUTEX_WAIT_PRIVATE, 295713, NULL <unfinished ...>
[pid  4080] futex(0x7fbdb78fdb44, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid  4068] futex(0x13d7fe4, FUTEX_WAIT_PRIVATE, 7, NULL <unfinished ...>
[pid  4235] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4241] <... select resumed> )      = 0 (Timeout)
[pid  4241] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4240] <... select resumed> )      = 0 (Timeout)
[pid  4240] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4233] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
[pid  4233] futex(0x7fbdb78fc600, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4233] futex(0x7fbdb78fc644, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 2091087, {1458150749, 307508000}, ffffffff <unfinished ...>
[pid  4232] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
[pid  4232] futex(0x7fb54639c600, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4232] futex(0x7fb54639c644, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 2091171, {1458150749, 629917000}, ffffffff <unfinished ...>
[pid  4234] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
[pid  4234] futex(0x7fbdb78fc680, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4234] futex(0x7fbdb78fc6c4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 418261, {1458150753, 758313000}, ffffffff <unfinished ...>
[pid  4235] <... select resumed> )      = 0 (Timeout)
[pid  4235] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4241] <... select resumed> )      = 0 (Timeout)
[pid  4241] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4240] <... select resumed> )      = 0 (Timeout)
[pid  4240] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4243] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
[pid  4243] futex(0x7fb54639d580, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4243] futex(0x7fb54639d5c4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 341649, {1458150759, 100453000}, ffffffff <unfinished ...>
[pid  4244] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
[pid  4244] futex(0x7fb4f5499080, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4244] futex(0x7fb4f54990c4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 418261, {1458150754, 136585000}, ffffffff <unfinished ...>
[pid  4233] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid  4233] futex(0x7fbdb78fc600, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4233] futex(0x7fbdb78fc644, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 2091089, {1458150750, 307861000}, ffffffff <unfinished ...>
[pid  4232] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid  4232] futex(0x7fb54639c600, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4232] futex(0x7fb54639c644, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 2091173, {1458150750, 630235000}, ffffffff <unfinished ...>
[pid  4235] <... select resumed> )      = 0 (Timeout)
[pid  4235] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4241] <... select resumed> )      = 0 (Timeout)
[pid  4241] select(0, NULL, NULL, NULL, {0, 999000} <unfinished ...>
[pid  4240] <... select resumed> )      = 0 (Timeout)
[pid  4240] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4233] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid  4233] futex(0x7fbdb78fc600, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4233] futex(0x7fbdb78fc644, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 2091091, {1458150751, 308187000}, ffffffff <unfinished ...>
[pid  4232] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid  4232] futex(0x7fb54639c600, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4232] futex(0x7fb54639c644, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 2091175, {1458150751, 630545000}, ffffffff <unfinished ...>
[pid  4235] <... select resumed> )      = 0 (Timeout)
[pid  4235] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4241] <... select resumed> )      = 0 (Timeout)
[pid  4241] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4240] <... select resumed> )      = 0 (Timeout)
[pid  4240] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4233] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid  4233] futex(0x7fbdb78fc600, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4233] futex(0x7fbdb78fc644, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 2091093, {1458150752, 308495000}, ffffffff <unfinished ...>
[pid  4232] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid  4232] futex(0x7fb54639c600, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4232] futex(0x7fb54639c644, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 2091177, {1458150752, 630864000}, ffffffff <unfinished ...>
[pid  4235] <... select resumed> )      = 0 (Timeout)
[pid  4235] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4241] <... select resumed> )      = 0 (Timeout)
[pid  4241] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4240] <... select resumed> )      = 0 (Timeout)
[pid  4240] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4233] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid  4233] futex(0x7fbdb78fc600, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4233] futex(0x7fbdb78fc644, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 2091095, {1458150753, 308824000}, ffffffff <unfinished ...>
[pid  4232] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid  4232] futex(0x7fb54639c600, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4232] futex(0x7fb54639c644, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 2091179, {1458150753, 631179000}, ffffffff <unfinished ...>
[pid  4235] <... select resumed> )      = 0 (Timeout)
[pid  4235] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4241] <... select resumed> )      = 0 (Timeout)
[pid  4241] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4240] <... select resumed> )      = 0 (Timeout)
[pid  4240] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4233] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid  4233] futex(0x7fbdb78fc600, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4233] futex(0x7fbdb78fc644, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 2091097, {1458150754, 309142000}, ffffffff <unfinished ...>
[pid  4232] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid  4232] futex(0x7fb54639c600, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4232] futex(0x7fb54639c644, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 2091181, {1458150754, 631515000}, ffffffff <unfinished ...>
[pid  4234] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid  4234] futex(0x7fbdb78fc680, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4234] futex(0x7fbdb78fc6c4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 418263, {1458150758, 758640000}, ffffffff <unfinished ...>
[pid  4235] <... select resumed> )      = 0 (Timeout)
[pid  4235] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4241] <... select resumed> )      = 0 (Timeout)
[pid  4241] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4240] <... select resumed> )      = 0 (Timeout)
[pid  4240] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4244] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid  4244] futex(0x7fb4f5499080, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4244] futex(0x7fb4f54990c4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 418263, {1458150759, 136911000}, ffffffff <unfinished ...>
[pid  4233] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid  4233] futex(0x7fbdb78fc600, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4233] futex(0x7fbdb78fc644, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 2091099, {1458150755, 309506000}, ffffffff <unfinished ...>
[pid  4232] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid  4232] futex(0x7fb54639c600, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4232] futex(0x7fb54639c644, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 2091183, {1458150755, 631810000}, ffffffff <unfinished ...>
[pid  4235] <... select resumed> )      = 0 (Timeout)
[pid  4235] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4241] <... select resumed> )      = 0 (Timeout)
[pid  4241] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4240] <... select resumed> )      = 0 (Timeout)
[pid  4240] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4233] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid  4233] futex(0x7fbdb78fc600, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4233] futex(0x7fbdb78fc644, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 2091101, {1458150756, 309808000}, ffffffff <unfinished ...>
[pid  4232] <... futex resumed> )       = -1 ETIMEDOUT (Connection timed out)
[pid  4232] futex(0x7fb54639c600, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4232] futex(0x7fb54639c644, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 2091185, {1458150756, 632132000}, ffffffff <unfinished ...>
[pid  4235] <... select resumed> )      = 0 (Timeout)
[pid  4235] select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid  4241] <... select resumed> )      = 0 (Timeout)
[pid  4241] select(0, NULL, NULL, NULL, {1, 0}^CProcess 4068 detached
Process 4080 detached
Process 4081 detached
Process 4082 detached
Process 4083 detached
Process 4084 detached
Process 4085 detached
Process 4086 detached
Process 4087 detached
Process 4088 detached
Process 4089 detached
Process 4090 detached
Process 4091 detached
Process 4092 detached
Process 4093 detached
Process 4094 detached
Process 4095 detached
Process 4096 detached
Process 4097 detached
Process 4098 detached
Process 4099 detached
Process 4100 detached
Process 4101 detached
Process 4102 detached
Process 4103 detached
Process 4104 detached
Process 4105 detached
Process 4232 detached
Process 4233 detached
Process 4234 detached
Process 4235 detached
Process 4236 detached
Process 4240 detached
Process 4241 detached
 <detached ...>
Process 4242 detached
Process 4243 detached
Process 4244 detached
Process 47289 detached
Process 64429 detached

It required a manual kill -9 of both mysqld_safe and mysqld processes, after have tried with sigint.

At restart Aria failed to start, following the instruction in the log to check first all Aria tables and repair them if needed:

160316 18:08:25 mysqld_safe Starting mysqld daemon with databases from /srv/sqldata
160316 18:08:25 [Note] /opt/wmf-mariadb10/bin/mysqld (mysqld 10.0.22-MariaDB) starting as process 44350 ...
2016-03-16 18:08:25 7faded9a27c0 InnoDB: Warning: Using innodb_locks_unsafe_for_binlog is DEPRECATED. This option may be removed in future releases. Please use READ COMMITTED transaction isolation level instead, see http://dev.mysql.com/doc/refman/5.6/en/set-transaction.html.
160316 18:08:25 [Note] InnoDB: Using mutexes to ref count buffer pool pages
160316 18:08:25 [Note] InnoDB: The InnoDB memory heap is disabled
160316 18:08:25 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
160316 18:08:25 [Note] InnoDB: Memory barrier is not used
160316 18:08:25 [Note] InnoDB: Compressed tables use zlib 1.2.8
160316 18:08:25 [Note] InnoDB: Using CPU crc32 instructions
160316 18:08:25 [Note] InnoDB: Initializing buffer pool, size = 32.0G
160316 18:08:27 [Note] InnoDB: Completed initialization of buffer pool
160316 18:08:27 [Note] InnoDB: Highest supported file format is Barracuda.
160316 18:08:27 [Note] InnoDB: The log sequence numbers 37139497232102 and 37139497232102 in ibdata files do not match the log sequence number 37872262767273 in the ib_logfiles!
160316 18:08:27 [Note] InnoDB: Database was not shutdown normally!
160316 18:08:27 [Note] InnoDB: Starting crash recovery.
160316 18:08:27 [Note] InnoDB: Reading tablespace information from the .ibd files...
160316 18:12:37 [Note] InnoDB: Restoring possible half-written data pages
160316 18:12:37 [Note] InnoDB: from the doublewrite buffer...
160316 18:12:43 [Note] InnoDB: 128 rollback segment(s) are active.
160316 18:12:43 [Note] InnoDB: Waiting for purge to start
160316 18:12:43 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.26-74.0 started; log sequence number 37872262767273
2016-03-16 18:12:43 7fa505bf9700 InnoDB: Loading buffer pool(s) from .//ib_buffer_pool
160316 18:12:43 [Note] mysqld: Aria engine: starting recovery
recovered pages: 0% 10% 20%
Got error 127 when executing record redo_insert_row_head
160316 18:12:46 [ERROR] mysqld: Aria engine: Redo phase failed
tables to flush: 3 2 1 0
160316 18:12:46 [ERROR] mysqld: Aria recovery failed. Please run aria_chk -r on all Aria tables and delete all aria_log.######## files
160316 18:12:46 [ERROR] Plugin 'Aria' init function returned error.
160316 18:12:46 [ERROR] Plugin 'Aria' registration as a STORAGE ENGINE failed.
160316 18:12:46 [Note] Plugin 'FEEDBACK' is disabled.
160316 18:12:47 [Note] CONNECT: Version 1.03.0007 October 20, 2015
160316 18:12:47 [ERROR] Unknown/unsupported storage engine: Aria
160316 18:12:47 [ERROR] Aborting

160316 18:12:48 [Note] InnoDB: FTS optimize thread exiting.
160316 18:12:48 [Note] InnoDB: Starting shutdown...
2016-03-16 18:12:48 7fa505bf9700 InnoDB: Buffer pool(s) load completed at 160316 18:12:48
2016-03-16 18:12:48 7fa505bf9700 InnoDB: Dumping buffer pool(s) to .//ib_buffer_pool
2016-03-16 18:12:48 7fa505bf9700 InnoDB: Buffer pool(s) dump completed at 160316 18:12:48
160316 18:12:51 [Note] InnoDB: Shutdown completed; log sequence number 37872262770226
160316 18:12:51 [Note] /opt/wmf-mariadb10/bin/mysqld: Shutdown complete

160316 18:12:51 mysqld_safe mysqld from pid file /srv/sqldata/dbstore2002.pid ended

./bnwikisource/shorturls.MAI was marked as crashed but was actually ok, I had already it repaired before the shutdown, just checking it updated it's status to ok.

The only other crashed table was:

Checking Aria file: ./log/EchoInteraction_15180901.MAI
Data records: 27141009   Deleted blocks:       0
- check file-size
- check key delete-chain
- check index reference
- check data record references index: 1
- check data record references index: 2
aria_chk: error: Found 27140064 keys of 27141009
- check record links
aria_chk: error: Page   1059761:  Wrong data in bitmap.  Page_type: 1  full: 0  empty_space: 6949  Bitmap-bits: 4 'full'
State updated
Aria table './log/EchoInteraction_15180901.MAI' is corrupted
Fix it using switch "-r" or "-o"

Repaired with:

root@dbstore2002:/srv/sqldata# /opt/wmf-mariadb10/bin/aria_chk -r ./log/EchoInteraction_15180901.MAI
- recovering (with sort) Aria-table './log/EchoInteraction_15180901.MAI'
Data records: 27141009
- Fixing index 1
- Fixing index 2
Data records: 27141010
State updated

Mentioned in SAL [2016-03-16T19:02:31Z] <volans> restarted mysql on dbstore2002 and all replicas except x1, still investigating T130128

Restarted MySQL skipping slave start and got this log:

160316 18:53:32 mysqld_safe Starting mysqld daemon with databases from /srv/sqldata
160316 18:53:32 [Note] /opt/wmf-mariadb10/bin/mysqld (mysqld 10.0.22-MariaDB) starting as process 55024 ...
2016-03-16 18:53:32 7efe142677c0 InnoDB: Warning: Using innodb_locks_unsafe_for_binlog is DEPRECATED. This option may be removed in future releases. Please use READ COMMITTED transaction isolation level instead, see http://dev.mysql.com/doc/refman/5.6/en/set-transaction.html.
160316 18:53:32 [Note] InnoDB: Using mutexes to ref count buffer pool pages
160316 18:53:32 [Note] InnoDB: The InnoDB memory heap is disabled
160316 18:53:32 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
160316 18:53:32 [Note] InnoDB: Memory barrier is not used
160316 18:53:32 [Note] InnoDB: Compressed tables use zlib 1.2.8
160316 18:53:32 [Note] InnoDB: Using CPU crc32 instructions
160316 18:53:32 [Note] InnoDB: Initializing buffer pool, size = 32.0G
160316 18:53:35 [Note] InnoDB: Completed initialization of buffer pool
160316 18:53:35 [Note] InnoDB: Highest supported file format is Barracuda.
160316 18:53:39 [Note] InnoDB: 128 rollback segment(s) are active.
160316 18:53:39 [Note] InnoDB: Waiting for purge to start
160316 18:53:39 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.26-74.0 started; log sequence number 37872262778652
2016-03-16 18:53:39 7ef5363f8700 InnoDB: Loading buffer pool(s) from .//ib_buffer_pool
160316 18:53:40 [ERROR] mysqld: File '/srv/sqldata/aria_log.00000163' not found (Errcode: 2 "No such file or directory")
160316 18:53:40 [Note] Plugin 'FEEDBACK' is disabled.
160316 18:53:40 [Note] CONNECT: Version 1.03.0007 October 20, 2015
160316 18:53:40 [Note] Recovering after a crash using tc.log
160316 18:53:40 [Note] Starting crash recovery...
160316 18:53:40 [Note] Crash recovery finished.
160316 18:53:40 [Note] Server socket created on IP: '::'.
160316 18:53:40 [Note] Event Scheduler: scheduler thread started with id 1
160316 18:53:40 [Warning] Neither --relay-log nor --relay-log-index were used; so replication may break when this MySQL server acts as a slave and has his hostname changed!! Please use '--log-basename=#' or '--relay-log=dbstore2002-relay-bin' to avoid this problem.
160316 18:53:40 [Note] /opt/wmf-mariadb10/bin/mysqld: ready for connections.
Version: '10.0.22-MariaDB'  socket: '/tmp/mysql.sock'  port: 3306  MariaDB Server
2016-03-16 18:53:42 7ef5363f8700 InnoDB: Buffer pool(s) load completed at 160316 18:53:42

And after that started logging every few seconds:

160316 18:54:00 [ERROR] Event Scheduler: [root@208.80.154.151][log.delete_schedule2] Table 'log.MobileWebClickTracking_5830671' doesn't exist
160316 18:54:00 [Note] Event Scheduler: [root@208.80.154.151].[log.delete_schedule2] event execution failed.

I started manually all the replicas, s1-s7 went well and are now replicating, x1 failed with:

160316 18:57:30 [Note] Master 'x1': Slave SQL thread initialized, starting replication in log 'db2009-bin.000837' at position 378850, relay log './dbstore2002-relay-bin-x1.000831' position: 379138
160316 18:57:30 [Note] Master 'x1': Slave I/O thread: connected to master 'repl@db2009.codfw.wmnet:3306',replication started in log 'db2009-bin.000837' at position 378850
160316 18:57:30 [ERROR] Master 'x1': Slave SQL: Error executing row event: 'Table 'wikishared.echo_unread_wikis' doesn't exist', Gtid 0-101618-190114053, Internal MariaDB error code: 1146
160316 18:57:30 [Warning] Master 'x1': Slave: Table 'wikishared.echo_unread_wikis' doesn't exist Error_code: 1146
160316 18:57:30 [ERROR] Master 'x1': Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'db2009-bin.000837' position 378850

Looking at the past in the log and in the git log/blame of the file I found that the wikishared replication was added on Dec 28th 2015 and the next restart was for the heartbeat stuff on March 4th, where the same error popped up.

Hence I guess that DB was added later given is not at all on dbstore2002, @jcrespo can you confirm?

I've also found that the replication filter show twice all the entries, from show slave status\G it was:

Replicate_Wild_Do_Table: flowdb.%,wikishared.%,heartbeat.%,flowdb.%,wikishared.%,heartbeat.%

Looks something like http://bugs.mysql.com/bug.php?id=19546 but it's a 10 year old bug... I'll investigate more another time of this.

I've changed it to:

MariaDB DBSTORE localhost (none) > SET default_master_connection='x1';
Query OK, 0 rows affected (0.00 sec)

MariaDB DBSTORE localhost (none) > show variables like '%Replicate_Wild_Do_Table%';
+-------------------------+---------------------------------------------------------------------+
| Variable_name           | Value                                                               |
+-------------------------+---------------------------------------------------------------------+
| replicate_wild_do_table | flowdb.%,wikishared.%,heartbeat.%,flowdb.%,wikishared.%,heartbeat.% |
+-------------------------+---------------------------------------------------------------------+
1 row in set (0.01 sec)

MariaDB DBSTORE localhost (none) > set global replicate_wild_do_table = 'flowdb.%,heartbeat.%';
Query OK, 0 rows affected (0.00 sec)

start slave;

And everything went back to normal. We still need to decide if we can trust this DB or in any case we want to restore it from another one.

In any case we need to fix puppet or import the wikishared DB in order to have it restart properly

DB was added later given is not at all on dbstore2002, @jcrespo can you confirm?

Yes, as I mentioned, x1 is not fully available on all dbstores due to space and/or capacity reasons.

I've also found that the replication filter show twice all the entries

A wonderful mariadb bug, but gladly, inconsequential.

We still need to decide if we can trust this DB or in any case we want to restore it from another one.

After a slave crashes, there is a 99% of it getting corrupted due to non-transactional replication replication control position. Bonus points if it involves non-transactional tables. We need to reimport it.

This crashed today after restart, needs full wipe and repopulation.

I've ignored "enwiki.pagelinks" on dbstore2002 replication, so it can at least continue replicating until reimported.

Corruption is certain on s2, s3 and s6 (recentchanges deletions failed), but I have skipped events to avoid running out of space (relay log).

jcrespo renamed this task from Fix dbstore2002 to Fix dbstore2001 and dbstore2002.Aug 26 2016, 7:33 AM
jcrespo updated the task description. (Show Details)

Change 328671 had a related patch set uploaded (by Jcrespo):
dbstore: configuration changes to make InnoDB the main storage

https://gerrit.wikimedia.org/r/328671

Change 332228 had a related patch set uploaded (by Marostegui):
mariadb: Split dbstore role classes

https://gerrit.wikimedia.org/r/332228

For the record: I have restarted dbstore2001 mysql to manually apply the variables to make innodb the default storage engine as well as increasing its buffer_pool size as I didn't wait for this code review as we need to see if the server catches up nicely with this flags changed.

This is what have changed:

innodb_buffer_pool_size        = 100G
default-storage-engine = InnoDB
#plugin-load = ha_tokudb
#tokudb_cache_size = 24G
#tokudb_fs_reserve_percent = 1

For the record: I have restarted dbstore2001 mysql to manually apply the variables to make innodb the default storage engine as well as increasing its buffer_pool size as I didn't wait for this code review as we need to see if the server catches up nicely with this flags changed.

This is what have changed:

default-storage-engine = InnoDB
#plugin-load = ha_tokudb
#tokudb_cache_size = 24G
#tokudb_fs_reserve_percent = 1

After this change dbstore2001 is catching up quite nicely :-)
So once: https://gerrit.wikimedia.org/r/#/c/332228/ is +1'ed those flags will be deployed with puppet and I will also restart dbstore2002 now.

After this change dbstore2001 is catching up quite nicely :-)
So once: https://gerrit.wikimedia.org/r/#/c/332228/ is +1'ed those flags will be deployed with puppet and I will also restart dbstore2002 now.

It keeps catching up and since yesterday it was around 70k seconds delayed and today it is 19k :-)
Same goes for s5 which decreased from 70k to 14k

This is what have changed:

default-storage-engine = InnoDB
#plugin-load = ha_tokudb
#tokudb_cache_size = 24G
#tokudb_fs_reserve_percent = 1

After this change dbstore2001 is catching up quite nicely :-)

You changed the buffer pool, too https://grafana.wikimedia.org/dashboard/db/mysql?panelId=1&fullscreen&from=1484655472342&to=1484788753698&var-dc=codfw%20prometheus%2Fops&var-server=dbstore2001 which is the real improvement here :-)

This is what have changed:

default-storage-engine = InnoDB
#plugin-load = ha_tokudb
#tokudb_cache_size = 24G
#tokudb_fs_reserve_percent = 1

After this change dbstore2001 is catching up quite nicely :-)

You changed the buffer pool, too https://grafana.wikimedia.org/dashboard/db/mysql?panelId=1&fullscreen&from=1484655472342&to=1484788753698&var-dc=codfw%20prometheus%2Fops&var-server=dbstore2001 which is the real improvement here :-)

Ah yes, looks like I forgot to copy that line :-) I will edit the orginal comment

All good!

root@dbstore2001:~# mysql --skip-ssl -e "show all slaves status\G" | egrep "Connection|Seconds"
              Connection_name: m3
        Seconds_Behind_Master: 0
              Connection_name: s1
        Seconds_Behind_Master: 0
              Connection_name: s2
        Seconds_Behind_Master: 0
              Connection_name: s3
        Seconds_Behind_Master: 0
              Connection_name: s4
        Seconds_Behind_Master: 0
              Connection_name: s5
        Seconds_Behind_Master: 87
              Connection_name: s6
        Seconds_Behind_Master: 0
              Connection_name: s7
        Seconds_Behind_Master: 0

Change 332228 merged by Marostegui:
mariadb: Split dbstore role classes

https://gerrit.wikimedia.org/r/332228

I have deployed the change that split the classes into dbstore (tokuDB) and dbstore2 (InnoDB)

I have enabled delayed replicationon dbstore2001- it is not dangerous, as worse case scenario it just stops the replication thread more than it should. In particular, it was badly/not completely documented. Aside from the things on:

events_dbstore_delayed.sql

A view has to be created with:

CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `ss` AS select `slave_status`.`Connection_name` AS `con`,substring_index(`slave_status`.`Master_Host`,'.',1) AS `master`,`slave_status`.`Master_Port` AS `port`,`slave_status`.`Slave_IO_Running` AS `io`,`slave_status`.`Slave_SQL_Running` AS `sql`,`slave_status`.`Seconds_Behind_Master` AS `lag` from `slave_status`;

and the connect engine has to be enabled- which is a no-go for non dedicated stuff. Aside from that, it is now getting delayed, so it seems to work. We will see if it maintains the delay during the weekend. The events have its own log at: ops.event_log.

Mentioned in SAL (#wikimedia-operations) [2017-02-10T18:08:56Z] <jynus> renabling delayed replication for dbstore2001 T130128

I have enabled delayed replicationon dbstore2001- it is not dangerous, as worse case scenario it just stops the replication thread more than it should. In particular, it was badly/not completely documented. Aside from the things on:

events_dbstore_delayed.sql

A view has to be created with:

CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `ss` AS select `slave_status`.`Connection_name` AS `con`,substring_index(`slave_status`.`Master_Host`,'.',1) AS `master`,`slave_status`.`Master_Port` AS `port`,`slave_status`.`Slave_IO_Running` AS `io`,`slave_status`.`Slave_SQL_Running` AS `sql`,`slave_status`.`Seconds_Behind_Master` AS `lag` from `slave_status`;

and the connect engine has to be enabled- which is a no-go for non dedicated stuff. Aside from that, it is now getting delayed, so it seems to work. We will see if it maintains the delay during the weekend. The events have its own log at: ops.event_log.

So it is going fine!

root@dbstore2001:~# mysql --skip-ssl -e "show all slaves status\G" | grep Seconds
        Seconds_Behind_Master: NULL
        Seconds_Behind_Master: NULL
        Seconds_Behind_Master: NULL
        Seconds_Behind_Master: 86491
        Seconds_Behind_Master: 86633
        Seconds_Behind_Master: NULL
        Seconds_Behind_Master: NULL
root@dbstore2001:~# bc
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
86448/3600
24
86653/3600
24

It happened by chance to just start some replication threads again:

root@dbstore2001:~# mysql --skip-ssl -e "show all slaves status\G" | egrep "Connection_name|Seconds"
              Connection_name: s1
        Seconds_Behind_Master: 86635
              Connection_name: s2
        Seconds_Behind_Master: 86673
              Connection_name: s3
        Seconds_Behind_Master: NULL
              Connection_name: s4
        Seconds_Behind_Master: NULL
              Connection_name: s5
        Seconds_Behind_Master: 86717
              Connection_name: s6
        Seconds_Behind_Master: 86441
              Connection_name: s7
        Seconds_Behind_Master: NULL

And they are "catching up" for the 24h delay which doesn't look far :-)

I have enabled delayed replicationon dbstore2001- it is not dangerous, as worse case scenario it just stops the replication thread more than it should. In particular, it was badly/not completely documented. Aside from the things on:

events_dbstore_delayed.sql

A view has to be created with:

CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `ss` AS select `slave_status`.`Connection_name` AS `con`,substring_index(`slave_status`.`Master_Host`,'.',1) AS `master`,`slave_status`.`Master_Port` AS `port`,`slave_status`.`Slave_IO_Running` AS `io`,`slave_status`.`Slave_SQL_Running` AS `sql`,`slave_status`.`Seconds_Behind_Master` AS `lag` from `slave_status`;

and the connect engine has to be enabled- which is a no-go for non dedicated stuff. Aside from that, it is now getting delayed, so it seems to work. We will see if it maintains the delay during the weekend. The events have its own log at: ops.event_log.

So it is going fine!

root@dbstore2001:~# mysql --skip-ssl -e "show all slaves status\G" | grep Seconds
        Seconds_Behind_Master: NULL
        Seconds_Behind_Master: NULL
        Seconds_Behind_Master: NULL
        Seconds_Behind_Master: 86491
        Seconds_Behind_Master: 86633
        Seconds_Behind_Master: NULL
        Seconds_Behind_Master: NULL
root@dbstore2001:~# bc
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
86448/3600
24
86653/3600
24

It happened by chance to just start some replication threads again:

root@dbstore2001:~# mysql --skip-ssl -e "show all slaves status\G" | egrep "Connection_name|Seconds"
              Connection_name: s1
        Seconds_Behind_Master: 86635
              Connection_name: s2
        Seconds_Behind_Master: 86673
              Connection_name: s3
        Seconds_Behind_Master: NULL
              Connection_name: s4
        Seconds_Behind_Master: NULL
              Connection_name: s5
        Seconds_Behind_Master: 86717
              Connection_name: s6
        Seconds_Behind_Master: 86441
              Connection_name: s7
        Seconds_Behind_Master: NULL

And they are "catching up" for the 24h delay which doesn't look far :-)

And got back to NULL, so looking good!

root@dbstore2001:~# mysql --skip-ssl -e "show all slaves status\G" | egrep "Connection_name|Seconds"
              Connection_name: s1
        Seconds_Behind_Master: NULL
              Connection_name: s2
        Seconds_Behind_Master: NULL
              Connection_name: s3
        Seconds_Behind_Master: NULL
              Connection_name: s4
        Seconds_Behind_Master: NULL
              Connection_name: s5
        Seconds_Behind_Master: NULL
              Connection_name: s6
        Seconds_Behind_Master: 86577
              Connection_name: s7
        Seconds_Behind_Master: 86631

@Marostegui You can look at the delay easier* by using the heartbeat table:

$ mysql -h dbstore2001.codfw.wmnet heartbeat -e "SELECT shard, greatest(0, TIMESTAMPDIFF(MICROSECOND, max(ts), UTC_TIMESTAMP(6)) - 500000)/1000000 AS lag FROM heartbeat.heartbeat WHERE datacenter='eqiad' GROUP BY shard HAVING shard like 's%'"
+-------+------------+
| shard | lag        |
+-------+------------+
| s1    | 86489.5628 |
| s2    | 86517.5631 |
| s3    | 86655.5631 |
| s4    | 86461.5634 |
| s5    | 86448.5633 |
| s6    | 86690.5632 |
| s7    | 86391.5633 |
+-------+------------+

The problem with the current setup is that you cannot shutdown it cleanly easily- specially without GTID you have to stop all slaves, disable events, stop slaves again, wait some minutes to check pending events weren't executed, stop slaves again, and then shutdown.

Change 328671 abandoned by Jcrespo:
dbstore: configuration changes to make InnoDB the main storage

https://gerrit.wikimedia.org/r/328671

Marostegui claimed this task.

We believe this is solved!