labsdb1004 MySQL crash
Closed, ResolvedPublic

Description

labsdb1004 is crashing due to:

170303 20:04:26 [ERROR] InnoDB: Table s51412__data/book in the InnoDB data dictionary has tablespace id 48786, but tablespace with that id or name does not exist. Have you deleted or moved .ibd files? This may also be a table created with CREATE TEMPORARY TABLE whose .ibd and .frm files MySQL automatically removed, but the table still exists in the InnoDB internal data dictionary.
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
170303 20:04:26 [ERROR] InnoDB: Table s51412__data/dewiki_book in the InnoDB data dictionary has tablespace id 48790, but tablespace with that id or name does not exist. Have you deleted or moved .ibd files? This may also be a table created with CREATE TEMPORARY TABLE whose .ibd and .frm files MySQL automatically removed, but the table still exists in the InnoDB internal data dictionary.
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
170303 20:04:26 [ERROR] InnoDB: Table s51412__data/dewiki_external_beacon in the InnoDB data dictionary has tablespace id 48792, but tablespace with that id or name does not exist. Have you deleted or moved .ibd files? This may also be a table created with CREATE TEMPORARY TABLE whose .ibd and .frm files MySQL automatically removed, but the table still exists in the InnoDB internal data dictionary.
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
170303 20:04:26 [ERROR] InnoDB: Table s51412__data/dewiki_image in the InnoDB data dictionary has tablespace id 48791, but tablespace with that id or name does not exist. Have you deleted or moved .ibd files? This may also be a table created with CREATE TEMPORARY TABLE whose .ibd and .frm files MySQL automatically removed, but the table still exists in the InnoDB internal data dictionary.
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
170303 20:04:26 [ERROR] InnoDB: Table s51412__data/dewiki_normdaten in the InnoDB data dictionary has tablespace id 48793, but tablespace with that id or name does not exist. Have you deleted or moved .ibd files? This may also be a table created with CREATE TEMPORARY TABLE whose .ibd and .frm files MySQL automatically removed, but the table still exists in the InnoDB internal data dictionary.
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
170303 20:04:26 [ERROR] InnoDB: Table s51412__data/dewiki_pd_aemter in the InnoDB data dictionary has tablespace id 48794, but tablespace with that id or name does not exist. Have you deleted or moved .ibd files? This may also be a table created with CREATE TEMPORARY TABLE whose .ibd and .frm files MySQL automatically removed, but the table still exists in the InnoDB internal data dictionary.
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
170303 20:04:26 [ERROR] InnoDB: Table s51412__data/dewiki_spiegel in the InnoDB data dictionary has tablespace id 48795, but tablespace with that id or name does not exist. Have you deleted or moved .ibd files? This may also be a table created with CREATE TEMPORARY TABLE whose .ibd and .frm files MySQL automatically removed, but the table still exists in the InnoDB internal data dictionary.
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
170303 20:04:26 [ERROR] InnoDB: Table s51412__data/dewiki_templatedata in the InnoDB data dictionary has tablespace id 48803, but tablespace with that id or name does not exist. Have you deleted or moved .ibd files? This may also be a table created with CREATE TEMPORARY TABLE whose .ibd and .frm files MySQL automatically removed, but the table still exists in the InnoDB internal data dictionary.
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
170303 20:04:26 [ERROR] InnoDB: Table s51412__data/dewiki_templatedata_pages in the InnoDB data dictionary has tablespace id 48797, but tablespace with that id or name does not exist. Have you deleted or moved .ibd files? This may also be a table created with CREATE TEMPORARY TABLE whose .ibd and .frm files MySQL automatically removed, but the table still exists in the InnoDB internal data dictionary.
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
170303 20:04:26 [ERROR] InnoDB: Table s51412__data/dewiki_wartung_bkl_in_pd in the InnoDB data dictionary has tablespace id 48798, but tablespace with that id or name does not exist. Have you deleted or moved .ibd files? This may also be a table created with CREATE TEMPORARY TABLE whose .ibd and .frm files MySQL automatically removed, but the table still exists in the InnoDB internal data dictionary.
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
170303 20:04:26 [ERROR] InnoDB: Table s51412__data/log in the InnoDB data dictionary has tablespace id 48799, but tablespace with that id or name does not exist. Have you deleted or moved .ibd files? This may also be a table created with CREATE TEMPORARY TABLE whose .ibd and .frm files MySQL automatically removed, but the table still exists in the InnoDB internal data dictionary.
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
170303 20:04:26 [ERROR] InnoDB: Table s51412__data/misc_data in the InnoDB data dictionary has tablespace id 48800, but tablespace with that id or name does not exist. Have you deleted or moved .ibd files? This may also be a table created with CREATE TEMPORARY TABLE whose .ibd and .frm files MySQL automatically removed, but the table still exists in the InnoDB internal data dictionary.

I have seen this:

mysql:root@localhost [s51412__data]> show tables;
ERROR 1018 (HY000): Can't read dir of './s51412__data/' (errno: 13 "Permission denied")
root@labsdb1004:/srv/labsdb/data# ls -lh | grep s51412__data
drwx------ 2 root  root  4.0K Dec 16 10:40 s51412__data

I am not completely aware of why this database is owned by root, so I won't touch them on a Friday evening.
@jcrespo @chasemp or @yuvipanda might have more context and know if it is safe to given them back to mysql user?

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 3 2017, 8:11 PM

Just to be clear, the server is UP, but I have left replication stopped so it doesn't crash the whole server when it comes to that transaction :-)

I honestly have no idea, I don't think I've had any contact with this setup. I hope @jynus or @yuvipanda has some wisdom. This seems super weird.

Looks like that database has been involved on a few things already:
https://phabricator.wikimedia.org/T150759
https://phabricator.wikimedia.org/T131897

So maybe those privileges were there as a precaution against big imports / overloads.
What we can do is chown that user to mysql, let replication catch up and then go back to the original privileges.

@Marostegui That log by itself should not make the server crash. s51412__data having wrong permissions is something we did on purpose during the failover, so I would expect some of those happening in the past.

I have restarted the slave ignoring a couple of extra heavy hitters, that should avoid crashes for now. I have also enabled GTID to mitigate problems on crash.

BTW, labsdb1005 run out of /tmp space during the weekend- we need to move it to /srv, somewhere. Both cases are probably the same origin, too much load from 1 or several users.

jcrespo moved this task from Triage to Next on the DBA board.Mar 5 2017, 1:11 PM

@Marostegui That log by itself should not make the server crash. s51412__data having wrong permissions is something we did on purpose during the failover, so I would expect some of those happening in the past.

Yes, but I thought we restored the original permissions.

Thanks for the workaround!

jcrespo claimed this task.Mar 6 2017, 10:24 AM
jcrespo moved this task from Next to In progress on the DBA board.
jcrespo triaged this task as "Normal" priority.Mar 6 2017, 12:47 PM

Change 341503 had a related patch set uploaded (by jynus):
[operations/puppet] Move tmpdir to /srv/labsdb/tmp to avoid filling up / partition

https://gerrit.wikimedia.org/r/341503

Change 341503 merged by Jcrespo:
[operations/puppet] Move tmpdir to /srv/labsdb/tmp to avoid filling up / partition

https://gerrit.wikimedia.org/r/341503

I am going to drop s51412__data from labsdb1004 only , and the others filtered, to avoid confusion on where data is up to date.

Mentioned in SAL (#wikimedia-operations) [2017-03-08T10:34:56Z] <jynus> restarting labsdb1004's mariadb T159572

jcrespo closed this task as "Resolved".Mar 8 2017, 10:42 AM

This is fixed for labsdb1004, except for https://gerrit.wikimedia.org/r/341551 , and labsdb1005, which requires a restart to apply new changes.