Page MenuHomePhabricator

No access to mysql from stat1007
Closed, ResolvedPublic

Description

From stat1007:

mysql --defaults-file=/etc/mysql/conf.d/analytics-research-client.cnf -h analytics-slave.eqiad.wmnet -A -e \"use log; show tables from log like '%ServerSideAccountCreation%';
ERROR 2013 (HY000): Lost connection to MySQL server at 'reading authorization packet', system error: 2 "No such file or directory"

also using the analytics-mysql utility (see: https://wikitech.wikimedia.org/wiki/Analytics/Systems/MariaDB)

analytics-mysql dewiki -e 'show tables';
ERROR 2013 (HY000): Lost connection to MySQL server at 'reading authorization packet', system error: 2 "No such file or directory"

Please advise.

Event Timeline

Interesting, thanks for the report. I am able to connect fine on dbstore1003 for dewiki, but I can see Aborted_connects in mariadb's show status increasing when attempting to connect from stat1007. It seems also timing out around at the 5s mark.

Tried also to raise temporarily the connection_timeout global variable on dbstore1003's s5 instance (even if the 'reading authorization packet', system error: 2 indicates something different) but it didn't count much.

As quick workaround, please use stat1004. stat1007 seems under heavy load, not really sure why it leads to that mysql error though.

@elukey Thank you for the support. Let's leave the ticket open until it is figured out what is happening on stat1007, what do you think?

Yes, we are still troubleshooting the issue with stat1007

fdans triaged this task as High priority.
fdans moved this task from Incoming to Operational Excellence on the Analytics board.
fdans added a project: Analytics-Kanban.

We are almost sure that this issue is due to network/system overloading due to a big rsync that is currently running on stat1007. It should complete in one/two days, and we'll know at that time if we were right or not. Since it is a very important rsync, please use stat1004/5 as interim solution. We have opened T234229 to resolve the root cause and avoid it in the future (hopefully).

@elukey Thank you. I was able to collect the data needed for T234036 from stat1004, so I will close this ticket as resolved and leave the battle in T234229 to you. Good luck!