Page MenuHomePhabricator

Can't connect to Beta Cluster database deployment-db1 or deployment-db2 (MariaDB down)
Closed, ResolvedPublic

Description

Loading the main page:
(Cannot access the database: Can't connect to MySQL server on '10.68.17.94' (4) (10.68.17.94))

In VE:


Cannot open any page in Betalabs, getting error "Error loading data from server: internal_api_error_DBConnectionError: [8c78efd3] Exception Caught: DB connection error: Can't connect to MySQL:

Event Timeline

Ryasmeen created this task.Apr 22 2015, 8:05 PM
Ryasmeen raised the priority of this task from to Needs Triage.
Ryasmeen updated the task description. (Show Details)
Ryasmeen added a subscriber: Ryasmeen.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 22 2015, 8:05 PM
Ryasmeen renamed this task from Cannot open any page in Betalabs, getting error "Error loading data from server: internal_api_error_DBConnectionError: [8c78efd3] Exception Caught: DB connection error: Can't connect to MySQL: to Cannot open any page with VE in Betalabs, getting error "Error loading data from server: internal_api_error_DBConnectionError: [8c78efd3] Exception Caught: DB connection error: Can't connect to MySQL:.Apr 22 2015, 8:05 PM
Ryasmeen added a project: VisualEditor.
Ryasmeen set Security to None.
Ryasmeen triaged this task as High priority.Apr 22 2015, 8:32 PM
Aklapper raised the priority of this task from High to Unbreak Now!.Apr 23 2015, 9:31 AM

One time loading VE on http://en.wikipedia.beta.wmflabs.org/wiki worked for me; two times I got a 503 or (Cannot access the database: Can't connect to MySQL server on '10.68.16.193' (4) (10.68.16.193)).

Jdforrester-WMF renamed this task from Cannot open any page with VE in Betalabs, getting error "Error loading data from server: internal_api_error_DBConnectionError: [8c78efd3] Exception Caught: DB connection error: Can't connect to MySQL: to Cannot open pages in Beta Cluster, getting error "Error loading data from server: internal_api_error_DBConnectionError: [8c78efd3] Exception Caught: DB connection error: Can't connect to MySQL:.Apr 23 2015, 2:43 PM
Jdforrester-WMF removed a project: VisualEditor.
greg renamed this task from Cannot open pages in Beta Cluster, getting error "Error loading data from server: internal_api_error_DBConnectionError: [8c78efd3] Exception Caught: DB connection error: Can't connect to MySQL: to Can't connect to Beta Cluster database.Apr 23 2015, 3:18 PM
greg updated the task description. (Show Details)
hashar renamed this task from Can't connect to Beta Cluster database to Can't connect to Beta Cluster database deployment-db1.Apr 24 2015, 9:34 AM
hashar renamed this task from Can't connect to Beta Cluster database deployment-db1 to Can't connect to Beta Cluster database deployment-db1 or deployment-db2 (MariaDB down).Apr 24 2015, 9:36 AM
hashar added a subscriber: hashar.Apr 24 2015, 9:51 AM

The beta cluster has two databases instances: 10.68.17.94 and 10.68.16.193 which are respectively deployment-db2 and deployment-db1.

The instances have been moved to new hosts which caused intermittent failures (T97033) and from SAL, @thcipriani restarted MariaDB on both on April 22nd:

April 22nd
21:40 thcipriani: restarted mariadb on deployment-db{1,2}

The error logs are sent to /mnt/sqldata/deployment-db1.err or /mnt/sqldata/deployment-db2.err but nothing suspicious there.

db1 has been restarted on Apr 23 according to last:

reboot   system boot  3.2.0-59-virtual Thu Apr 23 23:53 - 09:47  (09:54)

Maybe the service does not start automatically on boot, it does not show in /var/log/boot.log.

Didn't see mysql in /etc/rc[1-5].d/ anywhere.

Added mysql to both deployment-db{1,2] using:

sudo update-rc.d mysql defaults

I think this is typically done in the upstream package; however, we're using our own apt repo for mariadb precise instances on beta.

This should fix the issue for subsequent reboots, but we should review the puppet role and/or check with ops.

to be verified, it is well possible that in production we intentionally prevent mysql from starting manually. Either via the deb package or puppet.

On beta we might want to autostart it.

This issue is still occurring . Any update on this?

Not occurring for me.

Ryasmeen added a comment.EditedApr 28 2015, 7:08 PM

The issue has evolved a bit.Not getting those cant connect to MySQL error, but VE is silently failing to load there.There is one error in the console though: TypeError: mw.log.error is not a function

Ryasmeen added a subscriber: greg.Apr 28 2015, 7:21 PM

That error looks like it could be related to https://gerrit.wikimedia.org/r/#/c/206033/

(So yes, I see the error as well, but VE still functions)

Krenair, try saving your edit and re-open multiple times . It didnt occur to me in one shot as well.

hashar closed this task as Resolved.May 5 2015, 7:36 PM

The databases are up and running since the last manual intervention back in Apr 24th. So that is fixed :-)