Page MenuHomePhabricator

Beta Cluster is down
Closed, DuplicatePublic

Description

To reproduce, open https://en.wikipedia.beta.wmflabs.org/

Error

Our servers are currently under maintenance or experiencing a technical problem. Please try again in a few minutes.

See the error message at the bottom of this page for more information.

If you report this error to the Wikimedia System Administrators, please include the details below.

Request from - via deployment-cache-text06.deployment-prep.eqiad.wmflabs, ATS/8.0.8
Error: 502, Next Hop Connection Failed at 2020-11-09 12:56:10 GMT

beta.png (821×1 px, 150 KB)

Event Timeline

I'm not sure if this is UBN, but it's at least high. 😬

Nintendofan885 renamed this task from en.wikipedia.beta.wmflabs.org down to Beta Cluster down.Nov 9 2020, 2:43 PM
Nintendofan885 renamed this task from Beta Cluster down to Beta Cluster is down.
Nintendofan885 subscribed.

It's effecting the other beta.wmflabs.org subdomains as well

This one might not be a varnish problem. I see php-fpm complaining on deployment-mediawiki-07:

PHP Fatal error:  Uncaught RuntimeException: RedisConnectionPool requires a Redis client library. See https://www.mediawiki.org/wiki/Redis#Setup in /srv/mediawiki/php-master/includes/libs/redis/RedisConnectionPool.php:84
thcipriani claimed this task.

Restarting php7.2-fpm on deployment-mediawiki-07 fixed this one.

Looks like it was fixed for a while, but I'm getting the same error again. 😕

jenkins.png (466×752 px, 95 KB)

The job on the screenshot is https://integration.wikimedia.org/ci/view/Selenium/job/selenium-daily-beta-MediaWiki/

Yeah, also failing on page views:
https://meta.wikimedia.beta.wmflabs.org/wiki/Special:Gadgets

Request from - via deployment-cache-text06.deployment-prep.eqiad.wmflabs, ATS/8.0.8
Error: 502, Next Hop Connection Failed at 2020-11-11 17:10:16 GMT
hashar subscribed.

It is back. The root cause was that it was stuck to Varnish 5 while the configuration files were meant for Varnish 6. Upgrading to Varnish 6 fixed.