T127964 is a complaint of not being able to login on beta commons wiki. Looking at logstash I noticed since 11:51am a surge of messages of type:
Memcached error for key "{memcached-key}" on server "{memcached-server}": SERVER HAS FAILED AND IS DISABLED UNTIL TIMED RETRY Memcached error for key "{memcached-key}" on server "{memcached-server}": CONNECTION FAILURE
The puppet run on deployment-mediawiki02 has a diff at that time:
Info: Applying configuration version '1456314632' Notice: /Stage[main]/Nutcracker/File[/etc/nutcracker/nutcracker.yml]/content: --- /etc/nutcracker/nutcracker.yml 2015-10-08 00:50:36.322423911 +0000 +++ /tmp/puppet-file20160224-24403-qa3hxp 2016-02-24 11:50:56.379159631 +0000 @@ -35,8 +35,6 @@ server_failure_limit: 3 server_retry_timeout: 30000 servers: - - 10.68.16.177:6379:1 - - 10.68.16.231:6379:1 timeout: 1000 redis_eqiad: auto_eject_hosts: true @@ -49,6 +47,6 @@ server_failure_limit: 3 server_retry_timeout: 30000 servers: - - 10.68.16.177:6379:1 - - 10.68.16.231:6379:1 + - 10.68.16.177:6379:1 "shard01" + - 10.68.16.231:6379:1 "shard02" timeout: 1000