Fetch 3 node(s) from search_codfw to perform rolling restart on
Scheduling downtime on Icinga server icinga1001.wikimedia.org for hosts: elastic[2030,2041,2043].codfw.wmnet
Disabling Puppet with reason "reboot for JVM + kernel upgrade - gehel@cumin2001" on 3 hosts: elastic[2030,2041,2043].codfw.wmnet
Freezing writes on [<spicerack.elasticsearch_cluster.ElasticsearchCluster object at 0x7f1836f31898>, <spicerack.elasticsearch_cluster.ElasticsearchCluster object at 0x7f1836f31710>, <spicerack.elasticsearch_clus
ter.ElasticsearchCluster object at 0x7f1836f31780>]
Freezing all indices in <Elasticsearch([{'host': 'search.svc.codfw.wmnet', 'port': 9243, 'use_ssl': True}])>
Freezing all indices in <Elasticsearch([{'host': 'search.svc.codfw.wmnet', 'port': 9443, 'use_ssl': True}])>
Freezing all indices in <Elasticsearch([{'host': 'search.svc.codfw.wmnet', 'port': 9643, 'use_ssl': True}])>
Wait for a minimum time of 60sec to make sure all CirrusSearch writes are terminated
Stopping elasticsearch replication in a safe way on search_codfw
stopping replication on [<spicerack.elasticsearch_cluster.ElasticsearchCluster object at 0x7f1836f31898>, <spicerack.elasticsearch_cluster.ElasticsearchClusterobject at 0x7f1836f31710>, <spicerack.elasticsearch_
cluster.ElasticsearchCluster object at 0x7f1836f31780>]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/spicerack/cookbook.py", line 414, in _run
ret = self.module.run(args, self.spicerack)
File "/srv/deployment/spicerack/cookbooks/sre/elasticsearch/rolling-reboot.py", line 31, in run
reboot
File "/srv/deployment/spicerack/cookbooks/sre/elasticsearch/__init__.py", line 99, in execute_on_clusters
nodes.pool_nodes()
File "/usr/lib/python3.5/contextlib.py", line 77, in __exit__
self.gen.throw(type, value, traceback)
File "/usr/lib/python3/dist-packages/spicerack/elasticsearch_cluster.py", line 212, in stopped_replication
yield [stack.enter_context(cluster.stopped_replication()) for cluster in self._clusters]
File "/usr/lib/python3.5/contextlib.py", line 360, in __exit__
raise exc_details[1]
File "/usr/lib/python3.5/contextlib.py", line 77, in __exit__
self.gen.throw(type, value, traceback)
File "/usr/lib/python3/dist-packages/spicerack/elasticsearch_cluster.py", line 362, in stopped_replication
yield
File "/usr/lib/python3.5/contextlib.py", line 345, in __exit__
if cb(*exc_details):
File "/usr/lib/python3.5/contextlib.py", line 261, in _exit_wrapper
return cm_exit(cm, *exc_details)
File "/usr/lib/python3.5/contextlib.py", line 77, in __exit__
self.gen.throw(type, value, traceback)
File "/usr/lib/python3/dist-packages/spicerack/elasticsearch_cluster.py", line 364, in stopped_replication
self._start_replication()
File "/usr/lib/python3/dist-packages/spicerack/elasticsearch_cluster.py", line 379, in _start_replication
value='all', wait_for_completion=False)
File "/usr/lib/python3/dist-packages/spicerack/elasticsearch_cluster.py", line 391, in _do_cluster_routing
cluster_routing.do_action()
File "/usr/lib/python3/dist-packages/curator/actions.py", line 394, in do_action
report_failure(e)
File "/usr/lib/python3/dist-packages/curator/utils.py", line 173, in report_failure
'Exception: {0}'.format(exception)
curator.exceptions.FailedExecution: Exception encountered. Rerun with loglevelDEBUG and/or check Elasticsearch logs for more information. Exception: ConnectionTimeout caused by - ReadTimeoutError(HTTPSConnectio