Page MenuHomePhabricator

Upgrade codfw cluster to Elasticsearch 7.10.2
Closed, ResolvedPublic

Event Timeline

Mentioned in SAL (#wikimedia-operations) [2022-08-30T20:43:06Z] <ryankemper@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719

Mentioned in SAL (#wikimedia-operations) [2022-08-30T20:43:19Z] <ryankemper@cumin2002> END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719

Change 828092 had a related patch set uploaded (by Ryan Kemper; author: Ryan Kemper):

[operations/puppet@production] elastic: upgrade codfw elasticsearch to 7.10.2

https://gerrit.wikimedia.org/r/828092

Change 828092 merged by Ryan Kemper:

[operations/puppet@production] elastic: upgrade codfw elasticsearch to 7.10.2

https://gerrit.wikimedia.org/r/828092

Mentioned in SAL (#wikimedia-operations) [2022-08-30T23:50:21Z] <ryankemper@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719

Mentioned in SAL (#wikimedia-operations) [2022-08-30T23:50:28Z] <ryankemper@cumin2002> END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719

Mentioned in SAL (#wikimedia-operations) [2022-08-30T23:55:06Z] <ryankemper> T316719 Merged https://phabricator.wikimedia.org/T316719; running puppet across codfw fleet: ryankemper@cumin2002:~$ sudo -E cumin -b 6 'A:elastic-codfw' 'run-puppet-agent'

Mentioned in SAL (#wikimedia-operations) [2022-08-31T00:08:44Z] <ryankemper@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719

Mentioned in SAL (#wikimedia-operations) [2022-08-31T00:14:09Z] <ryankemper@cumin2002> END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719

Mentioned in SAL (#wikimedia-operations) [2022-08-31T00:14:50Z] <ryankemper> T316719 First elastic host upgraded properly. Cancelling cookbook to kick off a new rolling upgrade that will go 3 nodes at a time (first run was just one host as a sanity check)

Mentioned in SAL (#wikimedia-operations) [2022-08-31T00:15:20Z] <ryankemper@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719

Mentioned in SAL (#wikimedia-operations) [2022-08-31T02:49:27Z] <ryankemper@cumin2002> END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719

Mentioned in SAL (#wikimedia-operations) [2022-08-31T02:50:04Z] <ryankemper@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719

Mentioned in SAL (#wikimedia-operations) [2022-08-31T03:17:23Z] <ryankemper@cumin2002> END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719

Mentioned in SAL (#wikimedia-operations) [2022-08-31T03:23:50Z] <ryankemper@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719

Mentioned in SAL (#wikimedia-operations) [2022-08-31T03:23:54Z] <ryankemper@cumin2002> END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719

Mentioned in SAL (#wikimedia-operations) [2022-08-31T18:56:04Z] <ryankemper@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719

Mentioned in SAL (#wikimedia-operations) [2022-08-31T19:21:48Z] <ryankemper@cumin2002> END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719

Mentioned in SAL (#wikimedia-operations) [2022-08-31T19:30:43Z] <ryankemper> T316719 Rolling upgrade operation complete; all of elastic codfw is now on 7.10.2. Next week our related cirrus changes will go out with the mediawiki deploy train in 1.39.0-wmf.28