Page MenuHomePhabricator

Elasticsearch 7.10.2 rollout plan
Closed, ResolvedPublic

Description

Prerequisites:

Schedule:

  • Starting week: Aug 29, 2022
  • Train version: Expecting 1.39.0-wmf.28 with branch cut on Sept 5.

Once all prerequisites are verified define the week starting date for the rollout:

Plan:

  • Week 1
    • Monday: Upgrade cloudelastic to elasticsearch 7.10.2 and verify that updates are flowing properly
    • Tuesday: Merge apifeatureusage prep. Start the rollout on codfw and monitor updates
    • After the branch cut, merge:
      • Merge the es710 branch into master in vendor, CirrusSearch, and Elastica repositories
    • Note the mediawiki version they'll land into and update the mw-config patch (https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/824787) with it
    • Add these patches as risky in a comment to the corresponding deployment blockers phab task
    • Soon after merging upgrade deployment-prep to elasticsearch 7.10.2 and verify search functions/error logs
    • Once codfw is fully upgraded deploy the mw-config patch to switch search traffic based on the mw train version
  • Week 2
    • Wait for the train to rollout everywhere and verify search functions/error logs, search traffic will start flowing to codfw as the train rolls forward, be available to releng when syncing wiki groups
  • Week 3 (after the expected train version is running everywhere and unlikely to be rolled back)

Things to watch out:

  • CirrusSearch maintenance scripts: completion suggester, dump index, saneitizer

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 825874 had a related patch set uploaded (by Ryan Kemper; author: Ryan Kemper):

[operations/cookbooks@master] elastic: clear old es_6 resources during upgrade

https://gerrit.wikimedia.org/r/825874

Change 825883 had a related patch set uploaded (by Ryan Kemper; author: Ryan Kemper):

[operations/puppet@production] elastic: es7 removed bulk threadpool

https://gerrit.wikimedia.org/r/825883

Change 825874 merged by Bking:

[operations/cookbooks@master] elastic: clear old es_6 resources during upgrade

https://gerrit.wikimedia.org/r/825874

Change 825883 merged by Bking:

[operations/puppet@production] elastic: es7 removed bulk threadpool

https://gerrit.wikimedia.org/r/825883

Change 826383 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] opensearch: replace outdated config

https://gerrit.wikimedia.org/r/826383

Change 826386 had a related patch set uploaded (by Ryan Kemper; author: Ryan Kemper):

[operations/puppet@production] elastic: use our jvm not elasticsearch's jvm

https://gerrit.wikimedia.org/r/826386

Change 826386 merged by Bking:

[operations/puppet@production] elastic: use our jvm not elasticsearch's jvm

https://gerrit.wikimedia.org/r/826386

Change 826396 had a related patch set uploaded (by Ryan Kemper; author: Ryan Kemper):

[operations/puppet@production] elastic: don't start es7 unit until we tell it

https://gerrit.wikimedia.org/r/826396

Change 826396 merged by Bking:

[operations/puppet@production] elastic: don't start es7 unit until we tell it

https://gerrit.wikimedia.org/r/826396

Change 826640 had a related patch set uploaded (by Bking; author: Bking):

[operations/cookbooks@master] elastic: fix string concatenation

https://gerrit.wikimedia.org/r/826640

Change 826647 had a related patch set uploaded (by Ryan Kemper; author: Ryan Kemper):

[operations/cookbooks@master] elastic: no need to run puppet during es 7 upgrade

https://gerrit.wikimedia.org/r/826647

Change 826647 merged by jenkins-bot:

[operations/cookbooks@master] elastic: no need to run puppet during es 7 upgrade

https://gerrit.wikimedia.org/r/826647

Change 826640 merged by jenkins-bot:

[operations/cookbooks@master] elastic: fix string concatenation

https://gerrit.wikimedia.org/r/826640

Change 826651 had a related patch set uploaded (by Bking; author: Bking):

[operations/cookbooks@master] elastic: use correct systemd command

https://gerrit.wikimedia.org/r/826651

Change 826651 merged by jenkins-bot:

[operations/cookbooks@master] elastic: use correct systemd command

https://gerrit.wikimedia.org/r/826651

Change 827204 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[mediawiki/extensions/Elastica@master] Revert "Revert "Switch to Elastica 7.1.5""

https://gerrit.wikimedia.org/r/827204

Change 827544 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[mediawiki/vendor@master] Upgrade to Elastica 7.1.5

https://gerrit.wikimedia.org/r/827544

Change 827545 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[mediawiki/extensions/CirrusSearch@master] Merge remote-tracking branch 'gerrit/es710'

https://gerrit.wikimedia.org/r/827545

Change 828086 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[mediawiki/extensions/CirrusSearch@es710] Merge remote-tracking branch 'gerrit/es710' into es710

https://gerrit.wikimedia.org/r/828086

Change 828086 abandoned by Ebernhardson:

[mediawiki/extensions/CirrusSearch@es710] Merge remote-tracking branch 'gerrit/es710' into es710

Reason:

should be uploaded to master branch, not es710

https://gerrit.wikimedia.org/r/828086

Change 827545 merged by Ebernhardson:

[mediawiki/extensions/CirrusSearch@master] Merge remote-tracking branch 'gerrit/es710' into es710

https://gerrit.wikimedia.org/r/827545

Change 827544 merged by Ebernhardson:

[mediawiki/vendor@master] Upgrade to Elastica 7.1.5

https://gerrit.wikimedia.org/r/827544

Change 827204 merged by Ebernhardson:

[mediawiki/extensions/Elastica@master] Switch to Elastica 7.1.5 [re-apply]

https://gerrit.wikimedia.org/r/827204

Change 828403 had a related patch set uploaded (by Ryan Kemper; author: DCausse):

[operations/puppet@production] Relax elasticsearch master node detection

https://gerrit.wikimedia.org/r/828403

Change 828403 merged by Ryan Kemper:

[operations/puppet@production] Relax elasticsearch master node detection

https://gerrit.wikimedia.org/r/828403

Rollout is currently progressing. Found one regression from logs:

Search backend error during comp_suggest search for 'mediaw' after 131: x_content_parse_exception: [1:516] [terms_lookup] unknown field [1]

Looks to be an issue with a php array being encoded to a json object. Patch (https://gerrit.wikimedia.org/r/c/830214/) scheduled for backport window today. Otherwise it all looks reasonable so far.

eqiad is now running elastic 7 as well. The completion indices will rebuild overnight, we should be able to verify them tomorrow and switch traffic back to it's local datacenter instead of all being routed to codfw.

Change 832323 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/mediawiki-config@master] Revert "cirrus: Handle transition to elasticsearch 7.10"

https://gerrit.wikimedia.org/r/832323

Change 832323 merged by jenkins-bot:

[operations/mediawiki-config@master] Revert "cirrus: Handle transition to elasticsearch 7.10"

https://gerrit.wikimedia.org/r/832323

Mentioned in SAL (#wikimedia-operations) [2022-09-15T20:07:40Z] <thcipriani@deploy1002> Started scap: Backport for [[gerrit:832323|Revert "cirrus: Handle transition to elasticsearch 7.10" (T308676)]]

Mentioned in SAL (#wikimedia-operations) [2022-09-15T20:08:00Z] <thcipriani@deploy1002> thcipriani and dcausse: Backport for [[gerrit:832323|Revert "cirrus: Handle transition to elasticsearch 7.10" (T308676)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet

Mentioned in SAL (#wikimedia-operations) [2022-09-15T20:15:19Z] <thcipriani@deploy1002> Finished scap: Backport for [[gerrit:832323|Revert "cirrus: Handle transition to elasticsearch 7.10" (T308676)]] (duration: 07m 39s)

Change 826383 merged by Ryan Kemper:

[operations/puppet@production] opensearch: replace outdated config

https://gerrit.wikimedia.org/r/826383

Gehel subscribed.

Update has been sent to discovery@ and wikitech-l@. This can be closed.

Change 845054 had a related patch set uploaded (by Ryan Kemper; author: Ryan Kemper):

[operations/puppet@production] elastic: don't block on /root/allow_es7 existing

https://gerrit.wikimedia.org/r/845054

Change 845055 had a related patch set uploaded (by Ryan Kemper; author: Ryan Kemper):

[operations/cookbooks@master] elastic: no more /root/allow_es7

https://gerrit.wikimedia.org/r/845055

Change 845054 merged by Ryan Kemper:

[operations/puppet@production] elastic: don't block on /root/allow_es7 existing

https://gerrit.wikimedia.org/r/845054

Change 845055 merged by Ryan Kemper:

[operations/cookbooks@master] elastic: no more /root/allow_es7

https://gerrit.wikimedia.org/r/845055