Page MenuHomePhabricator

Upgrade elasticsearch to 5.6.14
Closed, ResolvedPublic

Description

To ease migration to ES6 we should migrate to 5.6.14 first.

Event Timeline

dcausse created this task.Feb 12 2019, 5:18 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 12 2019, 5:18 PM
dcausse triaged this task as High priority.Feb 12 2019, 5:20 PM
dcausse moved this task from needs triage to [epic] on the Discovery-Search board.

Change 491482 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] elasticsearch: relforge now uses elastic56 apt component

https://gerrit.wikimedia.org/r/491482

Change 491482 merged by Gehel:
[operations/puppet@production] elasticsearch: relforge now uses elastic56 apt component

https://gerrit.wikimedia.org/r/491482

Change 491485 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] elasticsearch/relforge: fix typo in hiera param for elasticsearch version

https://gerrit.wikimedia.org/r/491485

Change 491485 merged by Gehel:
[operations/puppet@production] elasticsearch/relforge: fix typo in hiera param for elasticsearch version

https://gerrit.wikimedia.org/r/491485

Mentioned in SAL (#wikimedia-operations) [2019-02-19T14:29:57Z] <gehel> rolling upgrade of elasticsearch on relforge - T215931

Change 491746 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] elasticsearch: upgrade elasticsearch / cirrus / codfw to 5.6.14

https://gerrit.wikimedia.org/r/491746

Change 491746 merged by Gehel:
[operations/puppet@production] elasticsearch: upgrade elasticsearch / cirrus / codfw to 5.6.14

https://gerrit.wikimedia.org/r/491746

Mentioned in SAL (#wikimedia-operations) [2019-02-20T13:59:26Z] <gehel> rolling upgrade of elasticsearch / cirrus / codfw to 5.6.14 - T215931

Mentioned in SAL (#wikimedia-operations) [2019-02-21T13:18:30Z] <gehel> restarting rolling upgrade on elasticsearch / cirrus / codfw - T215931

Change 492044 had a related patch set uploaded (by EBernhardson; owner: EBernhardson):
[operations/mediawiki-config@master] [cirrus] Switch production search traffic to codfw (1/2)

https://gerrit.wikimedia.org/r/492044

Change 492045 had a related patch set uploaded (by EBernhardson; owner: EBernhardson):
[operations/mediawiki-config@master] [cirrus] Switch production search traffic to codfw (1/2)

https://gerrit.wikimedia.org/r/492045

Change 492044 merged by jenkins-bot:
[operations/mediawiki-config@master] [cirrus] Switch production search traffic to codfw (1/2)

https://gerrit.wikimedia.org/r/492044

Mentioned in SAL (#wikimedia-operations) [2019-02-22T00:17:14Z] <ebernhardson@deploy1001> sync-file aborted: T215931 (duration: 00m 00s)

Mentioned in SAL (#wikimedia-operations) [2019-02-22T00:18:11Z] <ebernhardson@deploy1001> Synchronized wmf-config/InitialiseSettings.php: T215931 [cirrus] Switch production search traffic to codfw (1/2) (duration: 00m 46s)

Mentioned in SAL (#wikimedia-operations) [2019-02-22T00:21:11Z] <ebernhardson@deploy1001> Synchronized wmf-config/InitialiseSettings.php: T215931 [cirrus] Switch production search traffic to codfw (1/2) (duration: 00m 45s)

Change 492045 merged by jenkins-bot:
[operations/mediawiki-config@master] [cirrus] Switch production search traffic to codfw (2/2)

https://gerrit.wikimedia.org/r/492045

Mentioned in SAL (#wikimedia-operations) [2019-02-22T00:45:23Z] <ebernhardson@deploy1001> sync-file aborted: T215931 [cirrus] Switch production search traffic to codfw (2/2) (duration: 00m 05s)

Mentioned in SAL (#wikimedia-operations) [2019-02-22T00:46:23Z] <ebernhardson@deploy1001> Synchronized wmf-config/InitialiseSettings.php: T215931 [cirrus] Switch production search traffic to codfw (2/2) (duration: 00m 46s)

EBernhardson added a subscriber: EBernhardson.EditedFeb 22 2019, 12:53 AM

Prior to switchover a ran a few queries against all indices to warm codfw up, I don't think one occurance of this working is enough to call it a win but should try again when we switch eqiad back. Elasticsearch percentiles showed no noticable latency spike when traffic moved from eqiad to codfw.

Grafana dashboard for time in question: https://grafana.wikimedia.org/d/000000455/elasticsearch-percentiles?panelId=22&fullscreen&orgId=1&from=1550795053678&to=1550796853679&var-cluster=eqiad&var-smoothing=1&var-exported_cluster=search

At this point omega and psi were already serving traffic from codfw. The small latency spikes prior to switchover are those queries slowing down as I ran the warmup queries. Query was run multiple times with combinations of the following words that hopefully appear in many languages: a or the wiki wmf mediawiki wikipedia la to and

Query issued:

{
    "query": {
        "multi_match": {
            "query": "wikipedia",
            "operator": "or",
            "fields": ["all", "all.plain", "title", "title.plain", "category", "category.plain", "heading.plain", "heading", "auxiliary_text.plain", "auxiliary_text", "file_text", "file_text.plain", "redirect.title.plain", "redirect.title", "text", "text.plain", "opening_text.plain", "opening_text", "all_near_match", "template", "template.plain"]
        }
    },
    "size": 9000
}

Change 492266 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] elasticsearch: upgrade elasticsearch / cirrus to 5.6.14

https://gerrit.wikimedia.org/r/492266

Change 492266 merged by Gehel:
[operations/puppet@production] elasticsearch: upgrade elasticsearch / cirrus to 5.6.14

https://gerrit.wikimedia.org/r/492266

Mentioned in SAL (#wikimedia-operations) [2019-02-22T09:16:34Z] <gehel> starting rolling upgrade on elasticsearch / cirrus / eqiad - T215931

Mentioned in SAL (#wikimedia-operations) [2019-02-22T18:02:43Z] <gehel> rolling upgrade on elasticsearch / cirrus / eqiad completed - T215931

debt closed this task as Resolved.Mar 8 2019, 6:02 PM
debt claimed this task.
debt awarded a token.