Page MenuHomePhabricator

Convert Collab services with multiple backends to active-active where possible
Open, LowPublic

Description

When chart-museum was introduced Alex did some work that made it possible to add services to service.yaml and have them use discovery records without having an LVS setup.

There are a couple misc. services that already have multiple backends (1 eqiad, 1 codfw) and a discovery.wmnet CNAME and could be active-active but are effectively just using one backend with the other one being commented out in DNS.

In some cases that was not the case when we used varnish and they were already active/active but the new discovery setup was introduced when we switched to ATS and then added envoy for TLS termination behind it in T210411.

  • go through the list of services after the "; misc web services with multiple backends but without geoip" comment in templates/wmnet in DNS and check which can be active-active
  • convert them following the same pattern already used for chart-museum and then releases.wikimedia.org
  • releases.wikimedia.org
  • phabricator - active-active not supported
  • planet.wikimedia.org
  • webserver_misc_apps (various sites sharing VMs) - migrated to K8s
  • doc.wikimedia.org
  • peopleweb

Create separate tasks for

  • mwmaint (noc.wikimedia.org) owned by ServiceOps

Event Timeline

Change 628963 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] service/planet: turn planet into an active-active service using discovery

https://gerrit.wikimedia.org/r/628963

Change 628964 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/dns@master] add discovery records for planet

https://gerrit.wikimedia.org/r/628964

Change 628965 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/dns@master] switch debmonitor to discovery records

https://gerrit.wikimedia.org/r/628965

Change 628966 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] service/debmonitor: turn debmonitor into an active-active service

https://gerrit.wikimedia.org/r/628966

Dzahn triaged this task as Medium priority.Sep 21 2020, 11:26 PM
Dzahn updated the task description. (Show Details)
Dzahn moved this task from Incoming 🐫 to Doing 😎 on the serviceops-deprecated board.

Change 628966 abandoned by Dzahn:
[operations/puppet@production] service/debmonitor: turn debmonitor into an active-active service

Reason:

https://gerrit.wikimedia.org/r/628966

Change 628965 abandoned by Dzahn:
[operations/dns@master] switch debmonitor to discovery records

Reason:

https://gerrit.wikimedia.org/r/628965

Change 628963 abandoned by Dzahn:

[operations/puppet@production] service/planet: turn planet into an active-active service using discovery

Reason:

several reasons: planet software must be replaced, then this should be moved to kubernetes. not going to do this intermediate solution right now.

https://gerrit.wikimedia.org/r/628963

Change 628964 abandoned by Dzahn:

[operations/dns@master] add discovery records for planet

Reason:

several reasons: planet software must be replaced, then this should be moved to kubernetes. not going to do this intermediate solution right now.

https://gerrit.wikimedia.org/r/628964

LSobanski updated the task description. (Show Details)
LSobanski updated the task description. (Show Details)

Change 891894 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] service-catalog: add planet service

https://gerrit.wikimedia.org/r/891894

Change 891895 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] service-catalog: add people service

https://gerrit.wikimedia.org/r/891895

Change 891730 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/dns@master] add metafo records for planet

https://gerrit.wikimedia.org/r/891730

Change 891731 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/dns@master] add metafo records for people.wikimedia.org

https://gerrit.wikimedia.org/r/891731

There are 4 pending patches here that were kind of WIP but maybe ready:

This was once also part of T330091 but we did not actually do it yet and this is basically a duplicate of that while the other ticket was just about switching DCs once and is resolved.

a) add planet to service catalog: https://gerrit.wikimedia.org/r/c/operations/puppet/+/891894

b) add people to service catalog: https://gerrit.wikimedia.org/r/c/operations/puppet/+/891894

c) add metafo record to DNS for planet https://gerrit.wikimedia.org/r/c/operations/dns/+/891731/

d) add metafo record to DNS for people https://gerrit.wikimedia.org/r/c/operations/dns/+/891731/

cc: @Jelto @eoghan As agreed with Lukasz I am going to abandon them but ALSO link them here. They are still valid and it's a just a click to "restore" them. If you ever should get to that it would be cool if you restore / amend / merge them. They are free to take, but I also didn't want to leave open stuff in Gerrit.

Change 891731 abandoned by Dzahn:

[operations/dns@master] add metafo records for people.wikimedia.org

Reason:

https://phabricator.wikimedia.org/T263506#8996173

https://gerrit.wikimedia.org/r/891731

Change 891730 abandoned by Dzahn:

[operations/dns@master] add metafo records for planet

Reason:

https://phabricator.wikimedia.org/T263506#8996173

https://gerrit.wikimedia.org/r/891730

Change 891894 abandoned by Dzahn:

[operations/puppet@production] service-catalog: add planet service

Reason:

https://phabricator.wikimedia.org/T263506#8996173

https://gerrit.wikimedia.org/r/891894

Change 891895 abandoned by Dzahn:

[operations/puppet@production] service-catalog: add people service

Reason:

https://phabricator.wikimedia.org/T263506#8996173

https://gerrit.wikimedia.org/r/891895

LSobanski renamed this task from convert misc services with multiple backends to active-active where possible to Convert Collab services with multiple backends to active-active where possible.Oct 9 2023, 4:09 PM
LSobanski lowered the priority of this task from Medium to Low.Mon, May 11, 3:34 PM
LSobanski updated the task description. (Show Details)