Page MenuHomePhabricator

load-balance horizon between labweb1001 and 1002
Closed, ResolvedPublic

Description

In theory it should be simple to have horizon be active/active between the two labweb hosts. They should only need to share memcached access for tokens.

Details

Related Gerrit Patches:

Event Timeline

Andrew triaged this task as Medium priority.Feb 15 2018, 8:47 PM
Andrew created this task.

Change 411546 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labweb horizon: share memcached among labwebs

https://gerrit.wikimedia.org/r/411546

I misunderstood this a bit -- I assumed that the misc-web varnishes would load-balance for us, but it turns out we would need to add an additional layer of load-balancing (and a service ip) in front of the labwebs to have them be active/active. I'm not totally sure this is worth the additional complexity (especially the complexity of debugging future issues) given that we'll have no trouble supporting current traffic loads with one active host.

I think setting up an LVS is pretty easy these days (at least for the folk who know how to do it). I poked around and found this example config change -- https://gerrit.wikimedia.org/r/#/c/378956/.

Change 413169 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/dns@master] add lvs ip for labweb services

https://gerrit.wikimedia.org/r/413169

Change 413171 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labweb: add lvs service in front of labweb1001/1002

https://gerrit.wikimedia.org/r/413171

Change 413169 merged by Andrew Bogott:
[operations/dns@master] add lvs ip for labweb services

https://gerrit.wikimedia.org/r/413169

Change 413171 merged by Andrew Bogott:
[operations/puppet@production] labweb: add lvs service in front of labweb1001/1002

https://gerrit.wikimedia.org/r/413171

Change 411546 merged by Andrew Bogott:
[operations/puppet@production] labweb horizon: share memcached among labwebs

https://gerrit.wikimedia.org/r/411546

Change 413178 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] horizon: add a missing arg

https://gerrit.wikimedia.org/r/413178

Change 413178 merged by Andrew Bogott:
[operations/puppet@production] horizon: add a missing arg

https://gerrit.wikimedia.org/r/413178

Change 413186 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] horizon memcache: fix an issue with erb var resolution

https://gerrit.wikimedia.org/r/413186

Change 413186 merged by Andrew Bogott:
[operations/puppet@production] horizon memcache: fix an issue with erb var resolution

https://gerrit.wikimedia.org/r/413186

Change 413188 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labweb: inclued role::lvs::realserver on labweb hosts

https://gerrit.wikimedia.org/r/413188

Change 413188 merged by Andrew Bogott:
[operations/puppet@production] labweb: inclued role::lvs::realserver on labweb hosts

https://gerrit.wikimedia.org/r/413188

Change 413194 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] horizon/labweb: open firewall to internal IPs for port 80

https://gerrit.wikimedia.org/r/413194

Change 413194 merged by Andrew Bogott:
[operations/puppet@production] horizon/labweb: open firewall to internal IPs for port 80

https://gerrit.wikimedia.org/r/413194

I think setting up an LVS is pretty easy these days (at least for the folk who know how to do it). I poked around and found this example config change -- https://gerrit.wikimedia.org/r/#/c/378956/.

UPDATE: nope!

Change 413239 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labweb: remove specific memcached port

https://gerrit.wikimedia.org/r/413239

Change 413239 merged by Andrew Bogott:
[operations/puppet@production] labweb: remove specific memcached port

https://gerrit.wikimedia.org/r/413239

Horizon is now load-balanced on both hosts. I still need to figure out cache sharing for striker and wikitech.

Change 413275 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labweb: install nutcracker

https://gerrit.wikimedia.org/r/413275

Change 413275 merged by Andrew Bogott:
[operations/puppet@production] labweb: install nutcracker

https://gerrit.wikimedia.org/r/413275

Change 413394 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labweb nutcracker: re-use profile::mediawiki::nutcracker

https://gerrit.wikimedia.org/r/413394

Change 413396 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labweb nutcracker: further attempt to pass in memcached_pools correctly

https://gerrit.wikimedia.org/r/413396

Change 413394 merged by Andrew Bogott:
[operations/puppet@production] labweb nutcracker: re-use profile::mediawiki::nutcracker

https://gerrit.wikimedia.org/r/413394

Change 413413 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labweb nutcracker: add non-functional redis host list

https://gerrit.wikimedia.org/r/413413

Change 413413 merged by Andrew Bogott:
[operations/puppet@production] labweb nutcracker: add non-functional redis host list

https://gerrit.wikimedia.org/r/413413

Change 413490 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labweb: create /var/run/nutcracker

https://gerrit.wikimedia.org/r/413490

Change 413490 merged by Andrew Bogott:
[operations/puppet@production] labweb: create /var/run/nutcracker

https://gerrit.wikimedia.org/r/413490

demon removed a subscriber: demon.Feb 22 2018, 10:13 PM

Change 413396 abandoned by Andrew Bogott:
labweb nutcracker: further attempt to pass in memcached_pools correctly

Reason:
I did this a different way

https://gerrit.wikimedia.org/r/413396

nutcracker is now running on both labwebs, sharing a memcached pool. In theory it's also configured for redis but redis isn't actually running on either host. We'll see if we turn out to need it.

Andrew closed this task as Resolved.Feb 23 2018, 6:47 PM
bd808 moved this task from Doing to Done on the cloud-services-team (Kanban) board.