Page MenuHomePhabricator

load-balance horizon between labweb1001 and 1002
Closed, ResolvedPublic

Description

In theory it should be simple to have horizon be active/active between the two labweb hosts. They should only need to share memcached access for tokens.

Event Timeline

Andrew triaged this task as Medium priority.Feb 15 2018, 8:47 PM
Andrew created this task.

Change 411546 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labweb horizon: share memcached among labwebs

https://gerrit.wikimedia.org/r/411546

I misunderstood this a bit -- I assumed that the misc-web varnishes would load-balance for us, but it turns out we would need to add an additional layer of load-balancing (and a service ip) in front of the labwebs to have them be active/active. I'm not totally sure this is worth the additional complexity (especially the complexity of debugging future issues) given that we'll have no trouble supporting current traffic loads with one active host.

I think setting up an LVS is pretty easy these days (at least for the folk who know how to do it). I poked around and found this example config change -- https://gerrit.wikimedia.org/r/#/c/378956/.

Change 413169 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/dns@master] add lvs ip for labweb services

https://gerrit.wikimedia.org/r/413169

Change 413171 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labweb: add lvs service in front of labweb1001/1002

https://gerrit.wikimedia.org/r/413171

Change 413169 merged by Andrew Bogott:
[operations/dns@master] add lvs ip for labweb services

https://gerrit.wikimedia.org/r/413169

Change 413171 merged by Andrew Bogott:
[operations/puppet@production] labweb: add lvs service in front of labweb1001/1002

https://gerrit.wikimedia.org/r/413171

Change 411546 merged by Andrew Bogott:
[operations/puppet@production] labweb horizon: share memcached among labwebs

https://gerrit.wikimedia.org/r/411546

Change 413178 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] horizon: add a missing arg

https://gerrit.wikimedia.org/r/413178

Change 413178 merged by Andrew Bogott:
[operations/puppet@production] horizon: add a missing arg

https://gerrit.wikimedia.org/r/413178

Change 413186 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] horizon memcache: fix an issue with erb var resolution

https://gerrit.wikimedia.org/r/413186

Change 413186 merged by Andrew Bogott:
[operations/puppet@production] horizon memcache: fix an issue with erb var resolution

https://gerrit.wikimedia.org/r/413186

Change 413188 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labweb: inclued role::lvs::realserver on labweb hosts

https://gerrit.wikimedia.org/r/413188

Change 413188 merged by Andrew Bogott:
[operations/puppet@production] labweb: inclued role::lvs::realserver on labweb hosts

https://gerrit.wikimedia.org/r/413188

Change 413194 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] horizon/labweb: open firewall to internal IPs for port 80

https://gerrit.wikimedia.org/r/413194

Change 413194 merged by Andrew Bogott:
[operations/puppet@production] horizon/labweb: open firewall to internal IPs for port 80

https://gerrit.wikimedia.org/r/413194

I think setting up an LVS is pretty easy these days (at least for the folk who know how to do it). I poked around and found this example config change -- https://gerrit.wikimedia.org/r/#/c/378956/.

UPDATE: nope!

Change 413239 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labweb: remove specific memcached port

https://gerrit.wikimedia.org/r/413239

Change 413239 merged by Andrew Bogott:
[operations/puppet@production] labweb: remove specific memcached port

https://gerrit.wikimedia.org/r/413239

Horizon is now load-balanced on both hosts. I still need to figure out cache sharing for striker and wikitech.

Change 413275 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labweb: install nutcracker

https://gerrit.wikimedia.org/r/413275

Change 413275 merged by Andrew Bogott:
[operations/puppet@production] labweb: install nutcracker

https://gerrit.wikimedia.org/r/413275

Change 413394 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labweb nutcracker: re-use profile::mediawiki::nutcracker

https://gerrit.wikimedia.org/r/413394

Change 413396 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labweb nutcracker: further attempt to pass in memcached_pools correctly

https://gerrit.wikimedia.org/r/413396

Change 413394 merged by Andrew Bogott:
[operations/puppet@production] labweb nutcracker: re-use profile::mediawiki::nutcracker

https://gerrit.wikimedia.org/r/413394

Change 413413 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labweb nutcracker: add non-functional redis host list

https://gerrit.wikimedia.org/r/413413

Change 413413 merged by Andrew Bogott:
[operations/puppet@production] labweb nutcracker: add non-functional redis host list

https://gerrit.wikimedia.org/r/413413

Change 413490 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labweb: create /var/run/nutcracker

https://gerrit.wikimedia.org/r/413490

Change 413490 merged by Andrew Bogott:
[operations/puppet@production] labweb: create /var/run/nutcracker

https://gerrit.wikimedia.org/r/413490

Change 413396 abandoned by Andrew Bogott:
labweb nutcracker: further attempt to pass in memcached_pools correctly

Reason:
I did this a different way

https://gerrit.wikimedia.org/r/413396

nutcracker is now running on both labwebs, sharing a memcached pool. In theory it's also configured for redis but redis isn't actually running on either host. We'll see if we turn out to need it.