Page MenuHomePhabricator

Setup two elasticsearch clusters on relforge to test multi-instance
Closed, ResolvedPublic

Description

Most likely we will have missed a couple things while refactoring puppet. Setup two clusters on the relforge servers as a testbed before setting up multiple clusters on the regular prod clusters.

Event Timeline

EBernhardson triaged this task as Medium priority.Jun 27 2018, 7:32 PM
EBernhardson created this task.
EBjune moved this task from needs triage to Up Next on the Discovery-Search board.Jul 5 2018, 5:15 PM

Change 466591 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] relforge: setup 2 instances to validate multi-instance configuration

https://gerrit.wikimedia.org/r/466591

Change 467684 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] tlsproxy: allow multiple default servers on different ports

https://gerrit.wikimedia.org/r/467684

Gehel added subscribers: BBlack, Joe.EditedOct 16 2018, 3:52 PM

The current puppet code for tlsproxy::localssl does not allow for multiple $default_server, even when on different ports.

This led to an interesting conversation on how to differentiate the different elasticsearch instances, beside using different ports. Options:

  1. differentiate on TCP port only (first instance on 9243, second on 9443)
  2. differentiate on server names
  3. differentiate on IP
  4. a combination of some of the above
  1. seems the simpler solution, it matches the expectations of the clients and does not have any significant drawback that we could find
  2. SAN / SNI support in HTTP libraries is often broken if supported at all. Our current clients might be OK (unchecked), but if we can avoid the pain, we should
  3. it would work fine, but since we want (at least at some point) to have elasticsearch listening only on localhost, elasticsearch will be on different ports already. Exposing this mapping to the TLS endpoint as well seems simpler and less surprising. (yes, we could use different lo:aliases, but that seems even more confusing for not much gain).
  4. why not ?

With all that, option 1) seems the most interesting, but requires a change to tlsproxy::localssl. @Joe / @BBlack: you seem to be the main contributors to that class, feedback welcomed.

Change 467684 merged by Gehel:
[operations/puppet@production] tlsproxy: allow multiple default servers on different ports

https://gerrit.wikimedia.org/r/467684

Change 468320 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] tlsproxy::localssl: allow mutliple proxies with the same certificate

https://gerrit.wikimedia.org/r/468320

Change 468320 merged by Gehel:
[operations/puppet@production] tlsproxy::localssl: allow mutliple proxies with the same certificate

https://gerrit.wikimedia.org/r/468320

Change 466591 merged by Gehel:
[operations/puppet@production] relforge: setup 2 instances to validate multi-instance configuration

https://gerrit.wikimedia.org/r/466591

it seems to work well, the problems identified so far are:

  • hotthread script (T209030)
  • firewall config, mwmaint1002 is unable to talk to relforge:
sudo iptables -n -L | grep 10.64.16.77
ACCEPT     tcp  --  10.64.16.77          0.0.0.0/0            tcp dpt:9243

It's very probable that it's a specific thing to relforge but I suggest to see if we have other ad hoc rules that may need to be adjusted with new ports.

Change 475745 had a related patch set uploaded (by DCausse; owner: DCausse):
[operations/mediawiki-config@master] [cirrus] Use normal config for labswiki

https://gerrit.wikimedia.org/r/475745

Change 475745 merged by jenkins-bot:
[operations/mediawiki-config@master] [cirrus] Use normal config for labswiki

https://gerrit.wikimedia.org/r/475745

debt closed this task as Resolved.Nov 29 2018, 7:50 PM