Maniphest T198352

Setup two elasticsearch clusters on relforge to test multi-instance
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	EBernhardson
	Jun 27 2018, 7:32 PM

Description

Most likely we will have missed a couple things while refactoring puppet. Setup two clusters on the relforge servers as a testbed before setting up multiple clusters on the regular prod clusters.

Details

Subject	Repo	Branch	Lines +/-
[cirrus] Use normal config for labswiki	operations/mediawiki-config	master	+0 -7
relforge: setup 2 instances to validate multi-instance configuration	operations/puppet	production	+15 -10
tlsproxy::localssl: allow mutliple proxies with the same certificate	operations/puppet	production	+21 -13
tlsproxy: allow multiple default servers on different ports	operations/puppet	production	+9 -5

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Resolved	EBernhardson	T183281 [epic] ELK upgrade to 6.x (elasticsearch, kibana, logstash)
Resolved	None	T183282 [epic] Search cluster upgrade to 6.x
Resolved	debt	T193654 [epic] Run multiple elasticsearch clusters on same hardware
Resolved	Gehel	T198352 Setup two elasticsearch clusters on relforge to test multi-instance

Event Timeline

EBernhardson triaged this task as Medium priority.Jun 27 2018, 7:32 PM

EBernhardson created this task.

• EBjune moved this task from needs triage to Up Next on the Discovery-Search board.Jul 5 2018, 5:15 PM

Gehel edited projects, added Discovery-Search (Current work); removed Discovery-Search.Oct 11 2018, 1:05 PM

Gehel moved this task from Incoming to not in use - please delete on the Discovery-Search (Current work) board.

Change 466591 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] relforge: setup 2 instances to validate multi-instance configuration

https://gerrit.wikimedia.org/r/466591

gerritbot added a project: Patch-For-Review.Oct 11 2018, 1:06 PM

EBernhardson assigned this task to Gehel.Oct 11 2018, 6:16 PM

Change 467684 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] tlsproxy: allow multiple default servers on different ports

https://gerrit.wikimedia.org/r/467684

The current puppet code for tlsproxy::localssl does not allow for multiple $default_server, even when on different ports.

This led to an interesting conversation on how to differentiate the different elasticsearch instances, beside using different ports. Options:

differentiate on TCP port only (first instance on 9243, second on 9443)
differentiate on server names
differentiate on IP
a combination of some of the above

seems the simpler solution, it matches the expectations of the clients and does not have any significant drawback that we could find
SAN / SNI support in HTTP libraries is often broken if supported at all. Our current clients might be OK (unchecked), but if we can avoid the pain, we should
it would work fine, but since we want (at least at some point) to have elasticsearch listening only on localhost, elasticsearch will be on different ports already. Exposing this mapping to the TLS endpoint as well seems simpler and less surprising. (yes, we could use different lo:aliases, but that seems even more confusing for not much gain).
why not ?

With all that, option 1) seems the most interesting, but requires a change to tlsproxy::localssl. @Joe / @BBlack: you seem to be the main contributors to that class, feedback welcomed.

Gehel mentioned this in T207195: Configure LVS endpoints for new elasticsearch clusters.Oct 16 2018, 5:11 PM

Change 467684 merged by Gehel:
[operations/puppet@production] tlsproxy: allow multiple default servers on different ports

https://gerrit.wikimedia.org/r/467684

Change 468320 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] tlsproxy::localssl: allow mutliple proxies with the same certificate

https://gerrit.wikimedia.org/r/468320

Change 468320 merged by Gehel:
[operations/puppet@production] tlsproxy::localssl: allow mutliple proxies with the same certificate

https://gerrit.wikimedia.org/r/468320

Change 466591 merged by Gehel:
[operations/puppet@production] relforge: setup 2 instances to validate multi-instance configuration

https://gerrit.wikimedia.org/r/466591

Gehel moved this task from not in use - please delete to Needs review on the Discovery-Search (Current work) board.Nov 6 2018, 6:25 PM

it seems to work well, the problems identified so far are:

hotthread script (T209030)
firewall config, mwmaint1002 is unable to talk to relforge:

sudo iptables -n -L | grep 10.64.16.77
ACCEPT     tcp  --  10.64.16.77          0.0.0.0/0            tcp dpt:9243

It's very probable that it's a specific thing to relforge but I suggest to see if we have other ad hoc rules that may need to be adjusted with new ports.

EBernhardson moved this task from Needs review to Needs Reporting on the Discovery-Search (Current work) board.Nov 15 2018, 4:59 PM

Change 475745 had a related patch set uploaded (by DCausse; owner: DCausse):
[operations/mediawiki-config@master] [cirrus] Use normal config for labswiki

https://gerrit.wikimedia.org/r/475745

Change 475745 merged by jenkins-bot:
[operations/mediawiki-config@master] [cirrus] Use normal config for labswiki

https://gerrit.wikimedia.org/r/475745

debt closed this task as Resolved.Nov 29 2018, 7:50 PM

Setup two elasticsearch clusters on relforge to test multi-instanceClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Setup two elasticsearch clusters on relforge to test multi-instance
Closed, ResolvedPublic
Actions

Related Objects
Search...