Page MenuHomePhabricator

Setup an experimental, user accessible (read+write) ES cluster for Tool Labs
Closed, ResolvedPublic

Description

We should setup an experimental, write-by-whitelist (to begin with) read-for-all ES cluster on tool labs, since there are several tools that would benefit from such a thing (Originally suggested by @Halfak and @Capt_Swing for a teahouse search project), and the cluster that stashbot uses can be just part of labs. This will also provide experience building for future logging experiments.

Event Timeline

yuvipanda raised the priority of this task from to Needs Triage.
yuvipanda updated the task description. (Show Details)

a 3 node cluster with an authenticating proxy in front!

read-for-all

This should be easy but don't forget that all the horrible things one can do to DOS/crash an Elasticsearch node are query based. Will we want to expose Elasticsearch outside of the tools project?

authenticating proxy in front

What will we auth against? Labs LDAP?

I'm assuming that at least the end goal here will be to isolate write access by index (eg stashbot can write to statshbot-* but not teahouse-*). Do we need to care about that at all at first?

bd808 triaged this task as Medium priority.
bd808 moved this task from Backlog to In Progress on the Toolforge board.

Right, so by default let's not expose this outside the tools project, and make up DOS protections as we go along :D

Ah, so the other things that authenticate authenticate against.... identd. I would say we can even get away with something like basic auth to begin with and then decide what to do afterwards.

Current plan:

  • 3 XL jessie instances in the tools project
  • new ::role::toollabs::elasticsearch Puppet role to provision:
    • Elasticsearch
    • reverse proxy that restricts writing to Elasticsearch instances

The initial reverse proxy will probably be nginx with basic auth protection for POST requests. That will let us get started with other bits. Eventually this would be replaced with a custom proxy that allows more fine grained restrictions.

Change 256618 had a related patch set uploaded (by BryanDavis):
[WIP] Elasticsearch with proxy for tool labs

https://gerrit.wikimedia.org/r/256618

I've got the basic cluster up and running on tools-elastic-0[123]. The status of the cluster can be seen from https://tools.wmflabs.org/bd808-test/elastic.php

My next step will be to load some data in and make sure querying works as expected through the nginx reverse proxy layer.

Change 256618 merged by Yuvipanda:
Elasticsearch with proxy for tool labs

https://gerrit.wikimedia.org/r/256618

The stashbot, sal and bash tools are all now using the tools cluster!

i saw this mentioned securing elasticsearch. I dunno how useful it is, but here are some notes i took at elasticon 16 from their securing elasticsearch talk (which was kind-of a why a proxy only kinda/sorta works and you should pay for shield...but still good info): P2678

i saw this mentioned securing elasticsearch. I dunno how useful it is, but here are some notes i took at elasticon 16 from their securing elasticsearch talk (which was kind-of a why a proxy only kinda/sorta works and you should pay for shield...but still good info): P2678

Helpful stuff @EBernhardson, thanks.