Page MenuHomePhabricator

Setup an experimental, user accessible (read+write) ES cluster for Tool Labs
Closed, ResolvedPublic

Description

We should setup an experimental, write-by-whitelist (to begin with) read-for-all ES cluster on tool labs, since there are several tools that would benefit from such a thing (Originally suggested by @Halfak and @Capt_Swing for a teahouse search project), and the cluster that stashbot uses can be just part of labs. This will also provide experience building for future logging experiments.

Event Timeline

yuvipanda raised the priority of this task from to Needs Triage.
yuvipanda updated the task description. (Show Details)
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptDec 2 2015, 12:18 AM

a 3 node cluster with an authenticating proxy in front!

bd808 added a subscriber: bd808.Dec 2 2015, 12:31 AM

read-for-all

This should be easy but don't forget that all the horrible things one can do to DOS/crash an Elasticsearch node are query based. Will we want to expose Elasticsearch outside of the tools project?

authenticating proxy in front

What will we auth against? Labs LDAP?

I'm assuming that at least the end goal here will be to isolate write access by index (eg stashbot can write to statshbot-* but not teahouse-*). Do we need to care about that at all at first?

bd808 claimed this task.Dec 2 2015, 12:34 AM
bd808 triaged this task as Medium priority.
bd808 moved this task from Triage to In Progress on the Toolforge board.
Restricted Application added a project: User-bd808. · View Herald TranscriptDec 2 2015, 12:34 AM
bd808 moved this task from To Do to In Dev/Progress on the User-bd808 board.Dec 2 2015, 12:35 AM

Right, so by default let's not expose this outside the tools project, and make up DOS protections as we go along :D

Ah, so the other things that authenticate authenticate against.... identd. I would say we can even get away with something like basic auth to begin with and then decide what to do afterwards.

bd808 added a comment.Dec 2 2015, 3:54 AM

Current plan:

  • 3 XL jessie instances in the tools project
  • new ::role::toollabs::elasticsearch Puppet role to provision:
    • Elasticsearch
    • reverse proxy that restricts writing to Elasticsearch instances

The initial reverse proxy will probably be nginx with basic auth protection for POST requests. That will let us get started with other bits. Eventually this would be replaced with a custom proxy that allows more fine grained restrictions.

Change 256618 had a related patch set uploaded (by BryanDavis):
[WIP] Elasticsearch with proxy for tool labs

https://gerrit.wikimedia.org/r/256618

bd808 added a comment.Dec 4 2015, 1:56 PM

I've got the basic cluster up and running on tools-elastic-0[123]. The status of the cluster can be seen from https://tools.wmflabs.org/bd808-test/elastic.php

My next step will be to load some data in and make sure querying works as expected through the nginx reverse proxy layer.

Change 256618 merged by Yuvipanda:
Elasticsearch with proxy for tool labs

https://gerrit.wikimedia.org/r/256618

bd808 closed this task as Resolved.Dec 31 2015, 6:08 AM

The stashbot, sal and bash tools are all now using the tools cluster!

bd808 moved this task from Needs Review/Feedback to Done on the User-bd808 board.Dec 31 2015, 6:09 AM
bd808 moved this task from Done to Archive on the User-bd808 board.Feb 21 2016, 8:53 PM

i saw this mentioned securing elasticsearch. I dunno how useful it is, but here are some notes i took at elasticon 16 from their securing elasticsearch talk (which was kind-of a why a proxy only kinda/sorta works and you should pay for shield...but still good info): P2678

bd808 added a comment.Feb 26 2016, 5:16 PM

i saw this mentioned securing elasticsearch. I dunno how useful it is, but here are some notes i took at elasticon 16 from their securing elasticsearch talk (which was kind-of a why a proxy only kinda/sorta works and you should pay for shield...but still good info): P2678

Helpful stuff @EBernhardson, thanks.