Page MenuHomePhabricator

Provide a status page (list) of all active proxy definitions
Closed, ResolvedPublic

Description

To compare the list of running webservices with the list of all active proxy definitions, currently one needs to log into tools-webproxy, connect to the Redis server and list the keys (and then compare it to the output of qstat).

It would be much nicer if this list would be available directly from nginx. For starters, composing the list by just naming the prefixes should be a) enough and b) not cause any security concerns. This should look something like this (http://tools.wmflabs.org/?some-magic-url):

bub
dschwenbot
pathoschild-contrib
[…]

Optionally/additionally, http://tools.wmflabs.org/?some-magic-url.json would produce the list in JSON format.

Event Timeline

scfc raised the priority of this task from to Needs Triage.
scfc updated the task description. (Show Details)
scfc added a project: Toolforge.
scfc subscribed.

Yup, should be done in Lua on nginx, I think.

Forgot to claim. "?some-magic-url" is up for bikeshedding, of course.

On second thought …

proxylistener has the benefit to delete a proxy entry when the attached web service is terminated and closes the socket. On the other hand it has the disadvantage that it deletes the proxy entry only when the socket is closed which causes mayhem when proxylistener crashes/shuts down and a lot of stale entries remain behind. In practice, I think it would be more stable to replace this open socket procedure with a wrapper script that a) deletes all existing proxy entries for a tool (because usually there is only one web service for a tool running at the same time), b) adds the proxy entry for the web service, c) runs the web service and d) deletes the proxy entry.

If the socket no longer needs to be kept open, I think the interface on the proxy side could be integrated into nginx as a lua script instead of being a standalone Python daemon. In that case, it makes sense to set up this interface on a different port so that there is a clear distinction between proxying (accessible from outside Labs) and proxy management (accessible only to Tools instances).

As a first step in this direction, I'll set up a different nginx site (proxymanager) under port 8081 for this task.

@scfc: The reason we have the socket setup is to prevent one particular class of race conditions that perhaps let one tool pretend to be another tool for a short period of time. Consider:

  1. Tool A is bound to port 9999, is serving
  2. Tool A crashes
  3. Tool X, malicious, is continuously trying to grab Tool A's port, and succeeds
  4. Now, until tool A comes back up, people hitting Tool A's URL are actually hitting Tool X, and they have no idea.

This could be made worse - for example, by crashing a tool until bigbrother refuses to restart it, and then still getting all the URLs because stale routing entry...

Of course, if we have a reliable way of executing code when the tool's webprocess ends, that changes everything:)

Well, at the moment the stale proxy entries are up for grabs by anyone until an admin intervenes. I thought about putting the shutdown code in an SGE epilogue script which I think is executed even in case of OOM & Co.

There should be no stale proxy entries now tho - proxy listener is supposed
to clean them up when the connection breaks...

… except when the instance crashes due to an outage, for example :-) (currently "wikistream" has a proxy entry, but no web service running on tools-webgrid-generic-01:51321).

Ah of course :)

Yeah if we can get epilogue scripts verifiably working I think we can
switch to those :)

scfc triaged this task as High priority.Apr 6 2015, 7:14 AM
scfc set Security to None.
scfc moved this task from Backlog to In Progress on the Toolforge board.

Change 203313 had a related patch set uploaded (by Tim Landscheidt):
dynamicproxy: Provide list of active proxy entries for urlproxy

https://gerrit.wikimedia.org/r/203313

scfc changed the task status from Open to Stalled.Apr 13 2015, 7:25 AM
scfc updated the task description. (Show Details)

Change 203313 merged by Yuvipanda:
dynamicproxy: Provide list of active proxy entries for urlproxy

https://gerrit.wikimedia.org/r/203313

Note that we can also have a list of webservices that *should* be running by looking at service manifests :)

I've backported Jessie's lua-json to tools' local trusty repo and all is well.

Should this be closed now or should it also provide a publicly viewable page?

Change 204770 had a related patch set uploaded (by Tim Landscheidt):
dynamicproxy: Open firewall for proxymanager

https://gerrit.wikimedia.org/r/204770

Change 204770 merged by Yuvipanda:
dynamicproxy: Open firewall for proxymanager

https://gerrit.wikimedia.org/r/204770

scfc changed the task status from Stalled to Open.Apr 18 2015, 5:16 PM
scfc closed this task as Resolved.
scfc updated the task description. (Show Details)