There's also currently only one tomcat node, so when it goes down all java webservices are dead.
|Open||None||T90534 Make toolforge reliable enough (tracking)|
|Open||None||T91068 Set up a schedule for doing failover exercises for toollabs|
|Resolved||Andrew||T90542 Make sure that toollabs can function fully even with one virt* host fully down|
|Resolved||yuvipanda||T91066 Retire 'tomcat' node, make Java apps run on the generic webgrid|
How much memory are we saving by having separate nodes for lighttpd-based tasks and overprovisioning them? (If that is still true; modules/toollabs/manifests/node/web/lighttpd.pp doesn't mention anything specific.)
Otherwise, why not only have "generic execution nodes" (or "Precise exec node" and "Trusty exec node"), so we don't have to have two of each type? Fewer instances to worry about.
No, I didn't mean "generic web node", but "generic execution node". All jobs on all nodes, no nodes that only run jobs in a subset of queues. (Maybe prioritize web queues over others so that the start-up time of a webservice is in the interactive range.) Only one type of execution node = only one type of node to spread over the virtual servers.
It looks like only tools running on -tomcat node now are ones that were started with qsub and hence have no filters restricting them to the exec nodes. I'll just get rid of the node in a couple of hours when nothing is running on it.
yuvipanda@tools-bastion-01:~$ qconf -de tools-webgrid-tomcat.eqiad.wmflabs Host object "tools-webgrid-tomcat.eqiad.wmflabs" is still referenced in cluster queue "webgrid-generic".
Which is strange because I don't see tools-webgrid-tomcat in the webgrid-generic queue.
qhost -j (NB: qhost, no hostname) shows:
[…] tools-webgrid-tomcat.eqiad.wmflabs lx26-amd64 8 - 15.7G - 1.9G - 9804789 0.30012 opentask tools.sugges r 04/10/2015 05:45:04 webgrid-ge MASTER
Did you delete the instance with the job still running?
scfc@tools-bastion-01:~$ qconf -de tools-webgrid-tomcat.eqiad.wmflabs email@example.com removed "tools-webgrid-tomcat.eqiad.wmflabs" from execution host list scfc@tools-bastion-01:~$
scfc@tools-bastion-01:~$ qconf -ds tools-webgrid-tomcat.eqiad.wmflabs firstname.lastname@example.org removed "tools-webgrid-tomcat.eqiad.wmflabs" from submit host list scfc@tools-bastion-01:~$