Nodes are configured by /etc/ores/99-main.yaml to use ores1001.eqiad.wmnet:6379 as the Celery broker, and port :6380 as the score cache, but the Redis servers on ores1001 are only listening on 127.0.0.1, so these connections are refused.
Description
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | akosiaris | T162039 Prepare to service applications from kubernetes | |||
Resolved | akosiaris | T162041 Expand the infrastructure to codfw | |||
Resolved | RobH | T161700 CODFW: (4) hardware access request for kubernetes | |||
Resolved | RobH | T142578 codfw/eqiad:(9+9) hardware access request for ORES | |||
Unknown Object (Task) | |||||
Resolved | akosiaris | T165170 rack/setup/install ores2001-2009 | |||
Resolved | None | T176324 Scoring platform team FY18 Q2 | |||
Declined | None | T179501 Use external dsh group to list pooled ORES nodes | |||
Resolved | akosiaris | T168073 Switch ORES to dedicated cluster | |||
Resolved | Halfak | T185901 Preliminary deployment of ORES to new cluster | |||
Resolved | akosiaris | T171851 Reimage ores* hosts with Debian Stretch | |||
Resolved | awight | T169246 Stress/capacity test new ores* cluster | |||
Resolved | awight | T181806 Problem with Redis server configuration on new ORES cluster |
Event Timeline
Comment Actions
So the boxes were rebooted some 18 days ago, and given the stresstest was supposed to be not lasting that long, the redis configuration was not properly puppetized and on boot the wrong redis was restarted. I 've fixed that manually for now and will not be persisting it.