Page MenuHomePhabricator

RESTBase service exists on Parsoid ECONNREFUSED
Closed, ResolvedPublic

Description

Just noticed this strange behaviour on deployment-restbase0x: when a call to Parsoid results in ECONNREFUSED, RESTBase exists cleanly:

root@deployment-restbase02:/# systemctl status restbase
● restbase.service - LSB: REST storage API and backend orchestration layer
   Loaded: loaded (/etc/init.d/restbase)
   Active: active (exited) since Mon 2015-05-18 08:47:26 UTC; 37min ago
  Process: 11558 ExecStart=/etc/init.d/restbase start (code=exited, status=0/SUCCESS)

What makes it really peculiar is the fact that if RESTBase is started manually (even as the restbase user), such thing does not occur, i.e. RESTBase keeps on running.

In both cases RESTBase returns the appropriate error JSON, which points to the fact that the exit occurs after closing the client connection.

Further investigation reveals this possibly to be the fault of the gelf logger: when logging to stdout/syslog (regardless of the way the service is started - manually or via systemd), RESTBase remains alive, but as soon gelf is introduced, the service exists right after ECONNREFUSED is received from Parsoid. The explanation might lie in the fact that the generated error-message JSON sent to the logger is 1950 bytes long.

More investigation to follow.

Event Timeline

mobrovac raised the priority of this task from to Medium.
mobrovac updated the task description. (Show Details)
mobrovac added a project: RESTBase.
mobrovac subscribed.
mobrovac raised the priority of this task from Medium to Unbreak Now!.
mobrovac updated the task description. (Show Details)
mobrovac set Security to None.
mobrovac removed a subscriber: Aklapper.

Indeed, the problem is gelf. More concretely, it tries to send the messages to deployment-logstash1.eqiad.wmflabs, however, that host strangely misses a DNS entry (cf. T99521).

Change 211724 had a related patch set uploaded (by Mobrovac):
Beta: RESTBase: Switch to deployment-logstash1's IP address

https://gerrit.wikimedia.org/r/211724

Change 211724 abandoned by Mobrovac:
Beta: RESTBase: Switch to deployment-logstash1's IP address

Reason:
The culprit, https://phabricator.wikimedia.org/T99521, has been resolved, so this change is no longer needed.

https://gerrit.wikimedia.org/r/211724