Page MenuHomePhabricator

puppet keep trying to restart redis because upstart track wrong PID
Closed, ResolvedPublic

Description

T118704 / https://gerrit.wikimedia.org/r/#/c/253295/ introduced upstart support for Redis but it is not properly tracking the process PID:

#  start redis-instance-tcp_6379
redis-instance-tcp_6379 start/running, process 31771

# status redis-instance-tcp_6379
redis-instance-tcp_6379 stop/waiting

Tracking PID is wrong:

# cat /var/run/redis/tcp_6379.pid 
31762
# pidof redis-server
31762

The puppet service stanza mentions the service is managed by upstart and it thus uses status to verify whether it is working. Since the PID is wrong, status is always false and puppet always attempt to start it:

Notice: /Stage[main]/Role::Ci::Slave::Browsertests/Redis::Instance[6379]
    /Base::Service_unit[redis-instance-tcp_6379]/Service[redis-instance-tcp_6379]/ensure: ensure changed 'stopped' to 'running'
Info: /Stage[main]/Role::Ci::Slave::Browsertests/Redis::Instance[6379]
    /Base::Service_unit[redis-instance-tcp_6379]/Service[redis-instance-tcp_6379]: Unscheduling refresh on Service[redis-instance-tcp_6379]

The root cause is redis-server fork, thus changing the PID. The Upstart config need to be instructed to follow exactly one fork.

Event Timeline

hashar created this task.Dec 14 2015, 1:52 PM
hashar assigned this task to yuvipanda.
hashar raised the priority of this task from to Normal.
hashar updated the task description. (Show Details)
hashar added subscribers: aaron, Aklapper, ori and 5 others.

Change 258972 had a related patch set uploaded (by Hashar):
redis: upstart should track PID after one fork

https://gerrit.wikimedia.org/r/258972

hashar claimed this task.Dec 14 2015, 1:55 PM
hashar added a subscriber: yuvipanda.
hashar removed hashar as the assignee of this task.Dec 14 2015, 2:16 PM
hashar moved this task from To Triage to Done on the Beta-Cluster-Infrastructure board.
hashar moved this task from Untriaged to Done on the Continuous-Integration-Infrastructure board.

Fixed on Beta cluster / CI. Not sure of how to handle the change in production though where killall redis-server might not be wanted.

scfc added a subscriber: scfc.Dec 14 2015, 2:27 PM

Actually, @Joe's 16794d5a56bae609d7a6fe85382cfe5e475063cb removed expect fork with the comment:

redis: the service is not daemonized by default on ubuntu

Oh good finding @scfc , right now the instances do have daemonize yes though:

/etc/redis/tcp_6379.conf:daemonize yes

@ori refactored the redis class last week, apparently we no more use the default Ubuntu redis-server and always use redis::instance which comes with daemonize yes.

Change 258972 merged by Yuvipanda:
redis: upstart should track PID after one fork

https://gerrit.wikimedia.org/r/258972

hashar closed this task as Resolved.Feb 19 2016, 8:23 PM
hashar claimed this task.

Should be fixed via https://gerrit.wikimedia.org/r/258972 Beta/CI got fixed by cherry picking the patch.