Increase celery workers to 40 per scb node
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Halfak
	Aug 16 2016, 2:08 PM

Description

Before we deployed the recent ORES refactor, we were using 950MB per uwsgi worker and 1130MB per celery worker.

Currently, we have 24 + 1 celery processes (as of 8/09):

25 * 1130MB = 27.6GB

Currently, we have 48 + 1 uwsgi processes (as of 8/09):

49 * 950MB = 45.5GB

As of the refactor, the RES of celery has stayed the same, but the RES of uwsgi has fallen to 550 MB per process:

49 * 550MB = 26.3GB

In summary, we used to use 27.6 + 45.5 = 73.1GB of RES memory. Now we use 27.6 + 26.3 = 53.9GB of memory. So, we've gained 19.2GB of memory. In that amount of memory, we should safely be able to add 16 new celery workers.

(Note that all memory estimates were gathered using the RES column in ps)

Details

	Subject	Repo	Branch	Lines +/-
	ores: increase ores workers to 40 per node	operations/puppet	production	+1 -1

Customize query in gerrit

Related Objects

Mentioned In: rOPUP0ddb9e2d62cf: ores: increase ores workers to 40 per node
rOPUP16a0144ae062: ores: increase ores workers to 40 per node

Event Timeline

Halfak created this task.Aug 16 2016, 2:08 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 16 2016, 2:08 PM

Actually we have 32 celery workers, not 24. So we are using some 36G of RES memory, and respectively 64GB in total However, the great thing seems to be that the initialization of those workers (the loading of the models) happens before their fork so thanks to COW (copy-on-write) the workers don't actually consume that much memory. So, I am fine with adding another 8 workers.

Change 305201 had a related patch set uploaded (by Ladsgroup):
ores: increase ores workers to 40 per node

https://gerrit.wikimedia.org/r/305201

gerritbot added a project: Patch-For-Review.Aug 17 2016, 9:11 AM

Ladsgroup claimed this task.Aug 17 2016, 9:11 AM

Restricted Application added a project: User-Ladsgroup. · View Herald TranscriptAug 17 2016, 9:11 AM

Ladsgroup moved this task from Parked to Review on the Machine-Learning-Team (Active Tasks) board.Aug 17 2016, 9:13 AM

Ladsgroup moved this task from Incoming to Blocked on others on the User-Ladsgroup board.

Ladsgroup mentioned this in rOPUP16a0144ae062: ores: increase ores workers to 40 per node.Aug 17 2016, 9:18 AM

Change 305201 merged by Alexandros Kosiaris:
ores: increase ores workers to 40 per node

https://gerrit.wikimedia.org/r/305201

akosiaris mentioned this in rOPUP0ddb9e2d62cf: ores: increase ores workers to 40 per node.Aug 17 2016, 9:34 AM

Mentioned in SAL [2016-08-17T10:07:14Z] <Amir1> ladsgroup@scb[12]00[12]:~$ sudo service celery-ores-worker restart (T143105)

Ladsgroup moved this task from Blocked on others to Done on the User-Ladsgroup board.Aug 17 2016, 10:12 AM

Ladsgroup moved this task from Review to Completed on the Machine-Learning-Team (Active Tasks) board.Aug 18 2016, 12:12 AM

Ladsgroup closed this task as Resolved.Aug 22 2016, 6:22 PM

Increase celery workers to 40 per scb nodeClosed, ResolvedPublicActions

Description

Details

Related Objects

Event Timeline

Increase celery workers to 40 per scb node
Closed, ResolvedPublic
Actions