Page MenuHomePhabricator

Increase the CPU count for proton[12]00[12]
Closed, ResolvedPublic

Description

The machines have currently 2 vCPUs each. The current PDF request rates are around 2 per second. If we assume an average rendering time of 15s (very generous) and a slightly elevated rate of 2.5 requests per seconds, we would need 19 concurrent renderings to be happening. Under the assumption that one worker can fire two chromium instances in parallel, the worker count would need to be 10.

Possible options:

  • Increase the vCPU count to 10 on each machine
  • Add (2?) more VMs per DC
  • Have more workers than vCPUs

Probably the best option would be a combination of all of the above. Concretely, I suggest we add 4 more VMs (2 per DC), increase the vCPU count to 4, and have 6 workers running on each VM, bringing the total worker count to 24 per DC.

Event Timeline

mobrovac triaged this task as High priority.Jun 21 2018, 12:36 PM
mobrovac created this task.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 21 2018, 12:36 PM
mobrovac updated the task description. (Show Details)Jun 21 2018, 1:14 PM

Mentioned in SAL (#wikimedia-operations) [2018-06-25T10:32:44Z] <akosiaris> increase CPU count for proton machines from 2 to 10. T197862

Agreed. While overall the proposed solution is probably the best, I went ahead with option A (increase the vCPU count to 10) alone for now in order to facilitate moving forward with this without blocking it on adding more machines to the cluster. Overall this gives us a total of 20 worker count per DC which is pretty close to the proposed one and I am hopeful this will resolve the issues experienced. I 'll however the task open in order to later on implement the proposed approach

akosiaris lowered the priority of this task from High to Low.Jun 25 2018, 10:36 AM

Lowering priority to depict we currently have upgraded quite a bit the CPU count but the task is not yet resolved.

Vvjjkkii renamed this task from Increase the CPU count for proton[12]00[12] to 5iaaaaaaaa.Jul 1 2018, 1:02 AM
Vvjjkkii raised the priority of this task from Low to High.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
mobrovac renamed this task from 5iaaaaaaaa to Increase the CPU count for proton[12]00[12].Jul 1 2018, 10:47 AM
mobrovac lowered the priority of this task from High to Low.
mobrovac updated the task description. (Show Details)
ovasileva moved this task from Triage to Backlog on the Proton board.Feb 22 2019, 3:21 PM
Tgr added a subscriber: Tgr.Apr 10 2019, 7:39 PM

This is currently a subtask of T210651: Switch all PDF render traffic to new Proton service - does that mean it is seen as a blocker? If not, what's the expected impact / timeline?

mobrovac closed this task as Resolved.Apr 10 2019, 7:41 PM
mobrovac claimed this task.

Given that overall proton has been (relatively) stable in the current config and that a plan to move it to k8s exists, I'll go ahead and resolve this.