Change Details

We started the Nodepool project with 10 Jessie instances with an upper quota of 20is currently limited to 12 instances. I would like to bump the quota for a few reasons:get it raised to 20 instances. * we have migrated almost all npm jobs * added more ruby jobs * in the process of migrating Zend 5.5 and HHVM jobs * soonish we will migrate browser tests jobs which are long running jobs and occupy an instance for quite a whileThat will let us migrate the Zend 5.5 / HHVM jobs that are currently running on Ubuntu Trusty. An example load is F4708299 ([[ https://integration.wikimedia.org/ci/label/UbuntuTrusty/load-statistics | live link ]]), which seems to indicate that 5 instances will cover it. Adding a couple more to help with the contention we have observed during peak hours (SF morning / Europe evening) and reach a round number of 20 instances. I would like to have the upper limited doubled with a base pool of 20 instances (10 Jessie, 10 Trusty) and allowing up to 40 instances. We have already deleted 9 m1.large instances from the pool of permanent slaves (T148183) and will be able to delete a couple more once the HHVM/PHP jobs are moved. We spawn `m1.medium` which are:have: | RAM | 4GB | VCPU | 2 | Disk | 40GB The Nodepool limit (`max-server`) would be bumped from 12 to 20. On OpenStack side, the quota of instances has to take in account the automatic refresh of snapshot images or two more instances. | Metric | Base | Max | Future | Future Max | MetricCurrent | New |---|---|---|---|---|-- | Instances | 10 | 20 | 20 | **40** | Instances| Nodepool `max-server` | 12 | 20 | RAM | 40G | 80G | 80G | **160G** | RAM| OpenStack quota | | |-- | VCPU | 20 | 40 | 40 | **80** | VCPU| Instances | 15 | 22 | | Disk | 400G | 800G | 800G | **1.6TB** | Disk https://grafana.wikimedia.org/dashboard/db/labs-capacity-planning shows there is a bunch of place.RAM | 100G | 100G | | VCPU | 40 | 44 | There might be concern with disk space consumption. Though as I understand it disk space is copy on write and not filled until the instance fill the disk. Looking at the instances: **Trusty** | Filesystem | Size | Used | Avail | Use% | Mounted on |--|--|--|--|--|-- | /dev/vda1 | 38G | 1.82.4G | 354G | 57% | / **Jessie** | Filesystem | Size | Used | Avail | Use% | Mounted on |--|--|--|--|--|-- | /dev/vda1 | 38G | 2.13.6G | 343G | 611% | / So for 80 instances that would be ~160GBytes disk consumed + whatever the jobs are writing to disk. The original quota are described in 31f12ddd1386bcf236508c65d2e269ec7238456d