Page MenuHomePhabricator

Beta cluster has reached its quota
Open, Needs TriagePublic

Description

I can't create new instances anymore:


Currently it has reached number of VCPUs:

I'd recommend people go through instances they have created and delete the ones that they don't need anymore (I found one for chromium, two for sentry, one or two for flourine, two mx nodes, etc.).

We can also increase the quota.

List of instances

  • deployment-acme-chief03
  • deployment-acme-chief04
  • deployment-aqs01
  • deployment-aqs02
  • deployment-aqs03
  • deployment-cache-text06
  • deployment-cache-upload06
  • deployment-changeprop
    • Deleted
  • deployment-chromium01
    • Needed
  • deployment-chromium02
    • Deleted
  • deployment-cpjobqueue
    • Deleted
  • deployment-cumin02
  • deployment-cumin
  • deployment-db05
    • Needed
  • deployment-db06
    • Needed
  • deployment-deploy01
    • Needed
  • deployment-deploy02
  • deployment-docker-changeprop01
    • Needed
  • deployment-docker-citoid01
    • Needed
  • deployment-docker-cpjobqueue01
    • Needed
  • deployment-docker-cxserver01
    • Needed
  • deployment-docker-mathoid01
    • Needed
  • deployment-echostore01
    • Needed
  • deployment-elastic05
  • deployment-elastic06
  • deployment-elastic07
  • deployment-etcd-01
  • deployment-eventgate-3
    • Needed
  • deployment-eventlog05
  • deployment-eventstreams-1
    • Needed
  • deployment-fluorine02
    • Needed
  • deployment-imagescaler01
  • deployment-imagescaler02
  • deployment-imagescaler03
  • deployment-ircd
  • deployment-jobrunner03
    • Needed
  • deployment-kafka-jumbo-1
  • deployment-kafka-jumbo-2
  • deployment-kafka-main-1
  • deployment-kafka-main-2
  • deployment-logstash03
  • deployment-logstash2
  • deployment-mailman01
  • deployment-maps05
    • Deleted
  • deployment-mcs01
    • Needed
  • deployment-mdb01
    • Needed
  • deployment-mediawiki-07
    • Needed
  • deployment-mediawiki-09
    • Needed
  • deployment-memc04
  • deployment-memc05
  • deployment-memc06
  • deployment-memc07
  • deployment-memc08
  • deployment-ms-be05
  • deployment-ms-be06
  • deployment-ms-fe03
  • deployment-mwmaint01
  • deployment-mx02
  • deployment-ores01
  • deployment-parsoid11
  • deployment-poolcounter06
  • deployment-prometheus02
  • deployment-puppetdb03
    • Needed
  • deployment-puppetmaster04
    • Needed
  • deployment-push-notifications01
    • Needed
  • deployment-restbase01
  • deployment-restbase02
  • deployment-restbase03
  • deployment-sca01
    • Needed
  • deployment-sca02
    • Needed
  • deployment-sca04
  • deployment-schema-2
    • Needed
  • deployment-sentry01
    • Deleted
  • deployment-sessionstore03
  • deployment-snapshot01
  • deployment-urldownloader02
    • Needed
  • deployment-wdqs01
  • deployment-webperf11
    • Needed
  • deployment-webperf12
    • Needed
  • deployment-wikifeeds01
    • Needed
  • deployment-xhgui01
    • Needed at least another 1-2 weeks, will delete after that.
  • deployment-xhgui02
    • Needed
  • deployment-zookeeper02

Event Timeline

Ladsgroup created this task.Jul 4 2020, 3:17 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 4 2020, 3:17 PM
Majavah added a subscriber: Majavah.Jul 4 2020, 3:17 PM

List of instances:

Instance NameVCPUsRAM (MB)Disk (GB)Usage (Hours)Age (Seconds)State
deployment-acme-chief03120482039.4640926995Active
deployment-acme-chief04120482039.4640924947Active
deployment-aqs01240964039.4643385428Active
deployment-aqs02240964039.4643372228Active
deployment-aqs03240964039.4643372210Active
deployment-cache-text06240964039.46rEEVL7260971c7df2Active
deployment-cache-upload06240964039.467261651Active
deployment-changeprop120482039.4651182405Stopped
deployment-chromium01120482039.4651215972Active
deployment-chromium02120482039.4651224204Active
deployment-cpjobqueue240964039.4651210191Stopped
deployment-cumin02120482039.4638848120Active
deployment-cumin120482039.461055004Active
deployment-db0581638416039.4643792600Active
deployment-db0681638416039.4640429759Active
deployment-deploy01881926039.4651223631Active
deployment-deploy02881926039.4651223229Active
deployment-docker-changeprop01120482039.464413318Active
deployment-docker-citoid01120482039.4636043832Active
deployment-docker-cpjobqueue01120482039.462071846Active
deployment-docker-cxserver01120482039.4635988148Active
deployment-docker-mathoid01120482039.4636472431Active
deployment-echostore01120482039.4617170994Active
deployment-elastic05481928039.4651154703Active
deployment-elastic06481928039.4651155828Active
deployment-elastic07481928039.4651152564Active
deployment-etcd-01120482039.4651201269Active
deployment-eventgate-3120482039.4621411224Active
deployment-eventlog05481928039.4651208642Active
deployment-eventstreams-1120482039.4615548061Active
deployment-fluorine02120488039.4651193096Active
deployment-imagescaler01240964039.4651190311Active
deployment-imagescaler02240964039.4651202702Active
deployment-imagescaler03240964039.4649250051Active
deployment-ircd120482039.4651194397Active
deployment-jobrunner03481928039.4651218925Active
deployment-kafka-jumbo-1120488039.4651203682Active
deployment-kafka-jumbo-2120488039.4651201263Active
deployment-kafka-main-1120482039.4651213003Active
deployment-kafka-main-2120482039.4651218210Active
deployment-logstash0381638416039.4637060051Active
deployment-logstash281638416039.4651182252Active
deployment-mailman01120482039.462419715Active
deployment-maps05481928039.4639724237Active
deployment-mcs01120482039.4651195371Active
deployment-mdb01120488039.462122503Active
deployment-mediawiki-07481928039.4651211839Active
deployment-mediawiki-09481928039.4651216632Active
deployment-memc04240964039.4651191933Active
deployment-memc05240964039.4651183858Active
deployment-memc06240964039.4651184243Active
deployment-memc07240964039.4651204969Active
deployment-memc08240964039.4622661725Active
deployment-ms-be0581638416039.4638761194Active
deployment-ms-be0681638416039.4638761194Active
deployment-ms-fe03120482039.4638232895Active
deployment-mwmaint01120488039.4651223997Active
deployment-mx02120482039.4651211782Active
deployment-ores01481928039.4651219547Active
deployment-parsoid11240964039.4610590936Active
deployment-poolcounter06120482039.4618754786Active
deployment-prometheus02481928039.4646161569Active
deployment-puppetdb03120482039.4614229837Active
deployment-puppetmaster04240964039.4612932186Active
deployment-push-notifications01120482039.462832947Active
deployment-restbase01481928039.4651180967Active
deployment-restbase02481928039.4651189611Active
deployment-restbase03481928039.467229633Active
deployment-sca01120482039.4651184163Active
deployment-sca02120482039.4651165143Active
deployment-sca04240964039.4651198853Active
deployment-schema-2120482039.4634200080Active
deployment-sentry01240964039.4651183953Active
deployment-sessionstore03120482039.4617171015Active
deployment-snapshot01240964039.4651207377Active
deployment-urldownloader02120482039.4651224813Active
deployment-wdqs01481928039.4621527681Active
deployment-webperf11120482039.4651219955Active
deployment-webperf12120482039.4651221874Active
deployment-wikifeeds01120482039.4633600161Active
deployment-xhgui01120482039.4619310473Active
deployment-xhgui02120482039.462042306Active
deployment-zookeeper02120482039.4617299874Active

We have two appservers but five memcached nodes, that seems off.

Deleted deployment-sentry01 according to T106915#6279270

Joe added subscribers: hnowlan, Joe.Jul 5 2020, 7:06 AM

We have two appservers but five memcached nodes, that seems off.

Not really, given the amount of traffic beta receives, and the fact we don't have load-balancing so it's hard to distribute requests more.

Most of the VMs you listed above are still in use, even if no one logs into them or touches them since some time.

I would ask why do we have mailman VMs in deployment-prep, OTOH. It seems quite off-topic there.

The two stopped changeprop VMs can be removed, I think, but I'd ask @hnowlan to confirm that's the case

I would ask why do we have mailman VMs in deployment-prep, OTOH. It seems quite off-topic there.

It's the Mailman v3 testing instance as per T52864.

Most of the VMs you listed above are still in use, even if no one logs into them or touches them since some time.

Indeed but this project is huge and this is list of all VMs, if we just audit and clean 10-20%, it frees up 20-40 VCPUs. For comparison, "meet" project has 8 VCPUs as the quota.

I would ask why do we have mailman VMs in deployment-prep, OTOH. It seems quite off-topic there.

Yup, that's what I'm working on: https://lists-beta.wmflabs.org I can also request a dedicated project if you think that's better.

dpifke added a subscriber: dpifke.Jul 6 2020, 11:32 PM
Ladsgroup updated the task description. (Show Details)Jul 6 2020, 11:44 PM
dpifke updated the task description. (Show Details)Jul 6 2020, 11:47 PM
hnowlan updated the task description. (Show Details)Jul 7 2020, 9:18 AM
Mholloway updated the task description. (Show Details)Jul 7 2020, 1:59 PM
Mholloway updated the task description. (Show Details)Jul 7 2020, 2:19 PM
Mholloway updated the task description. (Show Details)Jul 7 2020, 3:27 PM