Page MenuHomePhabricator

hhvm-staging.hhvm.eqiad.wmflabs has high user/system CPU
Closed, ResolvedPublic

Description

hhvm-staging.hhvm.eqiad.wmflabs has high system and user CPU. Most probably there is a wild process on it ?

https://grafana-labs.wikimedia.org/dashboard/db/labs-project-board?panelId=41&fullscreen&orgId=1&var-project=hhvm&var-server=All&from=now-1y&to=now

hhvm-staging_cpu.png (531×998 px, 45 KB)

Would be great to clean up and reclaim some CPU capacity :)

Event Timeline

I never used that VM and we don't have any HHVM hosts in production apart from silver, so from my PoV this can be dropped.

The root partition here was full. That looks to have been caused by the hhvm-alt upstart service being in an infinite restart loop. That in turn seems to be caused by permissions with the /run directory. I cleaned up a bunch of logs and then shutdown the hhvm-alt process with service hhvm-alt stop. The load avg has dropped to 0.02.

I never used that VM and we don't have any HHVM hosts in production apart from silver, so from my PoV this can be dropped.

I think that @MoritzMuehlenhoff probably meant that we don't have (many) Trusty HHVM hosts (?).

hhvm-staging.hhvm.eqiad.wmflabs VM was built by @Joe on 2016-02-29. It looks like he and Ori are the only folks who have ever logged in there as non-root users. The latest artifacts there look to be from around 2017-01-20.

I think that @MoritzMuehlenhoff probably meant that we don't have (many) Trusty HHVM hosts (?).

That's what I meant; all HHVM hosts in production are running jessie for a while now, the only exception still on trusty is silver at this point (with a forthcoming replacement as labweb*).

Mentioned in SAL (#wikimedia-cloud) [2017-11-16T19:03:06Z] <bd808> Joined project as admin to cleanup vm (T179380)

Mentioned in SAL (#wikimedia-cloud) [2017-11-16T19:06:33Z] <bd808> Deleted hhvm-staging.hhvm.eqiad.wmflabs (T179380)

bd808 claimed this task.