Page MenuHomePhabricator

Remove Diamond?
Closed, ResolvedPublic

Description

Diamond has been removed from production in 2019 and we stopped adding it to cloud images in October 2020 via https://gerrit.wikimedia.org/r/c/operations/puppet/+/632477
Since it was never ported to Python 3, it it also not available on Bullseye instances either; we have an OS check in profile::wmcs::instance for this.

There are also no remaining WMF-specific Diamond collectors in puppet anymore (these were all removed/replaced by Prometheus over the past years), as such Diamond only collects the base metrics provided by the diamond package itself (CPU/network/load/disk space/memory etc).

Is there any reason not to simply remove it for good? (By setting diamond::remove: true in Hiera for all of WMCS and eventually removing the Diamond puppet code)?

Related Objects

StatusSubtypeAssignedTask
Resolvedtaavi
Resolvedtaavi
Opendcaro
Resolvedtaavi
Resolvedtaavi
Resolvedtaavi
Resolved JHedden
Resolved JHedden
Resolved Bstorm
Resolvedbd808
ResolvedAndrew
DeclinedNone
Resolved nskaggs
Resolvedtaavi
Resolvedjbond
Resolvedtaavi
Resolvedtaavi
Resolvedtaavi
Resolvedtaavi
Resolveddcaro
ResolvedAndrew

Event Timeline

This could move forward today, I think. The basic instance metrics are now stored in Prometheus, although there isn't a replacement dashboard yet (T264920).

Change 935103 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] Uninstall Diamond everywhere

https://gerrit.wikimedia.org/r/935103

Change 935103 merged by Filippo Giunchedi:

[operations/puppet@production] Uninstall Diamond everywhere

https://gerrit.wikimedia.org/r/935103

Mentioned in SAL (#wikimedia-cloud) [2023-07-17T12:45:45Z] <taavi> removing diamond from remaining buster instances T317032

I'm going to call this done. The absent code is still there, but with so many instances with broken Puppet config I think we can just let it stay there until all of the buster instances have been removed.