Page MenuHomePhabricator

Decommission servermon
Closed, ResolvedPublic

Description

All the use cases formerly provided by servermon/packages are now superceded by debmonitor and everything else should be covered by other components (e.g. puppetboard to detect hosts where puppet hasn't run recently).

So I think we can remove servermon (and eventually the netmon1003 instance). Should be doublechecked by raising it in the SRE meeting or a mail to the ops list.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 6 2018, 8:09 AM
MoritzMuehlenhoff triaged this task as Normal priority.Jul 9 2018, 9:49 AM

FWIW, servermon allows searching for specific facts (for specific hosts as well) under the fact query menu allowing to relatively easily slice information present in puppetdb (actually in mysql activerecord db, but this is unimportant), which is not something I see possible in puppetboard. There is a query tab but I get

What you were looking for has been disabled by the administrator.

It's a pretty useful functionality to me.

Volans added a comment.Jul 9 2018, 5:14 PM

There is a query tab but I get

What you were looking for has been disabled by the administrator.

We decided to disable access to the query tab because it allows to query everything, including catalogs, hence showing also secrets :(
See https://github.com/voxpupuli/puppetboard/blob/master/screenshots/query.png

The inventory page is customizable from the configuration with different facts, so basically is a static query:
https://puppetboard.wikimedia.org/inventory

What we could do is to try to make a patch to send upstream to allow for a more granular configuration for query, allowing to specify which kind of objects can be queried.

I'm open for suggestions

I'm using servermon for fact query regularly, but I think I'm one of the very few :) I admit I haven't played around much with puppetboard to adjust my use cases, so that may be something that could potentially work (with the caveats that Riccardo mentioned above, however).

That said, I'm actually using it for the Netbox migration, and I'd like to avoid changing that until we migrate to it, so let's at least postpone for a month or two please!

There is a query tab but I get

What you were looking for has been disabled by the administrator.

We decided to disable access to the query tab because it allows to query everything, including catalogs, hence showing also secrets :(
See https://github.com/voxpupuli/puppetboard/blob/master/screenshots/query.png
The inventory page is customizable from the configuration with different facts, so basically is a static query:
https://puppetboard.wikimedia.org/inventory
What we could do is to try to make a patch to send upstream to allow for a more granular configuration for query, allowing to specify which kind of objects can be queried.

Sure, if we can disable catalogs/reports in puppetboard (I guess this is the problem ?) that 'd be great

I'm open for suggestions

Ack @akosiaris, I've opened https://github.com/voxpupuli/puppetboard/issues/475 for now, I'll see if I can find the time to send a patch, doesn't seem overcomplicated to just filter the queryable endpoints.

Volans added a comment.EditedJul 18 2018, 10:26 AM

I've submitted https://github.com/voxpupuli/puppetboard/pull/477 upstream, and went ahead and applied it to our puppetboard installation, so that we can enable some query endpoints. As a start I propose to enable:

['facts', 'factsets', 'fact-contents', 'fact-paths', 'nodes']

Change 446564 had a related patch set uploaded (by Volans; owner: Volans):
[operations/puppet@production] puppetboard: enable some query endpoints

https://gerrit.wikimedia.org/r/446564

Change 446564 merged by Volans:
[operations/puppet@production] puppetboard: enable some query endpoints

https://gerrit.wikimedia.org/r/446564

The query tab is now enabled, limited to the above endpoints. This should cover most cases, although Puppetboard's query support is not great to be honest. @faidon might have something in store...

jcrespo added a comment.EditedNov 28 2018, 3:43 PM

Please, please, if this happens in anyway, remember not to close this as resolved without doing requesting lots of cleanup related to the database.

Is anyone still using Servermon at this point?

jbond added a subscriber: jbond.Jan 25 2019, 1:30 PM
jijiki added a subscriber: jijiki.Feb 27 2019, 7:48 PM

Is anyone still using Servermon at this point?

I can say I haven't in a pretty long time. If @faidon also doesn't I think we can shut it down.

Volans added a comment.Apr 4 2019, 4:53 PM

And when we do, can we also drop the package_updates custom fact?

And when we do, can we also drop the package_updates custom fact?

Sure.

Change 502171 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] turn netmon1003 into a spare, delete servermon role

https://gerrit.wikimedia.org/r/502171

Change 502172 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] mariadb: revoke servermon grants

https://gerrit.wikimedia.org/r/502172

Change 502173 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] deployment_server: remove servermon

https://gerrit.wikimedia.org/r/502173

Change 502174 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] delete servermon module

https://gerrit.wikimedia.org/r/502174

Change 502175 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] puppetmaster: remove servermon report

https://gerrit.wikimedia.org/r/502175

Change 502176 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] mariadb::ferm_misc: remove firewall rule for servermon

https://gerrit.wikimedia.org/r/502176

Dzahn added a subscriber: Dzahn.Apr 11 2019, 5:13 PM

I'm using servermon for fact query regularly, but I think I'm one of the very few :) I admit I haven't played around much with puppetboard to adjust my use cases, so that may be something that could potentially work (with the caveats that Riccardo mentioned above, however).
That said, I'm actually using it for the Netbox migration, and I'd like to avoid changing that until we migrate to it, so let's at least postpone for a month or two please!

Hi @faidon how do you think about this today? Would you be ok with decom'ing servermon nowadays?

Dzahn assigned this task to faidon.Apr 17 2019, 9:48 PM
Dzahn changed the status of subtask T220355: decom netmon1003 from Open to Stalled.Jul 9 2019, 8:01 PM
Dzahn mentioned this in T220355: decom netmon1003.

gentle ping

There was some discussions during the SRE offsite regarding this. @faidon and @Volans have the details, but the gist of it is that servermon still provides 1 functionality that puppetboard does not and it's the ability to query a set of hosts and obtain an arbitrary set of facts for those hosts in a tabular format.

We discussed this in the SRE Infrastructure Foundations meeting; given that there are other issues with Servermon blocking the Buster migration of the Puppet masters, servermon/netmon1003 can go away now. An alternative solution will be found for the use case described by Alex when the need comes up again.

Change 521903 had a related patch set uploaded (by Jbond; owner: John Bond):
[operations/puppet@production] puppetmaster: remove severmon custom reporter

https://gerrit.wikimedia.org/r/521903

Dzahn claimed this task.Jul 10 2019, 8:10 PM

Change 502173 merged by Dzahn:
[operations/puppet@production] deployment_server: remove servermon

https://gerrit.wikimedia.org/r/502173

Change 502175 abandoned by Dzahn:
puppetmaster: remove servermon report

Reason:
duplicate of https://gerrit.wikimedia.org/r/c/operations/puppet/ /521903

https://gerrit.wikimedia.org/r/502175

Change 521903 merged by Jbond:
[operations/puppet@production] puppetmaster: remove severmon custom reporter

https://gerrit.wikimedia.org/r/521903

Change 502171 merged by Dzahn:
[operations/puppet@production] remove netmon1003 from site.pp

https://gerrit.wikimedia.org/r/502171

Mentioned in SAL (#wikimedia-operations) [2019-07-11T22:59:36Z] <mutante> netmon1003 - removing servermon - servermon.wikimedia.org is being decom'ed (T198939)

Change 502174 merged by Dzahn:
[operations/puppet@production] delete servermon role and module

https://gerrit.wikimedia.org/r/502174

Dzahn changed the status of subtask T220355: decom netmon1003 from Stalled to Open.Jul 12 2019, 11:31 PM

Change 502176 merged by Jcrespo:
[operations/puppet@production] mariadb::ferm_misc: remove firewall rule for servermon

https://gerrit.wikimedia.org/r/502176

What about the puppet database on m1?

Change 502172 merged by Jcrespo:
[operations/puppet@production] mariadb: revoke servermon grants

https://gerrit.wikimedia.org/r/502172

Also the passwords have to be removed from the private repo (and possibly from labs/private).

Mentioned in SAL (#wikimedia-operations) [2019-07-16T08:44:41Z] <jynus> droping servermon accounts from m1 dbs T198939

What about the puppet database on m1?

The database should all be ephemeral data about past server state, so no need to retain, but adding @akosiaris for confirmation given he's the primary upstream author.

What about the puppet database on m1?

The database should all be ephemeral data about past server state, so no need to retain, but adding @akosiaris for confirmation given he's the primary upstream author.

Absolutely correct. Feel free to delete it.

Could also confirm all puppet grants (mysql database is understood, of course) on puppet database are no longer needed? You can find it on the misc production grants.

Change 523702 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb: Remove puppet mysql grants for m1 misc databases

https://gerrit.wikimedia.org/r/523702

Mentioned in SAL (#wikimedia-operations) [2019-07-16T16:55:47Z] <mutante> netmon1003: shutdown -h now | ganeti1001: gnt-instance shutdown netmon1003.wikmedia.org - removed from icinga T198939 T220355

Change 523797 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[labs/private@master] delete passwords::servermon (T198939)

https://gerrit.wikimedia.org/r/523797

Change 523797 merged by Dzahn:
[labs/private@master] delete passwords::servermon (T198939)

https://gerrit.wikimedia.org/r/523797

Dzahn added a comment.Jul 16 2019, 8:19 PM

Also the passwords have to be removed from the private repo (and possibly from labs/private).

done in private repo and labs/private as well

https://gerrit.wikimedia.org/r/c/labs/private/+/523797

Change 523799 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/dns@master] delete servermon.wikimedia.org

https://gerrit.wikimedia.org/r/523799

Change 523799 merged by Dzahn:
[operations/dns@master] delete servermon.wikimedia.org

https://gerrit.wikimedia.org/r/523799

Dzahn added a comment.Jul 16 2019, 8:26 PM

This should be done. Service is down. Ganeti VM is stopped and destroyed. Password class is deleted from private and labs/private. DNS entry is removed. string "servermon" does not appear anymore in puppet or the other repos...

Dzahn closed this task as Resolved.Jul 16 2019, 8:26 PM