Page MenuHomePhabricator

Migrate dbmonitor hosts to Buster
Closed, ResolvedPublic

Description

The dbmonitor* hosts are currently running jessie:

  • dbmonitor1001.wikimedia.org

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Mentioned in SAL (#wikimedia-operations) [2019-10-22T12:14:59Z] <jynus> reimage to buster dbmonitor2001.wikimedia.org T224589

Change 545273 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] dbmonitor: Install the right apache module packages for >jessie

https://gerrit.wikimedia.org/r/545273

Change 545273 merged by Jcrespo:
[operations/puppet@production] dbmonitor: Install the right apache module packages for >jessie

https://gerrit.wikimedia.org/r/545273

2 blockers:

  • Exec of /usr/sbin/a2enmod php7.0 fails, as ther right module would be php7.3- No support for buster on the http module? Httpd/Httpd::Mod_conf[php7.0]/Exec[ensure_present_mod_php7.0]
  • Error: '/usr/bin/git clone -b master https://gerrit.wikimedia.org/r/operations/software/tendril /srv/tendril' returned 1 instead of one of [0] /srv/tendril/.git: Permission denied, (why?)

2 blockers:

  • Exec of /usr/sbin/a2enmod php7.0 fails, as ther right module would be php7.3- No support for buster on the http module? Httpd/Httpd::Mod_conf[php7.0]/Exec[ensure_present_mod_php7.0]

That would be the php module, or whatever else you're using, not the httpd module, that has no support for php7.3.

Change 545282 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] dbmonitor: Deploy git repo as mwdeploy, otherwise no write permission

https://gerrit.wikimedia.org/r/545282

Change 545286 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] dbmonitor: Install the right apache modules for buster

https://gerrit.wikimedia.org/r/545286

Thanks, joe, I didn't see your comment so it tool me more time than I thought to find it. The above 2 patches should fix it?

Change 545282 merged by Jcrespo:
[operations/puppet@production] dbmonitor: Deploy git repo as mwdeploy, otherwise no write permission

https://gerrit.wikimedia.org/r/545282

Change 545286 merged by Jcrespo:
[operations/puppet@production] dbmonitor: Install the right apache modules for buster

https://gerrit.wikimedia.org/r/545286

🤔

Notice: /Stage[main]/Httpd/Httpd::Conf[defaults]/File[/etc/apache2/conf-enabled/00-defaults.conf]/ensure: created
Info: /Stage[main]/Httpd/Httpd::Conf[defaults]/File[/etc/apache2/conf-enabled/00-defaults.conf]: Scheduling refresh of Service[apache2]                                                                                                         
Notice: /Stage[main]/Httpd/Httpd::Mod_conf[rewrite]/Exec[ensure_present_mod_rewrite]/returns: executed successfully
Info: /Stage[main]/Httpd/Httpd::Mod_conf[rewrite]/Exec[ensure_present_mod_rewrite]: Scheduling refresh of Service[apache2]                                                                                                                      
Notice: /Stage[main]/Httpd/Httpd::Mod_conf[headers]/Exec[ensure_present_mod_headers]/returns: executed successfully
Info: /Stage[main]/Httpd/Httpd::Mod_conf[headers]/Exec[ensure_present_mod_headers]: Scheduling refresh of Service[apache2]                                                                                                                      
Notice: /Stage[main]/Httpd/Httpd::Mod_conf[ssl]/Exec[ensure_present_mod_ssl]/returns: executed successfully
Info: /Stage[main]/Httpd/Httpd::Mod_conf[ssl]/Exec[ensure_present_mod_ssl]: Scheduling refresh of Service[apache2]
Notice: /Stage[main]/Httpd/Httpd::Mod_conf[php7.3]/Exec[ensure_present_mod_php7.3]/returns: ERROR: Module mpm_event is enabled - cannot proceed due to conflicts. It needs to be disabled first!
Notice: /Stage[main]/Httpd/Httpd::Mod_conf[php7.3]/Exec[ensure_present_mod_php7.3]/returns: ERROR: Could not enable dependency mpm_prefork for php7.3, aborting
Notice: /Stage[main]/Httpd/Httpd::Mod_conf[php7.3]/Exec[ensure_present_mod_php7.3]/returns: Considering dependency mpm_prefork for php7.3:
Notice: /Stage[main]/Httpd/Httpd::Mod_conf[php7.3]/Exec[ensure_present_mod_php7.3]/returns: Considering conflict mpm_event for mpm_prefork:
Notice: /Stage[main]/Httpd/Httpd::Mod_conf[php7.3]/Exec[ensure_present_mod_php7.3]/returns: Considering conflict mpm_worker for mpm_prefork:
Error: '/usr/sbin/a2enmod php7.3' returned 1 instead of one of [0]
Error: /Stage[main]/Httpd/Httpd::Mod_conf[php7.3]/Exec[ensure_present_mod_php7.3]/returns: change from 'notrun' to ['0'] failed: '/usr/sbin/a2enmod php7.3' returned 1 instead of one of [0]
Notice: /Stage[main]/Httpd/Httpd::Mod_conf[authnz_ldap]/Exec[ensure_present_mod_authnz_ldap]/returns: executed successfully

I think I know what happened: Initially the Puppet code was wrong and libapache-mod-php didn't get installed (which needs mpm_prefork). But "apache" still got installed which detects a new installation in the postinst in is_fresh_install() and then sets up mpm as the default mpm in the postinst. Now that Puppet is fixed, you can do a "dpkg --purge apache2 apache2-bin apache2-data apache2-utils libapache2-mod-php libapache2-mod-php7.3" and then re-running Puppet should set this up correctly.

That was more or less what I tried before, but it installs event version rather than prefork. Just to be sure, I tried your exact purges again, and I got the same error:

Notice: /Stage[main]/Httpd/Httpd::Mod_conf[php7.3]/Exec[ensure_present_mod_php7.3]/returns: ERROR: Module mpm_event is enabled - cannot proceed due to conflicts. It needs to be disabled first!
Notice: /Stage[main]/Httpd/Httpd::Mod_conf[php7.3]/Exec[ensure_present_mod_php7.3]/returns: ERROR: Could not enable dependency mpm_prefork for php7.3, aborting
Notice: /Stage[main]/Httpd/Httpd::Mod_conf[php7.3]/Exec[ensure_present_mod_php7.3]/returns: Considering dependency mpm_prefork for php7.3:
Notice: /Stage[main]/Httpd/Httpd::Mod_conf[php7.3]/Exec[ensure_present_mod_php7.3]/returns: Considering conflict mpm_event for mpm_prefork:
Notice: /Stage[main]/Httpd/Httpd::Mod_conf[php7.3]/Exec[ensure_present_mod_php7.3]/returns: Considering conflict mpm_worker for mpm_prefork:
Error: '/usr/sbin/a2enmod php7.3' returned 1 instead of one of [0]
Error: /Stage[main]/Httpd/Httpd::Mod_conf[php7.3]/Exec[ensure_present_mod_php7.3]/returns: change from 'notrun' to ['0'] failed: '/usr/sbin/a2enmod php7.3' returned 1 instead of one of [0]

My suspicion is that event or one of its used modules is chosen somewhere in the httpd class dependencies?

I ran manually a2dismod mpm_event and now it worked. I will check if this happens again on a clean install of dbmonitor1001 and add code to handle it. You (we) still may be right, but the config maybe hidden on a -common package or something.

Now we "only" need to fix the php, with I would prefer not to, not because it would be difficult, but because it would be a waste of time, and I would prefer to create a simple flask + d3 microsite, specially for dbtree:

[Wed Oct 23 09:20:25.549861 2019] [php7:error] [pid 23096] [client 208.80.153.74:36964] PHP Fatal error:  Uncaught Error: Call to undefined function mysql_connect() in /srv/dbtree/index.php:31\nStack trace:\n#0 /srv/dbtree/inc/sanity.php(279): db()\n#1 /srv/dbtree/inc/sanity.php(951): sql->__construct('tendril.servers')\n#2 /srv/dbtree/inc/tree.php(88): sql::query('tendril.servers')\n#3 /srv/dbtree/index.php(41): Tree->generate()\n#4 {main}\n  thrown in /srv/dbtree/index.php on line 31

Change 545505 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/software/dbtree@master] mysql: Migrate away from long-deprecated mysql module to mysqli

https://gerrit.wikimedia.org/r/545505

Change 545505 merged by Jcrespo:
[operations/software/dbtree@master] mysql: Migrate away from long-deprecated mysql module to mysqli

https://gerrit.wikimedia.org/r/545505

Mentioned in SAL (#wikimedia-operations) [2019-10-23T10:11:03Z] <jynus> deploying new version of dbtree T224589

Mentioned in SAL (#wikimedia-operations) [2019-10-23T10:13:47Z] <jynus> reverting dbtree revision to HEAD~1 T224589

[Wed Oct 23 10:17:48.055752 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_select_db() expects exactly 2 parameters, 1 given in /srv/dbtree/index.php on line 33
[Wed Oct 23 10:17:48.055976 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_real_escape_string() expects exactly 2 parameters, 1 given in /srv/dbtree/inc/sanity.php on line 286
[Wed Oct 23 10:17:48.056122 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_query() expects parameter 1 to be mysqli, string given in /srv/dbtree/inc/sanity.php on line 783
[Wed Oct 23 10:17:48.056213 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_errno() expects exactly 1 parameter, 0 given in /srv/dbtree/inc/sanity.php on line 788
[Wed Oct 23 10:17:48.056297 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_error() expects exactly 1 parameter, 0 given in /srv/dbtree/inc/sanity.php on line 789
[Wed Oct 23 10:17:48.056377 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_fetch_array() expects parameter 1 to be mysqli_result, null given in /srv/dbtree/inc/sanity.php on line 852
[Wed Oct 23 10:17:48.060413 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_select_db() expects exactly 2 parameters, 1 given in /srv/dbtree/index.php on line 33
[Wed Oct 23 10:17:48.060750 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_real_escape_string() expects exactly 2 parameters, 1 given in /srv/dbtree/inc/sanity.php on line 286
[Wed Oct 23 10:17:48.060941 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_query() expects parameter 1 to be mysqli, string given in /srv/dbtree/inc/sanity.php on line 783
[Wed Oct 23 10:17:48.061083 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_errno() expects exactly 1 parameter, 0 given in /srv/dbtree/inc/sanity.php on line 788
[Wed Oct 23 10:17:48.061236 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_error() expects exactly 1 parameter, 0 given in /srv/dbtree/inc/sanity.php on line 789
[Wed Oct 23 10:17:48.061352 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_fetch_array() expects parameter 1 to be mysqli_result, null given in /srv/dbtree/inc/sanity.php on line 852
[Wed Oct 23 10:17:48.062785 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_select_db() expects exactly 2 parameters, 1 given in /srv/dbtree/index.php on line 33
[Wed Oct 23 10:17:48.062926 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_real_escape_string() expects exactly 2 parameters, 1 given in /srv/dbtree/inc/sanity.php on line 286
[Wed Oct 23 10:17:48.063107 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_query() expects parameter 1 to be mysqli, string given in /srv/dbtree/inc/sanity.php on line 783
[Wed Oct 23 10:17:48.063231 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_errno() expects exactly 1 parameter, 0 given in /srv/dbtree/inc/sanity.php on line 788
[Wed Oct 23 10:17:48.063341 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_error() expects exactly 1 parameter, 0 given in /srv/dbtree/inc/sanity.php on line 789
[Wed Oct 23 10:17:48.063453 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_fetch_array() expects parameter 1 to be mysqli_result, null given in /srv/dbtree/inc/sanity.php on line 852
[Wed Oct 23 10:17:48.064791 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_select_db() expects exactly 2 parameters, 1 given in /srv/dbtree/index.php on line 33
[Wed Oct 23 10:17:48.064942 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_real_escape_string() expects exactly 2 parameters, 1 given in /srv/dbtree/inc/sanity.php on line 286
[Wed Oct 23 10:17:48.065104 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_real_escape_string() expects exactly 2 parameters, 1 given in /srv/dbtree/inc/sanity.php on line 286
[Wed Oct 23 10:17:48.065229 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_real_escape_string() expects exactly 2 parameters, 1 given in /srv/dbtree/inc/sanity.php on line 286
[Wed Oct 23 10:17:48.065380 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_query() expects parameter 1 to be mysqli, string given in /srv/dbtree/inc/sanity.php on line 783
[Wed Oct 23 10:17:48.065489 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_errno() expects exactly 1 parameter, 0 given in /srv/dbtree/inc/sanity.php on line 788
[Wed Oct 23 10:17:48.065596 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_error() expects exactly 1 parameter, 0 given in /srv/dbtree/inc/sanity.php on line 789
[Wed Oct 23 10:17:48.065707 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_fetch_array() expects parameter 1 to be mysqli_result, null given in /srv/dbtree/inc/sanity.php on line 852
[Wed Oct 23 10:17:48.067187 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_select_db() expects exactly 2 parameters, 1 given in /srv/dbtree/index.php on line 33
[Wed Oct 23 10:17:48.067320 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_real_escape_string() expects exactly 2 parameters, 1 given in /srv/dbtree/inc/sanity.php on line 286
[Wed Oct 23 10:17:48.067507 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_query() expects parameter 1 to be mysqli, string given in /srv/dbtree/inc/sanity.php on line 783
[Wed Oct 23 10:17:48.067633 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_errno() expects exactly 1 parameter, 0 given in /srv/dbtree/inc/sanity.php on line 788
[Wed Oct 23 10:17:48.067751 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_error() expects exactly 1 parameter, 0 given in /srv/dbtree/inc/sanity.php on line 789
[Wed Oct 23 10:17:48.067896 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_fetch_array() expects parameter 1 to be mysqli_result, null given in /srv/dbtree/inc/sanity.php on line 852
[Wed Oct 23 10:17:48.070537 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_select_db() expects exactly 2 parameters, 1 given in /srv/dbtree/index.php on line 33
[Wed Oct 23 10:17:48.070809 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_query() expects parameter 1 to be mysqli, string given in /srv/dbtree/inc/sanity.php on line 783
[Wed Oct 23 10:17:48.070914 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_errno() expects exactly 1 parameter, 0 given in /srv/dbtree/inc/sanity.php on line 788
[Wed Oct 23 10:17:48.071025 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_error() expects exactly 1 parameter, 0 given in /srv/dbtree/inc/sanity.php on line 789
[Wed Oct 23 10:17:48.071139 2019] [:error] [pid 10017] [client 10.64.32.67:64543] PHP Warning:  mysqli_fetch_array() expects parameter 1 to be mysqli_result, null given in /srv/dbtree/inc/sanity.php on line 852

So normally the fix for the above would be trivial, but the design decisions of making sql class a singleton are in my opinion not worthy fixing, because it would force to either a deeper refactoring or a global scope hack. I would prefer to spend more time to refactor tendril into not using the Google API: T96499

jcrespo changed the task status from Open to Stalled.Nov 12 2019, 9:39 AM
jcrespo removed jcrespo as the assignee of this task.
MoritzMuehlenhoff renamed this task from Migrate dbmonitor hosts to Stretch/Buster to Migrate dbmonitor hosts to Buster.May 29 2020, 8:47 AM
MoritzMuehlenhoff updated the task description. (Show Details)

Please don't consider dbmonitor2001 as upgraded- as the application doesn't work after os upgrade. We could downgrade it back to jessie to make it work.

I ran manually a2dismod mpm_event and now it worked.

Confirmed. This happens to me all the time and the fix is manually running a2dismod mpm_event.

I once made a change to try and fix that but i gave up on it because i could never get a review:

https://gerrit.wikimedia.org/r/c/operations/puppet/+/451206

Just to be clear- work on this is stalled because the expected solution is to kill tendril, not to fix it. Manuel is right now working on that but it will take time.

merging in T262085 as a duplicate. I just wish it would actually merge the content like RT did

Feel free to add more context- the 500 were known but the description here was not very explicit, probably out of frustration of the many hours sinked into this.

Just to be clear- work on this is stalled because the expected solution is to kill tendril, not to fix it. Manuel is right now working on that but it will take time.

Thanks for making that clear. I will not worry about changing things in related puppet code then for now.

The problem described here is - I think - the same as the one in the "simplelap" role: you're not including the httpd::mpm class that is designed to take care of things for you.

Change 625583 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/puppet@production] tendril::webserver: configure mpm

https://gerrit.wikimedia.org/r/625583

Change 625583 merged by Giuseppe Lavagetto:
[operations/puppet@production] tendril::webserver: configure mpm

https://gerrit.wikimedia.org/r/625583

Joe removed Joe as the assignee of this task.Sep 23 2020, 8:03 AM

With orchestrator in place, can these be removed now? Support for jessie will cease in two weeks.

Unfortunately not, we don't have all the sections in Orchestrator yet. And also need to see how we'll be replacing the query activity, we have some ideas on that front but yet to be implemented/deployed

Ouch, let's move to dbmonitor to Stretch, then? If PHP 5 is the blocker (I remember some issues with PHP7 vaguely), I can make a stretch-wikimedia build of php5, but this really, really needs to move away from jessie: jessie is EOLed for three quarters now and we spend a lot of time on backporting security fixes for jessie-wikimedia internally and this really needs to end now.

The problem was indeed mysqli, we can try to see if we can run php5 on stretch as you propose.
@jcrespo took a deep look at this a couple of years ago I think, so maybe he can give more context on what he saw at the time (other than what's already on this task) at T224589#5597729 and T224589#5598014

The problem was indeed mysqli, we can try to see if we can run php5 on stretch as you propose.
@jcrespo took a deep look at this a couple of years ago I think, so maybe he can give more context on what he saw at the time (other than what's already on this task) at T224589#5597729 and T224589#5598014

Ok! Given that we have a long term solution with Orchestrator I think simply lifting PHP 5.5 to stretch is a good compromise, I'll check out a build later.

@jcrespo took a deep look at this a couple of years ago I think

I gave up- it is not just a question of mysql->mysqli rewrite, which would be trivial, it also uses global variables in a way that is not portable, and spending more time on a codebase that has other many issues (tokudb, etc.) was not worth it.

I am talking about porting the code, it should work with any php version that supports the old mysql driver.

Change 672976 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Add thirdparty/php56

https://gerrit.wikimedia.org/r/672976

Change 672977 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Adapt package installation for tendril/buster to pull in PHP 5.6

https://gerrit.wikimedia.org/r/672977

Change 672976 merged by Muehlenhoff:
[operations/puppet@production] Add thirdparty/php56

https://gerrit.wikimedia.org/r/672976

Mentioned in SAL (#wikimedia-operations) [2021-03-17T09:59:40Z] <moritzm> imported PHP 5.6.40 to thirdparty/php56 T224589

Change 672977 merged by Muehlenhoff:
[operations/puppet@production] Adapt package installation for tendril/buster to pull in PHP 5.6

https://gerrit.wikimedia.org/r/672977

Change 674021 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Point dbtree.w.o to dbmonitor1002

https://gerrit.wikimedia.org/r/674021

Change 674021 merged by Muehlenhoff:
[operations/puppet@production] Point dbtree.w.o to dbmonitor1002

https://gerrit.wikimedia.org/r/674021

Change 674026 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] tendril.sql: Add dbmonitor1002 grants

https://gerrit.wikimedia.org/r/674026

Change 674026 merged by Marostegui:
[operations/puppet@production] tendril.sql: Add dbmonitor1002 grants

https://gerrit.wikimedia.org/r/674026

Change 674303 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/dns@master] Failover tendril to dbmonitor1002

https://gerrit.wikimedia.org/r/674303

Change 674303 merged by Muehlenhoff:
[operations/dns@master] Failover tendril to dbmonitor1002

https://gerrit.wikimedia.org/r/674303

tendril.w.o and dbtree.w.o are now served from dbmonitor1002.wikimedia.org running Buster. If there are any issues, we can fallback to dbmonitor1001 by reverting https://gerrit.wikimedia.org/r/674303

Hey, @MoritzMuehlenhoff, I see no ongoing issues, but I see some things running 10x faster now!

I came back from holidays, @Kormat @LSobanski anything wrong during the last few days with dbmonitor running buster?
From Jaime's comment, everything looked good from his side, but did you notice something or can we go ahead and kill the jessie hosts?

Marostegui changed the task status from Stalled to Open.Apr 5 2021, 8:29 AM

Mentioned in SAL (#wikimedia-operations) [2021-04-07T10:51:42Z] <marostegui> Stop apache on dbmonitor1001 T224589

I have stopped apache on dbmonitor1001 (and done chmod -x to apache2 binary so puppet doesn't bring it up), let's leave it till next week and if nothing breaks, let's decommission it

I have stopped apache on dbmonitor1001 (and done chmod -x to apache2 binary so puppet doesn't bring it up), let's leave it till next week and if nothing breaks, let's decommission it

I'll start with decom in a bit

Change 678799 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Remove dbmonitor1001 from Puppet

https://gerrit.wikimedia.org/r/678799

Change 678800 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Remove grant for dbmonitor1001

https://gerrit.wikimedia.org/r/678800

cookbooks.sre.hosts.decommission executed by jmm@cumin1001 for hosts: dbmonitor1001.wikimedia.org

  • dbmonitor1001.wikimedia.org (PASS)
    • Downtimed host on Icinga
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.eqiad.wmnet to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.eqiad.wmnet to Netbox

Probably known (sorry) but the other alert I saw recently was: "CRITICAL: the following (6) node(s) change every puppet run: dbmonitor1001.wikimedia.org,...". Probably related to this?

Probably known (sorry) but the other alert I saw recently was: "CRITICAL: the following (6) node(s) change every puppet run: dbmonitor1001.wikimedia.org,...". Probably related to this?

Yeah, Manuel had rendered Apache non-startable (to rule out that anything still accidentally uses the old host), so Puppet tried to start it in vain with every Puppet run.

Change 678799 merged by Muehlenhoff:

[operations/puppet@production] Remove dbmonitor1001 from Puppet

https://gerrit.wikimedia.org/r/678799

MoritzMuehlenhoff claimed this task.

Tendril and dbtree are now running on a new Buster instance dbmonitor1002.wikimedia.org ith PHP 5.6 packages from sury.org (since Tendril needs the mysql extention dropped in PHP 7) and dbmonitor1001/jessie has been removed.

Change 679318 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] tendril.sql: Remove dbmonitor1001 grants

https://gerrit.wikimedia.org/r/679318

Change 679318 merged by Marostegui:

[operations/puppet@production] tendril.sql: Remove dbmonitor1001 grants

https://gerrit.wikimedia.org/r/679318

Change 678800 abandoned by Muehlenhoff:

[operations/puppet@production] Remove grant for dbmonitor1001

Reason:

Already done in 679318

https://gerrit.wikimedia.org/r/678800