While running scap for a security deploy, mw2448.codfw.wmnet had php-fpm fail to restart. I ran scap for a second time and received the same error. sync-apaches also failed on mw2448.codfw.wmnet. Log is below:
22:41:05 Started sync-apaches 22:44:27 ['/usr/bin/scap', 'pull', '--no-php-restart', '--no-update-l10n', '--include', 'php-1.42.0-wmf.9', '--include', 'php-1.42.0-wmf.9/extensions', '--include', 'php-1.42.0-wmf.9/extensions/PageTriage', '--include', 'php-1.42.0-wmf.9/extensions/PageTriage/modules', '--include', 'php-1.42.0-wmf.9/extensions/PageTriage/modules/***', 'mw2289.codfw.wmnet', 'mw1366.eqiad.wmnet', 'mw1420.eqiad.wmnet', 'deploy2002.codfw.wmnet', 'mw2300.codfw.wmnet', 'mw2259.codfw.wmnet', 'deploy1002.eqiad.wmnet', 'mw1404.eqiad.wmnet', 'mw1486.eqiad.wmnet', 'mw1398.eqiad.wmnet'] (ran as mwdeploy@mw2448.codfw.wmnet) returned [255]: ssh: connect to host mw2448.codfw.wmnet port 22: Connection timed out 22:44:27 sync-apaches: 100% (in-flight: 0; ok: 358; fail: 1; left: 0) 22:44:27 Per-host sync duration: average 2.0s, median 1.0s 22:44:27 rsync transfer: average 0 bytes/host, total 0 bytes 22:44:27 1 apaches had sync errors 22:44:27 Finished sync-apaches (duration: 03m 22s) 22:44:27 Started php-fpm-restarts 22:44:27 Running '/usr/local/sbin/restart-php-fpm-all php7.4-fpm 9223372036854775807' on 295 host(s) 22:48:27 /usr/bin/sudo -u root -- /usr/local/sbin/restart-php-fpm-all php7.4-fpm 9223372036854775807 (ran as mwdeploy@mw2448.codfw.wmnet) returned [255]: ssh: connect to host mw2448.codfw.wmnet port 22: Connection timed out 22:48:27 php-fpm-restart: 100% (in-flight: 0; ok: 294; fail: 1; left: 0) 22:48:27 1 hosts had failures restarting php-fpm