Page MenuHomePhabricator

Blacklist apache from unattended-upgrades on tools puppetmaster
Closed, ResolvedPublic

Description

Puppet failed on tools puppetmaster on 28/2 due to apache failing with - causing puppet failures across tools

Feb 28 06:49:53 tools-puppetmaster-02 apache2[22628]: Starting web server: apache2 failed!
Feb 28 06:49:53 tools-puppetmaster-02 apache2[22628]: The apache2 configtest failed. ... (warning).
Feb 28 06:49:53 tools-puppetmaster-02 apache2[22628]: Output of config test was:
Feb 28 06:49:53 tools-puppetmaster-02 apache2[22628]: AH00526: Syntax error on line 8 of /etc/apache2/sites-enabled/50-puppetmaster-wikimedia-org.conf:
Feb 28 06:49:53 tools-puppetmaster-02 apache2[22628]: Invalid command 'SSLOpenSSLConfCmd', perhaps misspelled or defined by a module not included in the server configuration
Feb 28 06:49:53 tools-puppetmaster-02 apache2[22628]: Action 'configtest' failed.
Feb 28 06:49:53 tools-puppetmaster-02 apache2[22628]: The Apache error log may have more information.
Feb 28 06:49:53 tools-puppetmaster-02 systemd[1]: apache2.service: control process exited, code=exited status=1
Feb 28 06:49:53 tools-puppetmaster-02 systemd[1]: Failed to start LSB: Apache2 web server.
Feb 28 06:49:53 tools-puppetmaster-02 systemd[1]: Unit apache2.service entered failed state.

_joe_ and moritzm helped fix it - and this was the underlying issue (from moritzm irc)

1:46 AM hi labs team, so apache on tools-puppetmaster-02 failed to start and that was caused by the use of unattended-upgrades in labs:
1:46 AM the standard apache package in Debian is built against openssl 1.0.1
1:49 AM but custom diffi hellman parameters can only be configured with apache is linked against openssl 1.0.2 (as required by "logjam"), so we're using a custom rebuild of apache on apt.wikimedia.org
1:49 AM so if Debian issues a new apache package it gets rebuilt internally and pushed to apt.wikimedia.org
1:51 AM but since labs uses unattended upgrades, the stock Debian version gets installed in the time window between Debian release and rebuild (DSA for apache happened Sunday evening and I pushed the rebuild yesterday at around 11am)
1:51 AM and since the native Debian version is not built against 1.0.2, it failed to start with an error since it doesn't provide SSLOpenSSLConfCmd
1:52 AM I suggest to blacklist apache from unattended-upgrades on at least the puppet masters using Unattended-Upgrade::Package-Blacklist: http://askubuntu.com/questions/193773/can-i-configure-unattended-upgrades-to-not-upgrade-packages-that-require-a-reboo
1:53 AM this is only needed for jessie, on trusty/precise the stock apache package is used
1:53 AM and stretch as well

Event Timeline

(FTR: To fix other instances with this problem (= every host with apache2 running), apt-get install apache2 is enough.)

If apache2 ist blacklisted, who will inform Cloud-Services/Toolforge administrators of the need to manually update apache2?

@scfc: Updates for Apache in jessie are relatively rare, I could simply drop a note in the labs channel when that happens?

Would pinning apache to our repo for jessie in the Puppet class handle this?

Would pinning apache to our repo for jessie in the Puppet class handle this?

Unfortunately not. unattended-upgrades would still upgrade the packages. The pinning might downgrade this again, but that would only lead to "upgrade battles" between unattended-upgrades and puppet.

This just happened again, and puppet broke across tools.

We need the following to be added to the unattended-upgrades config for jessie puppetmasters in labs:

// Skip apache updates (T159254)
Unattended-Upgrade::Package-Blacklist {
"apache2";
"apache2-bin";
"apache2-data";
"apache2-dbg";
"apache2-dev";
"apache2-doc";
"apache2-mpm-event";
"apache2-mpm-itk";
"apache2-mpm-prefork";
"apache2-mpm-worker";
"apache2-suexec";
"apache2-suexec-custom";
"apache2-suexec-pristine";
"apache2-utils";
"apache2.2-bin";
"apache2.2-common";
"libapache2-mod-macro";
"libapache2-mod-proxy-html";
};

(Typically /etc/apt/apt.conf.d/50unattended-upgrades, unless this has been customised in puppet)

The beta cluster puppet master (deployment-puppetmaster02.deployment-prep.eqiad.wmflabs) has been hit by that one.

apt-cache policy apache2
apache2:
  Installed: 2.4.10-10+deb8u9
  Candidate: 2.4.10-10+deb8u9+wmf1
  Version table:
     2.4.10-10+deb8u9+wmf1 0
       1001 http://apt.wikimedia.org/wikimedia/ jessie-wikimedia/main amd64 Packages
 *** 2.4.10-10+deb8u9 0
        500 http://security.debian.org/ jessie/updates/main amd64 Packages
        100 /var/lib/dpkg/status
     2.4.10-10+deb8u8 0
        500 http://httpredir.debian.org/debian/ jessie/main amd64 Packages
/var/log/apt/history.log
Start-Date: 2017-07-03  06:47:24
Commandline: /usr/bin/unattended-upgrade
Upgrade: libgcrypt20:amd64 (1.6.3-2+deb8u3, 1.6.3-2+deb8u4)
End-Date: 2017-07-03  06:47:25

Start-Date: 2017-07-04  06:36:09
Commandline: /usr/bin/unattended-upgrade
Upgrade: apache2-utils:amd64 (2.4.10-10+deb8u8+wmf1, 2.4.10-10+deb8u9), apache2-data:amd64 (2.4.10-10+deb8u8+wmf1, 2.4.10-10+deb8u9), apache2:amd64 (2.4.10-10+deb8u8+wmf1, 2.4.10-10+deb8u9), apache2-bin:amd64 (2.4.10-10+deb8u8+wmf1, 2.4.10-10+deb8u9)
End-Date: 2017-07-04  06:36:14
/var/log/apt/term.log
Log started: 2017-07-03  06:47:24
(Reading database ... 65615 files and directories currently installed.)
Preparing to unpack .../libgcrypt20_1.6.3-2+deb8u4_amd64.deb ...
Unpacking libgcrypt20:amd64 (1.6.3-2+deb8u4) over (1.6.3-2+deb8u3) ...
Setting up libgcrypt20:amd64 (1.6.3-2+deb8u4) ...
Processing triggers for libc-bin (2.19-18+deb8u10) ...
Log ended: 2017-07-03  06:47:25

Log started: 2017-07-04  06:36:09
(Reading database ... 65615 files and directories currently installed.)
Preparing to unpack .../apache2_2.4.10-10+deb8u9_amd64.deb ...
Unpacking apache2 (2.4.10-10+deb8u9) over (2.4.10-10+deb8u8+wmf1) ...
Preparing to unpack .../apache2-bin_2.4.10-10+deb8u9_amd64.deb ...
Unpacking apache2-bin (2.4.10-10+deb8u9) over (2.4.10-10+deb8u8+wmf1) ...
Preparing to unpack .../apache2-utils_2.4.10-10+deb8u9_amd64.deb ...
Unpacking apache2-utils (2.4.10-10+deb8u9) over (2.4.10-10+deb8u8+wmf1) ...
Preparing to unpack .../apache2-data_2.4.10-10+deb8u9_all.deb ...
Unpacking apache2-data (2.4.10-10+deb8u9) over (2.4.10-10+deb8u8+wmf1) ...
Processing triggers for systemd (215-17+deb8u7) ...
Processing triggers for man-db (2.7.0.2-5) ...
Setting up apache2-bin (2.4.10-10+deb8u9) ...
Setting up apache2-utils (2.4.10-10+deb8u9) ...
Setting up apache2-data (2.4.10-10+deb8u9) ...
Setting up apache2 (2.4.10-10+deb8u9) ...
Job for apache2.service failed. See 'systemctl status apache2.service' and 'journalctl -xn' for details.
invoke-rc.d: initscript apache2, action "restart" failed.
Log ended: 2017-07-04  06:36:14

From Moritz explanation that comes from the Apache config:

SSLOpenSSLConfCmd DHParameters "/etc/ssl/dhparam.pem"

Which is available with OpenSSL 1.0.2+.

https://httpd.apache.org/docs/trunk/en/mod/mod_ssl.html#sslcertificatefile stats that the DH parameters can be added to the file of the first SSLCertificateFile. Though on the puppet master that is probably a file generated/managed by puppet itself:

SSLCertificateFile      /var/lib/puppet/server/ssl/certs/deployment-puppetmaster02.deployment-prep.eqiad.wmflabs.pem

I thought about hacking:

  SSLHonorCipherOrder On
- SSLOpenSSLConfCmd DHParameters "/etc/ssl/dhparam.pem"
+ SSLCertificateFile /etc/ssl/dhparam.pem
   SSLCertificateFile      /var/lib/puppet/server/ssl/certs/deployment-puppetmaster02.deployment-prep.eqiad.wmflabs.pem
   SSLCertificateKeyFile   /var/lib/puppet/server/ssl/private_keys/deployment-puppetmaster02.deployment-prep.eqiad.wmflabs.pem

But Apache fails:

AH02561: Failed to configure certificate deployment-puppetmaster02.deployment-prep.eqiad.wmflabs:8140:0, check /etc/ssl/dhparam.pem
SSL Library Error: error:0906D06C:PEM routines:PEM_read_bio:no start line (Expecting: CERTIFICATE) -- Bad file contents or format - or even just a forgotten SSLCertificateKeyFile?

As to why unattended upgrade does 2.4.10-10+deb8u9 over 2.4.10-10+deb8u8+wmf1. The unattended configuration file comes with a different policy than apt-get apparently:

/etc/apt/apt.conf.d/50unattended-upgrades
Unattended-Upgrade::Origins-Pattern {
        "origin=Debian,codename=${distro_codename},label=Debian-Security";
};

On the CI puppetmaster (integration-puppetmaster01.integration.eqiad.wmflabs) I went with some custom rules:

https://gerrit.wikimedia.org/r/#/c/315079/
https://gerrit.wikimedia.org/r/#/c/315084/

Which although abandoned, are still on the puppet master. Might be worth revisiting both patches?

From integration-puppetmaster , the custom unattended configuration routes the proper upgrade: (2.4.10-10+deb8u9+wmf1) over (2.4.10-10+deb8u8+wmf1):

/var/log/apt/term.log
Log started: 2017-07-03  06:30:09
(Reading database ... 65776 files and directories currently installed.)
Preparing to unpack .../libgcrypt20_1.6.3-2+deb8u4_amd64.deb ...
Unpacking libgcrypt20:amd64 (1.6.3-2+deb8u4) over (1.6.3-2+deb8u3) ...
Setting up libgcrypt20:amd64 (1.6.3-2+deb8u4) ...
Processing triggers for libc-bin (2.19-18+deb8u10) ...
Log ended: 2017-07-03  06:30:10

Log started: 2017-07-04  06:49:02
(Reading database ... 65776 files and directories currently installed.)
Preparing to unpack .../apache2_2.4.10-10+deb8u9+wmf1_amd64.deb ...
Unpacking apache2 (2.4.10-10+deb8u9+wmf1) over (2.4.10-10+deb8u8+wmf1) ...
Preparing to unpack .../apache2-bin_2.4.10-10+deb8u9+wmf1_amd64.deb ...
Unpacking apache2-bin (2.4.10-10+deb8u9+wmf1) over (2.4.10-10+deb8u8+wmf1) ...
Preparing to unpack .../apache2-utils_2.4.10-10+deb8u9+wmf1_amd64.deb ...
Unpacking apache2-utils (2.4.10-10+deb8u9+wmf1) over (2.4.10-10+deb8u8+wmf1) ...
Preparing to unpack .../apache2-data_2.4.10-10+deb8u9+wmf1_all.deb ...
Unpacking apache2-data (2.4.10-10+deb8u9+wmf1) over (2.4.10-10+deb8u8+wmf1) ...
Processing triggers for systemd (215-17+deb8u7) ...
Processing triggers for man-db (2.7.0.2-5) ...
Setting up apache2-bin (2.4.10-10+deb8u9+wmf1) ...
Setting up apache2-utils (2.4.10-10+deb8u9+wmf1) ...
Setting up apache2-data (2.4.10-10+deb8u9+wmf1) ...
Setting up apache2 (2.4.10-10+deb8u9+wmf1) ...
Log ended: 2017-07-04  06:49:12

Mentioned in SAL (#wikimedia-releng) [2017-07-04T14:10:56Z] <hashar> manually upgraded apache2 on deployment-puppetmaster02 see T159254

Change 315084 had a related patch set uploaded (by Hashar; owner: Hashar):
[operations/puppet@production] contint: unattended upgrade from distro

https://gerrit.wikimedia.org/r/315084

Mentioned in SAL (#wikimedia-operations) [2017-09-21T08:35:36Z] <moritzm> unbreak deployment-puppetmaster02 in deployment-prep (broken by unattended-upgrades update of apache T159254)

Change 315084 abandoned by Hashar:
contint: unattended upgrade from distro

Reason:
Generalized by Arturo with https://gerrit.wikimedia.org/r/#/c/389480/

https://gerrit.wikimedia.org/r/315084

Mentioned in SAL (#wikimedia-cloud) [2018-04-06T11:23:24Z] <arturo> manually upgrade apache2 on tools-puppemaster for T159254

We just got hit by this again.

In this case, it was something like a timing issue. The first upgrade was:

Unpacking apache2 (2.4.10-10+deb8u12) over (2.4.10-10+deb8u11+wmf1)

But by the time I checked, we had 2.4.10-10+deb8u12 and 2.4.10-10+deb8u12+wmf1:

aborrero@tools-puppetmaster-01:~$ aptitude versions apache2
Package apache2:                        
p   2.4.10-10+deb8u11                                                                                                                                  oldstable                                                                                                                      500 
i   2.4.10-10+deb8u12                                                                                                                                  oldstable                                                                                                                      500 
p   2.4.10-10+deb8u12+wmf1                                                                                                                             jessie-wikimedia                                                                                                               1001
[...]

So a second run of unattended-upgrades would have picked the right apache2 version. Note that the +wmf package has higher apt pinning priority.

Mentioned in SAL (#wikimedia-cloud) [2018-04-06T14:30:20Z] <arturo> add puppet class toollabs::apt_pinning to tools-puppetmaster-01 using horizon, to add some apt pinning related to T159254

Change 424603 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] toollabs: apt_pinning: add apache2 package pinning

https://gerrit.wikimedia.org/r/424603

Change 424603 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] toollabs: apt_pinning: add apache2 package pinning

https://gerrit.wikimedia.org/r/424603

aborrero claimed this task.