Page MenuHomePhabricator

Make apache/maintenance hosts TLS connections to mariadb work
Open, MediumPublic

Description

There are no non-root available puppet CA signed client certs available on these hosts for use from MediaWiki or the mysql client. These will be needed for the ssl_set() call in DatabaseMysqli.php.

EDIT: Actually CA-only (by directory or CA file) is supported by mysqli, though not really documented (there's a bug somewhere about that), so this should work without client certs/keys.

Event Timeline

aaron created this task.Sep 12 2017, 8:43 AM
aaron updated the task description. (Show Details)

I can help with this, but I will need supervision to understand the whole privileges model for application servers to not do something risky.

Gilles moved this task from Inbox to Radar on the Performance-Team board.Sep 12 2017, 9:14 AM
Gilles edited projects, added Performance-Team (Radar); removed Performance-Team.
jcrespo moved this task from Triage to Backlog on the DBA board.Sep 12 2017, 4:21 PM
faidon added a subscriber: faidon.Sep 26 2017, 12:05 PM

I may be missing something, but why do we need client certificates? Just setting the CA path to /etc/ssl/certs and the rest of the arguments to NULL should suffice, I think?

we do not need client certs- we need the "public" CA being available to the clients. That, which is trivial to do, need a puppet patch I have not yet gotten around to do.

That isn't needed. We import the puppet CA to the host's certificate store in base and it should thus be available as /etc/ssl/certs/Puppet_Internal_CA.pem. Instead of using that though, the preferred, future-proof way to support it would be just using the (c_rehashed) /etc/ssl/certs as the CA path (in OpenSSL applications), or as /etc/ssl/certs/ca-certificates.crt (in GnuTLS/NSS applications).

Joe added a subscriber: Joe.Sep 26 2017, 12:24 PM

No puppet patch is needed, if you just need the CA cert available to clients. It is, in fact, readable by all:

oblivian@mw1254:~$ ls -la /etc/ssl/certs/Puppet_Internal_CA.pem
lrwxrwxrwx 1 root root 55 Sep 16  2016 /etc/ssl/certs/Puppet_Internal_CA.pem -> /usr/local/share/ca-certificates/Puppet_Internal_CA.crt
oblivian@mw1254:~$ ls -la /usr/local/share/ca-certificates/Puppet_Internal_CA.crt
-r--r--r-- 1 root root 1895 Sep 16  2016 /usr/local/share/ca-certificates/Puppet_Internal_CA.crt

Specifically:

oblivian@mw1254:~$ sudo -u www-data cat /etc/ssl/certs/Puppet_Internal_CA.pem
-----BEGIN CERTIFICATE-----
[CUT]
-----END CERTIFICATE-----
Joe added a comment.EditedSep 26 2017, 12:33 PM

Looking at http://php.net/manual/en/mysqli.ssl-set.php, I would think you'd only need to set capath=/etc/ssl/certs, while setting all other parameters to NULL (except maybe cipher, as I have no idea what is the actual default cipherlist for mysqli on HHVM).

@faidon, cool! Less work for us :-) As you imagine, I didn't have much time to have a proper look to it or how puppet handles its certs.

@Joe so s/puppet patch/mediawiki config patch/ :-)

In fact, as this is not going to be enabled on all hosts yet, performance already has access on mwdebug/terbium so this is already resolved, right? @aaron You only need "directions"?

aaron added a comment.Sep 26 2017, 3:28 PM

Looking at http://php.net/manual/en/mysqli.ssl-set.php, I would think you'd only need to set capath=/etc/ssl/certs, while setting all other parameters to NULL (except maybe cipher, as I have no idea what is the actual default cipherlist for mysqli on HHVM).

I tried that first but it yields "SSL connection error: SSL_CTX_set_default_verify_paths failed (10.192.32.108)".

I can connect using python just by pointing to the CA cert: https://gerrit.wikimedia.org/r/#/c/354206/1/wmfmariadbpy/WMFMariaDB.py (line 51), so I haven't dreamed it :-)

There could be some problems here

  • the password is in an old format/other server issues (I can check that)
  • the client is not linked/compatible against a modern openssl implementation (for example, old versions of the mysql client ar linked againsy yassl, which has very weak ciphers. My wmf-client package has both the binary clients and C connectors with a working implementation
  • Something else is not working right
aaron added a comment.Oct 3 2017, 11:01 PM

Looking at http://php.net/manual/en/mysqli.ssl-set.php, I would think you'd only need to set capath=/etc/ssl/certs, while setting all other parameters to NULL (except maybe cipher, as I have no idea what is the actual default cipherlist for mysqli on HHVM).

I tried that first but it yields "SSL connection error: SSL_CTX_set_default_verify_paths failed (10.192.32.108)".

I forget to mention, using the CA file parameter instead of the CA directory (ca=/etc/ssl/certs/Puppet_Internal_CA.pem) gives "SSL connection error: unknown error number (10.192.32.108)".

aaron renamed this task from Make client certs available for apache/maintenance hosts for TLS connections to mariadb to Make apache/maintenance hosts TLS connections to mariadb work.Oct 4 2017, 7:07 PM
Joe added a comment.Oct 4 2017, 7:54 PM

So what I extract from the errors is you're trying to connect to db2048 by IP and not by hostname, and the certificates we expose for mysql do not include verification information for the ip address in its SAN. In fact, I don't think we ever did add that info to our certs.

So if we had the hostname instead of the IP in db-codfw.php, it should work. I think performance was a reason for using IPs instead of hostnames there, so we might need to reissue the certificates if we want to keep using IPs. I think the implications for DBAs would be a huge maintenance work.

aaron updated the task description. (Show Details)Oct 4 2017, 8:33 PM
aaron added a comment.EditedOct 4 2017, 8:47 PM

So what I extract from the errors is you're trying to connect to db2048 by IP and not by hostname, and the certificates we expose for mysql do not include verification information for the ip address in its SAN. In fact, I don't think we ever did add that info to our certs.

So if we had the hostname instead of the IP in db-codfw.php, it should work. I think performance was a reason for using IPs instead of hostnames there, so we might need to reissue the certificates if we want to keep using IPs. I think the implications for DBAs would be a huge maintenance work.

I'm directly using the Database class. Trying the IP gives the same error unfortunately.

jcrespo added a comment.EditedOct 5 2017, 4:20 PM

I would suggest to setup a proxysql instance to move this forward? maybe on terbium itself, as a test? That way we can unblock this without actually doing any changes to mediawiki or the databases themselves? Would that work for you? After all, this was plan B, if plan A didn't scale.

aaron added a comment.Oct 5 2017, 10:06 PM

We discussed proxies in the last performance meeting and we're OK with that (it would cut down on handshake latency anyway).

I will setup that and ping you. Independently of this, not supporting TLS 1.2 (if that is the issue) is a huge problem and something we should seek/fix/upgrade/workaround separately.

Mentioned in SAL (#wikimedia-operations) [2017-10-17T11:32:18Z] <jynus> test-installing proxysql on wasat T175672

Change 384695 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] proxysql: Setup proxysql on terbium/wasat as a test

https://gerrit.wikimedia.org/r/384695

Looking at http://php.net/manual/en/mysqli.ssl-set.php, I would think you'd only need to set capath=/etc/ssl/certs, while setting all other parameters to NULL (except maybe cipher, as I have no idea what is the actual default cipherlist for mysqli on HHVM).

I tried that first but it yields "SSL connection error: SSL_CTX_set_default_verify_paths failed (10.192.32.108)".

I think this is because the version of yaSSL that MySQL bundles is blatantly buggy when the filename is null but the path is not null:

int SSL_CTX_load_verify_locations(SSL_CTX* ctx, const char* file,
                                  const char* path)
{
    int       ret = SSL_FAILURE;
    const int HALF_PATH = 128;

    if (file) ret = read_file(ctx, file, SSL_FILETYPE_PEM, CA);

    if (ret == SSL_SUCCESS && path) {
        // call read_file for each reqular file in path

In the current wolfSSL, ret is initially WOLFSSL_SUCCESS, so path can be used when file is NULL.

I forget to mention, using the CA file parameter instead of the CA directory (ca=/etc/ssl/certs/Puppet_Internal_CA.pem) gives "SSL connection error: unknown error number (10.192.32.108)".

strace shows that the server in this case returns "#08S01 Bad handshake". This could indeed be due to protocol version, since even the current version of yaSSL does not support TLS 1.2. HHVM is not affected by this, and connects successfully, presumably because it bundles its own MySQL client library rather than using the one from Debian jessie. I've confirmed that the libmysqlclient that we use for PHP 5.6 does not link to libssl, so it was presumably compiled without OpenSSL support. A workaround would be to recompile libmysqlclient with WITH_SSL=system, or to stop using PHP 5.6 for maintenance scripts, instead migrating to HHVM or PHP 7.0.

I think this is because the version of yaSSL that MySQL bundles

since even the current version of yaSSL does not support TLS 1.2

We compile MariaDB Server with OpenSSL only, so I assume you are referring to the client only compilation. I highly recommend going to openssl, it is not possible to provide a sane setup otherwise- I have client packages done (wmf-mariadb101-client, are called), or

stop using PHP 5.6 for maintenance scripts, instead migrating to HHVM or PHP 7.0

As Tim says.

Change 384695 merged by Jcrespo:
[operations/puppet@production] proxysql: Setup proxysql on terbium/wasat as a test

https://gerrit.wikimedia.org/r/384695

Change 392651 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] proxysql: Make proxy configuration non-readable for all users

https://gerrit.wikimedia.org/r/392651

Change 392651 merged by Jcrespo:
[operations/puppet@production] proxysql: Make proxy configuration non-readable for all users

https://gerrit.wikimedia.org/r/392651

@aaron the proxy is installed but unconfigured, - we still have to fix some issues with the start and process, but do you want me to point it to the real master? Do you want me to point it to a soon to be setup master test host?

Change 392674 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] proxysql: Enable systemd support so it starts as non-root

https://gerrit.wikimedia.org/r/392674

Change 392674 merged by Jcrespo:
[operations/puppet@production] proxysql: Enable systemd support so it starts as non-root

https://gerrit.wikimedia.org/r/392674

Change 392685 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] proxysql: fix /etc/proxysql.cfg permissions

https://gerrit.wikimedia.org/r/392685

Change 392685 merged by Jcrespo:
[operations/puppet@production] proxysql: fix /etc/proxysql.cfg permissions

https://gerrit.wikimedia.org/r/392685

Change 392689 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] proxysql: Fix /var/lib/proxysql permissions and move .my.cnf to profile

https://gerrit.wikimedia.org/r/392689

Change 392689 merged by Jcrespo:
[operations/puppet@production] proxysql: Fix /var/lib/proxysql permissions and move .my.cnf to profile

https://gerrit.wikimedia.org/r/392689

Change 392694 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] proxysql: Fix invalid puppet dependency cycle

https://gerrit.wikimedia.org/r/392694

Change 392694 merged by Jcrespo:
[operations/puppet@production] proxysql: Fix invalid puppet dependency cycle

https://gerrit.wikimedia.org/r/392694

Blocked on getting answers written at T175672#3778177.

@aaron - see note from Jaime above, he's waiting on answers from you about where to point the proxy.

aaron added a comment.Dec 12 2017, 6:29 PM

@aaron the proxy is installed but unconfigured, - we still have to fix some issues with the start and process, but do you want me to point it to the real master? Do you want me to point it to a soon to be setup master test host?

I don't see a reason that it has to be a real master. A local and foreign replica would do (as long as I know which one it's going to), though it's the first case that I care most about. Maybe I could just run it in screen with the remote and foreign case config, if that's a hassle (e.g. needing two proxysql instances or something).

A local and foreign replica would do

it is installed on both maintenance servers (eqiad and codfw), so that is the easiest to do- I will hopefully get it set up tomorrow.

Change 398007 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb: Create profile::client for non-root mariadb clients

https://gerrit.wikimedia.org/r/398007

Change 398007 merged by Jcrespo:
[operations/puppet@production] mariadb: Create profile::client for non-root mariadb clients

https://gerrit.wikimedia.org/r/398007

Change 398023 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] proxysql: Add proxysql user to mysql group for tls certs access

https://gerrit.wikimedia.org/r/398023

Change 398023 merged by Jcrespo:
[operations/puppet@production] proxysql: Add proxysql user to mysql group for tls certs access

https://gerrit.wikimedia.org/r/398023

Change 398056 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Depool db1067 for maintenance

https://gerrit.wikimedia.org/r/398056

Change 398056 merged by jenkins-bot:
[operations/mediawiki-config@master] mariadb: Depool db1067 for maintenance

https://gerrit.wikimedia.org/r/398056

@aaron After spending a whole day on this, while proxysql is technically installed on terbium and wasat, both pointing to db1067 from 2 separate datacenters, I am unable to make it speak TLS1.2 or downgrade mysql server to speak a common <TLS1.2 protocol. I think the easiest way would be to use the latest openssl-compiled libary found at /opt/wmf-mariadb101-client/lib and use that, I do not think we are going to get proxysql to speak TLS 1.2 any time soon: https://github.com/sysown/proxysql/issues/1247

jcrespo added a comment.EditedDec 13 2017, 4:33 PM

I think because maintenance hosts use php (?) it should be easier for it to work:

jynus@wasat:~$ sudo php test.php 
Array
(
    [@@ssl_cipher] => TLSv1.2
)
jynus@wasat:~$ cat test.php
<?php

$mysqli = new mysqli();
$mysqli->init();

$mysqli->ssl_set('/etc/mysql/ssl/server.key', '/etc/mysql/ssl/cert.pem', '/etc/ssl/certs/Puppet_Internal_CA.pem', NULL, 'TLSv1.2');

$mysqli->real_connect('db1067.eqiad.wmnet', 'wikiadmin', '<wikiadmin-pass>', 'information_schema', 3306) or die ('Could not connect');

$res = $mysqli->query('SELECT @@ssl_cipher');

print_r($res->fetch_assoc());

$mysqli->close();

I have left a copy of cert.pem and server.key (client certificated) on your home, as right now it is hardcoded under the mysql user and I do not know which user it should be available so it is read by the app but not by everyone else. /etc/ssl/certs/Puppet_Internal_CA.pem should be already public.

Change 431742 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/software@master] mariadb: Upgrade proxysql package

https://gerrit.wikimedia.org/r/431742

Change 431742 merged by Jcrespo:
[operations/software@master] mariadb: Upgrade proxysql package

https://gerrit.wikimedia.org/r/431742

mobrovac added a subscriber: mobrovac.

Would the next step here be puppetising the generation/dissemination of certs?