Page MenuHomePhabricator

Upgrade deployment-prep MediaWiki clusters to Buster
Open, Needs TriagePublic

Description

Per T245757 and T278641

Progress

  • deployment-mediawiki-07
  • deployment-mediawiki-09
    • mediawiki-[07-09]: replaced by deployment-mediawiki11
  • deployment-jobrunner03
    • replaced by deployment-jobrunner04
  • deployment-parsoid11 see T268524
    • replaced by deployment-parsoid12
    • might need to be reimaged again, due to parsoid/js leftovers

Event Timeline

Mentioned in SAL (#wikimedia-releng) [2021-03-29T05:40:07Z] <Majavah> move role::labs::lvm::srv puppet classes from deployment-mediawiki- prefix to current individual appservers, T278664

I installed deployment-mediawiki10 and it's giving 500s:

PHP fatal error: <br/> Uncaught ConfigException: Failed to load configuration from etcd: (curl error: 60) SSL peer certificate or SSH remote key was not OK in /srv/mediawiki/php-master/includes/config/EtcdConfig.php:205

Trying to connect to deployment-etcd02 fails:

root@deployment-mediawiki10:/srv/mediawiki# curl https://deployment-etcd02.deployment-prep.eqiad1.wikimedia.cloud:2379
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

root@deployment-mediawiki10:/srv/mediawiki# openssl s_client -showcerts deployment-etcd02.deployment-prep.eqiad1.wikimedia.cloud:2379
CONNECTED(00000003)
depth=0 CN = deployment-etcd02.deployment-prep.eqiad1.wikimedia.cloud
verify error:num=20:unable to get local issuer certificate
verify return:1
depth=0 CN = deployment-etcd02.deployment-prep.eqiad1.wikimedia.cloud
verify error:num=21:unable to verify the first certificate
verify return:1
---
Certificate chain
 0 s:CN = deployment-etcd02.deployment-prep.eqiad1.wikimedia.cloud
   i:CN = Puppet CA: deployment-puppetmaster03.deployment-prep.eqiad.wmflabs
-----BEGIN CERTIFICATE-----
MIIFyTCCA7GgAwIBAgICAO0wDQYJKoZIhvcNAQELBQAwTTFLMEkGA1UEAwxCUHVw
cGV0IENBOiBkZXBsb3ltZW50LXB1cHBldG1hc3RlcjAzLmRlcGxveW1lbnQtcHJl
cC5lcWlhZC53bWZsYWJzMB4XDTIxMDMwNDEzMzkzMFoXDTI2MDMwNDEzMzkzMFow
QzFBMD8GA1UEAww4ZGVwbG95bWVudC1ldGNkMDIuZGVwbG95bWVudC1wcmVwLmVx
aWFkMS53aWtpbWVkaWEuY2xvdWQwggIiMA0GCSqGSIb3DQEBAQUAA4ICDwAwggIK
AoICAQDxTVcgzuqjXc1hluHrFe9LyyQLp1U67/wQnEqc/gc6wQuCIpQ3yysyaEc9
Wq1o4m/nNov6cnch72rTAeJg6LJ8ayCIk9MraEvRleTQVD0b+xXDnOeO9k07X+bF
/L84ZBzwusmKmPa+Gp6aIatKu7A3e30z2ttaIx/huOE3qQd7jtcxEgQbyiAG3Mha
f/FB/fOLWkCsf5ffLts9U6Faeb7UI+FGbm603MBa7qXdQFH33c7k14JUuPZhG83r
K95CUKt6vOK3ER950oPlpj+Xaykvo26ig1JA67oxiilP3oE4xwKmBl8j9edOKu3F
O7DcXwtvDXyDcEKtgRbdKkQDMBjpzjkCTs1ivO8b2yFC2kwtlJyxFvho24+ZYQoR
OiKoZQtEFTFAxyJQPwYY3v0i10hAQDlITfgGzKDNyoSX1xW08lC7KJz1aXCplWhb
7ff/iOo5gMdnOjz08FuV56tkFkz9WqGHGvGM+jsCwijCmOs30t/n/fFjAO3BDaeZ
leYBVQuVt6D8kQwe+p/IVu+2h7d84NuFGUl8vLWGDcPjqtyoM0NAUpkbsDp6KvXg
jSwN4fUaXaAOSJk14uHCwsY5x8DmpZfTdu3iKILq6lIz0ObRD5yB/0sj5w0YRFnu
fsLgyZsgI4VHZ51oFq6QWAPnoiwzbqTpp/+YCLK9podfEU+xQQIDAQABo4G8MIG5
MDcGCWCGSAGG+EIBDQQqDChQdXBwZXQgUnVieS9PcGVuU1NMIEludGVybmFsIENl
cnRpZmljYXRlMA4GA1UdDwEB/wQEAwIFoDAgBgNVHSUBAf8EFjAUBggrBgEFBQcD
AQYIKwYBBQUHAwIwDAYDVR0TAQH/BAIwADAdBgNVHQ4EFgQUGMk90Y7PIc/sAkOx
ujQF+YIsFnMwHwYDVR0jBBgwFoAUr+w3dsHuo92YwFYgNbSe6yBvw9swDQYJKoZI
hvcNAQELBQADggIBAEfnCRHepFiboxLudPbP/Z4srJ5DxJYvLwc7xWkTNTJxsW3p
VHAZ3KSpp6V5BYDPJ7trdwiFK35GpROKGqryVADqwyi6ygsqpAf2bS/EXai5dk6o
ngFBgAFEADDCCcFDs4Sq3CxBUpOU3H3BA5xIZXKQxbfJrLNFTPZ+Ylv8Xl1O5u1S
IFBX0js852G0jrbayoTPPARYdhyj3o91Qln9WoE6TW/omoCgYQ8ehpvPHuhJj6Q+
LttQjsl+m+CYYAbhjygGWWnDChouY0hSC9jA3UiG5dSb6bIy1uLLs+NOf4BHxCrx
JnpG8dsGVac12P67rmtD6ynTsYFxckvXlBLUWBF/Bxg5o4mstVSd34WjJDYDPgc3
HU+reKlCN9+8kOZjCnHyxQVUTyeFQTG1uOF+pEE6nd3cIWuuO9AJnYFCLrlePrJm
oq5f2MHphH2aquikU5KDrJ5fbLovks5KFyNTzQlu1qEKC4D8vsUwjhdks4mU2YGq
VHmGv0qnxFHNV2iCIR739jTdluNK6KU0/L2f4vqIun3WRScJAm222PiWpqEhyTZj
cXhaDgR/N7uomT74EIjN1dIEC5MC/vr4qNR9muAIe2upf2WkLdlcxzftnv9C52xy
A7fN3FIuKv+3dysprZK3rb7nMYpIGNamObEO9wPM+m7S6aqS7SxXi1zJoNgS
-----END CERTIFICATE-----
---
Server certificate
subject=CN = deployment-etcd02.deployment-prep.eqiad1.wikimedia.cloud

issuer=CN = Puppet CA: deployment-puppetmaster03.deployment-prep.eqiad.wmflabs

---
No client certificate CA names sent
Peer signing digest: SHA256
Peer signature type: RSA
Server Temp Key: X25519, 253 bits
---
SSL handshake has read 2314 bytes and written 441 bytes
Verification error: unable to verify the first certificate
---
New, TLSv1.2, Cipher is ECDHE-RSA-AES128-GCM-SHA256
Server public key is 4096 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
    Protocol  : TLSv1.2
    Cipher    : ECDHE-RSA-AES128-GCM-SHA256
    Session-ID: 70C3F449C8BABC63A22E0F4B965F9E4AD374B54DD2BD5B35721DA49B8DA138BC
    Session-ID-ctx:
    Master-Key: C0F90B0BA41954363E6EFE6AE98EDE05A46CBC9380C2ED64ADFCE6C0C18278C8A1861A065FD0FA3FDB8ED93E93FD34E0
    PSK identity: None
    PSK identity hint: None
    SRP username: None
    TLS session ticket:
    0000 - ff 16 a3 e8 6e 09 21 01-2d 9b e8 74 91 97 29 4b   ....n.!.-..t..)K
    0010 - a6 25 0e b7 db 7b 1c 5f-b9 75 cf c3 5b 00 c5 f8   .%...{._.u..[...
    0020 - 42 e8 58 0b 1d d2 98 d0-ae 4a 60 c6 2f 92 af 66   B.X......J`./..f
    0030 - 81 93 2b 67 02 2a 10 b9-08 b2 74 fb 04 7c f4 05   ..+g.*....t..|..
    0040 - ea ca 4a 4b 0f 08 44 c8-0a 58 31 92 17 51 50 6d   ..JK..D..X1..QPm
    0050 - 1f 90 b7 86 29 22 23 55-8a 54 5d 6c 3e 35 7d 9b   ....)"#U.T]l>5}.
    0060 - ff 53 ca 9b b9 e6 88 e6-5e ff ec ea 0f d2 cd 29   .S......^......)
    0070 - 8a 02 5e 15 80 48 a4 2a-                          ..^..H.*

    Start Time: 1617001122
    Timeout   : 7200 (sec)
    Verify return code: 21 (unable to verify the first certificate)
    Extended master secret: no
---

works fine on other servers:

taavi@deployment-puppetmaster04:~$ curl https://deployment-etcd02.deployment-prep.eqiad1.wikimedia.cloud:2379
404 page not found

taavi@deployment-mediawiki-07:~$ curl https://deployment-etcd02.deployment-prep.eqiad1.wikimedia.cloud:2379
404 page not found

Certificate issues were solved, deployment-mediawiki10 deleted after we used it for debugging that. deployment-mediawiki11 is now running and I'm planning on switching traffic to that in a moment.

Mentioned in SAL (#wikimedia-releng) [2021-03-29T13:04:32Z] <Majavah> cherry pick https://gerrit.wikimedia.org/r/c/operations/puppet/+/675503/ on deployment-puppetmaster04 (T278664), also apply same change on horizon. this will switch traffic from deployment-mediawiki-07 to deployment-mediawiki11

Mentioned in SAL (#wikimedia-releng) [2021-03-30T07:26:09Z] <Majavah> shutoff deployment-mediawiki-09 T278664

Mentioned in SAL (#wikimedia-releng) [2021-04-07T14:27:16Z] <Majavah> delete deployment-mediawiki-07 and deployment-parsoid11 T278664

Mentioned in SAL (#wikimedia-releng) [2021-04-09T09:51:25Z] <Majavah> deleting deployment-jobrunner03 T278664