See parent task for more details, but in short we need to make sure SSL libraries are up to date on our platforms.
- Cloud VPS should be dealt with unattended updates
- Toolforge containers
- PAWS user container
See parent task for more details, but in short we need to make sure SSL libraries are up to date on our platforms.
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | TheDJ | T283164 Let's Encrypt issuance chains update | |||
| Resolved | SLyngshede-WMF | T283165 OpenSSL < 1.1.0 compatibility issues with new LE issuance chain | |||
| Resolved | None | T291387 Ensure Cloud Services platforms will accept new LE issuance chain | |||
| Resolved | BUG REPORT | Legoktm | T292263 Expired certificate error from CropTool when attempting OAuth login to mediawiki.org | ||
| Resolved | • aborrero | T292289 Toolforge mono version on stretch grid doesn't trust latest LE certs | |||
| Resolved | BUG REPORT | Chicocvenancio | T292355 video2commons login is broken by the LE cert expiry (py2.7) |
The expected version numbers are
openssl1.0: 1.0.2u-1~deb9u5
gnutls28: 3.5.8-5+deb9u6
We will probably need to do something like T194665: Provide an up-to-date mono environment on toolforge to get a mono that works. h/t @Legoktm for the past link.
The Jessie based containers in Toolforge are likely all broken by the change and not worth the effort to fix. I was able to "fix" T292243: POI/marker disappeared on Wikivoyage maps generated on Toolforge by moving it from the php5.6 container to a php7.4 container.
Jessie container use in Toolforge as of 2021-09-30T20:40Z (from https://k8s-status.toolforge.org/images/):
| image | active pods |
|---|---|
| node6-sssd-base | 1 |
| node6-sssd-web | 13 |
| php5-sssd-web | 216 |
| python2-sssd-base | 4 |
| python2-sssd-web | 24 |
| python34-sssd-web | 88 |
| ruby21-sssd-web | 2 |
| TOTAL | 348 |
@aborrero Do you have time to do the needful for this? We need a mono on the bastions and grid that can handle LE trust changes. I'm not sure what version that will be, but I'd hope it is mostly about what TLS library is it linked with.
Alternately we could try using cert-sync to just update the trust store (https://www.mono-project.com/docs/faq/security/).
Re: Mono - In case there ends up being no great option within Mono itself: you could also configure a generic outbound HTTPS proxy on the same host, using a proxy with a working HTTPS implementation.
I've gotten reports of users (two students in Wiki Education courses) getting an invalid cert error from visiting en.wikipedia.org. Is that perhaps because of some gadget traffic going to Toolforge while the user is on en.wikipedia.org, or are there also possible related cert problems on Wikipedia itself?
Yes, the production wikis also use Let's Encrypt certificates at some of our edge servers. That is actually the cause of most of the Toolforge/Cloud VPS issues (connecting to the content wikis). See https://meta.wikimedia.org/wiki/HTTPS/2021_Let%27s_Encrypt_root_expiry for information about expected problems (which funny enough you will have a hard time seeing if you are effected).
Thanks, that's helpful. So this is probably breaking Wikipedia for like ~2% of Mac OS users, ~0.3% of overall desktop users? Yikes! (Plus about that many who are still on even older versions, which I guess were already incompatible with Wikipedia.)
Change 725486 had a related patch set uploaded (by Bstorm; author: Bstorm):
[operations/docker-images/toollabs-images@master] openssl: update stretch container TLS libraries before using LE certs
Change 725486 merged by jenkins-bot:
[operations/docker-images/toollabs-images@master] openssl: update stretch container TLS libraries before using LE certs
Mentioned in SAL (#wikimedia-cloud) [2021-10-03T21:29:13Z] <bstorm> rebuilt stretch containers for potential issues with LE cert updates T291387
Mentioned in SAL (#wikimedia-cloud) [2021-10-03T21:30:58Z] <bstorm> rebuilding buster containers since they are also affected T291387 T292355
Yeah, it is affected, if the Buster image was created before the release of the Buster 10.10 point release, see https://phabricator.wikimedia.org/T283165#7365637 (i.e. before mid June)
Hi there, just wanted to share that I worked around this issue in the py2 web situation by switching to PyOpenSSL, which brings along a newer version of OpenSSL. The changes were pretty minimal and can be seen here: https://github.com/hatnote/montage/commit/1be5d09ff5b80e2a57eb71802096fc1fcb98e60f
More papertrail available here.
A technical detail which may be of some help: The Python on the Jessie image we were using was linking against OpenSSL 1.0.0, even though 1.0.2 was available, but openssl-dev appears to have been removed from the Wikimedia apt repo, so it was nontrivial to rebuild against the newer SSL.
Also surfacing a note on the workaround: this breaks some requests timeout behavior, so if you're relying on requests' timeout parameter, you may see some system errors (EAGAIN) instead of your expected behavior. Hope this helps!
That is a neat hack. Please do be aware that Python 2.7 was the last ever release of the 2.x series and it reached end of life upstream on 2020-01-01 (~22 months ago). Chances are really, really good that things like this will keep breaking for your Python 2.x projects. See https://docs.python.org/3/howto/pyporting.html for tips on porting py2 code to py3.
Hi Mahmoud,
we do have a -dev package for OpenSSL 1.1 on jessie, but it's called libssl11-dev, not libssl-dev. For background: Jessie originally only provided OpenSSL 1.0.2, but back then we needed OpenSSL 1.1 to provide modern crypto for our TLS terminators. And since the APIs changed between OpenSSL 1.0 and 1.1 we had to provide a separate -dev package, so that we could opt-in select applications to OpenSSL 1.1.