|operations/puppet||production||+1 -1||icinga/planet: use letsencrypt check command for https cert monitoring|
indeed, it's auto-renewed by acme-chief, we should tune those checks.
The new cert has been issued already and it's being staged to avoid client-side clock skew issues:
Jul 15 12:00:02 acmechief1001 acme-chief-backend: Staging_time will be enforced for unified / rsa-2048 till 2021-07-22 08:02:06
Should I just remove those checks or adjust them to stop caring about cert expiry? Or should they be kept but with lower threshold? If traffic doesn't need those alerts we can just remove that nowadays.
@Vgutierrez: A good first task is a self-contained, non-controversial task with a clear approach. It should be well-described with pointers to help a completely new contributor. Given the current short task description I'm removing the good first task tag. Please add details what exactly has to happen where and how for a new contributor, and then add back the good first task project tag. Thanks a lot in advance!
So.. we still want to monitor if TLS works on planet and phabricator, we just don't want to deal with cert expiry anymore. We need to create a new checkcommand probably. One of the less obvious parts to get right in Icinga/puppet, so no sure about the "good first task" but I will take it.
But the planet cert (https://en.planet.wikimedia.org/ and other language subdomains of it) is still a DigiCert cert and not an Letsencrypt cert.
That's why the fix isn't just replacing the "check_ssl_http" with "check_ssl_http_letsencrypt" which it would have been if that was the case.