Ensure your team is alerted well in advance of any SSL certificates that are expiring soon.
Whether you use an automated SSL certificate renewal system such as Let's Encrypt via certbot or cert-manager, manually renew certificates on your own via your domain registrar or another Certificate Authority, or rely on a hosting platform like Render or Cloudflare to manage SSL certificate renewal for you, an expired SSL certificate will create a total outage, blocking your https://
users from using your website and API.
Fortunately, it's easy to configure Heii On-Call to continuously monitor and notify your team about any soon-to-expire SSL certificates, so you have plenty of time to take action before there's a customer-visible outage.
Monitoring your SSL certificates is essential to anyone with an https://
website or API. Here's what might break:
http://example.com/.well_known/acme-challenge/*
, a CA outage, a root certificate issue on your ACME client, an IPv6 change gone wrong, a Kubernetes issue breaking cert-manager, etc.In modern web browsers, an expired SSL certificate results in your users seeing a big, scary warning when they try to visit your website:
API users are also denied, throwing exceptions such as ssl.SSLCertVerificationError
or OpenSSL::SSL::SSLError
which their code may not be prepared to handle:
In both cases, this is a severe user-facing outage that makes your website or API inaccessible to your users. The outage continues until the SSL certificate is renewed by your team.
SSL certificates, other than those for internal use only, are issued by a centralized, publicly-trusted organization known as a CA (Certificate Authority). The CA is responsible for "domain validation", ensuring that a certificate for https://example.com/
is only issued to an entity that has legitimate control of the example.com
domain. Since domains can expire and be bought and sold—and certificates and their associated private keys can be unintentionally leaked and compromised in cybersecurity incidents—the certificates are only considered valid for a limited amount of time. The limited validity time forces periodic revalidation by the CA, similar to how you have to periodically renew your passport for international travel.
A consortium of major web browser makers has decided that the maximum allowed lifetime of an SSL certificate will decrease substantially:
The expiration dates on your SSL certificates are likely one of the only ticking time bombs in your web application stack: if nothing happens (i.e. if the certificate is not renewed and replaced in time), your site is guaranteed to stop working on that date.
Heii On-Call's outbound probes continuously monitor a URL, such as your health check endpoint. By default, the optional "SSL Certificate Minimum Expiration Duration" setting is blank, so any valid and non-expired SSL certificate is considered acceptable. With a valid SSL certificate, Heii On-Call considers the URL to be up
, and the dashboard shows the certificate expiration date as highlighted below:
With an expired SSL certificate, Heii On-Call's dashboard shows an ssl_error
and considers the URL to be down
, so the on-call individual will be notified (after the defined timeout
):
That's fine for most outbound probes, and it matches browser / API cient behavior, so we recommend sticking with that for most of your probes.
However, for the special case of monitoring SSL certificate expiration, you'd like to be notified before the certificate expires. If the optional "SSL Certificate Minimum Expiration Duration" setting is specified:
then the site will be considered down
with reason ssl_certificate_expires_soon
if the SSL certificate is valid for less than the specified duration, even if the HTTPS request is otherwise successful:
The site shown in this screenshot above would still work just fine in a browser, but Heii On-Call considers this probe to be down
becaue the "SSL Certificate Minimum Expiration Duration" is set to 60 days
while the received certificate is only valid for another 55 days
. This is a stricter acceptability criteria than the blank default (i.e. 0 days
), and a stricter criteria than browsers use. This allows you to set up alerts for your team if the certificate hasn't expired yet but will soon, so you can take action to avoid user-facing downtime.
Creating an SSL certificate monitoring plan is straightforward: decide which certificates you want to monitor, and decide how early and how loudly you want your team to be notified.
The first step is to make a list of domains / SSL certificates you wish to monitor. If your site is at https://example.com/
, certainly, start there! But you may also wish to monitor the SSL certificates of other subdomains, such as www.example.com
(see also: A Practical Guide to Monitoring Website Redirects), api.example.com
, cdn.example.com
, app.example.com
, etc. if these apply to you. In many cases, these will be different certificates, renewed separately from your primary domain. If it's mission critical that you don't have an outage on that subdomain, be sure to monitor it separately.
The second step is to decide when you'd like to be alerted. Tools like certbot will attempt to automatically renew your certificate when 1/3rd of the certificate lifetime remains. For a 90-day certificate as currently issued by Let's Encrypt, that means you should never see a certificate that is valid for less than approximately 30 days. In this case, it's safest to use at most 27 or 28 days, because of uncertainty about when certbot runs, and allowing it to recover itself from a few days of outage on the CA's side. You can also decide to require a shorter minimum certificate lifetime, like 14 days, if you're relatively confident that this is still long enough for your team to diagnose and fix the issue. Using 14 days
also lets you avoid having false positives in the coming years as certificate lifetimes decrease.
Here's what we recommend creating within Heii On-Call, for each subdomain:
14 days
2 days
Heii On-Call allows configuring both critical and non-critical triggers. Critical ones use special permissions we've been granted for our iOS and Android apps to blast through Do-Not-Disturb and volume settings to play a loud sound you can't ignore, just like carrying a dedicated pager, repeating until acknowledged (or, if escalation strategies are configured, escalated to another member of your organization). The combination recommended above gives the person on-call a non-critical alert when you're 14 days from expiration, which can be dealt with at a convenient time. But if that goes unfixed, then your team will start getting critical alerts when the countdown is below 48 hours.
As for other settings on these triggers: the URL should just be a health check endpoint on that subdomain, and the timeout can be relatively long, like 1 hour
, so brief outages (which you may want to monitor with other triggers) don't bother these longer-time-period SSL certificate monitors.
https
URL.down
when the certificate will expire in under 14 days.www.example.com
.See Quickstart: SSL Certificate Monitoring in Heii On-Call for a version of these steps with screenshots.
You can set up these triggers with "SSL Certificate Minimum Expiration Duration" on Heii On-Call's free plan in just a few minutes. Sleep easier knowing you'll be alerted well in advance of an easily avoidable customer-facing outage.
Enjoy your continuous SSL certificate expiration monitoring!