SSL Certificate Expiration Monitoring

Ensure your team is alerted well in advance of any SSL certificates that are expiring soon.

Author
Mike Robbins

Whether you use an automated SSL certificate renewal system such as Let's Encrypt via certbot or cert-manager, manually renew certificates on your own via your domain registrar or another Certificate Authority, or rely on a hosting platform like Render or Cloudflare to manage SSL certificate renewal for you, an expired SSL certificate will create a total outage, blocking your https:// users from using your website and API.

Fortunately, it's easy to configure Heii On-Call to continuously monitor and notify your team about any soon-to-expire SSL certificates, so you have plenty of time to take action before there's a customer-visible outage.

Monitoring your SSL certificates is essential to anyone with an https:// website or API. Here's what might break:

  • For automatically renewed SSL certificates (e.g. Let's Encrypt): protect against a cron job that doesn't run, a Python dependency change that breaks certbot, a DNS-01 provider authentication issue with your registrar / DNS provider / Route53 IAM, an HTTP-01 integration issue with serving http://example.com/.well_known/acme-challenge/*, a CA outage, a root certificate issue on your ACME client, an IPv6 change gone wrong, a Kubernetes issue breaking cert-manager, etc.
  • For manually rotated SSL certificates: protect against the person who renewed the certs last year going on vacation, forgetting to do it in time, requesting the renewal via CSR but not deploying it to production, deploying the wrong certificate file, etc.
  • For platform-managed SSL certificates: protect against being lost in some edge case in their SSL certificate renewal code, the platform's upstream CA rate limits being hit, etc.

Impact of Expired SSL Certificates

In modern web browsers, an expired SSL certificate results in your users seeing a big, scary warning when they try to visit your website:

Expired SSL Certificate Warning in Firefox, Safari, Chrome

API users are also denied, throwing exceptions such as ssl.SSLCertVerificationError or OpenSSL::SSL::SSLError which their code may not be prepared to handle:

Expired SSL Certificate in Python, Ruby, Curl

In both cases, this is a severe user-facing outage that makes your website or API inaccessible to your users. The outage continues until the SSL certificate is renewed by your team.

When (and Why) Do SSL Certificates Expire?

SSL certificates, other than those for internal use only, are issued by a centralized, publicly-trusted organization known as a CA (Certificate Authority). The CA is responsible for "domain validation", ensuring that a certificate for https://example.com/ is only issued to an entity that has legitimate control of the example.com domain. Since domains can expire and be bought and sold—and certificates and their associated private keys can be unintentionally leaked and compromised in cybersecurity incidents—the certificates are only considered valid for a limited amount of time. The limited validity time forces periodic revalidation by the CA, similar to how you have to periodically renew your passport for international travel.

A consortium of major web browser makers has decided that the maximum allowed lifetime of an SSL certificate will decrease substantially:

  • 398 days, until 2026-03-15
  • 200 days, until 2027-03-15
  • 100 days, until 2029-03-15
  • 47 days, after 2029-03-15

The expiration dates on your SSL certificates are likely one of the only ticking time bombs in your web application stack: if nothing happens (i.e. if the certificate is not renewed and replaced in time), your site is guaranteed to stop working on that date.

Heii On-Call's "SSL Certificate Minimum Expiration Duration" Setting

Heii On-Call's outbound probes continuously monitor a URL, such as your health check endpoint. By default, the optional "SSL Certificate Minimum Expiration Duration" setting is blank, so any valid and non-expired SSL certificate is considered acceptable. With a valid SSL certificate, Heii On-Call considers the URL to be up, and the dashboard shows the certificate expiration date as highlighted below:

Outbound probe with valid certificate

With an expired SSL certificate, Heii On-Call's dashboard shows an ssl_error and considers the URL to be down, so the on-call individual will be notified (after the defined timeout):

Outbound probe with expired SSL certificate

That's fine for most outbound probes, and it matches browser / API cient behavior, so we recommend sticking with that for most of your probes.

However, for the special case of monitoring SSL certificate expiration, you'd like to be notified before the certificate expires. If the optional "SSL Certificate Minimum Expiration Duration" setting is specified:

SSL Certificate Minimum Expiration Duration configuration field on Heii On-Call

then the site will be considered down with reason ssl_certificate_expires_soon if the SSL certificate is valid for less than the specified duration, even if the HTTPS request is otherwise successful:

ssl_certificate_expires_soon showing on Heii On-Call

The site shown in this screenshot above would still work just fine in a browser, but Heii On-Call considers this probe to be down becaue the "SSL Certificate Minimum Expiration Duration" is set to 60 days while the received certificate is only valid for another 55 days. This is a stricter acceptability criteria than the blank default (i.e. 0 days), and a stricter criteria than browsers use. This allows you to set up alerts for your team if the certificate hasn't expired yet but will soon, so you can take action to avoid user-facing downtime.

Designing an SSL Certificate Expiration Monitoring Plan

Creating an SSL certificate monitoring plan is straightforward: decide which certificates you want to monitor, and decide how early and how loudly you want your team to be notified.

The first step is to make a list of domains / SSL certificates you wish to monitor. If your site is at https://example.com/, certainly, start there! But you may also wish to monitor the SSL certificates of other subdomains, such as www.example.com (see also: A Practical Guide to Monitoring Website Redirects), api.example.com, cdn.example.com, app.example.com, etc. if these apply to you. In many cases, these will be different certificates, renewed separately from your primary domain. If it's mission critical that you don't have an outage on that subdomain, be sure to monitor it separately.

The second step is to decide when you'd like to be alerted. Tools like certbot will attempt to automatically renew your certificate when 1/3rd of the certificate lifetime remains. For a 90-day certificate as currently issued by Let's Encrypt, that means you should never see a certificate that is valid for less than approximately 30 days. In this case, it's safest to use at most 27 or 28 days, because of uncertainty about when certbot runs, and allowing it to recover itself from a few days of outage on the CA's side. You can also decide to require a shorter minimum certificate lifetime, like 14 days, if you're relatively confident that this is still long enough for your team to diagnose and fix the issue. Using 14 days also lets you avoid having false positives in the coming years as certificate lifetimes decrease.

Here's what we recommend creating within Heii On-Call, for each subdomain:

  • A non-critical ("It can wait until Monday") outbound probe trigger with "SSL Certificate Minimum Expiration Duration" set to 14 days
  • A critical ("Wake somebody up ASAP!") outbound probe trigger with "SSL Certificate Minimum Expiration Duration" set to 2 days

Heii On-Call allows configuring both critical and non-critical triggers. Critical ones use special permissions we've been granted for our iOS and Android apps to blast through Do-Not-Disturb and volume settings to play a loud sound you can't ignore, just like carrying a dedicated pager, repeating until acknowledged (or, if escalation strategies are configured, escalated to another member of your organization). The combination recommended above gives the person on-call a non-critical alert when you're 14 days from expiration, which can be dealt with at a convenient time. But if that goes unfixed, then your team will start getting critical alerts when the countdown is below 48 hours.

As for other settings on these triggers: the URL should just be a health check endpoint on that subdomain, and the timeout can be relatively long, like 1 hour, so brief outages (which you may want to monitor with other triggers) don't bother these longer-time-period SSL certificate monitors.

Quickstart

  1. Within Heii On-Call, go to a Service page, or click "New Service" and call it "SSL Certificates" to group these monitors together. Assign this Service to whichever on-call Rotation you'd like to get alerts.
  2. Click "New Trigger".
  3. Click "Outbound Probe" as the type of trigger.
  4. Click "It can wait until Monday" to request non-critical alerts.
  5. In the "Name" field, enter a friendly name like: "example.com SSL cert expiry 14+ days".
  6. In the "Timeout" field, enter: "1 hour" so that temporary outages don't cause false alarms.
  7. In the "URL" field, enter a health check endpoint URL, such as: "https://example.com/healthz". Double-check to be sure it's an https URL.
  8. In the "SSL Certificate Minimum Expiration Duration" field, enter: "14 days" so this trigger will be considered down when the certificate will expire in under 14 days.
  9. Click "Create Trigger".
  10. Repeat steps 2-9 for all other combinations you'd like to monitor, such as a critical alert at "2 days", or other subdomains like www.example.com.

See Quickstart: SSL Certificate Monitoring in Heii On-Call for a version of these steps with screenshots.

Next Steps

You can set up these triggers with "SSL Certificate Minimum Expiration Duration" on Heii On-Call's free plan in just a few minutes. Sleep easier knowing you'll be alerted well in advance of an easily avoidable customer-facing outage.

Enjoy your continuous SSL certificate expiration monitoring!