In this step-by-step guide I cover how to integrate Heii On-Call with a Prometheus Alertmanager instance. Integration takes a matter of minutes, and you can start leveraging Prometheus's extensive set of features in your website monitoring and on-call rotations.
Prometheus is a popular open source monitoring solution that lets you aggregate metrics exposed by your infrastructure and applications. If you are willing to put in the work to configure and maintain it, it can be a great alternative to expensive Application Performance Monitoring solutions like Datadog and Better Uptime. At Heii On-Call we are big fans of keeping our operations as lean as possible, so we are big fans of Prometheus and Grafana for monitoring. If you are new to Prometheus I suggest you follow our guide to a minimal Prometheus setup.
Heii On-Call natively integrates with Prometheus Alertmanager through their provided webhooks. Once set up, you can route Prometheus alerts to a Heii On-Call trigger, and Prometheus will even automatically resolve a triggering alert if the alerting condition goes back below the alerting threshold.
The first step is to set up the Heii On-Call trigger that will receive a webhook from your Prometheus Alertmanager instance. Create a new trigger and choose Prometheus
from the mechanism dropdown.
We also need to create an API key so that Prometheus can authenticate with Heii On-Call. Click on API Keys from an Organization's home page, and create a new API key for Prometheus.
Note the API key and the Trigger ID. We will use both of these in the Prometheus configuration coming up.
Now head over to where you keep your Prometheus configuration. If you are using Kubernetes this is likely a ConfigMap definition, or a file on the Prometheus Alertmanager instance if you are running it directly. Somewhere in the configuration file you should have a receivers:
key that consists of a list of receivers that are available to be routed to. Add a new receiver that looks like this:
receivers: - name: heii-on-call webhook_configs: - url: "https://api.heiioncall.com/triggers/YOUR-TRIGGER-ID-HERE/prometheus" http_config: follow_redirects: false authorization: credentials: "YOUR-HEII-ON-CALL-API-KEY-HERE"
The block follows the specification for a webhook_config in Alertmanager. See their documentation for further customization. If your configuration lives in version control, we recommend using credentials_file
instead of credentials
to store the Heii On-Call API key.
Now we need to set up a route:
to send alerts to the receiver we just created. Prometheus uses a "tree" of routes where the first route is the root node, all alerts go through the root node and travel down the tree of routes, and will be delivered to any receiver that matches the matchers
directive. Like many configuration options in Prometheus this is often overkill for small to medium sized teams. In the example below we set the root route to our heii-on-call receiver, so that every alert gets sent to the heii-on-call trigger.
route: receiver: heii-on-call group_wait: 10s repeat_interval: 30m routes: []
If you had multiple Heii-On Call receivers you could set up additional leaf routes, one for each Heii On-Call trigger, and then you would be able to match a label on the alert to route to the right team.
That's it! With this simple configuration alerts will be routed to the individual currently on call in the Service you set up in Heii On-Call. If you have a more complicated use case that our current integration does not cover, don't hesitate to reach out to us at heii@heiioncall.com. We will be happy to help.
Remember that monitoring through metrics is only part of a comprehensive uptime and monitoring plan. In addition to gathering metrics from your infrastructure you should also be monitoring uptime from an external source.
Happy Monitoring!