Prometheus and AlertManagerΒΆ

Robusta can improve your existing Prometheus alerts. It can also execute Remediation Actions in response to alerts.

PrerequisitesΒΆ

AlertManager must be connected to Robusta. Refer to Integrating AlertManager and Prometheus.

TriggersΒΆ

The following triggers are available for Prometheus alerts:

on_prometheus_alert

on_prometheus_alert fires when a Prometheus alert starts or stops firing.

Example

Run the ps aux command when HostHighCpuLoad fires. Output will be sent as a Robusta notification. The node on which the command executes will be selected according to the alert labels.

customPlaybooks:
- triggers:
  - on_prometheus_alert:
      alert_name: HostHighCpuLoad
      scope:
        labels:
          - "deployment=nginx"
  actions:
  - node_bash_enricher:
     bash_command: ps aux

on_prometheus_alert supports the following parameters:

optional:
alert_name (str)
status (str) = firing

one of "firing", "resolved", or "all"

pod_name_prefix (str)
namespace_prefix (str)
instance_name_prefix (str)
k8s_providers (str list)
scope (complex)

each entry contains:

optional:
include (complex list)
exclude (complex list)

The scope filtering mechanism works exactly as it does for sinks (see Routing Alerts To Specific Sinks for more information), but you can only atch on labels and annotations in this case.

There are dedicated playbook actions for on_prometheus_alert:

Additionally, almost all Event Enrichment actions support on_prometheus_alert.

Running Python Code in Response to a AlertΒΆ

If the builtin actions are insufficient, you can extend Robusta with your own actions that respond to Prometheus alerts.

example action

@action
def my_action(alert: PrometheusKubernetesAlert):
    print(f"The alert {alert.alert_name} fired on pod {alert.pod.metadata.name}")
    print(f"The pod has these processes:", alert.pod.exec("ps aux"))
    print(f"The pod has {len(alert.pod.spec.containers)} containers")

alert.pod is a Kubernetes pod object. It will exist if the Prometheus alert had a pod label and the pod is alive when the playbook runs. There are also node, deployment, and daemonset fields.

Refer to Developing New Actions for more details.