Prometheus and AlertManagerΒΆ
Robusta can improve your existing Prometheus alerts. It can also execute Remediation Actions in response to alerts.
PrerequisitesΒΆ
AlertManager must be connected to Robusta. Refer to Integrating AlertManager and Prometheus.
TriggersΒΆ
The following triggers are available for Prometheus alerts:
on_prometheus_alert
on_prometheus_alert
fires when a Prometheus alert starts or stops firing.
Example
Run the ps aux
command when HostHighCpuLoad fires. Output will be sent as a Robusta notification. The node on which the command executes will be selected according to the alert labels.
customPlaybooks:
- triggers:
- on_prometheus_alert:
alert_name: HostHighCpuLoad
scope:
include:
- labels:
- "deployment=nginx"
actions:
- node_bash_enricher:
bash_command: ps aux
on_prometheus_alert
supports the following parameters:
- alert_name (str)
- status (str) = firing
one of "firing", "resolved", or "all"
- pod_name_prefix (str)
- namespace_prefix (str)
- instance_name_prefix (str)
- k8s_providers (str list)
- scope (complex)
each entry contains:
optional:- include (complex list)
- exclude (complex list)
The scope
filtering mechanism works exactly as it does for sinks
(see Routing Alerts To Specific Sinks for more information), but you can only
atch on labels
and annotations
in this case.
Recommended ActionsΒΆ
There are dedicated playbook actions for on_prometheus_alert
:
Additionally, almost all Event Enrichment actions support on_prometheus_alert
.
Running Python Code in Response to a AlertΒΆ
If the builtin actions are insufficient, you can extend Robusta with your own actions that respond to Prometheus alerts.
example action
@action
def my_action(alert: PrometheusKubernetesAlert):
print(f"The alert {alert.alert_name} fired on pod {alert.pod.metadata.name}")
print(f"The pod has these processes:", alert.pod.exec("ps aux"))
print(f"The pod has {len(alert.pod.spec.containers)} containers")
alert.pod
is a Kubernetes pod object. It will exist if the Prometheus alert had a pod
label and the pod is alive
when the playbook runs. There are also node
, deployment
, and daemonset
fields.
Refer to Developing New Actions for more details.