Remediate Prometheus AlertsΒΆ
Robusta can respond to Prometheus alerts and automatically remediate them.
Using Kubernetes Jobs for Alert RemediationΒΆ
A popular remediation method involves running Kubernetes Jobs when alerts fire.
Add the following to your customPlaybooks:
customPlaybooks:
- triggers:
- on_prometheus_alert:
alert_name: TestAlert
actions:
- alert_handling_job:
# you can access information from the alert using environment variables
command:
- sh
- -c
- echo \"$ALERT_NAME $ALERT_LABEL_REGION dumping all available environment variables, which include alert metadata and labels\" && env && sleep 60
image: busybox
notify: true
wait_for_completion: true
completion_timeout: 100
env:
- name: GITHUB_SECRET
valueFrom:
secretKeyRef:
name: robusta-github-key
key: githubapikey
Perform a Helm Upgrade.
Note
alert_nameshould be the exact name of the Prometheus Alert. For example,CrashLoopBackOffwill not work, because the actual Prometheus Alert isKubePodCrashLooping.Alert labels are added as environment variables in the following format
ALERT_LABEL_{LABEL_NAME}. For example a label namedfoobecomesALERT_LABEL_FOO
Test this playbook by simulating a Prometheus alert:
robusta playbooks trigger prometheus_alert alert_name=TestAlert
Running Bash Commands for Alert RemediationΒΆ
Alerts can also be remediated with bash commands:
customPlaybooks:
- triggers:
- on_prometheus_alert:
alert_name: SomeAlert
actions:
- node_bash_enricher:
bash_command: do something
Further ReadingΒΆ
Reference for the alert_handling_job action
Reference for the node_bash_enricher action