Remediate Prometheus AlertsΒΆ
Robusta can respond to Prometheus alerts and automatically remediate them.
Using Kubernetes Jobs for Alert RemediationΒΆ
A popular remediation method involves running Kubernetes Jobs when alerts fire.
Add the following to your customPlaybooks:
customPlaybooks:
- triggers:
- on_prometheus_alert:
alert_name: TestAlert
actions:
- alert_handling_job:
# you can access information from the alert using environment variables
command:
- sh
- -c
- echo \"$ALERT_NAME $ALERT_LABEL_REGION dumping all available environment variables, which include alert metadata and labels\" && env && sleep 60
image: busybox
notify: true
wait_for_completion: true
completion_timeout: 100
env:
- name: GITHUB_SECRET
valueFrom:
secretKeyRef:
name: robusta-github-key
key: githubapikey
Perform a Helm Upgrade.
Note
alert_name
should be the exact name of the Prometheus Alert. For example,CrashLoopBackOff
will not work, because the actual Prometheus Alert isKubePodCrashLooping
.Alert labels are added as environment variables in the following format
ALERT_LABEL_{LABEL_NAME}
. For example a label namedfoo
becomesALERT_LABEL_FOO
Test this playbook by simulating a Prometheus alert:
robusta playbooks trigger prometheus_alert alert_name=TestAlert
Running Bash Commands for Alert RemediationΒΆ
Alerts can also be remediated with bash commands:
customPlaybooks:
- triggers:
- on_prometheus_alert:
alert_name: SomeAlert
actions:
- node_bash_enricher:
bash_command: do something
Further ReadingΒΆ
Reference for the alert_handling_job action
Reference for the node_bash_enricher action