RemediationΒΆ
Robusta includes actions that modify Kubernetes resources in your cluster. See also:
Alert handling jobΒΆ
Playbook Action: alert_handling_job
Create a kubernetes job with the specified parameters
In addition, the job pod receives the following alert parameters as environment variables
ALERT_NAME
ALERT_STATUS
ALERT_OBJ_KIND - oneof pod/deployment/node/job/daemonset or None in case it's unknown
ALERT_OBJ_NAME
ALERT_OBJ_NAMESPACE (If present)
ALERT_OBJ_NODE (If present)
ALERT_LABEL_{LABEL_NAME} for every label on the alert. For example a label named foo becomes ALERT_LABEL_FOO
Add this to your Robusta configuration (Helm values.yaml):
customPlaybooks:
- actions:
- alert_handling_job:
command:
- perl
- -Mbignum=bpi
- -wle
- print bpi(2000)
image: string
triggers:
- on_prometheus_alert: {}
The above is an example. Try customizing the trigger and parameters.
- image (str)
The job image.
- command (str list)
The job command as array of strings
- name (str) = robusta-action-job
Custom name for the job and job container.
- namespace (str) = default
The created job namespace.
- service_account (str)
Job pod service account. If omitted, default is used.
- restart_policy (str) = OnFailure
Job container restart policy
- job_ttl_after_finished (int) = 120
Delete finished job ttl (seconds). If omitted, jobs will not be deleted automatically.
- notify (bool)
Add a notification for creating the job.
- wait_for_completion (bool) = True
Wait for the job to complete and attach it's output. Only relevant when notify=true.
- completion_timeout (int) = 300
Maximum seconds to wait for job to complete. Only relevant when wait_for_completion=true.
- backoff_limit (int)
Specifies the number of retries before marking this job failed. Defaults to 6
- active_deadline_seconds (int)
Specifies the duration in seconds relative to the startTime
that the job may be active before the system tries to terminate it; value must be
positive integer
- env (envvar list)
Inject environment variables and secrets just like you do with a Kubernetes Job.
Delete podΒΆ
Playbook Action: delete_pod
Deletes a pod
Add this to your Robusta configuration (Helm values.yaml):
customPlaybooks:
- actions:
- delete_pod: {}
triggers:
- on_pod_all_changes: {}
The above is an example. Try customizing the trigger and parameters.
No action parameters
This action can be manually triggered using the Robusta CLI:
robusta playbooks trigger delete_pod name=POD_NAME namespace=POD_NAMESPACE
Delete jobΒΆ
Playbook Action: delete_job
Delete the job from the cluster
Add this to your Robusta configuration (Helm values.yaml):
customPlaybooks:
- actions:
- delete_job: {}
triggers:
- on_job_failure: {}
The above is an example. Try customizing the trigger and parameters.
No action parameters
This action can be manually triggered using the Robusta CLI:
robusta playbooks trigger delete_job name=JOB_NAME namespace=JOB_NAMESPACE
Alert on hpa reached limitΒΆ
Playbook Action: alert_on_hpa_reached_limit
Add this to your Robusta configuration (Helm values.yaml):
customPlaybooks:
- actions:
- alert_on_hpa_reached_limit: {}
triggers:
- on_horizontalpodautoscaler_update: {}
The above is an example. Try customizing the trigger and parameters.
- increase_pct (int) = 20
Increase the HPA max_replicas by this percentage.
Rollout restartΒΆ
Playbook Action: rollout_restart
Performs rollout restart on a kubernetes workload. Supports deployments, deploymentconfig, daemonsets and statefulsets related events.
Add this to your Robusta configuration (Helm values.yaml):
customPlaybooks:
- actions:
- rollout_restart: {}
triggers:
- on_prometheus_alert: {}
The above is an example. Try customizing the trigger and parameters.
No action parameters
on_kubernetes_warning_event_delete
This action can be manually triggered using the Robusta CLI:
robusta playbooks trigger rollout_restart kind=RESOURCE_KIND name=RESOURCE_NAME
Restart named rolloutΒΆ
Playbook Action: restart_named_rollout
Performs rollout restart on a named argo rollout.
Add this to your Robusta configuration (Helm values.yaml):
customPlaybooks:
- actions:
- restart_named_rollout:
name: string
namespace: string
triggers:
- on_prometheus_alert: {}
The above is an example. Try customizing the trigger and parameters.
- name (str)
Resource name
- namespace (str)
Resource namespace
This action can be manually triggered using the Robusta CLI:
robusta playbooks trigger restart_named_rollout name=NAME namespace=NAMESPACE
NodeΒΆ
CordonΒΆ
Playbook Action: cordon
Cordon, Taints a node as unschedulable.
Add this to your Robusta configuration (Helm values.yaml):
customPlaybooks:
- actions:
- cordon: {}
triggers:
- on_node_create: {}
The above is an example. Try customizing the trigger and parameters.
No action parameters
This action can be manually triggered using the Robusta CLI:
robusta playbooks trigger cordon name=NODE_NAME
UncordonΒΆ
Playbook Action: uncordon
Unordon, Taints a node as schedulable.
Add this to your Robusta configuration (Helm values.yaml):
customPlaybooks:
- actions:
- uncordon: {}
triggers:
- on_node_create: {}
The above is an example. Try customizing the trigger and parameters.
No action parameters
This action can be manually triggered using the Robusta CLI:
robusta playbooks trigger uncordon name=NODE_NAME
DrainΒΆ
Playbook Action: drain
Drain, taints a node as unschedulable, and evicts all pods from the node. DaemonSets pods are skipped, as they tolerant unschedulable nodes by default.
Add this to your Robusta configuration (Helm values.yaml):
customPlaybooks:
- actions:
- drain: {}
triggers:
- on_node_create: {}
The above is an example. Try customizing the trigger and parameters.
No action parameters
This action can be manually triggered using the Robusta CLI:
robusta playbooks trigger drain name=NODE_NAME