Remediation

These actions actively run procedures in your cluster

Alert handling job

Playbook Action: alert_handling_job

Create a kubernetes job with the specified parameters

In addition, the job pod receives the following alert parameters as environment variables

ALERT_NAME

ALERT_STATUS

ALERT_OBJ_KIND - oneof pod/deployment/node/job/daemonset or None in case it's unknown

ALERT_OBJ_NAME

ALERT_OBJ_NAMESPACE (If present)

ALERT_OBJ_NODE (If present)

Add this to your Robusta configuration (Helm values.yaml):

actions:
- alert_handling_job:
    command:
    - perl
    - -Mbignum=bpi
    - -wle
    - print bpi(2000)
    image: string
triggers:
- on_prometheus_alert: {}

The above is an example. Try customizing the trigger and parameters.

required:
image (str)

The job image.

command (str list)

The job command as array of strings

optional:
name (str) = robusta-action-job

Custom name for the job and job container.

namespace (str) = default

The created job namespace.

service_account (str)

Job pod service account. If omitted, default is used.

restart_policy (str) = OnFailure

Job container restart policy

job_ttl_after_finished (int)

Delete finished job ttl (seconds). If omitted, jobs will not be deleted automatically.

notify (bool)

Add a notification for creating the job.

backoff_limit (int)

Specifies the number of retries before marking this job failed. Defaults to 6

active_deadline_seconds (int)

Specifies the duration in seconds relative to the startTime that the job may be active before the system tries to terminate it; value must be positive integer

Delete pod

Playbook Action: delete_pod

Deletes a pod

Add this to your Robusta configuration (Helm values.yaml):

actions:
- delete_pod: {}
triggers:
- on_container_oom_killed: {}

The above is an example. Try customizing the trigger and parameters.

No action parameters

This action can be manually triggered using the Robusta CLI:

robusta playbooks trigger delete_pod name=POD_NAME namespace=POD_NAMESPACE 

Alert on hpa reached limit

Playbook Action: alert_on_hpa_reached_limit

Notify when the HPA reaches its maximum replicas and allow fixing it.

Add this to your Robusta configuration (Helm values.yaml):

actions:
- alert_on_hpa_reached_limit: {}
triggers:
- on_horizontalpodautoscaler_update: {}

The above is an example. Try customizing the trigger and parameters.

optional:
increase_pct (int) = 20

Increase the HPA max_replicas by this percentage.