Remediation

These actions actively run procedures in your cluster

Alert handling job

Playbook Action: alert_handling_job

Create a kubernetes job with the specified parameters

In addition, the job pod receives the following alert parameters as environment variables

ALERT_NAME

ALERT_STATUS

ALERT_OBJ_KIND - oneof pod/deployment/node/job/daemonset or None in case it's unknown

ALERT_OBJ_NAME

ALERT_OBJ_NAMESPACE (If present)

ALERT_OBJ_NODE (If present)

Add this to your Robusta configuration (Helm values.yaml):

customPlaybooks:
- actions:
  - alert_handling_job:
      command:
      - perl
      - -Mbignum=bpi
      - -wle
      - print bpi(2000)
      image: string
  triggers:
  - on_prometheus_alert: {}

The above is an example. Try customizing the trigger and parameters.

required:
image (str)

The job image.

command (str list)

The job command as array of strings

optional:
name (str) = robusta-action-job

Custom name for the job and job container.

namespace (str) = default

The created job namespace.

service_account (str)

Job pod service account. If omitted, default is used.

restart_policy (str) = OnFailure

Job container restart policy

job_ttl_after_finished (int)

Delete finished job ttl (seconds). If omitted, jobs will not be deleted automatically.

notify (bool)

Add a notification for creating the job.

backoff_limit (int)

Specifies the number of retries before marking this job failed. Defaults to 6

active_deadline_seconds (int)

Specifies the duration in seconds relative to the startTime that the job may be active before the system tries to terminate it; value must be positive integer

Delete pod

Playbook Action: delete_pod

Deletes a pod

Add this to your Robusta configuration (Helm values.yaml):

customPlaybooks:
- actions:
  - delete_pod: {}
  triggers:
  - on_pod_crash_loop: {}

The above is an example. Try customizing the trigger and parameters.

No action parameters

This action can be manually triggered using the Robusta CLI:

robusta playbooks trigger delete_pod name=POD_NAME namespace=POD_NAMESPACE 

Delete job

Playbook Action: delete_job

Delete the job from the cluster

Add this to your Robusta configuration (Helm values.yaml):

customPlaybooks:
- actions:
  - delete_job: {}
  triggers:
  - on_job_failure: {}

The above is an example. Try customizing the trigger and parameters.

No action parameters

This action can be manually triggered using the Robusta CLI:

robusta playbooks trigger delete_job name=JOB_NAME namespace=JOB_NAMESPACE 

Alert on hpa reached limit

Playbook Action: alert_on_hpa_reached_limit

Notify when the HPA reaches its maximum replicas and allow fixing it.

Add this to your Robusta configuration (Helm values.yaml):

customPlaybooks:
- actions:
  - alert_on_hpa_reached_limit: {}
  triggers:
  - on_horizontalpodautoscaler_update: {}

The above is an example. Try customizing the trigger and parameters.

optional:
increase_pct (int) = 20

Increase the HPA max_replicas by this percentage.