List of built-in playbooks¶
Application Visibility and Troubleshooting¶
Restart loop reporter¶
Playbook Action
When a pod is in restart loop, debug the issue, fetch the logs, and send useful information on the restart

This action can be run automatically.
Add this to your Robusta configuration (values.yaml when installing with Helm):
actions:
- restart_loop_reporter: {}
triggers:
- on_pod_all_changes: {}
The above is an example. Try customizing the trigger and parameters.
- rate_limit (int) = 3600
Rate limit the execution of this action (Seconds).
- restart_reason (str)
Limit restart loops for this specific reason. If omitted, all restart reasons will be included.
on_pod_create
on_pod_all_changes
on_pod_delete
on_prometheus_alert
on_pod_update
This action can be manually triggered using the Robusta CLI:
robusta playbooks trigger restart_loop_reporter name=POD_NAME namespace=POD_NAMESPACE
Pod ps¶
Playbook Action
This action can be run automatically.
Add this to your Robusta configuration (values.yaml when installing with Helm):
actions:
- pod_ps: {}
triggers:
- on_pod_all_changes: {}
The above is an example. Try customizing the trigger and parameters.
No action parameters
on_pod_create
on_pod_all_changes
on_pod_delete
on_prometheus_alert
on_pod_update
This action can be manually triggered using the Robusta CLI:
robusta playbooks trigger pod_ps name=POD_NAME namespace=POD_NAMESPACE
Kubernetes Error Handling¶
Node health watcher¶
Playbook Action
This action can be run automatically.
Add this to your Robusta configuration (values.yaml when installing with Helm):
actions:
- node_health_watcher: {}
triggers:
- on_node_all_changes: {}
The above is an example. Try customizing the trigger and parameters.
No action parameters
on_node_delete
on_node_all_changes
on_node_update
on_node_create
Alert on hpa reached limit¶
Playbook Action
This action can be run automatically.
Add this to your Robusta configuration (values.yaml when installing with Helm):
actions:
- alert_on_hpa_reached_limit: {}
triggers:
- on_horizontalpodautoscaler_delete: {}
The above is an example. Try customizing the trigger and parameters.
- increase_pct (int) = 20
Increase the HPA max_replicas by this percentage.
on_horizontalpodautoscaler_delete
on_horizontalpodautoscaler_all_changes
on_horizontalpodautoscaler_create
on_horizontalpodautoscaler_update
Scale hpa callback¶
Playbook Action
Update the max_replicas of this HPA to the specified value.
Usually used as a callback action, when the HPA reaches the max_replicas limit.
This action can be run automatically.
Add this to your Robusta configuration (values.yaml when installing with Helm):
actions:
- scale_hpa_callback:
max_replicas: 1
triggers:
- on_horizontalpodautoscaler_delete: {}
The above is an example. Try customizing the trigger and parameters.
- max_replicas (int)
New max_replicas to set this HPA to.
on_horizontalpodautoscaler_delete
on_horizontalpodautoscaler_all_changes
on_horizontalpodautoscaler_create
on_horizontalpodautoscaler_update
This action can be manually triggered using the Robusta CLI:
robusta playbooks trigger scale_hpa_callback name=HORIZONTALPODAUTOSCALER_NAME namespace=HORIZONTALPODAUTOSCALER_NAMESPACE max_replicas=MAX_REPLICAS
Kubernetes Events¶
Event report¶
Playbook Action
This action can be run automatically.
Add this to your Robusta configuration (values.yaml when installing with Helm):
actions:
- event_report: {}
triggers:
- on_kubernetes_warning_event: {}
The above is an example. Try customizing the trigger and parameters.
- rate_limit (int) = 3600
Rate limit the execution of this action (Seconds).
- finding_key (str) = DEFAULT
Specify the finding identifier, to reference it in other actions.
on_kubernetes_warning_event
on_event_all_changes
on_event_create
on_event_delete
on_event_update
Event resource events¶
Playbook Action
This action can be run automatically.
Add this to your Robusta configuration (values.yaml when installing with Helm):
actions:
- event_resource_events: {}
triggers:
- on_kubernetes_warning_event: {}
The above is an example. Try customizing the trigger and parameters.
- finding_key (str) = DEFAULT
Specify the finding identifier, to reference it in other actions.
on_kubernetes_warning_event
on_event_all_changes
on_event_create
on_event_delete
on_event_update
Kubernetes Monitoring¶
Git change audit¶
Playbook Action
Audit Kubernetes resources from the cluster to Git as yaml files (cluster/namespace/resources hierarchy). Monitor resource changes and save it to a dedicated Git repository.
Using this audit repository, you can easily detect unplanned changes on your clusters.

This action can be run automatically.
Add this to your Robusta configuration (values.yaml when installing with Helm):
actions:
- git_change_audit:
cluster_name: string
git_key: '********'
git_url: git@github.com:arikalon1/robusta-audit.git
triggers:
- on_service_delete: {}
The above is an example. Try customizing the trigger and parameters.
- cluster_name (str)
This cluster name. Changes will be audited under this cluster name.
- git_url (str)
Audit Git repository url.
- git_key (str)
Git repository deployment key with write access. To set this up generate a private/public key pair for GitHub.
- ignored_changes (str list)
List of changes that shouldn't be audited.
on_kubernetes_any_resource_all_changes
on_kubernetes_any_resource_create
on_kubernetes_any_resource_update
on_kubernetes_any_resource_delete
- Or any other inheriting trigger. See Triggers for details
on_pod_all_changes
on_job_delete
on_statefulset_all_changes
...
Deployment status report¶
Playbook Action
Collect predefined grafana panels screenshots, after a deployment change. The report will be generated in intervals, as configured in the 'delays' parameter. When the report is ready, it will be sent to the configured sinks.

This action can be run automatically.
Add this to your Robusta configuration (values.yaml when installing with Helm):
actions:
- deployment_status_report:
delays:
- 1
- 1
grafana_api_key: '********'
reports_panel_urls:
- http://MY_GRAFANA/d-solo/SOME_OTHER_DASHBOARD/.../?orgId=1&from=now-1h&to=now&panelId=3
triggers:
- on_deployment_delete: {}
The above is an example. Try customizing the trigger and parameters.
- grafana_api_key (str)
Grafana API key.
- delays (int list)
List of seconds intervals in which to generate this report. Specifying [60, 60] will generate this report twice, after 60 seconds and 120 seconds after the change.
- reports_panel_urls (str list)
List of panel urls included in this report. it's highly recommended to put relative time arguments, rather then absolute. i.e. from=now-1h&to=now
- report_name (str) = Deployment change report
The name of the report.
- fields_to_monitor (str list) = ['image']
List of yaml attributes to monitor. Any field that contains one of these strings will match.
on_deployment_all_changes
on_deployment_delete
on_deployment_update
on_deployment_create
Resource babysitter¶
Playbook Action
This action can be run automatically.
Add this to your Robusta configuration (values.yaml when installing with Helm):
actions:
- resource_babysitter: {}
triggers:
- on_service_delete: {}
The above is an example. Try customizing the trigger and parameters.
- fields_to_monitor (str list) = ['spec']
List of yaml attributes to monitor. Any field that contains one of these strings will match.
- omitted_fields (str list) = ['status', 'metadata.generation', 'metadata.resourceVersion', 'metadata.managedFields', 'spec.replicas']
List of yaml attributes changes to ignore.
on_kubernetes_any_resource_all_changes
on_kubernetes_any_resource_create
on_kubernetes_any_resource_update
on_kubernetes_any_resource_delete
- Or any other inheriting trigger. See Triggers for details
on_pod_all_changes
on_job_delete
on_statefulset_all_changes
...
Incluster ping¶
Playbook Action
Check network connectivity in your cluster using ping. Pings a hostname from within the cluster
This action can be run automatically.
Add this to your Robusta configuration (values.yaml when installing with Helm):
actions:
- incluster_ping:
hostname: string
triggers:
- on_pod_create: {}
The above is an example. Try customizing the trigger and parameters.
- hostname (str)
Ping target host name.
Any trigger
This action can be manually triggered using the Robusta CLI:
robusta playbooks trigger incluster_ping hostname=HOSTNAME
Integrations¶
Argo app sync¶
Playbook Action
This action can be run automatically.
Add this to your Robusta configuration (values.yaml when installing with Helm):
actions:
- argo_app_sync:
argo_app_name: string
argo_token: '********'
argo_url: https://my-argo-cd.com
triggers:
- on_pod_create: {}
The above is an example. Try customizing the trigger and parameters.
- argo_url (str)
http(s) Argo CD server url.
- argo_token (str)
Argo CD authentication token.
- argo_app_name (str)
Argo CD application that needs syncing.
- argo_verify_server_cert (bool) = True
verify Argo CD server certificate. Defaults to True.
- rate_limit_seconds (int) = 1800
this playbook is rate limited. Defaults to 1800 seconds.
Any trigger
This action can be manually triggered using the Robusta CLI:
robusta playbooks trigger argo_app_sync argo_url=ARGO_URL argo_token=ARGO_TOKEN argo_app_name=ARGO_APP_NAME
Kubernetes Optimization¶
Config ab testing¶
Playbook Action
Apply YAML configurations to Kubernetes resources for limited periods of time.
Adds adds grafana annotations showing when each configuration was applied.
The execution schedule is defined by the playbook trigger. (every X seconds)
- Commonly used for:
Troubleshooting - Finding the first version a production bug appeared by iterating over image tags Cost/performance optimization - Comparing the cost or performance of different deployment configurations
- Note:
Only changing attributes that already exists in the active configuration is supported.
For example, you can change resources.requests.cpu, if that attribute already exists in the deployment.

This action can be run automatically.
Add this to your Robusta configuration (values.yaml when installing with Helm):
actions:
- config_ab_testing:
configuration_sets:
- config_items: '"spec.template.spec.containers[0].resources.requests.cpu": 250m,
"spec.template.spec.containers[0].resources.requests.memory": 128Mi'
config_set_name: string
- config_items: '"spec.template.spec.containers[0].resources.requests.cpu": 250m,
"spec.template.spec.containers[0].resources.requests.memory": 128Mi'
config_set_name: string
grafana_api_key: '********'
grafana_dashboard_uid: 09ec8aa1e996d6ffcd6817bbaff4db1b
grafana_url: http://grafana.namespace.svc
kind: string
name: string
triggers:
- on_schedule: {}
The above is an example. Try customizing the trigger and parameters.
- grafana_api_key (str)
grafana key with write permissions.
- grafana_dashboard_uid (str)
dashboard ID as it appears in the dashboard's url
- kind (str)
The kind of the tested resource. Kind can be 'Deployment'/'StatefulSet' etc
- name (str)
The name of the tested resource.
- configuration_sets (complex list)
List of test configurations.
each entry contains:
required:- config_set_name (str)
The name of this configuration set. .
- config_items (str dict)
The yaml attributes values for this configuration set.
- grafana_url (str)
http(s) url of grafana or None for autodetection of an in-cluster grafana
- api_version (str) = v1
The api version of the tested resource.
- namespace (str) = default
The namespace of the tested resource.
on_schedule
Disk benchmark¶
Playbook Action
Run disk benchmark in your cluster. The benchmark creates a PVC, using the configured storage class, and runs the benchmark using fio. For more details: https://fio.readthedocs.io/en/latest/

This action can be run automatically.
Add this to your Robusta configuration (values.yaml when installing with Helm):
actions:
- disk_benchmark:
storage_class_name: string
triggers:
- on_pod_create: {}
The above is an example. Try customizing the trigger and parameters.
- storage_class_name (str)
Pvc storage class, From the available cluster storage classes. standard/fast/etc.
- pvc_name (str) = robusta-disk-benchmark
Name of the pvc created for the benchmark.
- test_seconds (int) = 20
The benchmark duration.
- namespace (str) = robusta
Namespace used for the benchmark.
- disk_size (str) = 10Gi
The size of pvc used for the benchmark.
Any trigger
This action can be manually triggered using the Robusta CLI:
robusta playbooks trigger disk_benchmark storage_class_name=STORAGE_CLASS_NAME
Stress Testing and Chaos Engineering¶
Generate high cpu¶
Playbook Action
Create a pod with high CPU on the cluster for 60 seconds. Can be used to simulate alerts or other high CPU load scenarios.
This action can be run automatically.
Add this to your Robusta configuration (values.yaml when installing with Helm):
actions:
- generate_high_cpu: {}
triggers:
- on_pod_create: {}
The above is an example. Try customizing the trigger and parameters.
No action parameters
Any trigger
This action can be manually triggered using the Robusta CLI:
robusta playbooks trigger generate_high_cpu
Http stress test¶
Playbook Action
This action can be run automatically.
Add this to your Robusta configuration (values.yaml when installing with Helm):
actions:
- http_stress_test:
url: string
triggers:
- on_pod_create: {}
The above is an example. Try customizing the trigger and parameters.
- url (str)
In cluster target url.
- n (int) = 1000
Number of requests to run.
Any trigger
This action can be manually triggered using the Robusta CLI:
robusta playbooks trigger http_stress_test url=URL
Prometheus alert¶
Playbook Action
Simulate Prometheus alert sent to the Robusta runner. Can be used for testing, when implementing actions triggered by Prometheus alerts.
This action can be run automatically.
Add this to your Robusta configuration (values.yaml when installing with Helm):
actions:
- prometheus_alert:
alert_name: string
pod_name: string
triggers:
- on_pod_create: {}
The above is an example. Try customizing the trigger and parameters.
- alert_name (str)
Simulated alert name.
- pod_name (str)
Pod name, for a simulated pod alert.
- namespace (str) = default
Pod namespace, for a simulated pod alert.
- status (str) = firing
Simulated alert status. firing/resolved.
- severity (str) = error
Simulated alert severity.
- description (str) = simulated prometheus alert
Simulated alert description.
- generator_url (str)
Prometheus generator_url. Some enrichers, use this attribute to query Prometheus.
Any trigger
This action can be manually triggered using the Robusta CLI:
robusta playbooks trigger prometheus_alert alert_name=ALERT_NAME pod_name=POD_NAME