Playbook BasicsΒΆ

A playbook is an automation rule for detecting, investigating, or fixing problems in your cluster.

For a gentle introduction, see What are Playbooks?

OverviewΒΆ

Every playbook consists of a condition (trigger) and instructions (actions) defining the response.

Playbooks behave like pipelines:

  1. Events come into Robusta and are checked against triggers.

  2. When there is a match, a trigger fires

  3. The relevant playbook runs

  4. All playbook actions execute, receiving the event as context

  5. If notifications were generated by the playbook, they are sent to sinks.

Defining Custom PlaybooksΒΆ

Using a custom playbook, we can get notified in Slack whenever a Pod's Liveness probe fails.

Use the customPlaybooks Helm value:

customPlaybooks:
- triggers:
    - on_kubernetes_warning_event_create:
        include: ["Liveness"]   # fires on failed Liveness probes
  actions:
    - create_finding:
        severity: HIGH
        title: "Failed liveness probe: $name"
    - event_resource_events: {}

Perform a Helm Upgrade to apply the custom playbook.

Next time a Liveness probe fails, you will get notified.

Apply the following command the simulate a failing liveness probe.

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/liveness_probe_fail/failing_liveness_probe.yaml

Let's explore each part of the above playbook in depth.

Understanding TriggersΒΆ

Triggers are event-driven, firing at specific moments when something occurs in your cluster. Even a Kubernetes cluster doing nothing generates a constant stream of events. Using triggers, you can find and react to the events that matter.

Going back to the above example, we saw the trigger on_kubernetes_warning_event_create. Breaking down the name, you'll notice the format on_<resource_type>_<operation>. This is a general pattern. on_kubernetes_warning_event_create fires when new Warning Events (kubectl get events --all-namespaces --field-selector type=Warning) are created.

The trigger also had an include filter, limiting which Warning Events cause the playbook to run. In this case its a Liveness probe event. See each trigger's documentation to learn which filters are supported.

Common TriggersΒΆ

Popular triggers include:

All triggers can be found under Triggers Reference.

Understanding ActionsΒΆ

Actions perform tasks in response to triggers, such as collecting information, investigating issues, or fixing problems.

In the above example, there were two actions. When playbooks contain multiple actions, they are executed in order:

  • create_finding - this generates the notification message

  • event_resource_events - this is a specific action for on_kubernetes_warning_event_create which attaches relevant events to the notification

The latter action has a funny name, which reflects that it takes a Kubernetes Warning Event as input, finds the related Kubernetes resource (e.g. a Pod), and then fetches all the related Kubernetes Warning Events for that resource.

Actions, Enrichers, and Silencers

Many actions in Robusta were written for a specific purpose, like enriching alerts or silencing them.

By convention, these actions are called enrichers and silencers, but those names are just convention.

Under the hood, enrichers and silencers are plain old actions, nothing more.

Common ActionsΒΆ

Popular actions include:

All actions can be found under Actions Reference.

Understanding NotificationsΒΆ

In Robusta, notifications are called Findings, as they represent something the playbook discovered.

In the above example, a Finding was generated by the create_finding action. Refer to Creating Notifications for more details.

Matching Actions to TriggersΒΆ

Triggers output typed events when they fire. For example:

  • The on_prometheus_alert trigger outputs a PrometheusAlert event

  • The on_pod_update trigger outputs a PodChangeEvent event

Each action is compatible with a subset of event types.

For instance, logs_enricher requires an event with a Pod object, such as PrometheusAlert, PodEvent, or PodChangeEvent.

Refer to docs for each action , to see supported events.