Enhanced Prometheus Alerts¶
Robusta takes Prometheus to the next level by correlating alerts with other observability data.
Robusta has two primary sources of alerts:
Prometheus alerts, forwarded by AlertManager to Robusta
APIServer Alerts, generated by Robusta itself (e.g. for OOMKilled pods)
Let's see each type of alert in action.
Testing out Prometheus alerts¶
Deploy a broken pod that will be stuck in pending state:
kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/pending_pods/pending_pod_resources.yaml
Trigger a Prometheus alert immediately, skipping the normal delays:
robusta playbooks trigger prometheus_alert alert_name=KubePodCrashLooping namespace=default pod_name=example-pod
Testing out APIServer alerts¶
Let's deploy a crashing pod:
kubectl apply -f https://gist.githubusercontent.com/robusta-lab/283609047306dc1f05cf59806ade30b6/raw
Verify that the pod is actually crashing:
$ kubectl get pods -A | grep crashpod
NAME READY STATUS RESTARTS AGE
crashpod-64d8fbfd-s2dvn 0/1 CrashLoopBackOff 1 7s
Once the pod has reached two restarts, you'll get notified in Slack (or whatever alternative integration you configured):
Now open the Robusta UI and look for the same message there.
Finally, clean up the crashing pod:
kubectl delete deployment crashpod