Monitor Kubernetes from ScratchΒΆ

Estimated time: 5 minutes

Setup Kubernetes monitoring from scratch. Install Robusta, Prometheus, and Grafana on Kubernetes using Helm. This is the recommended way to monitor your cluster, with an all-in-one package.

PrerequisitesΒΆ

Have questions?

Ask on Slack or open a GitHub issue

Generate a ConfigΒΆ

Robusta needs settings to work. For example, if you use Slack then Robusta needs a Slack API key. These settings are configured as Helm values.

Use the robusta cli tool to generate the Helm values. You can install the cli with pip or run it inside a prebuilt container.

Requirements and Troubleshooting
pip install -U robusta-cli --no-cache
robusta gen-config --enable-prometheus-stack
Requirements and Troubleshooting

A Docker daemon and bash are required.

On Windows, use bash inside WSL.

curl -fsSL -o robusta https://docs.robusta.dev/master/_static/robusta
chmod +x robusta
./robusta gen-config --enable-prometheus-stack

You should now have a generated_values.yaml file with a Robusta config. Save this file! You'll need it to install Robusta on new clusters.

This file contains sensitive values. Refer to Managing Secrets for tips on protecting them.

Install with HelmΒΆ

Copy the below commands, replacing the <YOUR_CLUSTER_NAME> placeholder.

On some clusters this can take a while, so don't panic if it appears stuck:

helm repo add robusta https://robusta-charts.storage.googleapis.com && helm repo update
helm install robusta robusta/robusta -f ./generated_values.yaml --set clusterName=<YOUR_CLUSTER_NAME>

To use all Robusta features, ensure storage is enabled on your cluster. If necessary, refer to the EKS documentation and install the EBS CSI add-on

How do I know if my cluster has storage enabled?

Try installing Robusta. If storage is not configured, you'll receive an error:

PreBind plugin "VolumeBinding": binding volumes: timed out waiting for the condition

Running kubectl get pvc -A will also show PersistentVolumeClaims in Pending state.

In this case, follow the instructions above and enable storage for your cluster.

helm repo add robusta https://robusta-charts.storage.googleapis.com && helm repo update
helm install robusta robusta/robusta -f ./generated_values.yaml --set clusterName=<YOUR_CLUSTER_NAME>

Due to Autopilot restrictions, some components are disabled for Robusta's bundled Prometheus. Don't worry, everything will still work.

helm repo add robusta https://robusta-charts.storage.googleapis.com && helm repo update
helm install robusta robusta/robusta -f ./generated_values.yaml \
    --set clusterName=<YOUR_CLUSTER_NAME> \
    --set kube-prometheus-stack.coreDns.enabled=false \
    --set kube-prometheus-stack.kubeControllerManager.enabled=false \
    --set kube-prometheus-stack.kubeDns.enabled=false \
    --set kube-prometheus-stack.kubeEtcd.enabled=false \
    --set kube-prometheus-stack.kubeProxy.enabled=false \
    --set kube-prometheus-stack.kubeScheduler.enabled=false \
    --set kube-prometheus-stack.nodeExporter.enabled=false \
    --set kube-prometheus-stack.prometheusOperator.kubeletService.enabled=false

Install as usual, then grant relevant permissions.

helm repo add robusta https://robusta-charts.storage.googleapis.com && helm repo update
helm install robusta robusta/robusta -f ./generated_values.yaml --set clusterName=<YOUR_CLUSTER_NAME>

Test clusters tend to have fewer resources. To lower Robusta's resource requests, set isSmallCluster=true.

helm repo add robusta https://robusta-charts.storage.googleapis.com && helm repo update
helm install robusta robusta/robusta -f ./generated_values.yaml --set clusterName=<YOUR_CLUSTER_NAME> --set isSmallCluster=true

Verifying InstallationΒΆ

Confirm that two Robusta pods are running with no errors in the logs:

kubectl get pods -A | grep robusta
robusta logs

See Robusta in actionΒΆ

Deploy a crashing pod:

kubectl apply -f https://gist.githubusercontent.com/robusta-lab/283609047306dc1f05cf59806ade30b6/raw

Verify the pod is crashing:

$ kubectl get pods -A | grep crashpod
NAME                            READY   STATUS             RESTARTS   AGE
crashpod-64d8fbfd-s2dvn         0/1     CrashLoopBackOff   1          7s

Once the pod restarts twice, you'll get notified in your configured sink.

Example Slack Message

Now open the Robusta UI and look for the same message there.

Finally, clean up the crashing pod:

kubectl delete deployment crashpod

Next StepsΒΆ

See how Robusta improves Prometheus.