ScansΒΆ
Robusta includes built-in actions to scan and get insights on Kubernetes clusters.
These actions can be triggered:
Automatically, on a schedule.
On demand, via the Robusta UI.
On demand, via cli command.
KRR - Prometheus-based Kubernetes Resource RecommendationsΒΆ
Robustas KRR is a CLI tool for optimizing resource allocation in Kubernetes clusters. It gathers pod usage data from Prometheus and recommends requests and limits for CPU and memory. This reduces costs and improves performance. By default, every instance of Robusta that's connected to the UI will run a KRR scan on startup. Further KRR scans can be triggered in the UI, and all scans can be viewed there.
With or without the UI, you can configure additional scans on a schedule. The results can be sent as a PDF to Slack or to the Robusta UI.
Krr scanΒΆ
Playbook Action: krr_scan
Displays a KRR scan report.
You can trigger a KRR scan at any time, by running the following command:
robusta playbooks trigger krr_scan
Add this to your Robusta configuration (Helm values.yaml):
customPlaybooks:
- actions:
- krr_scan:
prometheus_additional_labels: 'cluster: ''cluster-2-test''env: ''prod'''
prometheus_auth: Basic YWRtaW46cGFzc3dvcmQ=
prometheus_url: http://prometheus-k8s.monitoring.svc.cluster.local:9090
prometheus_url_query_string: demo-query=example-data
triggers:
- on_schedule: {}
The above is an example. Try customizing the trigger and parameters.
- custom_annotations (str dict)
custom annotations to be used for the running pod/job
- prometheus_url (str)
Prometheus url. If omitted, we will try to find a prometheus instance in the same cluster
- prometheus_auth (str)
Prometheus auth header to be used in Authorization header. If omitted, we will not add any auth header
- prometheus_url_query_string (str)
Additional query string parameters to be appended to the Prometheus connection URL
- prometheus_additional_labels (str dict)
A dictionary of additional labels needed for multi-cluster prometheus
- add_additional_labels (bool) = True
adds the additional labels (if defined) to the query
- prometheus_graphs_overrides (complex list)
each entry contains:
required:- resource_type (str)
- item_type (str)
- query (str)
- values_format (str)
- serviceAccountName (str) = robusta-runner-service-account
The account name to use for the KRR scan job.
- strategy (str) = simple
- args (str)
Deprecated - KRR cli arguments.
- krr_args (str)
KRR cli arguments.
- timeout (int) = 3600
Time span for yielding the scan.
- max_workers (int) = 3
Number of concurrent workers used in krr.
- krr_job_spec (dict)
A dictionary for passing spec params such as tolerations and nodeSelector.
This action can be manually triggered using the Robusta CLI:
robusta playbooks trigger krr_scan
Taints, Tolerations and NodeSelectorsΒΆ
To set custom tolerations or a nodeSelector update your generated_values.yaml
file as follows:
globalConfig:
krr_job_spec:
tolerations:
- key: "key1"
operator: "Exists"
effect: "NoSchedule"
nodeSelector:
nodeName: "your-selector"
Customizing Efficiency Recommendations in the Robusta UIΒΆ
You can tweak KRR's recommendation algorithm to suit your environment using the krr_args
setting in Robusta's Helm chart.
Add the following config to the top of your generated_values.yaml
with your custom values. KRR will use these values every time it sends data to the Robusta UI or other destinations.
If you are having performance issues, specifically with Prometheus using a lot of memory, reduce max_workers
to reduce memory usage. KRR uses 3 workers by default.
globalConfig:
krr_args: "--cpu-min 15 --mem-min 200 --cpu_percentile 90 --memory_buffer_percentage 25"
max_workers: 2
Common KRR Settings
|
Type |
Used for |
Default value |
---|---|---|---|
|
INTEGER |
Sets the minimum recommended CPU value in millicores. |
10 |
|
INTEGER |
Sets the minimum recommended memory value in MB. |
100 |
|
TEXT |
The duration of the history data to use (in hours). |
336 |
|
TEXT |
The step for the history data (in minutes). |
1.25 |
|
TEXT |
The percentile to use for the CPU recommendation. |
99 |
|
TEXT |
The percentage of added buffer to the peak memory usage for memory recommendation. |
15 |
|
TEXT |
The number of data points required to make a recommendation for a resource. |
100 |
Popeye - A Kubernetes Cluster SanitizerΒΆ
Popeye is a utility that scans live Kubernetes clusters and reports potential issues with resources and configurations. By default, every instance of Robusta that's connected to the UI will run a Popeye scan on startup. Further Popeye scans can be triggered in the UI, and all scans can be viewed there.
With or without the UI, you can configure additional scans on a schedule as shown below.
customPlaybooks:
- triggers:
- on_schedule:
fixed_delay_repeat:
repeat: 1 # number of times to run or -1 to run forever
seconds_delay: 604800 # 1 week
actions:
- popeye_scan:
spinach: |
popeye:
excludes:
v1/pods:
- name: rx:kube-system
sinks:
- "robusta_ui_sink"
The results can be sent as a PDF to Slack or to the Robusta UI.
Note
Other sinks like MSTeams are not supported yet.
Popeye scanΒΆ
Playbook Action: popeye_scan
Displays a popeye scan report.
You can trigger a Popeye scan at any time, by running the following command:
robusta playbooks trigger popeye_scan
Add this to your Robusta configuration (Helm values.yaml):
customPlaybooks:
- actions:
- popeye_scan: {}
triggers:
- on_schedule: {}
The above is an example. Try customizing the trigger and parameters.
- custom_annotations (str dict)
custom annotations to be used for the running pod/job
- service_account_name (str) = robusta-runner-service-account
The account name to use for the Popeye scan job.
- args (str)
Deprecated - Popeye cli arguments.
- popeye_args (str) = -s no,ns,po,svc,sa,cm,dp,sts,ds,pv,pvc,hpa,pdb,cr,crb,ro,rb,ing,np,psp
Popeye cli arguments.
- spinach (str) = popeye: excludes: apps/v1/daemonsets: - name: rx:kube-system apps/v1/deployments: - name: rx:kube-system v1/configmaps: - name: rx:kube-system v1/pods: - name: rx:.* codes: - 106 - 107 - name: rx:kube-system v1/services: - name: rx:kube-system v1/namespaces: - name: kube-system
Spinach.yaml config file to supply to the scan.
- timeout (int) = 300
Time span for yielding the scan.
- popeye_job_spec (dict)
A dictionary for passing spec params such as tolerations and nodeSelector.
This action can be manually triggered using the Robusta CLI:
robusta playbooks trigger popeye_scan
Taints, Tolerations and NodeSelectorsΒΆ
To set custom tolerations or a nodeSelector update your generated_values.yaml
file as follows:
globalConfig:
popeye_job_spec:
tolerations:
- key: "key1"
operator: "Exists"
effect: "NoSchedule"
nodeSelector:
kubernetes.io/arch: "amd64"
nodeName: "your-selector"
Note
Popeye does not support arm nodes yet. If your cluster has both Arm and x64 nodes add kubernetes.io/arch: "amd64"
as a node selector to schedule Popeye jobs on the x64 nodes.
Troubleshooting PopeyeΒΆ
Popeye scans run as Jobs in your cluster. If there are issues with a scan, troubleshoot as follows:
EventsΒΆ
To find errors with the Popeye job run:
kubectl get events --all-namespaces --field-selector=type!=Normal | grep popeye-job
LogsΒΆ
Additional errors can sometimes be found in the Robusta runner logs:
robusta logs
Known issuesΒΆ
couldn't get resource list for external.metrics.k8s.io/v1beta1
ΒΆ
This is a known issue, there is a working workaround, which involves deploying a dummy workload. Read more about it here.
exec /bin/sh: exec format error
ΒΆ
At the moment, Popeye docker images are only compiled for linux/amd64 os/arch. This error suggests you are running the Popeye image on a different os/arch node.