Getting SupportΒΆ
Ask for help, or just say hi!
Commercial SupportΒΆ
Contact support@robusta.dev for details.
Common ErrorsΒΆ
This list contains some common errors we have encountered over time.
Robusta CLI toolΒΆ
Errors installing the robusta cli config creation tool. Not relevant when using the Web Installation method.
command not found: robusta (CLI not in path)
Determine where the Robusta-cli binary file is located
find / -regex '.*/bin/robusta' 2>/dev/null
Add the path you found (e.g
/opt/homebrew/bin/
) to your PATH. To do so, find your shell config file ( ~/.profile or ~/.bash_profile or ~/.zshrc etc...) and append the following:
export PATH="$PATH:<new-path>"
Reopen the terminal or run:
source <your-shell-config-file>
Alternative Solution
Instead of modifying PATH, run Robusta commands via the python3 binary: python3 -m robusta.cli.main gen-config
SSL certificate errors on Mac OS
This implies a python package with certificates is missing on your system.
To fix it, run /Applications/Python 3.9/Install Certificates.command
For more info see: https://stackoverflow.com/questions/52805115/certificate-verify-failed-unable-to-get-local-issuer-certificate
Helm installation failsΒΆ
Problems when running helm install
command or installing via GitOps.
unknown field in com.coreos.monitoring.v1.Prometheus.spec, ValidationError(Prometheus.spec)
This indicates potential discrepancies between the version of Prometheus you are trying to use and the version of the CRDs in your cluster.
Follow this guide for upgrading CRDs from an older version.
at least one sink must be defined
Verify sinksConfig
is defined in your Robusta values file, with at least one sink like Slack, Teams or Robusta UI ("robusta_sink"). If it's your first time installing, the fastest solution is to start configue creation from scratch.
Error: UPGRADE FAILED: execution error at (robusta/templates/playbooks-config.yaml:9:7): At least one sink must be defined!
Robusta runner, Prometheus or Holmes failuresΒΆ
robusta-runner pod is in Pending state due to memory issues
If your cluster has 20 Nodes or less, set robusta-runner's memory request to 512MiB in Robusta's Helm values:
runner:
resources:
requests:
memory: 512MiB
limits:
memory: 512MiB
Prometheus' pods are in Pending state due to memory issues
If your cluster has 20 Nodes or less, set Prometheus memory request to 1Gi in Robusta's Helm values:
kube-prometheus-stack:
prometheus:
prometheusSpec:
resources:
requests:
memory: 1Gi
limits:
memory: 1Gi
If using a test cluster like Kind/Colima, re-install Robusta with the isSmallCluster=true
property:
helm install robusta robusta/robusta -f ./generated_values.yaml --set clusterName=<YOUR_CLUSTER_NAME> --set isSmallCluster=true
robusta-runner isn't working or has exceptions
Start by checking the logs for errors:
kubectl get pods -A | grep robusta-runner # get the name and the namespace of the robusta pod
kubectl logs -n <NAMESPACE> <ROBUSTA-RUNNER-POD-NAME> # get the logs
Discovery Error
2023-04-17 23:37:43.019 ERROR Discovery process internal error
2023-04-17 23:37:43.022 INFO Initialized new discovery pool
2023-04-17 23:37:43.022 ERROR Failed to run publish discovery for robusta_ui_sink
Traceback (most recent call last):
File "/app/src/robusta/core/sinks/robusta/robusta_sink.py", line 175, in __discover_resources
results: DiscoveryResults = Discovery.discover_resources()
File "/app/src/robusta/core/discovery/discovery.py", line 288, in discover_resources
raise e
File "/app/src/robusta/core/discovery/discovery.py", line 280, in discover_resources
return future.result()
File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 446, in result
return self.__get_result()
File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
This error might be due to memory issues. Increase the memory request in Robusta's Helm values:
runner:
resources:
requests:
memory: 2048Mi
limits:
memory: 2048Mi
Blocked by firewall / HTTP proxy
If your Kubernetes cluster is behind an HTTP proxy or firewall, follow the instructions in Deploying Behind Proxies to ensure Robusta has the necessary access.
Error in Holmes: binascii.a2b_base64(s, strict_mode=validate)
If the Holmes pod fail to start, with this exception:
2024-09-20 15:37:57.961 INFO loading config /etc/robusta/config/active_playbooks.yaml
Traceback (most recent call last):
File "/app/server.py", line 65, in <module>
dal = SupabaseDal()
^^^^^^^^^^^^^
File "/app/holmes/core/supabase_dal.py", line 38, in __init__
self.enabled = self.__init_config()
^^^^^^^^^^^^^^^^^^^^
File "/app/holmes/core/supabase_dal.py", line 68, in __init_config
robusta_token = self.__load_robusta_config()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/holmes/core/supabase_dal.py", line 61, in __load_robusta_config
return RobustaToken(**json.loads(base64.b64decode(token)))
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/base64.py", line 88, in b64decode
return binascii.a2b_base64(s, strict_mode=validate)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
binascii.Error: Invalid base64-encoded string: number of data characters (21) cannot be 1 more than a multiple of 4
It's often because the Robusta UI Token
is pulled from a secret, and Holmes cannot read it.
See Reading the Robusta UI Token from a secret in HolmesGPT to configure Holmes to read the token
Alert Manager is not workingΒΆ
Not getting alert manager alerts
Receiver url has namespace TBD
Tip
If you're using the Robusta UI, you can test alert routing by Simulating an alert.
AlertManager Silences are Disappearing
This happens when AlertManager does not have persistent storage enabled.
When using Robusta's embedded Prometheus Stack, persistent storage is enabled by default.
For other Prometheus distributions set the following Helm value (or it's equivalent):
# this is the setting in in kube-prometheus-stack
# the exact setting will differ for other Prometheus distributions
alertmanager:
alertmanagerSpec:
storage:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi