PrometheusΒΆ
By enabling this toolset, HolmesGPT will be able to generate graphs from prometheus metrics as well as help you write and validate prometheus queries. HolmesGPT can also detect memory leak patterns, CPU throttling, lagging queues, and high latency issues.
Prior to generating a PromQL query, HolmesQPT tends to list the available metrics. This is done to ensure the metrics used in PromQL are actually available.
ConfigurationΒΆ
holmes:
toolsets:
prometheus/metrics:
enabled: true
config:
prometheus_url: http://<prometheus host>:9090
headers:
Authorization: "Basic <base_64_encoded_string>"
Update your Helm values (generated_values.yaml) with the above configuration and run a Helm upgrade:
helm upgrade robusta robusta/robusta --values=generated_values.yaml --set clusterName=<YOUR_CLUSTER_NAME>
Add the following to ~/.holmes/config.yaml, creating the file if it doesn't exist:
toolsets:
prometheus/metrics:
enabled: true
config:
prometheus_url: http://<prometheus host>:9090
headers:
Authorization: "Basic <base_64_encoded_string>"
It is also possible to set the PROMETHEUS_URL
environment variable instead of the above prometheus_url
config key.
Advanced configuration
Below is the full list of options for this toolset:
prometheus/metrics:
enabled: true
config:
prometheus_url: http://<prometheus host>:9090
headers:
Authorization: "Basic <base_64_encoded_string>"
metrics_labels_time_window_hrs: 48 # default value
metrics_labels_cache_duration_hrs: 12 # default value
fetch_labels_with_labels_api: false # default value
fetch_metadata_with_series_api: false # default value
tool_calls_return_data: true # default value
prometheus_url A base URL for prometheus. This should include the protocol (e.g. https) and the port.
headers Extra headers to pass to all prometheus http requests. Use this to pass authentication. Prometheus supports basic authentication.
metrics_labels_time_window_hrs Represents the time window, in hours, over which labels are fetched. This avoids fetching obsolete labels. Set it to
null
to let HolmesGPT fetch labels regardless of when they were generated.metrics_labels_cache_duration_hrs How long are labels cached, in hours. Set it to
null
to disable caching.fetch_labels_with_labels_api Uses prometheus labels API to fetch labels instead of the series API. In some cases setting to True can improve the performance of the toolset, however there will be an increased number of HTTP calls to prometheus. You can experiment with both as they are functionally identical.
fetch_metadata_with_series_api Uses the series API instead of the metadata API. You should only set this value to true if the metadata API is disabled or not working. HolmesGPT's ability to select the right metric will be negatively impacted because the series API does not return key metadata like the metrics/series description or their type (gauge, histogram, etc.).
tool_calls_return_data Defaults to
true
. Iffalse
, no prometheus data will be returned to HolmesGPT. Set it tofalse
if you frequently reach the token limit when using this toolset. Setting this setting tofalse
will also disable HolmesGPT's ability to analyze prometheus data.
Finding the prometheus URL
The best way to find the prometheus URL is to use "ask holmes". This only works if your cluster is live and already connected to Robusta.
If not, follow these steps:
Run
kubectl get services -n <monitoring-namespace>
to list all services. Replace<monitoring-namespace>
with the namespace where Prometheus is deployed. This is oftenmonitoring
orprometheus
. You can also runkubectl get services -A
which will list all services in all namespaces.Identify which are the namespace and name of your Prometheus service. You can set up port forwarding to test if the service is correct and if Prometheus is reachable.
Run
kubectl describe service <service-name> -n <namespace>
to get details about the service, including the cluster IP and port.Set the DNS or the cluster IP as well as the port to the configuration field
prometheus_url
as mentioned above.
CapabilitiesΒΆ
The table below describes the specific capabilities provided by this toolset. HolmesGPT can decide to invoke any of these capabilities when answering questions or investigating issues.
Tool Name |
Description |
---|---|
list_available_metrics |
List all the available metrics to query from prometheus, including their types (counter, gauge, histogram, summary) and available labels. |
execute_prometheus_instant_query |
Execute an instant PromQL query |
execute_prometheus_range_query |
Execute a PromQL range query |