--- layout: default navsection: admin title: Metrics ... {% comment %} Copyright (C) The Arvados Authors. All rights reserved. SPDX-License-Identifier: CC-BY-SA-3.0 {% endcomment %} Some Arvados services publish Prometheus/OpenMetrics-compatible metrics at @/metrics@. Metrics can help you understand how components perform under load, find performance bottlenecks, and detect and diagnose problems. To access metrics endpoints, services must be configured with a "management token":management-token.html. When accessing a metrics endpoint, prefix the management token with @"Bearer "@ and supply it in the @Authorization@ request header.
curl -sfH "Authorization: Bearer your_management_token_goes_here" "https://0.0.0.0:25107/metrics"The plain text export format includes "help" messages with a description of each reported metric. When configuring Prometheus, use a @bearer_token@ or @bearer_token_file@ option to authenticate requests.
scrape_configs: - job_name: keepstore bearer_token: your_management_token_goes_here static_configs: - targets: - "keep0.ClusterID.example.com:25107"table(table table-bordered table-condensed table-hover). |_. Component|_. Metrics endpoint| |arvados-api-server|| |arvados-controller|✓| |arvados-dispatch-cloud|✓| |arvados-git-httpd|| |arvados-node-manager|| |arvados-ws|✓| |composer|| |keepproxy|| |keepstore|✓| |keep-balance|✓| |keep-web|✓| |workbench1|| |workbench2|| h2. Node manager The node manager does not export prometheus-style metrics, but its @/status.json@ endpoint provides a snapshot of internal status at the time of the most recent wishlist update.
curl -sfH "Authorization: Bearer your_management_token_goes_here" "http://0.0.0.0:8989/status.json"table(table table-bordered table-condensed). |_. Attribute|_. Type|_. Description| |nodes_booting|int|Number of nodes in booting state| |nodes_unpaired|int|Number of nodes in unpaired state| |nodes_busy|int|Number of nodes in busy state| |nodes_idle|int|Number of nodes in idle state| |nodes_fail|int|Number of nodes in fail state| |nodes_down|int|Number of nodes in down state| |nodes_shutdown|int|Number of nodes in shutdown state| |nodes_wish|int|Number of nodes in the current wishlist| |node_quota|int|Current node count ceiling due to cloud quota limits| |config_max_nodes|int|Configured max node count| h3. Example
{ "actor_exceptions": 0, "idle_times": { "compute1": 0, "compute3": 0, "compute2": 0, "compute4": 0 }, "create_node_errors": 0, "destroy_node_errors": 0, "nodes_idle": 0, "config_max_nodes": 8, "list_nodes_errors": 0, "node_quota": 8, "Version": "1.1.4.20180719160944", "nodes_wish": 0, "nodes_unpaired": 0, "nodes_busy": 4, "boot_failures": 0 }