--- layout: default navsection: admin title: Health checks ... {% comment %} Copyright (C) The Arvados Authors. All rights reserved. SPDX-License-Identifier: CC-BY-SA-3.0 {% endcomment %} Arvados services support endpoints for monitoring the status of a cluster. Health check endpoints are found at @/_health/ping@ for many Arvados services. Services must have ManagementToken configured. This is used to authorize access to the health check endpoint. If ManagementToken is not configured, health checks will return the error @404 disabled@. The requester must provide the HTTP header @Authorization: Bearer (ManagementToken)@. This endpoint returns a JSON object with the field @health@. This has a value of either @OK@ or @ERROR@. On error, it may also include a field @error@ with additional information. h2. How to enable health checks on each service. h3. API server Set @MangementToken@ in @application.yml@
  # Token to be included in all healthcheck requests. Disabled by default.
  # Server expects request header of the format "Authorization: Bearer xxx"
  ManagementToken: ...
h3. Node Manager Set @port@ (the listen port) and @MangementToken@ in the @Manage@ section of @node-manager.ini@ .
[Manage]
port=8888
ManagementToken=...
* * keepstore * keep-web * keepproxy * arv-git-httpd * websockets h2. Healthcheck aggregator The service @arvados-health@ performs health checks on all configured services and returns a single value of @OK@ or @ERROR@ for the entire cluster. It exposes the endpoint @/_health/all@ . The healthcheck aggregator uses the "NodeProfile" section of the cluster-wide configuration file. Here is an example.
Cluster:
  # The cluster uuid prefix
  zzzzz:
    NodeProfile:
      # For each node, the profile name corresponds to a
      # locally-resolvable hostname, and describes which Arvados
      # services are available on that machine.
      api:
        arvados-controller:
          Listen: 8000
        arvados-api-server:
          Listen: 8001
      manage:
	arvados-node-manager:
	  Listen: 8002
      workbench:
	arvados-workbench:
	  Listen: 8003
	arvados-ws:
	  Listen: 8004
      keep:
	keep-web:
	  Listen: 8005
	keepproxy:
	  Listen: 8006
      keep0:
        keepstore:
	  Listen: 25701
      keep1:
        keepstore:
	  Listen: 25701