---
layout: default
navsection: userguide
title: Analyzing workflow cost (cloud only)
...
{% comment %}
Copyright (C) The Arvados Authors. All rights reserved.
SPDX-License-Identifier: CC-BY-SA-3.0
{% endcomment %}
{% include 'tutorial_expectations' %}
{% include 'notebox_begin' %}
Cost information is generally only available when Arvados runs in a cloud environment and @arvados-dispatch-cloud@ is used to dispatch containers. The per node-hour price for each defined InstanceType must be supplied in "config.yml":{{site.baseurl}}/admin/config.html.
{% include 'notebox_end' %}
The @arv-cluster-activity@ program can be used to analyze cluster usage and cost over a time period.
h2. Installation
The @arv-cluster-activity@ tool can be installed from a distribution package or PyPI.
h2. Option 1: Install from distribution packages
First, "add the appropriate package repository for your distribution":{{ site.baseurl }}/install/packages.html.
{% assign arvados_component = 'python3-arvados-cluster-activity' %}
{% include 'install_packages' %}
h2. Option 2: Install with pip
Run @pip install arvados-cluster-activity[prometheus]@ in an appropriate installation environment, such as a virtualenv.
Note:
Support for fetching Prometheus metrics depends on Pandas and NumPy. If these dependencies pose a problem, you can install the cluster activity tool without Prometheus support by omitting it from @pip install@.
The Cluster Activity report uses the Arvados Python SDK, which uses @pycurl@, which depends on the @libcurl@ C library. To build the module you may have to first install additional packages. On Debian-based distributions you can install them by running:
# apt install git build-essential python3-dev libcurl4-openssl-dev libssl-dev
~$ arv-cluster-activity --help
usage: arv-cluster-activity [-h] [--start START] [--end END] [--days DAYS] [--cost-report-file COST_REPORT_FILE] [--include-workflow-steps] [--columns COLUMNS] [--exclude EXCLUDE]
[--html-report-file HTML_REPORT_FILE] [--version] [--cluster CLUSTER] [--prometheus-auth PROMETHEUS_AUTH]
options:
-h, --help show this help message and exit
--start START Start date for the report in YYYY-MM-DD format (UTC) (or use --days)
--end END End date for the report in YYYY-MM-DD format (UTC), default "now"
--days DAYS Number of days before "end" to start the report (or use --start)
--cost-report-file COST_REPORT_FILE
Export cost report to specified CSV file
--include-workflow-steps
Include individual workflow steps (optional)
--columns COLUMNS Cost report columns (optional), must be comma separated with no spaces between column names. Available columns are:
Project, ProjectUUID, Workflow,
WorkflowUUID, Step, StepUUID, Sample, SampleUUID, User, UserUUID, Submitted, Started, Runtime, Cost
--exclude EXCLUDE Exclude workflows containing this substring (may be a regular expression)
--html-report-file HTML_REPORT_FILE
Export HTML report to specified file
--version Print version and exit.
--cluster CLUSTER Cluster to query for prometheus stats
--prometheus-auth PROMETHEUS_AUTH
Authorization file with prometheus info
PROMETHEUS_HOST=https://your.prometheus.server.example.com PROMETHEUS_USER=admin PROMETHEUS_PASSWORD=password@PROMETHEUS_USER@ and @PROMETHEUS_PASSWORD@ will be passed in an @Authorization@ header using HTTP Basic authentication. Alternately, instead of @PROMETHEUS_USER@ and @PROMETHEUS_PASSWORD@ you can provide @PROMETHEUS_APIKEY@. This will be passed in as a Bearer token (@Authorization: Bearer
~$ arv-cluster-activity \
--days 90
--include-workflow-steps \
--prometheus-auth prometheus.env \
--cost-report-file report.csv \
--html-report-file report.html
INFO:root:Exporting workflow runs 0 - 5
INFO:root:Getting workflow steps
INFO:root:Got workflow steps 0 - 2
INFO:root:Getting container hours time series
INFO:root:Getting data usage time series