|keep-balance |no |yes|no ^3^|InternalURLs only used to expose Prometheus metrics|
|keep-web |yes |yes|yes ^5^|InternalURLs used by reverse proxy and container log API|
|websocket |yes |yes|no ^2^|InternalURLs only used by reverse proxy (e.g. Nginx)|
-|workbench1 |yes |no|no ||
|workbench2 |yes |no|no ||
</div>
If a client connects to the @Keepproxy@ service, it will talk to Nginx which will reverse proxy the traffic to the @Keepproxy@ service.
-h3. Workbench
-
-Consider this section for the @Workbench@ service:
-
-{% codeblock as yaml %}
- Workbench1:
- ExternalURL: "https://workbench.ClusterID.example.com"
-{% endcodeblock %}
-
-The @ExternalURL@ advertised is @https://workbench.ClusterID.example.com@. There is no value for @InternalURLs@ because Workbench1 is a Rails application served by Passenger. The only client connecting to the Passenger process is the reverse proxy (e.g. Nginx), and the listening host/post is configured in its configuration:
-
-<notextile><pre><code>
-server {
- listen 443 ssl;
- server_name workbench.ClusterID.example.com;
-
- ssl_certificate /YOUR/PATH/TO/cert.pem;
- ssl_certificate_key /YOUR/PATH/TO/cert.key;
-
- root /var/www/arvados-workbench/current/public;
- index index.html;
-
- passenger_enabled on;
- # If you're using RVM, uncomment the line below.
- #passenger_ruby /usr/local/rvm/wrappers/default/ruby;
-
- # `client_max_body_size` should match the corresponding setting in
- # the API.MaxRequestSize and Controller's server's Nginx configuration.
- client_max_body_size 128m;
-}
-</code></pre></notextile>
-
h3. API server
Consider this section for the @RailsAPI@ service:
h2. Keepproxy Permissions
-Permitting @keeproxy@ makes it possible to use @arv-put@ and @arv-get@, and upload from Workbench 1. It works in terms of individual 64 MiB keep blocks. It prints a log line each time a user uploads or downloads an individual block. Those logs are usually stored by @journald@ or @syslog@.
+Permitting @keepproxy@ makes it possible to use @arv-put@ and @arv-get@. It works in terms of individual 64 MiB keep blocks. It prints a log line each time a user uploads or downloads an individual block. Those logs are usually stored by @journald@ or @syslog@.
The default policy allows anyone to upload or download.
h2. WebDAV and S3 API Permissions
-Permitting @WebDAV@ makes it possible to use WebDAV, S3 API, download from Workbench 1, and upload/download with Workbench 2. It works in terms of individual files. It prints a log each time a user uploads or downloads a file. When @WebDAVLogEvents@ (default true) is enabled, it also adds an entry into the API server @logs@ table.
+Permitting @WebDAV@ makes it possible to use WebDAV, S3 API, and upload/download with Workbench 2. It works in terms of individual files. It prints a log each time a user uploads or downloads a file. When @WebDAVLogEvents@ (default true) is enabled, it also adds an entry into the API server @logs@ table.
When a user attempts to upload or download from a service without permission, they will receive a @403 Forbidden@ response. This only applies to file content.
For uploads, use the @file_upload@ event type.
-Note that this only covers upload and download activity via WebDAV, S3, Workbench 1 (download only) and Workbench 2.
+Note that this only covers upload and download activity via WebDAV, S3, and Workbench 2.
-File upload in Workbench 1 and the @arv-get@ and @arv-put@ tools use @Keepproxy@, which does not log activity to the audit log because it operates at the block level, not the file level. @Keepproxy@ records the uuid of the user that owns the token used in the request in its system logs. Those logs are usually stored by @journald@ or @syslog@. A typical log line for such a block download looks like this:
+The @arv-get@ and @arv-put@ tools upload via @Keepproxy@, which does not log activity to the audit log because it operates at the block level, not the file level. @Keepproxy@ records the uuid of the user that owns the token used in the request in its system logs. Those logs are usually stored by @journald@ or @syslog@. A typical log line for such a block download looks like this:
<pre>
-Jul 20 15:03:38 workbench.xxxx1.arvadosapi.com keepproxy[63828]: {"level":"info","locator":"abcdefghijklmnopqrstuvwxyz012345+53251584","msg":"Block download","time":"2021-07-20T15:03:38.458792300Z","user_full_name":"Albert User","user_uuid":"ce8i5-tpzed-abcdefghijklmno"}
+Jul 20 15:03:38 keep.xxxx1.arvadosapi.com keepproxy[63828]: {"level":"info","locator":"abcdefghijklmnopqrstuvwxyz012345+53251584","msg":"Block download","time":"2021-07-20T15:03:38.458792300Z","user_full_name":"Albert User","user_uuid":"ce8i5-tpzed-abcdefghijklmno"}
</pre>
It is possible to do a reverse lookup from the locator to find all matching collections: the @manifest_text@ field of a collection lists all the block locators that are part of the collection. The @manifest_text@ field also provides the relevant filename in the collection. Because this lookup is rather involved and there is no automated tool to do it, we recommend disabling @KeepproxyPermission.User.Download@ and @KeepproxyPermission.User.Upload@ for sites where the audit log is important and @arv-get@ and @arv-put@ are not essential.
The Keepproxy server is a gateway into your Keep storage. Unlike the Keepstore servers, which are only accessible on the local LAN, Keepproxy is suitable for clients located elsewhere on the internet. Specifically, in contrast to Keepstore:
* A client writing through Keepproxy sends a single copy of a data block, and Keepproxy distributes copies to the appropriate Keepstore servers.
-* A client can write through Keepproxy without precomputing content hashes. Notably, the browser-based upload feature in Workbench requires Keepproxy.
+* A client can write through Keepproxy without precomputing content hashes.
* Keepproxy checks API token validity before processing requests. (Clients that can connect directly to Keepstore can use it as scratch space even without a valid API token.)
By convention, we use the following hostname for the Keepproxy server: