17863: Merge branch 'main' into 17863-bundler-update
authorWard Vandewege <ward@curii.com>
Thu, 22 Jul 2021 14:25:59 +0000 (10:25 -0400)
committerWard Vandewege <ward@curii.com>
Thu, 22 Jul 2021 14:25:59 +0000 (10:25 -0400)
Arvados-DCO-1.1-Signed-off-by: Ward Vandewege <ward@curii.com>

doc/admin/restricting-upload-download.html.textile.liquid
go.mod
go.sum
lib/controller/localdb/login_oidc.go
lib/controller/localdb/login_oidc_test.go
sdk/go/arvadostest/oidc_provider.go
tools/arvbox/lib/arvbox/docker/Dockerfile.demo
tools/arvbox/lib/arvbox/docker/service/crunch-dispatch-local/run
tools/arvbox/lib/arvbox/docker/service/crunch-dispatch-local/run-service [new file with mode: 0755]

index ea10752342d69beb264f5d5a096adc5dc935aa9e..44a0467cf472f66afc4e0bf759be89894606bf94 100644 (file)
@@ -18,7 +18,7 @@ There are two services involved in accessing data from outside the cluster.
 
 h2. Keepproxy Permissions
 
-Permitting @keeproxy@ makes it possible to use @arv-put@ and @arv-get@, and upload from Workbench 1.  It works in terms of individual 64 MiB keep blocks.  It prints a log each time a user uploads or downloads an individual block.
+Permitting @keeproxy@ makes it possible to use @arv-put@ and @arv-get@, and upload from Workbench 1.  It works in terms of individual 64 MiB keep blocks.  It prints a log line each time a user uploads or downloads an individual block. Those logs are usually stored by @journald@ or @syslog@.
 
 The default policy allows anyone to upload or download.
 
@@ -33,8 +33,6 @@ The default policy allows anyone to upload or download.
           Upload: true
 </pre>
 
-If you create a sharing link as an admin user, and then give someone the token from the sharing link to download a file using @arv-get@, because the downloader is anonymous, the download permission will be restricted based on the "User" role and not the "Admin" role.
-
 h2. WebDAV and S3 API Permissions
 
 Permitting @WebDAV@ makes it possible to use WebDAV, S3 API, download from Workbench 1, and upload/download with Workbench 2.  It works in terms of individual files.  It prints a log each time a user uploads or downloads a file.  When @WebDAVLogEvents@ (default true) is enabled, it also adds an entry into the API server @logs@ table.
@@ -49,7 +47,7 @@ The default policy allows anyone to upload or download.
 
 <pre>
     Collections:
-      WebDAVPermisison:
+      WebDAVPermission:
         User:
           Download: true
           Upload: true
@@ -57,9 +55,11 @@ The default policy allows anyone to upload or download.
           Download: true
           Upload: true
       WebDAVLogEvents: true
-</pre>
+      </pre>
 
-If you create a sharing link as an admin user, and then give someone the token from the sharing link to download a file over HTTP (WebDAV or S3 API), because the downloader is anonymous, the download permission will be restricted based on the "User" role and not the "Admin" role.
+When a user or admin creates a sharing link, a custom scoped token is embedded in that link. This effectively allows anonymous user access to the associated data via that link. These custom scoped tokens are always treated as user tokens for the purposes of restricting download access, even when created by an admin user. In other words, these custom scoped tokens, when used in a sharing link, are always subject to the value of the @WebDAVPermission/User/Download@ configuration setting.
+
+If that custom scoped token is used with @arv-get@, its use will be subject to the value of the @KeepproxyPermission/User/Download@ configuration setting.
 
 h2. Shell node and container permissions
 
@@ -67,7 +67,7 @@ Be aware that even when upload and download from outside the network is not allo
 
 h2. Choosing a policy
 
-This distinction between WebDAV and Keepproxy is important for auditing.  WebDAV records 'upload' and 'download' events on the API server that are included in the "User Activity Report":user-activity.html ,  whereas @keepproxy@ only logs upload and download of individual blocks, which require a reverse lookup to determine the collection(s) and file(s) a block is associated with.
+This distinction between WebDAV and Keepproxy is important for auditing.  WebDAV records 'upload' and 'download' events on the API server that are included in the "User Activity Report":user-activity.html,  whereas @keepproxy@ only logs upload and download of individual blocks, which require a reverse lookup to determine the collection(s) and file(s) a block is associated with.
 
 You set separate permissions for @WebDAV@ and @Keepproxy@, with separate policies for regular users and admin users.
 
@@ -81,7 +81,7 @@ For ease of access auditing, this policy prevents downloads using @arv-get@.  Do
 
 <pre>
     Collections:
-      WebDAVPermisison:
+      WebDAVPermission:
         User:
           Download: true
           Upload: true
@@ -105,7 +105,7 @@ This policy prevents regular users (non-admin) from downloading data.  Uploading
 
 <pre>
     Collections:
-      WebDAVPermisison:
+      WebDAVPermission:
         User:
           Download: false
           Upload: true
@@ -129,7 +129,7 @@ This policy is suitable for an installation where data is being shared with a gr
 
 <pre>
     Collections:
-      WebDAVPermisison:
+      WebDAVPermission:
         User:
           Download: true
           Upload: false
@@ -146,3 +146,24 @@ This policy is suitable for an installation where data is being shared with a gr
           Upload: true
       WebDAVLogEvents: true
 </pre>
+
+
+h2. Accessing the audit log
+
+When @WebDAVLogEvents@ is enabled, uploads and downloads of files are logged in the Arvados audit log. These events are included in the "User Activity Report":user-activity.html. The audit log can also be accessed via the API, SDKs or command line. For example, to show the 100 most recent file downloads:
+
+<pre>
+arv log list --filters '[["event_type","=","file_download"]]' -o 'created_at desc' -l 100
+</pre>
+
+For uploads, use the @file_upload@ event type.
+
+Note that this only covers upload and download activity via WebDAV, S3, Workbench 1 (download only) and Workbench 2.
+
+File upload in Workbench 1 and the @arv-get@ and @arv-put@ tools use @Keepproxy@, which does not log activity to the audit log because it operates at the block level, not the file level. @Keepproxy@ records the uuid of the user that owns the token used in the request in its system logs. Those logs are usually stored by @journald@ or @syslog@. A typical log line for such a block download looks like this:
+
+<pre>
+Jul 20 15:03:38 workbench.xxxx1.arvadosapi.com keepproxy[63828]: {"level":"info","locator":"abcdefghijklmnopqrstuvwxyz012345+53251584","msg":"Block download","time":"2021-07-20T15:03:38.458792300Z","user_full_name":"Albert User","user_uuid":"ce8i5-tpzed-abcdefghijklmno"}
+</pre>
+
+It is possible to do a reverse lookup from the locator to find all matching collections: the @manifest_text@ field of a collection lists all the block locators that are part of the collection. The @manifest_text@ field also provides the relevant filename in the collection. Because this lookup is rather involved and there is no automated tool to do it, we recommend disabling @KeepproxyPermission/User/Download@ and @KeepproxyPermission/User/Upload@ for sites where the audit log is important and @arv-get@ and @arv-put@ are not essential.
diff --git a/go.mod b/go.mod
index b70f6f3b476789f69f1845bffb6bfd4fcac80885..adca449b7143ff3b01d2e3a9e037eac2114320a8 100644 (file)
--- a/go.mod
+++ b/go.mod
@@ -44,7 +44,7 @@ require (
        github.com/johannesboyne/gofakes3 v0.0.0-20200716060623-6b2b4cb092cc
        github.com/julienschmidt/httprouter v1.2.0
        github.com/kevinburke/ssh_config v0.0.0-20171013211458-802051befeb5 // indirect
-       github.com/lib/pq v1.3.0
+       github.com/lib/pq v1.10.2
        github.com/msteinert/pam v0.0.0-20190215180659-f29b9f28d6f9
        github.com/opencontainers/go-digest v1.0.0-rc1 // indirect
        github.com/opencontainers/image-spec v1.0.1-0.20171125024018-577479e4dc27 // indirect
@@ -62,8 +62,6 @@ require (
        golang.org/x/net v0.0.0-20210405180319-a5a99cb37ef4
        golang.org/x/oauth2 v0.0.0-20190604053449-0f29369cfe45
        golang.org/x/sys v0.0.0-20210603125802-9665404d3644
-       golang.org/x/tools v0.1.0 // indirect
-       golang.org/x/sys v0.0.0-20210510120138-977fb7262007
        golang.org/x/tools v0.1.2 // indirect
        google.golang.org/api v0.13.0
        gopkg.in/asn1-ber.v1 v1.0.0-20181015200546-f715ec2f112d // indirect
diff --git a/go.sum b/go.sum
index 1fd37ed11ce411e625518eb3189c225e80b74616..2f575eae919cf129009ad887c5e91ada2448edcb 100644 (file)
--- a/go.sum
+++ b/go.sum
@@ -179,6 +179,8 @@ github.com/kr/logfmt v0.0.0-20140226030751-b84e30acd515/go.mod h1:+0opPa2QZZtGFB
 github.com/lib/pq v1.0.0/go.mod h1:5WUZQaWbwv1U+lTReE5YruASi9Al49XbQIvNi/34Woo=
 github.com/lib/pq v1.3.0 h1:/qkRGz8zljWiDcFvgpwUpwIAPu3r07TDvs3Rws+o/pU=
 github.com/lib/pq v1.3.0/go.mod h1:5WUZQaWbwv1U+lTReE5YruASi9Al49XbQIvNi/34Woo=
+github.com/lib/pq v1.10.2 h1:AqzbZs4ZoCBp+GtejcpCpcxM3zlSMx29dXbUSeVtJb8=
+github.com/lib/pq v1.10.2/go.mod h1:AlVN5x4E4T544tWzH6hKfbfQvm3HdbOxrmggDNAPY9o=
 github.com/mattn/go-sqlite3 v1.9.0 h1:pDRiWfl+++eC2FEFRy6jXmQlvp4Yh3z1MJKg4UeYM/4=
 github.com/mattn/go-sqlite3 v1.9.0/go.mod h1:FPy6KqzDD04eiIsT53CuJW3U88zkxoIYsOqkbpncsNc=
 github.com/matttproud/golang_protobuf_extensions v1.0.1 h1:4hp9jkHxhMHkqkrB3Ix0jegS5sx/RkqARlsWZ6pIwiU=
@@ -316,11 +318,11 @@ golang.org/x/sys v0.0.0-20200930185726-fdedc70b468f/go.mod h1:h1NjWce9XRLGQEsW7w
 golang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
 golang.org/x/sys v0.0.0-20210119212857-b64e53b001e4 h1:myAQVi0cGEoqQVR5POX+8RR2mrocKqNN1hmeMqhX27k=
 golang.org/x/sys v0.0.0-20210119212857-b64e53b001e4/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
-golang.org/x/sys v0.0.0-20210603125802-9665404d3644 h1:CA1DEQ4NdKphKeL70tvsWNdT5oFh1lOjihRcEDROi0I=
-golang.org/x/sys v0.0.0-20210603125802-9665404d3644/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
 golang.org/x/sys v0.0.0-20210330210617-4fbd30eecc44/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
 golang.org/x/sys v0.0.0-20210510120138-977fb7262007 h1:gG67DSER+11cZvqIMb8S8bt0vZtiN6xWYARwirrOSfE=
 golang.org/x/sys v0.0.0-20210510120138-977fb7262007/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
+golang.org/x/sys v0.0.0-20210603125802-9665404d3644 h1:CA1DEQ4NdKphKeL70tvsWNdT5oFh1lOjihRcEDROi0I=
+golang.org/x/sys v0.0.0-20210603125802-9665404d3644/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
 golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1 h1:v+OssWQX+hTHEmOBgwxdZxK4zHq3yOs8F9J7mk0PY8E=
 golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
 golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
index 61dc5c816b35661f39c4a800ab17f1bf55325f06..6182469ac378d58b1e1f864bf4d98a6b48a022fb 100644 (file)
@@ -177,12 +177,19 @@ func (ctrl *oidcLoginController) getAuthInfo(ctx context.Context, token *oauth2.
        } else if verified, _ := claims[ctrl.EmailVerifiedClaim].(bool); verified || ctrl.EmailVerifiedClaim == "" {
                // Fall back to this info if the People API call
                // (below) doesn't return a primary && verified email.
-               name, _ := claims["name"].(string)
-               if names := strings.Fields(strings.TrimSpace(name)); len(names) > 1 {
-                       ret.FirstName = strings.Join(names[0:len(names)-1], " ")
-                       ret.LastName = names[len(names)-1]
-               } else if len(names) > 0 {
-                       ret.FirstName = names[0]
+               givenName, _ := claims["given_name"].(string)
+               familyName, _ := claims["family_name"].(string)
+               if givenName != "" && familyName != "" {
+                       ret.FirstName = givenName
+                       ret.LastName = familyName
+               } else {
+                       name, _ := claims["name"].(string)
+                       if names := strings.Fields(strings.TrimSpace(name)); len(names) > 1 {
+                               ret.FirstName = strings.Join(names[0:len(names)-1], " ")
+                               ret.LastName = names[len(names)-1]
+                       } else if len(names) > 0 {
+                               ret.FirstName = names[0]
+                       }
                }
                ret.Email, _ = claims[ctrl.EmailClaim].(string)
        }
index 4be7d58f699c455ee67e31a206bcebfc889ed0ad..4778e45f5fe48f3a8edeb7e9afa295524d7af5e4 100644 (file)
@@ -56,6 +56,8 @@ func (s *OIDCLoginSuite) SetUpTest(c *check.C) {
        s.fakeProvider.AuthEmail = "active-user@arvados.local"
        s.fakeProvider.AuthEmailVerified = true
        s.fakeProvider.AuthName = "Fake User Name"
+       s.fakeProvider.AuthGivenName = "Fake"
+       s.fakeProvider.AuthFamilyName = "User Name"
        s.fakeProvider.ValidCode = fmt.Sprintf("abcdefgh-%d", time.Now().Unix())
        s.fakeProvider.PeopleAPIResponse = map[string]interface{}{}
 
@@ -421,8 +423,8 @@ func (s *OIDCLoginSuite) TestGoogleLogin_Success(c *check.C) {
        c.Check(token, check.Matches, `v2/zzzzz-gj3su-.{15}/.{32,50}`)
 
        authinfo := getCallbackAuthInfo(c, s.railsSpy)
-       c.Check(authinfo.FirstName, check.Equals, "Fake User")
-       c.Check(authinfo.LastName, check.Equals, "Name")
+       c.Check(authinfo.FirstName, check.Equals, "Fake")
+       c.Check(authinfo.LastName, check.Equals, "User Name")
        c.Check(authinfo.Email, check.Equals, "active-user@arvados.local")
        c.Check(authinfo.AlternateEmails, check.HasLen, 0)
 
@@ -446,6 +448,7 @@ func (s *OIDCLoginSuite) TestGoogleLogin_Success(c *check.C) {
 
 func (s *OIDCLoginSuite) TestGoogleLogin_RealName(c *check.C) {
        s.fakeProvider.AuthEmail = "joe.smith@primary.example.com"
+       s.fakeProvider.AuthEmailVerified = true
        s.fakeProvider.PeopleAPIResponse = map[string]interface{}{
                "names": []map[string]interface{}{
                        {
@@ -471,8 +474,10 @@ func (s *OIDCLoginSuite) TestGoogleLogin_RealName(c *check.C) {
        c.Check(authinfo.LastName, check.Equals, "Psmith")
 }
 
-func (s *OIDCLoginSuite) TestGoogleLogin_OIDCRealName(c *check.C) {
+func (s *OIDCLoginSuite) TestGoogleLogin_OIDCNameWithoutGivenAndFamilyNames(c *check.C) {
        s.fakeProvider.AuthName = "Joe P. Smith"
+       s.fakeProvider.AuthGivenName = ""
+       s.fakeProvider.AuthFamilyName = ""
        s.fakeProvider.AuthEmail = "joe.smith@primary.example.com"
        state := s.startLogin(c)
        s.localdb.Login(context.Background(), arvados.LoginOptions{
index de21302e5a048dfbca340abf24cb6c5359de7305..fa5e55c42e10af410d86d0e16fc23a637dbaeff2 100644 (file)
@@ -29,6 +29,8 @@ type OIDCProvider struct {
        AuthEmail          string
        AuthEmailVerified  bool
        AuthName           string
+       AuthGivenName      string
+       AuthFamilyName     string
        AccessTokenPayload map[string]interface{}
 
        PeopleAPIResponse map[string]interface{}
@@ -96,6 +98,8 @@ func (p *OIDCProvider) serveOIDC(w http.ResponseWriter, req *http.Request) {
                        "email":          p.AuthEmail,
                        "email_verified": p.AuthEmailVerified,
                        "name":           p.AuthName,
+                       "given_name":     p.AuthGivenName,
+                       "family_name":    p.AuthFamilyName,
                        "alt_verified":   true,                    // for custom claim tests
                        "alt_email":      "alt_email@example.com", // for custom claim tests
                        "alt_username":   "desired-username",      // for custom claim tests
@@ -131,8 +135,8 @@ func (p *OIDCProvider) serveOIDC(w http.ResponseWriter, req *http.Request) {
                json.NewEncoder(w).Encode(map[string]interface{}{
                        "sub":            "fake-user-id",
                        "name":           p.AuthName,
-                       "given_name":     p.AuthName,
-                       "family_name":    "",
+                       "given_name":     p.AuthGivenName,
+                       "family_name":    p.AuthFamilyName,
                        "alt_username":   "desired-username",
                        "email":          p.AuthEmail,
                        "email_verified": p.AuthEmailVerified,
index 92099614a81e7cdd881bcb7f97897dc2a1a5d76d..cb0dc2d652d9a8c54fb7db3bd39cb9255015faff 100644 (file)
@@ -47,7 +47,7 @@ RUN sudo -u arvbox /var/lib/arvbox/service/doc/run-service --only-deps
 RUN sudo -u arvbox /var/lib/arvbox/service/vm/run-service --only-deps
 RUN sudo -u arvbox /var/lib/arvbox/service/keepproxy/run-service --only-deps
 RUN sudo -u arvbox /var/lib/arvbox/service/arv-git-httpd/run-service --only-deps
-RUN sudo -u arvbox /var/lib/arvbox/service/crunch-dispatch-local/run-service --only-deps
+RUN /var/lib/arvbox/service/crunch-dispatch-local/run --only-deps
 RUN sudo -u arvbox /var/lib/arvbox/service/websockets/run --only-deps
 RUN sudo -u arvbox /usr/local/lib/arvbox/keep-setup.sh --only-deps
 RUN sudo -u arvbox /var/lib/arvbox/service/sdk/run-service
index 821afdce50b1fac246071dabc7e5fd507b201c84..3ce2220d0e26d5dc70705e8c8cafb1a7303225ae 100755 (executable)
@@ -6,25 +6,11 @@
 exec 2>&1
 set -ex -o pipefail
 
-. /usr/local/lib/arvbox/common.sh
-. /usr/local/lib/arvbox/go-setup.sh
+# singularity can use suid
+chown root /var/lib/arvados/bin/singularity \
+      /var/lib/arvados/etc/singularity/singularity.conf \
+      /var/lib/arvados/etc/singularity/capability.json \
+      /var/lib/arvados/etc/singularity/ecl.toml
+chmod u+s /var/lib/arvados/bin/singularity
 
-flock /var/lib/gopath/gopath.lock go install "git.arvados.org/arvados.git/services/crunch-dispatch-local"
-install $GOPATH/bin/crunch-dispatch-local /usr/local/bin
-ln -sf arvados-server /usr/local/bin/crunch-run
-
-if test "$1" = "--only-deps" ; then
-    exit
-fi
-
-cat > /usr/local/bin/crunch-run.sh <<EOF
-#!/bin/sh
-exec /usr/local/bin/crunch-run -container-enable-networking=default -container-network-mode=host \$@
-EOF
-chmod +x /usr/local/bin/crunch-run.sh
-
-export ARVADOS_API_HOST=$localip:${services[controller-ssl]}
-export ARVADOS_API_HOST_INSECURE=1
-export ARVADOS_API_TOKEN=$(cat $ARVADOS_CONTAINER_PATH/superuser_token)
-
-exec /usr/local/bin/crunch-dispatch-local -crunch-run-command=/usr/local/bin/crunch-run.sh -poll-interval=1
+exec /usr/local/lib/arvbox/runsu.sh $0-service $1
diff --git a/tools/arvbox/lib/arvbox/docker/service/crunch-dispatch-local/run-service b/tools/arvbox/lib/arvbox/docker/service/crunch-dispatch-local/run-service
new file mode 100755 (executable)
index 0000000..821afdc
--- /dev/null
@@ -0,0 +1,30 @@
+#!/bin/bash
+# Copyright (C) The Arvados Authors. All rights reserved.
+#
+# SPDX-License-Identifier: AGPL-3.0
+
+exec 2>&1
+set -ex -o pipefail
+
+. /usr/local/lib/arvbox/common.sh
+. /usr/local/lib/arvbox/go-setup.sh
+
+flock /var/lib/gopath/gopath.lock go install "git.arvados.org/arvados.git/services/crunch-dispatch-local"
+install $GOPATH/bin/crunch-dispatch-local /usr/local/bin
+ln -sf arvados-server /usr/local/bin/crunch-run
+
+if test "$1" = "--only-deps" ; then
+    exit
+fi
+
+cat > /usr/local/bin/crunch-run.sh <<EOF
+#!/bin/sh
+exec /usr/local/bin/crunch-run -container-enable-networking=default -container-network-mode=host \$@
+EOF
+chmod +x /usr/local/bin/crunch-run.sh
+
+export ARVADOS_API_HOST=$localip:${services[controller-ssl]}
+export ARVADOS_API_HOST_INSECURE=1
+export ARVADOS_API_TOKEN=$(cat $ARVADOS_CONTAINER_PATH/superuser_token)
+
+exec /usr/local/bin/crunch-dispatch-local -crunch-run-command=/usr/local/bin/crunch-run.sh -poll-interval=1