net-ssh-gateway (2.0.0)
net-ssh (>= 4.0.0)
nio4r (2.5.8)
- nokogiri (1.13.7)
+ nokogiri (1.13.9)
mini_portile2 (~> 2.8.0)
racc (~> 1.4)
npm-rails (0.2.1)
}
install_services/login-sync() {
+ install_gem arvados sdk/ruby
install_gem arvados-login-sync services/login-sync
}
- sdk/cli/index.html.textile.liquid
- sdk/cli/reference.html.textile.liquid
- sdk/cli/subcommands.html.textile.liquid
+ - sdk/cli/project-management.html.textile.liquid
- Go:
- sdk/go/index.html.textile.liquid
- sdk/go/example.html.textile.liquid
- Users and Groups:
- admin/user-management.html.textile.liquid
- admin/user-management-cli.html.textile.liquid
+ - admin/group-management.html.textile.liquid
- admin/reassign-ownership.html.textile.liquid
- admin/link-accounts.html.textile.liquid
- - admin/group-management.html.textile.liquid
- admin/federation.html.textile.liquid
- admin/merge-remote-account.html.textile.liquid
- admin/migrating-providers.html.textile.liquid
- user/topics/arvados-sync-external-sources.html.textile.liquid
- admin/scoped-tokens.html.textile.liquid
- admin/token-expiration-policy.html.textile.liquid
- - admin/user-activity.html.textile.liquid
- Monitoring:
- admin/logging.html.textile.liquid
- admin/metrics.html.textile.liquid
- admin/health-checks.html.textile.liquid
- admin/management-token.html.textile.liquid
+ - admin/user-activity.html.textile.liquid
- Data Management:
- admin/collection-versioning.html.textile.liquid
- admin/collection-managed-properties.html.textile.liquid
---
layout: default
navsection: admin
-title: Group management
+title: Role group management at the CLI
...
{% comment %}
This page describes how to manage groups at the command line. You should be familiar with the "permission system":{{site.baseurl}}/api/permission-model.html .
-h2. Create a group
+h2. Create a role group
User groups are entries in the "groups" table with @"group_class": "role"@.
arv group create --group '{"name": "My new group", "group_class": "role"}'
</pre>
-h2(#add). Add a user to a group
+h2(#add). Add a user to a role group
There are two separate permissions associated with group membership. The first link grants the user @can_manage@ permission to manage things that the group can manage. The second link grants permission for other users of the group to see that this user is part of the group.
A user can also be given read-only access to a group. In that case, the first link should be created with @can_read@ instead of @can_manage@.
-h2. List groups
+h2. List role groups
<pre>
arv group list --filters '[["group_class", "=", "role"]]'
</pre>
-h2. List members of a group
+h2. List members of a role group
Use the command "jq":https://stedolan.github.io/jq/ to extract the tail_uuid of each permission link which has the user uuid.
["head_uuid", "=", "the_group_uuid"]]' | jq .items[].tail_uuid
</pre>
-h2. Share a project with a group
+h2(#share-project). Share a project with a role group
-This will give all members of the group @can_manage@ access.
+Members of the role group will have access to the project based on their level of access to the role group.
<pre>
arv link create --link '{
"head_uuid": "the_project_uuid"}'
</pre>
-A project can also be shared read-only. In that case, the first link should be created with @can_read@ instead of @can_manage@.
+A project can also be shared read-only. In that case, the link @name@ should be @can_read@ instead of @can_manage@.
h2. List things shared with the group
["tail_uuid", "=", "the_group_uuid"]]' | jq .items[].head_uuid
</pre>
-h2. Stop sharing a project with a group
+h2(#stop-sharing-project). Stop sharing a project with a group
This will remove access for members of the group.
arv link delete --uuid each_link_uuid
</pre>
-h2. Remove user from a group
+h2. Remove user from a role group
The first step is to find the permission link objects. The second step is to delete them.
</notextile>
-h2(#main). development main (as of 2022-09-21)
+h2(#main). development main (as of 2022-10-14)
"previous: Upgrading to 2.4.3":#v2_4_3
+h3. Old container logs are automatically deleted from PostgreSQL
+
+Cached copies of log entries from containers that finished more than 1 month ago are now deleted automatically (this only affects the "live" logs saved in the PostgreSQL database, not log collections saved in Keep). If you have an existing cron job that runs @rake db:delete_old_container_logs@, you can remove it. See configuration options @Containers.Logging.MaxAge@ and @Containers.Logging.SweepInterval@.
+
+h3. Fixed salt installer template file to support container shell access
+
+If you manage your cluster using the salt installer, you may want to update it to the latest version, use the appropriate @config_examples@ subdirectory and re-reploy with your custom @local.params@ file so that the @arvados-controller@'s @nginx@ configuration file gets fixed.
+
+h3. Login-sync script requires configuration update on LoginCluster federations
+
+If you have @arvados-login-sync@ running on a satellite cluster, please update the environment variable settings by removing the @LOGINCLUSTER_ARVADOS_API_*@ variables and setting @ARVADOS_API_TOKEN@ to a LoginCluster's admin token, as described on the "updated install page":{{site.baseurl}}/install/install-shell-server.html#arvados-login-sync.
+
h3. Renamed keep-web metrics and WebDAV configs
Metrics previously reported by keep-web (@arvados_keepweb_collectioncache_requests@, @..._hits@, @..._pdh_hits@, @..._api_calls@, @..._cached_manifests@, and @arvados_keepweb_sessions_cached_collection_bytes@) have been replaced with @arvados_keepweb_cached_session_bytes@.
</notextile>
-h3(#SbatchArguments). Containers.LSF.BsubArgumentsList
+h3(#BsubArgumentsList). Containers.LSF.BsubArgumentsList
When arvados-dispatch-lsf invokes @bsub@, you can add arguments to the command by specifying @BsubArgumentsList@. You can use this to send the jobs to specific cluster partitions or add resource requests. Set @BsubArgumentsList@ to an array of strings.
Note that the default value for @BsubArgumentsList@ uses the @-o@ and @-e@ arguments to write stdout/stderr data to files in @/tmp@ on the compute nodes, which is helpful for troubleshooting installation/configuration problems. Ensure you have something in place to delete old files from @/tmp@, or adjust these arguments accordingly.
-h3(#SbatchArguments). Containers.LSF.BsubCUDAArguments
+h3(#BsubCUDAArguments). Containers.LSF.BsubCUDAArguments
If the container requests access to GPUs (@runtime_constraints.cuda.device_count@ of the container request is greater than zero), the command line arguments in @BsubCUDAArguments@ will be added to the command line _after_ @BsubArgumentsList@. This should consist of the additional @bsub@ flags your site requires to schedule the job on a node with GPU support. Set @BsubCUDAArguments@ to an array of strings. For example:
</pre>
</notextile>
-h3(#PollPeriod). Containers.PollInterval
+h3(#PollInterval). Containers.PollInterval
arvados-dispatch-lsf polls the API server periodically for new containers to run. The @PollInterval@ option controls how often this poll happens. Set this to a string of numbers suffixed with one of the time units @s@, @m@, or @h@. For example:
</notextile>
-h3(#CrunchRunCommand-network). Containers.CrunchRunArgumentList: Using host networking for containers
+h3(#CrunchRunArgumentList). Containers.CrunchRunArgumentList: Using host networking for containers
Older Linux kernels (prior to 3.18) have bugs in network namespace handling which can lead to compute node lockups. This by is indicated by blocked kernel tasks in "Workqueue: netns cleanup_net". If you are experiencing this problem, as a workaround you can disable use of network namespaces by Docker across the cluster. Be aware this reduces container isolation, which may be a security risk.
</pre>
</notextile>
+
+h3(#InstanceTypes). InstanceTypes: Avoid submitting jobs with unsatisfiable resource constraints
+
+LSF does not provide feedback when a submitted job's RAM, CPU, or disk space constraints cannot be satisfied by any node: the job will wait in the queue indefinitely with "pending" status, reported by Arvados as "queued".
+
+As a workaround, you can configure @InstanceTypes@ with your LSF cluster's compute node sizes. Arvados will use these sizes to determine when a container is impossible to run, and cancel it instead of submitting an LSF job.
+
+Apart from detecting non-runnable containers, the configured instance types will not have any effect on scheduling.
+
+<notextile>
+<pre> InstanceTypes:
+ most-ram:
+ VCPUs: 8
+ RAM: 640GiB
+ IncludedScratch: 640GB
+ most-cpus:
+ VCPUs: 32
+ RAM: 256GiB
+ IncludedScratch: 640GB
+ gpu:
+ VCPUs: 8
+ RAM: 256GiB
+ IncludedScratch: 640GB
+ CUDA:
+ DriverVersion: "11.4"
+ HardwareCapability: "7.5"
+ DeviceCount: 1
+</pre>
+</notextile>
+
+
{% assign arvados_component = 'arvados-dispatch-lsf' %}
{% include 'install_packages' %}
A shell node runs the @arvados-login-sync@ service to manage user accounts, and typically has Arvados utilities and SDKs pre-installed. Users are allowed to log in and run arbitrary programs. For optimal performance, the Arvados shell server should be on the same LAN as the Arvados cluster.
-Because Arvados @config.yml@ _contains secrets_ it should not *not* be present on shell nodes.
+Because Arvados @config.yml@ _contains secrets_ it should *not* be present on shell nodes.
Shell nodes should be separate virtual machines from the VMs running other Arvados services. You may choose to grant root access to users so that they can customize the node, for example, installing new programs. This has security considerations depending on whether a shell node is single-user or multi-user.
A single-user shell node should be set up so that it only stores Arvados access tokens that belong to that user. In that case, that user can be safely granted root access without compromising other Arvados users.
-In the multi-user shell node case, a malicious user with @root@ access could access other user's Arvados tokens. Users should only be given @root@ access on a multi-user shell node if you would trust them them to be Arvados administrators. Be aware that with access to the @docker@ daemon, it is trival to gain *root* access to any file on the system, so giving users @docker@ access should be considered equivalent to @root@ access.
+In the multi-user shell node case, a malicious user with @root@ access could access other user's Arvados tokens. Users should only be given @root@ access on a multi-user shell node if you would trust them to be Arvados administrators. Be aware that with access to the @docker@ daemon, it is trival to gain *root* access to any file on the system, so giving users @docker@ access should be considered equivalent to @root@ access.
h2(#dependencies). Install Dependencies and SDKs
</pre>
</notextile>
-h3. Part of a LoginCLuster federation
+h3. Part of a LoginCluster federation
-If this cluster is part of a "federation with centralized user management":../admin/federation.html#LoginCluster , the login sync script also needs to be given the host and user token for the login cluster.
+If the cluster is part of a "federation with centralized user management":../admin/federation.html#LoginCluster , the login sync script needs to be given an admin token from the login cluster.
<notextile>
<pre>
<code>shellserver:# <span class="userinput">umask 0700; tee /etc/cron.d/arvados-login-sync <<EOF
ARVADOS_API_HOST="<strong>ClusterID.example.com</strong>"
-ARVADOS_API_TOKEN="<strong>xxxxxxxxxxxxxxxxx</strong>"
-LOGINCLUSTER_ARVADOS_API_HOST="<strong>LoginClusterID.example.com</strong>"
-LOGINCLUSTER_ARVADOS_API_TOKEN="<strong>yyyyyyyyyyyyyyyyy</strong>"
+ARVADOS_API_TOKEN="<strong>yyyloginclusteradmintokenyyyy</strong>"
ARVADOS_VIRTUAL_MACHINE_UUID="<strong>zzzzz-2x53u-zzzzzzzzzzzzzzz</strong>"
*/2 * * * * root arvados-login-sync
EOF</span></code>
SPDX-License-Identifier: CC-BY-SA-3.0
{% endcomment %}
-h2. Cancel a container request
+# "Cancel a container request":#cancel-a-container-request
+# "Cancel all container requests":#cancel-all-container-requests
+# "List completed container requests":#list-completed-container-requests
+# "Get input of a CWL workflow":#get-input-of-a-cwl-workflow
+# "Get output of a CWL workflow":#get-output-of-a-cwl-workflow
+# "Get state of a CWL workflow":#get-state-of-a-cwl-workflow
+# "List input of child requests":#list-input-of-child-requests
+# "List output of child requests":#list-output-of-child-requests
+# "List failed child requests":#list-failed-child-requests
+# "Get log of a child request":#get-log-of-a-child-request
+# "Create a collection sharing link":#sharing-link
+# "Combine two or more collections":#combine-two-or-more-collections
+# "Upload a file into a new collection":#upload-a-file-into-a-new-collection
+# "Download a file from a collection":#download-a-file-from-a-collection
+# "Copy files from a collection to a new collection":#copy-files-from-a-collection-to-a-new-collection
+# "Copy files from a collection to another collection":#copy-files-from-a-collection-to-another-collection
+# "Delete a file from an existing collection":#delete-a-file-from-an-existing-collection
+# "Listing records with paging":#listing-records-with-paging
+# "Querying the vocabulary definition":#querying-the-vocabulary-definition
+# "Translating between vocabulary identifiers and labels":#translating-between-vocabulary-identifiers-and-labels
+# "Create a Project":#create-a-project
+
+h2(#cancel-a-container-request). Cancel a container request
{% codeblock as python %}
import arvados
arvados.api().container_requests().update(uuid=container_request_uuid, body={"priority": 0}).execute()
{% endcodeblock %}
-h2. Cancel all container requests
+h2(#cancel-all-container-requests). Cancel all container requests
{% codeblock as python %}
import arvados
api.container_requests().update(uuid=container_request["uuid"], body={"priority": 0}).execute()
{% endcodeblock %}
-h2. List completed container requests
+h2(#list-completed-container-requests). List completed container requests
{% codeblock as python %}
import arvados
print("%s, %s, %s" % (container_request["uuid"], container_request["name"], "Success" if container["exit_code"] == 0 else "Failed"))
{% endcodeblock %}
-h2. Get input of a CWL workflow
+h2(#get-input-of-a-cwl-workflow). Get input of a CWL workflow
{% codeblock as python %}
import arvados
print(container_request["mounts"]["/var/lib/cwl/cwl.input.json"])
{% endcodeblock %}
-h2. Get output of a CWL workflow
+h2(#get-output-of-a-cwl-workflow). Get output of a CWL workflow
{% codeblock as python %}
import arvados
print(collection.open("cwl.output.json").read())
{% endcodeblock %}
-h2. Get state of a CWL workflow
+h2(#get-state-of-a-cwl-workflow). Get state of a CWL workflow
{% codeblock as python %}
import arvados
print(get_cr_state(container_request_uuid))
{% endcodeblock %}
-h2. List input of child requests
+h2(#list-input-of-child-requests). List input of child requests
{% codeblock as python %}
import arvados
print(" %s" % m["portable_data_hash"])
{% endcodeblock %}
-h2. List output of child requests
+h2(#list-output-of-child-requests). List output of child requests
{% codeblock as python %}
import arvados
print("%s -> %s" % (c["name"], uuid_to_pdh[c["output_uuid"]]))
{% endcodeblock %}
-h2. List failed child requests
+h2(#list-failed-child-requests). List failed child requests
{% codeblock as python %}
import arvados
print("%s (%s)" % (child_containers[c["uuid"]]["name"], child_containers[c["uuid"]]["uuid"]))
{% endcodeblock %}
-h2. Get log of a child request
+h2(#get-log-of-a-child-request). Get log of a child request
{% codeblock as python %}
import arvados
print("%s/c=%s/t=%s/_/" % (download, collection_uuid, token["api_token"]))
{% endcodeblock %}
-h2. Combine two or more collections
+h2(#combine-two-or-more-collections). Combine two or more collections
Note, if two collections have files of the same name, the contents will be concatenated in the resulting manifest.
import arvados
import arvados.collection
api = arvados.api()
-project_uuid = "zzzzz-tpzed-zzzzzzzzzzzzzzz"
+project_uuid = "zzzzz-j7d0g-zzzzzzzzzzzzzzz"
collection_uuids = ["zzzzz-4zz18-aaaaaaaaaaaaaaa", "zzzzz-4zz18-bbbbbbbbbbbbbbb"]
combined_manifest = ""
for u in collection_uuids:
newcol.save_new(name="My combined collection", owner_uuid=project_uuid)
{% endcodeblock %}
-h2. Upload a file into a new collection
+h2(#upload-a-file-into-a-new-collection). Upload a file into a new collection
{% codeblock as python %}
import arvados
print("Saved %s to %s" % (collection_name, c.manifest_locator()))
{% endcodeblock %}
-h2. Download a file from a collection
+h2(#download-a-file-from-a-collection). Download a file from a collection
{% codeblock as python %}
import arvados
print("Finished downloading %s" % filename)
{% endcodeblock %}
-h2. Copy files from a collection to a new collection
+h2(#copy-files-from-a-collection-to-a-new-collection). Copy files from a collection to a new collection
{% codeblock as python %}
import arvados.collection
print("Created collection %s" % target.manifest_locator())
{% endcodeblock %}
-h2. Copy files from a collection to another collection
+h2(#copy-files-from-a-collection-to-another-collection). Copy files from a collection to another collection
{% codeblock as python %}
import arvados.collection
target.save()
{% endcodeblock %}
-h2. Delete a file from an existing collection
+h2(#delete-a-file-from-an-existing-collection). Delete a file from an existing collection
{% codeblock as python %}
import arvados
c.save()
{% endcodeblock %}
-h2. Listing records with paging
+h2(#listing-records-with-paging). Listing records with paging
Use the @arvados.util.keyset_list_all@ helper method to iterate over all the records matching an optional filter. This method handles paging internally and returns results incrementally using a Python iterator. The first parameter of the method takes a @list@ method of an Arvados resource (@collections@, @container_requests@, etc).
print("got collection " + c["uuid"])
{% endcodeblock %}
-h2. Querying the vocabulary definition
+h2(#querying-the-vocabulary-definition). Querying the vocabulary definition
The Python SDK provides facilities to interact with the "active metadata vocabulary":{{ site.baseurl }}/admin/metadata-vocabulary.html in the system. The developer can do key and value lookups in a case-insensitive manner:
# Example output: 'IDVALSIZES2'
{% endcodeblock %}
-h2. Translating between vocabulary identifiers and labels
+h2(#translating-between-vocabulary-identifiers-and-labels). Translating between vocabulary identifiers and labels
Client software might need to present properties to the user in a human-readable form or take input from the user without requiring them to remember identifiers. For these cases, there're a couple of conversion methods that take a dictionary as input like this:
# Example output: {'Importance': 'Critical'}
voc.convert_to_identifiers({'creature': 'elephant'})
# Example output: {'IDTAGANIMALS': 'IDVALANIMALS3'}
-{% endcodeblock %}
\ No newline at end of file
+{% endcodeblock %}
+
+h2(#create-a-project). Create a Project
+
+{% codeblock as python %}
+import arvados
+
+parent_project_uuid = "zzzzz-j7d0g-zzzzzzzzzzzzzzz"
+project_name = "My project"
+
+g = arvados.api().groups().create(body={
+ "group": {
+ "group_class": "project",
+ "owner_uuid": parent_project_uuid,
+ "name": project_name,
+ }}).execute()
+
+print("New project uuid is", g["uuid"])
+{% endcodeblock %}
LocalKeepLogsToContainerLog: none
Logging:
- # When you run the db:delete_old_container_logs task, it will find
- # containers that have been finished for at least this many seconds,
+ # Periodically (see SweepInterval) Arvados will check for
+ # containers that have been finished for at least this long,
# and delete their stdout, stderr, arv-mount, crunch-run, and
# crunchstat logs from the logs table.
MaxAge: 720h
+ # How often to delete cached log entries for finished
+ # containers (see MaxAge).
+ SweepInterval: 12h
+
# These two settings control how frequently log events are flushed to the
# database. Log lines are buffered until either crunch_log_bytes_per_event
# has been reached or crunch_log_seconds_between_events has elapsed since
# This feature is disabled when set to zero.
IdleTimeout: 0s
+ # URL to a file that is a fragment of text or HTML which should
+ # be rendered in Workbench as a banner.
+ BannerURL: ""
+
# Workbench welcome screen, this is HTML text that will be
# incorporated directly onto the page.
WelcomePageHTML: |
"Workbench.UserProfileFormFields.*.*.*": true,
"Workbench.UserProfileFormMessage": true,
"Workbench.WelcomePageHTML": true,
+ "Workbench.BannerURL": true,
}
func redactUnsafe(m map[string]interface{}, mPrefix, lookupPrefix string) error {
)
var (
- TrashSweep = &DBLocker{key: 10001}
- retryDelay = 5 * time.Second
+ TrashSweep = &DBLocker{key: 10001}
+ ContainerLogSweep = &DBLocker{key: 10002}
+ retryDelay = 5 * time.Second
)
// DBLocker uses pg_advisory_lock to maintain a cluster-wide lock for
return conn.chooseBackend(options.UUID).LinkDelete(ctx, options)
}
+func (conn *Conn) LogCreate(ctx context.Context, options arvados.CreateOptions) (arvados.Log, error) {
+ return conn.chooseBackend(options.ClusterID).LogCreate(ctx, options)
+}
+
+func (conn *Conn) LogUpdate(ctx context.Context, options arvados.UpdateOptions) (arvados.Log, error) {
+ return conn.chooseBackend(options.UUID).LogUpdate(ctx, options)
+}
+
+func (conn *Conn) LogGet(ctx context.Context, options arvados.GetOptions) (arvados.Log, error) {
+ return conn.chooseBackend(options.UUID).LogGet(ctx, options)
+}
+
+func (conn *Conn) LogList(ctx context.Context, options arvados.ListOptions) (arvados.LogList, error) {
+ return conn.generated_LogList(ctx, options)
+}
+
+func (conn *Conn) LogDelete(ctx context.Context, options arvados.DeleteOptions) (arvados.Log, error) {
+ return conn.chooseBackend(options.UUID).LogDelete(ctx, options)
+}
+
func (conn *Conn) SpecimenList(ctx context.Context, options arvados.ListOptions) (arvados.SpecimenList, error) {
return conn.generated_SpecimenList(ctx, options)
}
defer out.Close()
out.Write(regexp.MustCompile(`(?ms)^.*package .*?import.*?\n\)\n`).Find(buf))
io.WriteString(out, "//\n// -- this file is auto-generated -- do not edit -- edit list.go and run \"go generate\" instead --\n//\n\n")
- for _, t := range []string{"Container", "ContainerRequest", "Group", "Specimen", "User", "Link", "APIClientAuthorization"} {
+ for _, t := range []string{"Container", "ContainerRequest", "Group", "Specimen", "User", "Link", "Log", "APIClientAuthorization"} {
_, err := out.Write(bytes.ReplaceAll(orig, []byte("Collection"), []byte(t)))
if err != nil {
panic(err)
return merged, err
}
+func (conn *Conn) generated_LogList(ctx context.Context, options arvados.ListOptions) (arvados.LogList, error) {
+ var mtx sync.Mutex
+ var merged arvados.LogList
+ var needSort atomic.Value
+ needSort.Store(false)
+ err := conn.splitListRequest(ctx, options, func(ctx context.Context, _ string, backend arvados.API, options arvados.ListOptions) ([]string, error) {
+ options.ForwardedFor = conn.cluster.ClusterID + "-" + options.ForwardedFor
+ cl, err := backend.LogList(ctx, options)
+ if err != nil {
+ return nil, err
+ }
+ mtx.Lock()
+ defer mtx.Unlock()
+ if len(merged.Items) == 0 {
+ merged = cl
+ } else if len(cl.Items) > 0 {
+ merged.Items = append(merged.Items, cl.Items...)
+ needSort.Store(true)
+ }
+ uuids := make([]string, 0, len(cl.Items))
+ for _, item := range cl.Items {
+ uuids = append(uuids, item.UUID)
+ }
+ return uuids, nil
+ })
+ if needSort.Load().(bool) {
+ // Apply the default/implied order, "modified_at desc"
+ sort.Slice(merged.Items, func(i, j int) bool {
+ mi, mj := merged.Items[i].ModifiedAt, merged.Items[j].ModifiedAt
+ return mj.Before(mi)
+ })
+ }
+ if merged.Items == nil {
+ // Return empty results as [], not null
+ // (https://github.com/golang/go/issues/27589 might be
+ // a better solution in the future)
+ merged.Items = []arvados.Log{}
+ }
+ return merged, err
+}
+
func (conn *Conn) generated_APIClientAuthorizationList(ctx context.Context, options arvados.ListOptions) (arvados.APIClientAuthorizationList, error) {
var mtx sync.Mutex
var merged arvados.APIClientAuthorizationList
}
go h.trashSweepWorker()
+ go h.containerLogSweepWorker()
}
var errDBConnection = errors.New("database connection error")
}
}
+func (s *HandlerSuite) TestContainerLogSweep(c *check.C) {
+ s.cluster.SystemRootToken = arvadostest.SystemRootToken
+ s.cluster.Containers.Logging.SweepInterval = arvados.Duration(time.Second / 10)
+ s.handler.CheckHealth()
+ ctx := auth.NewContext(s.ctx, &auth.Credentials{Tokens: []string{arvadostest.ActiveTokenV2}})
+ logentry, err := s.handler.federation.LogCreate(ctx, arvados.CreateOptions{Attrs: map[string]interface{}{
+ "object_uuid": arvadostest.CompletedContainerUUID,
+ "event_type": "stderr",
+ "properties": map[string]interface{}{
+ "text": "test trash sweep\n",
+ },
+ }})
+ c.Assert(err, check.IsNil)
+ defer s.handler.federation.LogDelete(ctx, arvados.DeleteOptions{UUID: logentry.UUID})
+ deadline := time.Now().Add(5 * time.Second)
+ for {
+ if time.Now().After(deadline) {
+ c.Log("timed out")
+ c.FailNow()
+ }
+ logentries, err := s.handler.federation.LogList(ctx, arvados.ListOptions{Filters: []arvados.Filter{{"uuid", "=", logentry.UUID}}, Limit: -1})
+ c.Assert(err, check.IsNil)
+ if len(logentries.Items) == 0 {
+ break
+ }
+ time.Sleep(time.Second / 10)
+ }
+}
+
func (s *HandlerSuite) TestLogActivity(c *check.C) {
s.cluster.SystemRootToken = arvadostest.SystemRootToken
s.cluster.Users.ActivityLoggingPeriod = arvados.Duration(24 * time.Hour)
return rtr.backend.LinkDelete(ctx, *opts.(*arvados.DeleteOptions))
},
},
+ {
+ arvados.EndpointLogCreate,
+ func() interface{} { return &arvados.CreateOptions{} },
+ func(ctx context.Context, opts interface{}) (interface{}, error) {
+ return rtr.backend.LogCreate(ctx, *opts.(*arvados.CreateOptions))
+ },
+ },
+ {
+ arvados.EndpointLogUpdate,
+ func() interface{} { return &arvados.UpdateOptions{} },
+ func(ctx context.Context, opts interface{}) (interface{}, error) {
+ return rtr.backend.LogUpdate(ctx, *opts.(*arvados.UpdateOptions))
+ },
+ },
+ {
+ arvados.EndpointLogList,
+ func() interface{} { return &arvados.ListOptions{Limit: -1} },
+ func(ctx context.Context, opts interface{}) (interface{}, error) {
+ return rtr.backend.LogList(ctx, *opts.(*arvados.ListOptions))
+ },
+ },
+ {
+ arvados.EndpointLogGet,
+ func() interface{} { return &arvados.GetOptions{} },
+ func(ctx context.Context, opts interface{}) (interface{}, error) {
+ return rtr.backend.LogGet(ctx, *opts.(*arvados.GetOptions))
+ },
+ },
+ {
+ arvados.EndpointLogDelete,
+ func() interface{} { return &arvados.DeleteOptions{} },
+ func(ctx context.Context, opts interface{}) (interface{}, error) {
+ return rtr.backend.LogDelete(ctx, *opts.(*arvados.DeleteOptions))
+ },
+ },
{
arvados.EndpointSpecimenCreate,
func() interface{} { return &arvados.CreateOptions{} },
return resp, err
}
+func (conn *Conn) LogCreate(ctx context.Context, options arvados.CreateOptions) (arvados.Log, error) {
+ ep := arvados.EndpointLogCreate
+ var resp arvados.Log
+ err := conn.requestAndDecode(ctx, &resp, ep, nil, options)
+ return resp, err
+}
+
+func (conn *Conn) LogUpdate(ctx context.Context, options arvados.UpdateOptions) (arvados.Log, error) {
+ ep := arvados.EndpointLogUpdate
+ var resp arvados.Log
+ err := conn.requestAndDecode(ctx, &resp, ep, nil, options)
+ return resp, err
+}
+
+func (conn *Conn) LogGet(ctx context.Context, options arvados.GetOptions) (arvados.Log, error) {
+ ep := arvados.EndpointLogGet
+ var resp arvados.Log
+ err := conn.requestAndDecode(ctx, &resp, ep, nil, options)
+ return resp, err
+}
+
+func (conn *Conn) LogList(ctx context.Context, options arvados.ListOptions) (arvados.LogList, error) {
+ ep := arvados.EndpointLogList
+ var resp arvados.LogList
+ err := conn.requestAndDecode(ctx, &resp, ep, nil, options)
+ return resp, err
+}
+
+func (conn *Conn) LogDelete(ctx context.Context, options arvados.DeleteOptions) (arvados.Log, error) {
+ ep := arvados.EndpointLogDelete
+ var resp arvados.Log
+ err := conn.requestAndDecode(ctx, &resp, ep, nil, options)
+ return resp, err
+}
+
func (conn *Conn) SpecimenCreate(ctx context.Context, options arvados.CreateOptions) (arvados.Specimen, error) {
ep := arvados.EndpointSpecimenCreate
var resp arvados.Specimen
package controller
import (
+ "context"
"time"
"git.arvados.org/arvados.git/lib/controller/dblock"
"git.arvados.org/arvados.git/sdk/go/ctxlog"
)
-func (h *Handler) trashSweepWorker() {
- sleep := h.Cluster.Collections.TrashSweepInterval.Duration()
- logger := ctxlog.FromContext(h.BackgroundContext).WithField("worker", "trash sweep")
+func (h *Handler) periodicWorker(workerName string, interval time.Duration, locker *dblock.DBLocker, run func(context.Context) error) {
+ logger := ctxlog.FromContext(h.BackgroundContext).WithField("worker", workerName)
ctx := ctxlog.Context(h.BackgroundContext, logger)
- if sleep <= 0 {
- logger.Debugf("Collections.TrashSweepInterval is %v, not running worker", sleep)
+ if interval <= 0 {
+ logger.Debugf("interval is %v, not running worker", interval)
return
}
- dblock.TrashSweep.Lock(ctx, h.db)
- defer dblock.TrashSweep.Unlock()
- for time.Sleep(sleep); ctx.Err() == nil; time.Sleep(sleep) {
- dblock.TrashSweep.Check()
- ctx := auth.NewContext(ctx, &auth.Credentials{Tokens: []string{h.Cluster.SystemRootToken}})
- _, err := h.federation.SysTrashSweep(ctx, struct{}{})
+ locker.Lock(ctx, h.db)
+ defer locker.Unlock()
+ for time.Sleep(interval); ctx.Err() == nil; time.Sleep(interval) {
+ locker.Check()
+ err := run(ctx)
if err != nil {
- logger.WithError(err).Info("trash sweep failed")
+ logger.WithError(err).Infof("%s failed", workerName)
}
}
}
+
+func (h *Handler) trashSweepWorker() {
+ h.periodicWorker("trash sweep", h.Cluster.Collections.TrashSweepInterval.Duration(), dblock.TrashSweep, func(ctx context.Context) error {
+ ctx = auth.NewContext(ctx, &auth.Credentials{Tokens: []string{h.Cluster.SystemRootToken}})
+ _, err := h.federation.SysTrashSweep(ctx, struct{}{})
+ return err
+ })
+}
+
+func (h *Handler) containerLogSweepWorker() {
+ h.periodicWorker("container log sweep", h.Cluster.Containers.Logging.SweepInterval.Duration(), dblock.ContainerLogSweep, func(ctx context.Context) error {
+ db, err := h.db(ctx)
+ if err != nil {
+ return err
+ }
+ res, err := db.ExecContext(ctx, `
+DELETE FROM logs
+ USING containers
+ WHERE logs.object_uuid=containers.uuid
+ AND logs.event_type in ('stdout', 'stderr', 'arv-mount', 'crunch-run', 'crunchstat', 'hoststat', 'node', 'container', 'keepstore')
+ AND containers.log IS NOT NULL
+ AND now() - containers.finished_at > $1::interval`,
+ h.Cluster.Containers.Logging.MaxAge.String())
+ if err != nil {
+ return err
+ }
+ logger := ctxlog.FromContext(ctx)
+ rows, err := res.RowsAffected()
+ if err != nil {
+ logger.WithError(err).Warn("unexpected error from RowsAffected()")
+ } else {
+ logger.WithField("rows", rows).Info("deleted rows from logs table")
+ }
+ return nil
+ })
+}
"net"
"net/http"
"net/url"
+ "os"
"strings"
"time"
"git.arvados.org/arvados.git/lib/cmd"
+ "git.arvados.org/arvados.git/lib/config"
"git.arvados.org/arvados.git/sdk/go/arvados"
"git.arvados.org/arvados.git/sdk/go/ctxlog"
+ "git.arvados.org/arvados.git/sdk/go/health"
"github.com/sirupsen/logrus"
)
return
}
+ diag.dotest(5, "running health check (same as `arvados-server check`)", func() error {
+ ldr := config.NewLoader(&bytes.Buffer{}, ctxlog.New(&bytes.Buffer{}, "text", "info"))
+ ldr.SetupFlags(flag.NewFlagSet("diagnostics", flag.ContinueOnError))
+ cfg, err := ldr.Load()
+ if err != nil {
+ diag.infof("skipping because config could not be loaded: %s", err)
+ return nil
+ }
+ cluster, err := cfg.GetCluster("")
+ if err != nil {
+ return err
+ }
+ if cluster.SystemRootToken != os.Getenv("ARVADOS_API_TOKEN") {
+ diag.infof("skipping because provided token is not SystemRootToken")
+ }
+ agg := &health.Aggregator{Cluster: cluster}
+ resp := agg.ClusterHealth()
+ for _, e := range resp.Errors {
+ diag.errorf("health check: %s", e)
+ }
+ diag.infof("health check: reported clock skew %v", resp.ClockSkew)
+ return nil
+ })
+
var dd arvados.DiscoveryDocument
ddpath := "discovery/v1/apis/arvados/v1/rest"
diag.dotest(10, fmt.Sprintf("getting discovery document from https://%s/%s", client.APIHost, ddpath), func() error {
if ctr.State != dispatch.Locked {
// already started by prior invocation
} else if _, ok := disp.lsfqueue.Lookup(ctr.UUID); !ok {
+ if _, err := dispatchcloud.ChooseInstanceType(disp.Cluster, &ctr); errors.As(err, &dispatchcloud.ConstraintsNotSatisfiableError{}) {
+ err := disp.arvDispatcher.Arv.Update("containers", ctr.UUID, arvadosclient.Dict{
+ "container": map[string]interface{}{
+ "runtime_status": map[string]string{
+ "error": err.Error(),
+ },
+ },
+ }, nil)
+ if err != nil {
+ return fmt.Errorf("error setting runtime_status on %s: %s", ctr.UUID, err)
+ }
+ return disp.arvDispatcher.UpdateState(ctr.UUID, dispatch.Cancelled)
+ }
disp.logger.Printf("Submitting container %s to LSF", ctr.UUID)
cmd := []string{disp.Cluster.Containers.CrunchRunCommand}
cmd = append(cmd, "--runtime-engine="+disp.Cluster.Containers.RuntimeEngine)
defer disp.logger.Printf("Done monitoring container %s", ctr.UUID)
go func(uuid string) {
- cancelled := false
for ctx.Err() == nil {
- qent, ok := disp.lsfqueue.Lookup(uuid)
+ _, ok := disp.lsfqueue.Lookup(uuid)
if !ok {
// If the container disappears from
// the lsf queue, there is no point in
cancel()
return
}
- if !cancelled && qent.Stat == "PEND" && strings.Contains(qent.PendReason, "There are no suitable hosts for the job") {
- disp.logger.Printf("container %s: %s", uuid, qent.PendReason)
- err := disp.arvDispatcher.Arv.Update("containers", uuid, arvadosclient.Dict{
- "container": map[string]interface{}{
- "runtime_status": map[string]string{
- "error": qent.PendReason,
- },
- },
- }, nil)
- if err != nil {
- disp.logger.Printf("error setting runtime_status on %s: %s", uuid, err)
- continue // retry
- }
- err = disp.arvDispatcher.UpdateState(uuid, dispatch.Cancelled)
- if err != nil {
- continue // retry (UpdateState() already logged the error)
- }
- cancelled = true
- }
}
}(ctr.UUID)
type suite struct {
disp *dispatcher
crTooBig arvados.ContainerRequest
+ crPending arvados.ContainerRequest
crCUDARequest arvados.ContainerRequest
}
c.Assert(err, check.IsNil)
cluster.Containers.CloudVMs.PollInterval = arvados.Duration(time.Second / 4)
cluster.Containers.MinRetryPeriod = arvados.Duration(time.Second / 4)
+ cluster.InstanceTypes = arvados.InstanceTypeMap{
+ "biggest_available_node": arvados.InstanceType{
+ RAM: 100 << 30, // 100 GiB
+ VCPUs: 4,
+ IncludedScratch: 100 << 30,
+ Scratch: 100 << 30,
+ }}
s.disp = newHandler(context.Background(), cluster, arvadostest.Dispatch1Token, prometheus.NewRegistry()).(*dispatcher)
s.disp.lsfcli.stubCommand = func(string, ...string) *exec.Cmd {
return exec.Command("bash", "-c", "echo >&2 unimplemented stub; false")
})
c.Assert(err, check.IsNil)
+ err = arvados.NewClientFromEnv().RequestAndDecode(&s.crPending, "POST", "arvados/v1/container_requests", nil, map[string]interface{}{
+ "container_request": map[string]interface{}{
+ "runtime_constraints": arvados.RuntimeConstraints{
+ RAM: 100000000,
+ VCPUs: 2,
+ },
+ "container_image": arvadostest.DockerImage112PDH,
+ "command": []string{"sleep", "1"},
+ "mounts": map[string]arvados.Mount{"/mnt/out": {Kind: "tmp", Capacity: 1000}},
+ "output_path": "/mnt/out",
+ "state": arvados.ContainerRequestStateCommitted,
+ "priority": 1,
+ "container_count_max": 1,
+ },
+ })
+ c.Assert(err, check.IsNil)
+
err = arvados.NewClientFromEnv().RequestAndDecode(&s.crCUDARequest, "POST", "arvados/v1/container_requests", nil, map[string]interface{}{
"container_request": map[string]interface{}{
"runtime_constraints": arvados.RuntimeConstraints{
fakejobq[nextjobid] = args[1]
nextjobid++
mtx.Unlock()
- case s.crTooBig.ContainerUUID:
+ case s.crPending.ContainerUUID:
c.Check(args, check.DeepEquals, []string{
- "-J", s.crTooBig.ContainerUUID,
- "-n", "1",
- "-D", "954187MB",
- "-R", "rusage[mem=954187MB:tmp=256MB] span[hosts=1]",
- "-R", "select[mem>=954187MB]",
+ "-J", s.crPending.ContainerUUID,
+ "-n", "2",
+ "-D", "608MB",
+ "-R", "rusage[mem=608MB:tmp=256MB] span[hosts=1]",
+ "-R", "select[mem>=608MB]",
"-R", "select[tmp>=256MB]",
- "-R", "select[ncpus>=1]"})
+ "-R", "select[ncpus>=2]"})
mtx.Lock()
fakejobq[nextjobid] = args[1]
nextjobid++
var records []map[string]interface{}
for jobid, uuid := range fakejobq {
stat, reason := "RUN", ""
- if uuid == s.crTooBig.ContainerUUID {
+ if uuid == s.crPending.ContainerUUID {
// The real bjobs output includes a trailing ';' here:
stat, reason = "PEND", "There are no suitable hosts for the job;"
}
c.Error("timed out")
break
}
+ // "crTooBig" should never be submitted to lsf because
+ // it is bigger than any configured instance type
+ if ent, ok := s.disp.lsfqueue.Lookup(s.crTooBig.ContainerUUID); ok {
+ c.Errorf("Lookup(crTooBig) == true, ent = %#v", ent)
+ break
+ }
// "queuedcontainer" should be running
if _, ok := s.disp.lsfqueue.Lookup(arvadostest.QueuedContainerUUID); !ok {
c.Log("Lookup(queuedcontainer) == false")
continue
}
+ // "crPending" should be pending
+ if ent, ok := s.disp.lsfqueue.Lookup(s.crPending.ContainerUUID); !ok {
+ c.Logf("Lookup(crPending) == false", ent)
+ continue
+ }
// "lockedcontainer" should be cancelled because it
// has priority 0 (no matching container requests)
if ent, ok := s.disp.lsfqueue.Lookup(arvadostest.LockedContainerUUID); ok {
c.Logf("Lookup(lockedcontainer) == true, ent = %#v", ent)
continue
}
- // "crTooBig" should be cancelled because lsf stub
- // reports there is no suitable instance type
- if ent, ok := s.disp.lsfqueue.Lookup(s.crTooBig.ContainerUUID); ok {
- c.Logf("Lookup(crTooBig) == true, ent = %#v", ent)
- continue
- }
var ctr arvados.Container
if err := s.disp.arvDispatcher.Arv.Get("containers", arvadostest.LockedContainerUUID, nil, &ctr); err != nil {
c.Logf("error getting container state for %s: %s", arvadostest.LockedContainerUUID, err)
c.Logf("container %s is not in the LSF queue but its arvados record has not been updated to state==Cancelled (state is %q)", s.crTooBig.ContainerUUID, ctr.State)
continue
} else {
- c.Check(ctr.RuntimeStatus["error"], check.Equals, "There are no suitable hosts for the job;")
+ c.Check(ctr.RuntimeStatus["error"], check.Equals, "constraints not satisfiable by any configured instance type")
}
c.Log("reached desired state")
break
EndpointLinkGet = APIEndpoint{"GET", "arvados/v1/links/{uuid}", ""}
EndpointLinkList = APIEndpoint{"GET", "arvados/v1/links", ""}
EndpointLinkDelete = APIEndpoint{"DELETE", "arvados/v1/links/{uuid}", ""}
+ EndpointLogCreate = APIEndpoint{"POST", "arvados/v1/logs", "log"}
+ EndpointLogUpdate = APIEndpoint{"PATCH", "arvados/v1/logs/{uuid}", "log"}
+ EndpointLogGet = APIEndpoint{"GET", "arvados/v1/logs/{uuid}", ""}
+ EndpointLogList = APIEndpoint{"GET", "arvados/v1/logs", ""}
+ EndpointLogDelete = APIEndpoint{"DELETE", "arvados/v1/logs/{uuid}", ""}
EndpointSysTrashSweep = APIEndpoint{"POST", "sys/trash_sweep", ""}
EndpointUserActivate = APIEndpoint{"POST", "arvados/v1/users/{uuid}/activate", ""}
EndpointUserCreate = APIEndpoint{"POST", "arvados/v1/users", "user"}
LinkGet(ctx context.Context, options GetOptions) (Link, error)
LinkList(ctx context.Context, options ListOptions) (LinkList, error)
LinkDelete(ctx context.Context, options DeleteOptions) (Link, error)
+ LogCreate(ctx context.Context, options CreateOptions) (Log, error)
+ LogUpdate(ctx context.Context, options UpdateOptions) (Log, error)
+ LogGet(ctx context.Context, options GetOptions) (Log, error)
+ LogList(ctx context.Context, options ListOptions) (LogList, error)
+ LogDelete(ctx context.Context, options DeleteOptions) (Log, error)
SpecimenCreate(ctx context.Context, options CreateOptions) (Specimen, error)
SpecimenUpdate(ctx context.Context, options UpdateOptions) (Specimen, error)
SpecimenGet(ctx context.Context, options GetOptions) (Specimen, error)
// Space characters are trimmed when reading the settings file, so
// these are equivalent:
//
-// ARVADOS_API_HOST=localhost\n
-// ARVADOS_API_HOST=localhost\r\n
-// ARVADOS_API_HOST = localhost \n
-// \tARVADOS_API_HOST = localhost\n
+// ARVADOS_API_HOST=localhost\n
+// ARVADOS_API_HOST=localhost\r\n
+// ARVADOS_API_HOST = localhost \n
+// \tARVADOS_API_HOST = localhost\n
func NewClientFromEnv() *Client {
vars := map[string]string{}
home := os.Getenv("HOME")
// Convert an arbitrary struct to url.Values. For example,
//
-// Foo{Bar: []int{1,2,3}, Baz: "waz"}
+// Foo{Bar: []int{1,2,3}, Baz: "waz"}
//
// becomes
//
-// url.Values{`bar`:`{"a":[1,2,3]}`,`Baz`:`waz`}
+// url.Values{`bar`:`{"a":[1,2,3]}`,`Baz`:`waz`}
//
// params itself is returned if it is already an url.Values.
func anythingToValues(params interface{}) (url.Values, error) {
SSHHelpPageHTML string
SSHHelpHostSuffix string
IdleTimeout Duration
+ BannerURL string
}
}
}
Logging struct {
MaxAge Duration
+ SweepInterval Duration
LogBytesPerEvent int
LogSecondsBetweenEvents Duration
LogThrottlePeriod Duration
var errDuplicateInstanceTypeName = errors.New("duplicate instance type name")
// UnmarshalJSON does special handling of InstanceTypes:
-// * populate computed fields (Name and Scratch)
-// * error out if InstancesTypes are populated as an array, which was
-// deprecated in Arvados 1.2.0
+//
+// - populate computed fields (Name and Scratch)
+//
+// - error out if InstancesTypes are populated as an array, which was
+// deprecated in Arvados 1.2.0
func (it *InstanceTypeMap) UnmarshalJSON(data []byte) error {
fixup := func(t InstanceType) (InstanceType, error) {
if t.ProviderType == "" {
package arvados
import (
+ "bytes"
"encoding/json"
"fmt"
"strings"
// UnmarshalJSON implements json.Unmarshaler.
func (d *Duration) UnmarshalJSON(data []byte) error {
+ if bytes.Equal(data, []byte(`"0"`)) || bytes.Equal(data, []byte(`0`)) {
+ // Unitless 0 is not accepted by ParseDuration, but we
+ // accept it as a reasonable spelling of 0
+ // nanoseconds.
+ *d = 0
+ return nil
+ }
if data[0] == '"' {
return d.Set(string(data[1 : len(data)-1]))
}
err = json.Unmarshal([]byte(`{"D":"60s"}`), &d)
c.Check(err, check.IsNil)
c.Check(d.D.Duration(), check.Equals, time.Minute)
+
+ d.D = Duration(time.Second)
+ err = json.Unmarshal([]byte(`{"D":"0"}`), &d)
+ c.Check(err, check.IsNil)
+ c.Check(d.D.Duration(), check.Equals, time.Duration(0))
+
+ d.D = Duration(time.Second)
+ err = json.Unmarshal([]byte(`{"D":0}`), &d)
+ c.Check(err, check.IsNil)
+ c.Check(d.D.Duration(), check.Equals, time.Duration(0))
}
//
// After seeking:
//
-// ptr.segmentIdx == len(filenode.segments) // i.e., at EOF
-// ||
-// filenode.segments[ptr.segmentIdx].Len() > ptr.segmentOff
+// ptr.segmentIdx == len(filenode.segments) // i.e., at EOF
+// ||
+// filenode.segments[ptr.segmentIdx].Len() > ptr.segmentOff
func (fn *filenode) seek(startPtr filenodePtr) (ptr filenodePtr) {
ptr = startPtr
if ptr.off < 0 {
type Log struct {
ID uint64 `json:"id"`
UUID string `json:"uuid"`
+ OwnerUUID string `json:"owner_uuid"`
ObjectUUID string `json:"object_uuid"`
ObjectOwnerUUID string `json:"object_owner_uuid"`
EventType string `json:"event_type"`
- EventAt *time.Time `json:"event"`
+ EventAt time.Time `json:"event"`
+ Summary string `json:"summary"`
Properties map[string]interface{} `json:"properties"`
- CreatedAt *time.Time `json:"created_at"`
+ CreatedAt time.Time `json:"created_at"`
+ ModifiedAt time.Time `json:"modified_at"`
}
// LogList is an arvados#logList resource.
as.appendCall(ctx, as.LinkDelete, options)
return arvados.Link{}, as.Error
}
+func (as *APIStub) LogCreate(ctx context.Context, options arvados.CreateOptions) (arvados.Log, error) {
+ as.appendCall(ctx, as.LogCreate, options)
+ return arvados.Log{}, as.Error
+}
+func (as *APIStub) LogUpdate(ctx context.Context, options arvados.UpdateOptions) (arvados.Log, error) {
+ as.appendCall(ctx, as.LogUpdate, options)
+ return arvados.Log{}, as.Error
+}
+func (as *APIStub) LogGet(ctx context.Context, options arvados.GetOptions) (arvados.Log, error) {
+ as.appendCall(ctx, as.LogGet, options)
+ return arvados.Log{}, as.Error
+}
+func (as *APIStub) LogList(ctx context.Context, options arvados.ListOptions) (arvados.LogList, error) {
+ as.appendCall(ctx, as.LogList, options)
+ return arvados.LogList{}, as.Error
+}
+func (as *APIStub) LogDelete(ctx context.Context, options arvados.DeleteOptions) (arvados.Log, error) {
+ as.appendCall(ctx, as.LogDelete, options)
+ return arvados.Log{}, as.Error
+}
func (as *APIStub) SpecimenCreate(ctx context.Context, options arvados.CreateOptions) (arvados.Specimen, error) {
as.appendCall(ctx, as.SpecimenCreate, options)
return arvados.Specimen{}, as.Error
for svcName, sh := range resp.Services {
switch svcName {
case arvados.ServiceNameDispatchCloud,
- arvados.ServiceNameDispatchLSF:
+ arvados.ServiceNameDispatchLSF,
+ arvados.ServiceNameDispatchSLURM:
// ok to not run any given dispatcher
case arvados.ServiceNameHealth,
arvados.ServiceNameWorkbench1,
err := ccmd.run(ctx, prog, args, stdin, stdout, stderr)
if err != nil {
if err != errSilent {
- fmt.Fprintln(stdout, err.Error())
+ fmt.Fprintln(stderr, err.Error())
}
return 1
}
loader.SetupFlags(flags)
versionFlag := flags.Bool("version", false, "Write version information to stdout and exit 0")
timeout := flags.Duration("timeout", defaultTimeout.Duration(), "Maximum time to wait for health responses")
+ quiet := flags.Bool("quiet", false, "Silent on success (suppress 'health check OK' message on stderr)")
outputYAML := flags.Bool("yaml", false, "Output full health report in YAML format (default mode shows errors as plain text, is silent on success)")
if ok, _ := cmd.ParseFlags(flags, prog, args, "", stderr); !ok {
// cmd.ParseFlags already reported the error
}
if resp.Health != "OK" {
for _, msg := range resp.Errors {
- fmt.Fprintln(stdout, msg)
+ fmt.Fprintln(stderr, msg)
}
fmt.Fprintln(stderr, "health check failed")
return errSilent
}
+ if !*quiet {
+ fmt.Fprintln(stderr, "health check OK")
+ }
return nil
}
exitcode := CheckCommand.RunCommand("check", []string{"-config=" + tmpdir + "/config.yml"}, &bytes.Buffer{}, &stdout, &stderr)
c.Check(exitcode, check.Equals, 0)
+ c.Check(stderr.String(), check.Equals, "health check OK\n")
+ c.Check(stdout.String(), check.Equals, "")
+
+ stdout.Reset()
+ stderr.Reset()
+ exitcode = CheckCommand.RunCommand("check", []string{"-quiet", "-config=" + tmpdir + "/config.yml"}, &bytes.Buffer{}, &stdout, &stderr)
+ c.Check(exitcode, check.Equals, 0)
c.Check(stderr.String(), check.Equals, "")
c.Check(stdout.String(), check.Equals, "")
package keepclient
import (
+ "fmt"
"io"
"sort"
"strconv"
data = make([]byte, size, bufsize)
_, err = io.ReadFull(rdr, data)
err2 := rdr.Close()
- if err == nil {
- err = err2
+ if err == nil && err2 != nil {
+ err = fmt.Errorf("close(): %w", err2)
+ }
+ if err != nil {
+ err = fmt.Errorf("Get %s: %w", locator, err)
}
}
c.mtx.Lock()
"GitInternalDir": os.path.join(SERVICES_SRC_DIR, 'api', 'tmp', 'internal.git'),
},
"LocalKeepBlobBuffersPerVCPU": 0,
+ "Logging": {
+ "SweepInterval": 0, # disable, otherwise test cases can't acquire dblock
+ },
"SupportedDockerImageFormats": {"v1": {}},
"ShellAccess": {
"Admin": True,
# delete oid_login_perms for this user
#
- # note: these permission links are obsolete, they have no effect
- # on anything and they are not created for new users.
+ # note: these permission links are obsolete anyway: they have no
+ # effect on anything and they are not created for new users.
Link.where(tail_uuid: self.email,
link_class: 'permission',
name: 'can_login').destroy_all
- # delete repo_perms for this user
- Link.where(tail_uuid: self.uuid,
- link_class: 'permission',
- name: 'can_manage').destroy_all
-
- # delete vm_login_perms for this user
- Link.where(tail_uuid: self.uuid,
- link_class: 'permission',
- name: 'can_login').destroy_all
-
- # delete "All users" group read permissions for this user
+ # Delete all sharing permissions so (a) the user doesn't
+ # automatically regain access to anything if re-setup in future,
+ # (b) the user doesn't appear in "currently shared with" lists
+ # shown to other users.
+ #
+ # Notably this includes the can_read -> "all users" group
+ # permission.
Link.where(tail_uuid: self.uuid,
- head_uuid: all_users_group_uuid,
link_class: 'permission').destroy_all
# delete any signatures by this user
# from the logs table.
namespace :db do
- desc "Remove old container log entries from the logs table"
+ desc "deprecated / no-op"
task delete_old_container_logs: :environment do
- delete_sql = "DELETE FROM logs WHERE id in (SELECT logs.id FROM logs JOIN containers ON logs.object_uuid = containers.uuid WHERE event_type IN ('stdout', 'stderr', 'arv-mount', 'crunch-run', 'crunchstat') AND containers.log IS NOT NULL AND now() - containers.finished_at > interval '#{Rails.configuration.Containers.Logging.MaxAge.to_i} seconds')"
-
- ActiveRecord::Base.connection.execute(delete_sql)
+ Rails.logger.info "this db:delete_old_container_logs rake task is no longer used"
end
end
ApiClientAuthorization.create!(user: User.find_by_uuid(created['uuid']), api_client: ApiClient.all.first).api_token
end
+ # share project and collections with the new user
+ act_as_system_user do
+ Link.create!(tail_uuid: created['uuid'],
+ head_uuid: groups(:aproject).uuid,
+ link_class: 'permission',
+ name: 'can_manage')
+ Link.create!(tail_uuid: created['uuid'],
+ head_uuid: collections(:collection_owned_by_active).uuid,
+ link_class: 'permission',
+ name: 'can_read')
+ Link.create!(tail_uuid: created['uuid'],
+ head_uuid: collections(:collection_owned_by_active_with_file_stats).uuid,
+ link_class: 'permission',
+ name: 'can_write')
+ end
+
assert_equal 1, ApiClientAuthorization.where(user_id: User.find_by_uuid(created['uuid']).id).size, 'expected token not found'
post "/arvados/v1/users/#{created['uuid']}/unsetup", params: {}, headers: auth(:admin)
assert_not_nil created2['uuid'], 'expected uuid for the newly created user'
assert_equal created['uuid'], created2['uuid'], 'expected uuid not found'
assert_equal 0, ApiClientAuthorization.where(user_id: User.find_by_uuid(created['uuid']).id).size, 'token should have been deleted by user unsetup'
+ # check permissions are deleted
+ assert_empty Link.where(tail_uuid: created['uuid'])
verify_link_existence created['uuid'], created['email'], false, false, false, false, false
end
+++ /dev/null
-# Copyright (C) The Arvados Authors. All rights reserved.
-#
-# SPDX-License-Identifier: AGPL-3.0
-
-require 'test_helper'
-require 'rake'
-
-Rake.application.rake_require "tasks/delete_old_container_logs"
-Rake::Task.define_task(:environment)
-
-class DeleteOldContainerLogsTaskTest < ActiveSupport::TestCase
- TASK_NAME = "db:delete_old_container_logs"
-
- def log_uuids(*fixture_names)
- fixture_names.map { |name| logs(name).uuid }
- end
-
- def run_with_expiry(clean_after)
- Rails.configuration.Containers.Logging.MaxAge = clean_after
- Rake::Task[TASK_NAME].reenable
- Rake.application.invoke_task TASK_NAME
- end
-
- def check_log_existence(test_method, fixture_uuids)
- uuids_now = Log.where("object_uuid LIKE :pattern AND event_type in ('stdout', 'stderr', 'arv-mount', 'crunch-run', 'crunchstat')", pattern: "%-dz642-%").map(&:uuid)
- fixture_uuids.each do |expect_uuid|
- send(test_method, uuids_now, expect_uuid)
- end
- end
-
- test "delete all finished logs" do
- uuids_to_keep = log_uuids(:stderr_for_running_container,
- :crunchstat_for_running_container)
- uuids_to_clean = log_uuids(:stderr_for_previous_container,
- :crunchstat_for_previous_container,
- :stderr_for_ancient_container,
- :crunchstat_for_ancient_container)
- run_with_expiry(1)
- check_log_existence(:assert_includes, uuids_to_keep)
- check_log_existence(:refute_includes, uuids_to_clean)
- end
-
- test "delete old finished logs" do
- uuids_to_keep = log_uuids(:stderr_for_running_container,
- :crunchstat_for_running_container,
- :stderr_for_previous_container,
- :crunchstat_for_previous_container)
- uuids_to_clean = log_uuids(:stderr_for_ancient_container,
- :crunchstat_for_ancient_container)
- run_with_expiry(360.days)
- check_log_existence(:assert_includes, uuids_to_keep)
- check_log_existence(:refute_includes, uuids_to_clean)
- end
-end
s.files = ["bin/arvados-login-sync", "agpl-3.0.txt"]
s.executables << "arvados-login-sync"
s.required_ruby_version = '>= 2.1.0'
- s.add_runtime_dependency 'arvados', '>= 1.3.3.20190320201707'
+ # Note the letter 'a' at the end of the version dependency. This enables
+ # bundler's dependency resolver to include 'pre-release' versions, like the
+ # ones we build (but not publish) on every test pipeline job.
+ # See: https://github.com/rubygems/bundler/issues/4340
+ s.add_runtime_dependency 'arvados', '~> 2.4', '> 2.4.4a'
s.add_runtime_dependency 'launchy', '< 2.5'
# We need at least version 0.8.7.3, cf. https://dev.arvados.org/issues/15673
s.add_dependency('arvados-google-api-client', '>= 0.8.7.3', '< 0.8.9')
debug = true
end
arv = Arvados.new({ :suppress_ssl_warnings => false })
- logincluster_arv = Arvados.new({ :api_host => (ENV['LOGINCLUSTER_ARVADOS_API_HOST'] || ENV['ARVADOS_API_HOST']),
- :api_token => (ENV['LOGINCLUSTER_ARVADOS_API_TOKEN'] || ENV['ARVADOS_API_TOKEN']),
- :suppress_ssl_warnings => false })
+ logincluster_host = ENV['ARVADOS_API_HOST']
+ logincluster_name = arv.cluster_config['Login']['LoginCluster'] or ''
+
+ if logincluster_name != '' and logincluster_name != arv.cluster_config['ClusterID']
+ logincluster_host = arv.cluster_config['RemoteClusters'][logincluster_name]['Host']
+ end
+ logincluster_arv = Arvados.new({ :api_host => logincluster_host,
+ :suppress_ssl_warnings => false })
vm_uuid = ENV['ARVADOS_VIRTUAL_MACHINE_UUID']
userEnv = IO::read(tokenfile)
if (m = /^ARVADOS_API_TOKEN=(.*?\n)/m.match(userEnv))
begin
- tmp_arv = Arvados.new({ :api_host => (ENV['LOGINCLUSTER_ARVADOS_API_HOST'] || ENV['ARVADOS_API_HOST']),
- :api_token => (m[1]),
- :suppress_ssl_warnings => false })
+ tmp_arv = Arvados.new({ :api_host => logincluster_host,
+ :api_token => (m[1]),
+ :suppress_ssl_warnings => false })
tmp_arv.user.current
rescue Arvados::TransactionFailedError => e
if e.to_s =~ /401 Unauthorized/
- proxy_set_header: 'X-Real-IP $remote_addr'
- proxy_set_header: 'X-Forwarded-For $proxy_add_x_forwarded_for'
- proxy_set_header: 'X-External-Client $external_client'
+ - proxy_set_header: 'Upgrade $http_upgrade'
+ - proxy_set_header: 'Connection "upgrade"'
- proxy_max_temp_file_size: 0
- proxy_request_buffering: 'off'
- proxy_buffering: 'off'
- proxy_set_header: 'X-Real-IP $remote_addr'
- proxy_set_header: 'X-Forwarded-For $proxy_add_x_forwarded_for'
- proxy_set_header: 'X-External-Client $external_client'
+ - proxy_set_header: 'Upgrade $http_upgrade'
+ - proxy_set_header: 'Connection "upgrade"'
- proxy_max_temp_file_size: 0
- proxy_request_buffering: 'off'
- proxy_buffering: 'off'
- proxy_set_header: 'X-Real-IP $remote_addr'
- proxy_set_header: 'X-Forwarded-For $proxy_add_x_forwarded_for'
- proxy_set_header: 'X-External-Client $external_client'
+ - proxy_set_header: 'Upgrade $http_upgrade'
+ - proxy_set_header: 'Connection "upgrade"'
- proxy_max_temp_file_size: 0
- proxy_request_buffering: 'off'
- proxy_buffering: 'off'
collectionNameCache = {}
def getCollectionName(arv, uuid, pdh):
lookupField = uuid
- filters = [["uuid","=",uuid]]
+ filters = [["uuid", "=", uuid]]
cached = uuid in collectionNameCache
# look up by uuid if it is available, fall back to look up by pdh
- if len(uuid) != 27:
+ if uuid is None or len(uuid) != 27:
# Look up by pdh. Note that this can be misleading; the download could
# have happened from a collection with the same pdh but different name.
# We arbitrarily pick the oldest collection with the pdh to lookup the
# name, if the uuid for the request is not known.
lookupField = pdh
- filters = [["portable_data_hash","=",pdh]]
+ filters = [["portable_data_hash", "=", pdh]]
cached = pdh in collectionNameCache
if not cached:
- u = arv.collections().list(filters=filters,order="created_at",limit=1).execute().get("items")
+ u = arv.collections().list(filters=filters, order="created_at", limit=1).execute().get("items")
if len(u) < 1:
return "(deleted)"
collectionNameCache[lookupField] = u[0]["name"]
users[owner].append([loguuid, event_at, "Deleted collection %s" % (getname(e["properties"]["old_attributes"]))])
elif e["event_type"] == "file_download":
- users.setdefault(e["object_uuid"], [])
- users[e["object_uuid"]].append([loguuid, event_at, "Downloaded file \"%s\" from \"%s\" (%s) (%s)" % (
- e["properties"].get("collection_file_path") or e["properties"].get("reqPath"),
- getCollectionName(arv, e["properties"].get("collection_uuid"), e["properties"].get("portable_data_hash")),
- e["properties"].get("collection_uuid"),
- e["properties"].get("portable_data_hash"))])
-
+ users.setdefault(e["object_uuid"], [])
+ users[e["object_uuid"]].append([loguuid, event_at, "Downloaded file \"%s\" from \"%s\" (%s) (%s)" % (
+ e["properties"].get("collection_file_path") or e["properties"].get("reqPath"),
+ getCollectionName(arv, e["properties"].get("collection_uuid"), e["properties"].get("portable_data_hash")),
+ e["properties"].get("collection_uuid"),
+ e["properties"].get("portable_data_hash"))])
elif e["event_type"] == "file_upload":
- users.setdefault(e["object_uuid"], [])
- users[e["object_uuid"]].append([loguuid, event_at, "Uploaded file \"%s\" to \"%s\" (%s)" % (
- e["properties"].get("collection_file_path") or e["properties"].get("reqPath"),
- getCollectionName(arv, e["properties"].get("collection_uuid"), e["properties"].get("portable_data_hash")),
- e["properties"].get("collection_uuid"))])
+ users.setdefault(e["object_uuid"], [])
+ users[e["object_uuid"]].append([loguuid, event_at, "Uploaded file \"%s\" to \"%s\" (%s)" % (
+ e["properties"].get("collection_file_path") or e["properties"].get("reqPath"),
+ getCollectionName(arv, e["properties"].get("collection_uuid"), e["properties"].get("portable_data_hash")),
+ e["properties"].get("collection_uuid"))])
else:
users[owner].append([loguuid, event_at, "%s %s %s" % (e["event_type"], e["object_kind"], e["object_uuid"])])