-
-Clusters:
- zzzzz:
- Containers:
+ Containers:
SLURM:
SbatchArgumentsList:
- "--partition=PartitionName"
@@ -110,27 +105,24 @@ Clusters:
Note: If an argument is supplied multiple times, @slurm@ uses the value of the last occurrence of the argument on the command line. Arguments specified through Arvados are added after the arguments listed in SbatchArguments. This means, for example, an Arvados container with that specifies @partitions@ in @scheduling_parameter@ will override an occurrence of @--partition@ in SbatchArguments. As a result, for container parameters that can be specified through Arvados, SbatchArguments can be used to specify defaults but not enforce specific policy.
-h3(#CrunchRunCommand-cgroups). Containers.CrunchRunArgumentList: Dispatch to SLURM cgroups
+h3(#CrunchRunCommand-cgroups). Containers.CrunchRunArgumentList: Dispatch to Slurm cgroups
-If your SLURM cluster uses the @task/cgroup@ TaskPlugin, you can configure Crunch's Docker containers to be dispatched inside SLURM's cgroups. This provides consistent enforcement of resource constraints. To do this, use a crunch-dispatch-slurm configuration like the following:
+If your Slurm cluster uses the @task/cgroup@ TaskPlugin, you can configure Crunch's Docker containers to be dispatched inside Slurm's cgroups. This provides consistent enforcement of resource constraints. To do this, use a crunch-dispatch-slurm configuration like the following:
-
-Clusters:
- zzzzz:
- Containers:
+ Containers:
CrunchRunArgumentsList:
- "-cgroup-parent-subsystem=memory"
-The choice of subsystem ("memory" in this example) must correspond to one of the resource types enabled in SLURM's @cgroup.conf@. Limits for other resource types will also be respected. The specified subsystem is singled out only to let Crunch determine the name of the cgroup provided by SLURM. When doing this, you should also set "ReserveExtraRAM":#ReserveExtraRAM .
+The choice of subsystem ("memory" in this example) must correspond to one of the resource types enabled in Slurm's @cgroup.conf@. Limits for other resource types will also be respected. The specified subsystem is singled out only to let Crunch determine the name of the cgroup provided by Slurm. When doing this, you should also set "ReserveExtraRAM":#ReserveExtraRAM .
{% include 'notebox_begin' %}
-Some versions of Docker (at least 1.9), when run under systemd, require the cgroup parent to be specified as a systemd slice. This causes an error when specifying a cgroup parent created outside systemd, such as those created by SLURM.
+Some versions of Docker (at least 1.9), when run under systemd, require the cgroup parent to be specified as a systemd slice. This causes an error when specifying a cgroup parent created outside systemd, such as those created by Slurm.
-You can work around this issue by disabling the Docker daemon's systemd integration. This makes it more difficult to manage Docker services with systemd, but Crunch does not require that functionality, and it will be able to use SLURM's cgroups as container parents. To do this, "configure the Docker daemon on all compute nodes":install-compute-node.html#configure_docker_daemon to run with the option @--exec-opt native.cgroupdriver=cgroupfs@.
+You can work around this issue by disabling the Docker daemon's systemd integration. This makes it more difficult to manage Docker services with systemd, but Crunch does not require that functionality, and it will be able to use Slurm's cgroups as container parents. To do this, "configure the Docker daemon on all compute nodes":install-compute-node.html#configure_docker_daemon to run with the option @--exec-opt native.cgroupdriver=cgroupfs@.
{% include 'notebox_end' %}
@@ -139,53 +131,17 @@ h3(#CrunchRunCommand-network). Containers.CrunchRunArgumentList: Using host netw
Older Linux kernels (prior to 3.18) have bugs in network namespace handling which can lead to compute node lockups. This by is indicated by blocked kernel tasks in "Workqueue: netns cleanup_net". If you are experiencing this problem, as a workaround you can disable use of network namespaces by Docker across the cluster. Be aware this reduces container isolation, which may be a security risk.
-
-Clusters:
- zzzzz:
- Containers:
+ Containers:
CrunchRunArgumentsList:
- "-container-enable-networking=always"
- "-container-network-mode=host"
-h3(#MinRetryPeriod). Containers.MinRetryPeriod: Rate-limit repeated attempts to start containers
-
-If SLURM is unable to run a container, the dispatcher will submit it again after the next PollPeriod. If PollPeriod is very short, this can be excessive. If MinRetryPeriod is set, the dispatcher will avoid submitting the same container to SLURM more than once in the given time span.
-
-
-
-Clusters:
- zzzzz:
- Containers:
- MinRetryPeriod: 30s
-
-
-
-h3(#ReserveExtraRAM). Containers.ReserveExtraRAM: Extra RAM for jobs
-
-Extra RAM to reserve (in bytes) on each SLURM job submitted by Arvados, which is added to the amount specified in the container's @runtime_constraints@. If not provided, the default value is zero. Helpful when using @-cgroup-parent-subsystem@, where @crunch-run@ and @arv-mount@ share the control group memory limit with the user process. In this situation, at least 256MiB is recommended to accomodate each container's @crunch-run@ and @arv-mount@ processes.
-
-
-
-Clusters:
- zzzzz:
- Containers:
- ReserveExtraRAM: 256MiB
-
-
-
-h2. Restart the dispatcher
-
-{% include 'notebox_begin' %}
-
-The crunch-dispatch-slurm package includes configuration files for systemd. If you're using a different init system, you'll need to configure a service to start and stop a @crunch-dispatch-slurm@ process as desired. The process should run from a directory where the @crunch@ user has write permission on all compute nodes, such as its home directory or @/tmp@. You do not need to specify any additional switches or environment variables.
+{% assign arvados_component = 'crunch-dispatch-slurm' %}
-{% include 'notebox_end' %}
+{% include 'install_packages' %}
-Restart the dispatcher to run with your new configuration:
+{% include 'start_service' %}
-
-~$ sudo systemctl restart crunch-dispatch-slurm
-
-
+{% include 'restart_api' %}