13399: Don't spam logs about low priority jobs that are running.
authorTom Clegg <tclegg@veritasgenetics.com>
Mon, 6 Aug 2018 20:07:06 +0000 (16:07 -0400)
committerTom Clegg <tclegg@veritasgenetics.com>
Mon, 6 Aug 2018 20:07:06 +0000 (16:07 -0400)
Once a job has started, its priority relative to other jobs doesn't
matter to us anyway, and we already skip low-priority jobs in the
"renice" procedure, so there's no reason to warn about this.

Also change "mysterious low priority" reporting threshold from 1Mi to
20K to match the other "nice/hold race" code.

Arvados-DCO-1.1-Signed-off-by: Tom Clegg <tclegg@veritasgenetics.com>

services/crunch-dispatch-slurm/squeue.go

index ccbe44487c4394a10dbd8c8cb97d0b06af5995e5..20305ab90abe91b150ae71a7749fd39c8e529548 100644 (file)
@@ -193,7 +193,7 @@ func (sqc *SqueueChecker) check() {
                        // resolved the same way.
                        log.Printf("releasing held job %q (priority=%d, state=%q, reason=%q)", uuid, p, state, reason)
                        sqc.Slurm.Release(uuid)
-               } else if p < 1<<20 && replacing.wantPriority > 0 {
+               } else if state != "RUNNING" && p <= 2*slurm15NiceLimit && replacing.wantPriority > 0 {
                        log.Printf("warning: job %q has low priority %d, nice %d, state %q, reason %q", uuid, p, n, state, reason)
                }
        }