Skip to content

Tasks are skipped, nodes not fully utilized #2

@rbberger

Description

@rbberger

In a 12 node job with 28 cores each, launching a 35,000 task job shows some weird behavior. After 602 tasks it jumps to task id 1334.

{"job_id": 5697, "event": "task_start","task_id": 602, "slot_id": 136}
{"job_id": 5697, "event": "task_start","task_id": 1334, "slot_id": 177}

The subsequent jobs are again in sequence, but the 730 +/- 1 jobs remain missing.
In addition, not all slots are used right now, even though there are thousands to go.

screenshot from 2017-10-19 14-14-13

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions