SeaWulf Shared Queues

SeaWulf Shared Queues Guide

Shared queues allow multiple users to run jobs on the same node simultaneously. Proper resource management is crucial to avoid exceeding memory or CPU limits.
Note: All shared queues are accessed from milan1/milan2

Shared Queues Overview

Shared queues are designed for users who do not need exclusive access to a node but still require compute resources. You must explicitly request memory to avoid exceeding available resources.

Available Shared Queues

CPU Nodes

Queue CPU Architecture Vector/Matrix Extension CPU Cores per Node Node Memory Default Runtime Max Runtime Max Nodes
short-40core-shared Intel Skylake AVX512 40 192 GB 1 hr 4 hrs 4
long-40core-shared Intel Skylake AVX512 40 192 GB 8 hrs 24 hrs 3
extended-40core-shared Intel Skylake AVX512 40 192 GB 8 hrs 3.5 days 1
short-96core-shared AMD EPYC Milan AVX2 96 256 GB 1 hr 4 hrs 4
long-96core-shared AMD EPYC Milan AVX2 96 256 GB 8 hrs 24 hrs 3
extended-96core-shared AMD EPYC Milan AVX2 96 256 GB 8 hrs 3.5 days 1

GPU Nodes

Queue CPU Architecture Vector/Matrix Extension CPU Cores per Node GPUs per Node Node Memory Default Runtime Max Runtime Max Nodes Max Simultaneous Jobs per User
a100 Intel Ice Lake AVX512, DL Boost 64 4 256 GB 1 hr 8 hrs 2 2
a100-long Intel Ice Lake AVX512, DL Boost 64 4 256 GB 8 hrs 2 days 1 2
a100-large Intel Ice Lake AVX512, DL Boost 64 4 256 GB 1 hr 8 hrs 4 1

Instructions for Using Shared Queues

  • Request memory explicitly: Use #SBATCH --mem=[amount] to avoid exceeding available memory on a shared node.
  • Check resource limits: Know the total node memory and CPU cores and plan your job accordingly.
  • Monitor usage: Use squeue -u $USER to check your jobs and top or htop to see memory/CPU usage.
  • Benefit: Shared queues can have shorter wait times than exclusive nodes if you request exactly the resources you need.