High Performance Computing at Stony Brook University

SeaWulf

SeaWulf, the main computational cluster at Stony Brook University, is available for use by all faculty and students. Its name is a portmanteau of "Seawolf" and "Beowulf," the name of one of the first high performance computing clusters using commodity equipment. This cluster includes 405 nodes and 23,372 cores, with a peak performance of ~1.8 PFLOPS available for research computation. Available to all nodes is a 4 Petabyte shared storage array running the GPFS file system. Recent upgrades include:

2023: “SeaWulf II” with support from NSF Major Research Instrumentation award 2215987, matching funds from the New York State Department of Economic Development contract C210148, plus additional funding from SBU’s President, Provost, Vice President for Research, CIO, Professor Dilip Gersappe and the Deans of CAS, CEAS, and SoMAS. This latest addition, a set of servers from HPE, is composed of 94 compute nodes and a total of 9024 cores, offering 606 TFLOPS. Each node uses Intel Xeon Max 9468 processors, with 96 cores across both sockets, running at a base clock speed of 2.1 GHz, and 256GB of DDR5-4800 of RAM. What is notable about these nodes is the presence of 128 GB of HBM2e of RAM offering 2 TB/s of aggregate memory bandwidth. These servers are interconnected via an NDR400 InfiniBand fabric at a cutting-edge 400 Gbps. The login node uses dual Intel Xeon Platinum 8468 Processors, with a total of 96 cores running at 2.1 Ghz and 512GB of DDR5-4800 of memory.

2022: RCI expanded the SeaWulf cluster thanks to generous funding by Professors Dilip Gersappe, Benjamin Levine and Robert Rizzo. The first project in this expansion was 48 nodes, totaling 4608 cores with a peak performance of 339 TFLOPS. Each of those compute nodes have AMD EYPC 7643 CPU’s with a total of 96 cores with a base clock speed of 2.3 GHz and 256 GB of DDR4-3200 of RAM plus an HDR100 InfiniBand adapter providing 100 Gbps of connectivity. There are also 2 login nodes providing high availability access, configured identically to the compute nodes but with 512GB each instead of 256GB. The second project in this expansion is a GPU-focused set of servers, with eleven nodes and an aggregate of 352 cores and 44 GPUs providing 449 TFLOPS of performance. These GPU-enabled nodes each include 4x Nvidia A100 GPUs with 80GB of video memory alongside two Intel Xeon Gold 6338 Processors (64 cores per node) running at a base speed of 2.3 GHz with 256 GB of DDR4-3200 RAM, and an HDR100 InfiniBand adapter. For memory intensive tasks, the system offers an HPE server containing 3 TB of DDR4-2933 RAM and 4 Intel Xeon Platinum 8360H Processors running at 3 GHz and a total of 96 cores.

2019: SeaWulf was expanded by 64 compute nodes, each with Intel Xeon Gold 6148 CPUs with 40 cores that operate at a base speed of 2.4 GHz and have 256 GB of RAM. This expansion, in aggregate, offers up to 111 TFLOPS and 2,560 cores. In addition, there is a large memory node, with 3 TB of DDR4 RAM and Intel E7-8870v3 processors with 72 cores operating at 2.1 Gigahertz, for a total of 72 cores and 144 threads (via Hyper-Threading) and also a Nvidia V100 16GB GPU.

2016: SeaWulf was renewed with support from NSF Major Research Instrumentation award 1531492, matching funds from the New York State Department of Economic Development contract C210148, plus additional funding from SBU’s President, Provost, Vice President for Research, and CIO. It is composed of 164 compute nodes from Penguin Computing, each with two Intel Xeon E5-2683v3 CPUs on a FDR InfiniBand network. CPUs are codenamed “Haswell", offering 14 cores each, and operating at a base speed of 2.0 Gigahertz. 8 of these compute nodes contain GPUs. Total of 28 x Nvidia Tesla K80 24GB Accelerators, offering 64x GK210 (K40) Cores (159,744 CUDA cores). One node with 2 Tesla P100 GPUs and one node with 2 Tesla V100 GPUs are also accessible.

Ookami

Ookami is one of the first computers outside of Japan to be powered by the HPE Apollo 80 system, which was originally developed by Cray and Fujitsu, and uses the Fujitsu A64FX processor, the same processor technology in one of the fastest and most power efficient supercomputer in the world, Fugaku at the RIKEN Center for Computational Science, in Japan. The ARM-based processor includes multiple innovations integrated with very fast, low-latency memory that together make it easier for science and engineering applications to reach both high performance and high power efficiency, therefore “greener” technology. Ookami is made up of 176 Fujitsu A64FX, 2 Nvidia Grace Superchips, 1 AmpereOne-X, 2 Thunder X2, 1 Intel Skylake, and 1 AMD Milan node, for a total of 8,964 cores, and it is coupled with a petabyte of Lustre shared storage.