Longleaf is a new compute cluster designed for memory and I/O intensive, loosely coupled workloads with an emphasis on aggregate job throughput over individual job performance. In particular workloads consisting of a large quantity of jobs each requiring a single compute host are best suited to Longleaf.
Pine is the purpose-built filesystem for high-throughput and data-intensive computing and information-processing. Pine is presented to Longleaf natively.
- 120 “General-Purpose” nodes (24-cores each; 256-GB RAM; 2x10Gbps NIC)
- 24 “Big-Data” nodes (12-cores each; 256-GB RAM; 2x10Gbps; 2x40Gbps)
- 5 large memory nodes (3-TB RAM each)
- 5 “GPU” nodes each with GeForce GTX1080 cards (102,400 CUDA cores in total)
The nodes include local SSD disks for a GPFS Local Read-Only Cache (“LRoC”) that optimizes the most frequent metadata data/file requests to the node itself, thus eliminating traversals of the network fabric and disk subsystem. Both General-Purpose and Big-Data nodes have 68-GigaBytes/second of memory bandwidth. General-Purpose nodes have 10.67GB of memory per core and 53.34-Megabytes/second of network bandwidth per core. Big-Data nodes have 21.34GB of memory per core and 213.34-Megabytes/second of network bandwidth per core.
- Connected to Longleaf compute nodes by zero-hop 40Gbps connections
- 14-controllers (for throughput and fault tolerance)
- High-performance parallel filesystem (GPFS)
- Tiered: approx 210-TB SSD disk; and approx. 2-PB SAS disk
Longleaf uses the SLURM resource management and batch scheduling system. Longleaf’s total conventional compute core count is 6,496 cores (note: this count reflects that hyperthreading enabled).
Research Computing’s Isilon scale-out NAS space (often referred to as the filesystem `/proj`) is presented to Longleaf as well.