Longleaf and Pine

Longleaf is a new compute cluster designed for memory and I/O intensive, loosely coupled workloads with an emphasis on aggregate job throughput over individual job performance. In particular workloads consisting of a large quantity of jobs each requiring a single compute host are best suited to Longleaf.

Pine is the purpose-built filesystem for high-throughput and data-intensive computing and information-processing.  Pine is presented to Longleaf natively.

Longleaf includes:

  • 120 “General-Purpose” nodes (24-cores each; 256-GB RAM; 2x10Gbps NIC)
  • 24 “Big-Data” nodes (12-cores each; 256-GB RAM; 2x10Gbps; 2x40Gbps)
  • 5 large memory nodes (3-TB RAM each)
  • 5 “GPU” nodes each with GeForce GTX1080 cards (102,400 CUDA cores in total)

The nodes include local SSD disks for a GPFS Local Read-Only Cache (“LRoC”) that optimizes the most frequent metadata data/file requests to the node itself, thus eliminating traversals of the network fabric and disk subsystem.  Both General-Purpose and Big-Data nodes have 68-GigaBytes/second of memory bandwidth.  General-Purpose nodes have 10.67GB of memory per core and 53.34-Megabytes/second of network bandwidth per core.  Big-Data nodes have 21.34GB of memory per core and 213.34-Megabytes/second of network bandwidth per core.

Pine includes:

  • Connected to Longleaf compute nodes by zero-hop 40Gbps connections
  • 14-controllers (for throughput and fault tolerance)
  • High-performance parallel filesystem (GPFS)
  • Tiered: approx 210-TB SSD disk; and approx. 2-PB SAS disk

Longleaf uses the SLURM resource management and batch scheduling system.  Longleaf’s total conventional compute core count is 6,496 cores (note: this count reflects that hyperthreading enabled).

Research Computing’s Isilon scale-out NAS space (often referred to as the filesystem `/proj`) is presented to Longleaf as well.