HiPerGator 3.0

HiPerGator 3.0 – CPUs

HiPerGator 3.0 has the following configuration:

  • 240 AMD EPYC Rome machines
    • 2 AMD EPYC 7702 64-core processors running at 2.0GHz
    • 1024GB of RAM
    • 30,720 total cores
  • 150 AMD EPYC Milan machines
    • 2 AMD EPYC 75F3 32-core processors running at 2.95GHz
    • 512GB of RAM
    • 19.200 total cores

The system also has a number of machines with 4TB of memory for large jobs.

HiPerGator 3.0 – GPUs

As of the end of 2019, HiPerGator has 608 new GPUs in production. These are installed in nodes from Exxact with 32 Intel Xeon Gold 6242 2.80 GHz CPU cores and 192GB of RAM in each node.

  • 560 GPUs are NVIDIA GeForce RTX 2080ti cards.
  • 48 GPUs are NVIDIA Quadro RTX 6000 cards configured in pairs with NVLink adapter.
    • These GPUs can be used by applications that use two GPU cards and rapidly exchange data directly between the two cards, without the need to transfer data to central RAM first.

See the GPU Access page for full details of using these new GPUs.

HiPerGator 3.0 – Blue Storage

The replacement for “blue” storage on HiPerGator, currently known as the /ufrc filesystem, arrived in January and is being configured and tested for stability, functionality, and performance. This storage system from DataDirect Networks (DDN) will form the new /blue filesystem for HiPerGator as part of the HiPerGator 3.0 upgrade.

  • Size: 4 PB usable
  • File system: Lustre 2.12
  • Hardware: DDN SFA18K storage system with two SFA200NV NVMe storage subsystems
    • 4 metadata servers (MDS) with flash/NVMe storage of 71.89 TB usable across 4 metadata targets (MDT)
    • 8 object storage servers (OSS)
    • 32 object storage targets (OST)
    • 400 spinning disks providing 3672.65 TB usable
    • 36 flash/NVMe drives providing 392.64 TB usable
    • InfiniBand EDR adapters at 100 Gbit/sec
  • Features:
    • Data on metadata: Lustre is a parallel filesystem designed to store and rapidly access very large files. But a lot of work on HiPerGator involves very large numbers of small files. The MDS stores the catalog of the files, with data on MDS, it also stores the first 64 KiB of a file, which for many files is the entire file. This greatly improves performance in handling small files.
    • Progressive file layout: Files in Lustre can be distributed across multiple disks for maximal performance. This requires extra effort and some expertise to do manually or in your code. This new version of Lustre has Progressive File Layout (PFL) to improve performance automatically. PFL on the new Blue storage, will work as follows: the file is created on the MDT and the first 64 KiB of the file will reside there. As the file grows the next 1MiB of data will be written to flash OSTs. Any additional data will be written to spinning disk OSTs, with striping automatically engaged.Thus smaller files are quick to get to and have very good performance since they reside on flash media, while very big files will be striped across all available OSSs for maximal performance the storage system can deliver to get great performance.
    • Distributed Namespace: This feature allows Metadata Targets to be scaled horizontally within a single namespace and striping of directories.  In the new Blue system, there are four MDS servers and targets, where there was only one of each in the existing /ufrc system.  Each server and target in the new system is dramatically more powerful than what is in /ufrc – more processors and RAM in the servers, and NVMe-interface flash storage instead of spinning disk.
    • Directory Quota:  This feature allows assignment of per-directory quotas.  In the existing /ufrc system, filesystem-wide group quota is assigned, which can cause data and quota management problems for sponsors whose users belong to more than one group.  The new Blue system will use directory quotas.