festus
The cluster “festus” (btrzx24) went into operation in January 2025. It consists of two management nodes, one virtualization server, two login nodes, several storage servers and 74 compute nodes which are connected by an 100G Infiniband Interprocess- and a 25G Sericenetwork. “festus” uses Slurm (24.11) as resource manager. The ITS file server (e.g., the ITS home directory) is not mounted on the cluster for performance reasons, every users has a separate home directory (10GB) which lies on the clusters own nfs-server.
Acknowledging festus / Publications
As with other DFG-funded projects, results must be made available to the general public in an appropriate manner. The publications must contain a reference to the DFG funding (so-called “Funding Acknowledgement”) in the language of the publication, stating the project number.
Whenever the festus has been used to produce results used in a publication or posters, we kindly request citing the service in the acknowledgements:
Calculations were performed using the festus-cluster of the
Bayreuth Centre for High Performance Computing (https://www.bzhpc.uni-bayreuth.de),
funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - 523317330.
Login
The login nodes of festus will be accessible with ssh via festus.hpc.uni-bayreuth.de only from university networks. If you are outside the university, a VPN connection is required. If your login shell is (t)csh or ksh, you have to change it to bash or zsh in the ITS self-service portal.
Compute nodes
62 compute servers (“typA”)
- 2x AMD EPYC 9554 64c CPU (max. 3.75GHz, 128 cores total)
- 24x 16GB RAM (384GB total)
- 480 GB NVMe
5 compute servers (“typB”)
- 2x AMD EPYC 9684X 96c CPU (max 3.42GHz, 192 cores total)
- 24x 64GB RAM (1536GB total)
- 480 GB NVMe
1 compute server (“typC”)
- 2x INTEL® XEON® Platinum 8480+ 56c CPU (max. 3.8GHz, 112 cores total)
- 16x 128GB RAM (2048GB)
- 480 GB + 14TB nvme
- 4x NVIDIA H100
1 compute server (“typD”)
- 2x INTEL® XEON® Platinum 8480+ 56c CPU (max. 3.8GHz, 112 cores total)
- 16x 128GB RAM (2048GB)
- 480 GB + 14TB nvme
- 4x AMD MI210
3 compute server (“typE”)
- 2x AMD EPYC 9554 64c CPU (max. 3.75GHz, 128 cores total)
- 24x 16GB RAM (384GB total)
- ~3.84TB NVMe
- 2x NVIDIA L40
2 compute server (“typF”)
- 2x AMD EPYC 9554 64c CPU (max. 3.75GHz, 128 cores total)
- 24x 16GB RAM (384GB total)
- ~3.84TB NVMe
- 2x AMD MI210
Partitions
Priotities are calculated with slurm’s Multifactor Priority Plugin. Where the groups/accounts financial share and consumed resources are most weighted.
- Wall time: 8 hours (default), 24 hours (max)
- Nodes: typA,B,E,F-nodes
- Wall time: 8 hours (default), 24 hours (max)
- Nodes: typC,D-nodes
- Wall time: 15 Minutes (default), 90 Minutes (max)
- Restrictions: max 2 nodes per job
Network
- Infiniband (100 Gbit/s)
- Ethernet (25 Gbit/s)
User file space
Every data inside /workdir and /scratch has limited lifetime and neither a Backup nor snapshots. Start with /workdir (NFS) if /scratch (BeeGFS) is not needed.
- /groups/org-id: Group directory (only for groups financially involved in the cluster)
- /home: 10GB per User
- /workdir: ~70TB
- data lifetime: max 60 days
Every data inside /workdir and /scratch has limited lifetime and neither a Backup nor snapshots. Start with /workdir (NFS) if /scratch (BeeGFS) is not needed.
- /scratch
- data lifetime: 10 days
Warning
Use this system only for (intel) MPIIO or parallel hdf5. Do not use this for POSIX-IO-only jobs. If you don’t know whether you need to you use /scratch or not try /workdir first!
If you log on with ssh to a node on which you are running a job, you will not be able to access the same /tmp or /dev/shm as your job!
- typA/B: ~200GB
- typC/D: ~14TB
- typE/F: ~3.84TB
Commissioning & Extension
November 2024
Resource Manager & Scheduler
Slurm 24.11
Operating system
RHEL9.5 / RockyLinux 9.5