IEEE CLUSTER 2025 Program

Tuesday September 2nd

8:30-9:30

Registration
Foyer

9:30-11:00

[Tutorial] High-Performance and Smart Networking Technologies for HPC and AI
Pentland East

[Tutorial] Write highly parallel, vendor neutral applications using C++ and SYCL
Pentland West

[Tutorial] Accelerate HPC and AI workloads with the NVIDIA GH200 Superchip and HPE EX Supercomputing Platform
Prestonfield

[Tutorial] Identifying Software and Hardware Inefficiency at Scale
Holyrood

[Workshop] LLMxHPC: 2025 International Workshop on Large Language Models (LLMs) and HPC (agenda)
Duddingston

[Workshop] REX-IO 2025: 5th Workshop on Re-envisioning Extreme-Scale I/O for Emerging Hybrid HPC Workloads (agenda)
Salisbury

11:00-11:30

Break

11:30-13:00

[Tutorial] High-Performance and Smart Networking Technologies for HPC and AI
Pentland East

[Tutorial] Write highly parallel, vendor neutral applications using C++ and SYCL
Pentland West

[Tutorial] Accelerate HPC and AI workloads with the NVIDIA GH200 Superchip and HPE EX Supercomputing Platform
Prestonfield

[Tutorial] Identifying Software and Hardware Inefficiency at Scale
Holyrood

[Workshop] LLMxHPC: 2025 International Workshop on Large Language Models (LLMs) and HPC (agenda)
Duddingston

[Workshop] REX-IO 2025: 5th Workshop on Re-envisioning Extreme-Scale I/O for Emerging Hybrid HPC Workloads (agenda)
Salisbury

13:00-14:00

Lunch Break

14:00-15:30

[Tutorial] High-Performance and Smart Networking Technologies for HPC and AI
Pentland East

[Tutorial] Write highly parallel, vendor neutral applications using C++ and SYCL
Pentland West

[Tutorial] Accelerate HPC and AI workloads with the NVIDIA GH200 Superchip and HPE EX Supercomputing Platform
Prestonfield

[Tutorial] A practical introduction to programming the Tenstorrent Tensix architecture for HPC
Holyrood

[Workshop] REX-IO 2025: 5th Workshop on Re-envisioning Extreme-Scale I/O for Emerging Hybrid HPC Workloads (agenda)
Salisbury

15:30-16:00

Break

16:00-17:30

[Tutorial] High-Performance and Smart Networking Technologies for HPC and AI
Pentland East

[Tutorial] Write highly parallel, vendor neutral applications using C++ and SYCL
Pentland West

[Tutorial] Accelerate HPC and AI workloads with the NVIDIA GH200 Superchip and HPE EX Supercomputing Platform
Prestonfield

[Tutorial] A practical introduction to programming the Tenstorrent Tensix architecture for HPC
Holyrood

[Workshop] REX-IO 2025: 5th Workshop on Re-envisioning Extreme-Scale I/O for Emerging Hybrid HPC Workloads (agenda)
Salisbury

Wednesday September 3rd

8:30-9:30

Registration
Foyer

8:30-9:15

Student session: Career Compass
Salisbury

9:30-11:00

Opening Session
Pentland
Chair: Taisuke Boku (U. Tsukuba)
Welcome address (Michèle Weiland, General Co-Chair, EPCC)
TPC report (Toni Peña, TPC Co-Chair, BSC)
Keynote (1): Natalia Vassilieva
Pentland
Chair: Nick Brown (EPCC)

11:00-11:30

Break

11:30-13:00

Best Paper Finalists
Pentland
Chair: Adrian Jackson (EPCC)
Scaling Deep Learning Molecular Dynamics to 500M Atoms on 4096-Node ARMv8 Clusters
PRT: An Efficient Pipeline Reuse Technology for Large Models Training
Closing the HPC-Cloud Convergence Gap: Multi-Tenant Slingshot RDMA for Kubernetes

13:00-14:00

Lunch

14:00-15:30

Session (1) AI Models and Approaches
Pentland
Chair: Miwako Tsuji (RIKEN)
ROCK: Serving Multimodal Model in Cloud with Heterogeneous-Aware Resource Orchestration for Thousands of LoRA Adapters
SplitQuant: Resource-Efficient LLM Offline Serving on Heterogeneous GPUs via Phase-Aware Model Partition and Adaptive Quantization
DaCe AD: Unifying High-Performance Automatic Differentiation for Machine Learning and Scientific Computing

Session (2) Job Scheduling and Orchestration
Prestonfield
Chair: Ewa Deelman (USC)
GreenK8s: Green-aware Scheduling for Sustainable Kubernetes Cluster Management
DDRM: An SLO-aware Deep Dynamic Resource Management Framework for Microservices
Are We There Yet? Predicting the Queue Wait Times for HPC Jobs

15:30-16:00

Break

16:00-17:00

Poster Presentations
Pentland

17:00-19:00

Poster Session and Conference Reception
JMCC

Thursday September 4th

8:30-9:30

Registration
Foyer

8:30-9:15

Student session: Skills to Thrive
Salisbury

9:30-10:30

Keynote (2): Rosa Badia
Pentland
Chair: Bronis de Supinski (LLNL)

10:30-11:00

Break

11:00-12:30

Session (3) Storage and IO
Pentland
Chair: Steven Wright (York)
Proactive SSD Failure Prediction with A Gradient-Guided LSTM-xLSTM Hybrid Model
EquilibrIO: Taming the I/O Tides in High-Performance Computing
CFseq: A Framework for Constructing Compression-Friendly Field Sequences for Network Logs

Session (4) Networking and Communications
Prestonfield
Chair: Jay Lofstead (Sandia)
Towards dynamic message passing protocols for stencil-based communication patterns
PIAR: Path-Improved Adaptive Routing for Dragonfly Networks
Cascade: a Collaborative Algorithm for Scalable And Efficient Neighborhood Allgather

12:30-13:30

Lunch Break

13:30-15:00

Session (5) Optimising GPU Performance
Pentland
Chair: Michael Kruse (AMD)
Uniconn: A Uniform High-Level Communication Library for Portable Multi-GPU Programming
A Pattern-Aware Finite Element Matrix Assembly Method on GPUs
nsys2prv: detailed and quantitative analysis of large-scale GPU execution traces with Paraver

Session (6) Systemware and System Architectures
Prestonfield
Chair: James Richings (EPCC)
Cache Less to Save More: A Cost-Based Distributed Caching Strategy for ICN
SoCL: Scalable and Latency-Optimized Microservices in Serverless Edge Computing
Detecting Silent Data Corruption From Hardware Counters

15:00-15:30

Break

15:30-17:30

Session (7) Performance Modelling and Optimisation
Pentland
Chair: Chris Maynard (Met Office)
Lessons from Profiling and Optimizing Placement in AMR Codes
Fine-grain energy consumption modeling of HPC task-based programs
A Versatile Simulated Data Transport Layer for In Situ Workflows Performance Evaluation

Session (8) Storage and I/O
Prestonfield
Chair: Sarah Neuwirth (JGU)
Bridging Metadata Service and CXL: A Metadata-grained and Directory-aware Storage Engine for Distributed Storage Systems
Revisiting Fragmentation for Deduplication in Clustered Primary Storage Systems
FIFO-MEP: An Efficient Multi-Eviction-Point FIFO Cache with Stable Demotion for Burst-Oriented Access Mitigation
RAN: Accelerating Data Repair with Available Nodes in Erasure-Coded Storage

19:00-22:00

Conference Dinner

Friday September 5th

8:30-9:30

Registration
Foyer

09:30-10:30

Keynote (3): Garth Wells
Pentland
Chair: Michèle Weiland

10:30-11:00

CLUSTER 2026 Presentation
Pentland
Chair: Michèle Weiland

11:00-11:30

Break

11:30-13:00

Session (9) Networking and Communications
Pentland
Chair: Jay Lofstead (Sandia)
TRACE: A Targeted Recommender for VM Assignment in Cloud Environment
Scalable and Fast Inference Serving via Hybrid Communication Scheduling on Heterogeneous Networks
Communication Notification through User-Level Interrupts for the BXI Network

Session (10) Algorithms and Numerical Approaches
Prestonfield
Chair: Chris Maynard (Met Office)
Parallel Selected Inversion of Block-tridiagonal with Arrowhead Matrices
Parallel tall-and-skinny QR factorization based on LU-CholeskyQR algorithm
Towards High-Performance and Portable Molecular Docking on CPUs through Vectorization

13:00-14:00

Lunch Break

14:00-15:30

Session (11) Applications and Optimisation Approaches
Pentland
Chair: Nick Brown (EPCC)
MoE-Rckpt: Efficient In-Memory Checkpointing for MoE Model Training with Dynamicity Awareness
Efficient Multi-GPU Programming in Python: Reducing Synchronization and Access Overheads

Session (12) Scheduling and Applications
Prestonfield
Chair: Toni Peña (BSC)
BMPipe: Bubble-Memory Co-optimization Strategy Planner for Very-large DNN Training
Deadline-Aware Resource Allocation and Scheduling of Serverless Workloads on Heterogeneous Clusters
Accelerating Key-Value Data Structures Using AVX-512 SIMD Extensions

15:30-15:45

Break

15:45-16:15

Awards and Closing Session
Pentland
Chair: Michèle Weiland, EPCC
Best Poster Awards
Best Paper Awards
Closing Remarks: Taisuke Boku (U. Tsukuba)

Conference Room Layout

Conference centre layout