IEEE Cluster 2024 Program

September 24

8:30-9:00

Registration

9:00-10:30

[Workshop] Sustainable HPC State of the Practice Workshop (Sustainable HPC SOP Workshop)
This session is scheduled to start at 9:30.
Room 402

[Tutorial] Embrace Arm and GPU in the datacenter using the NVIDIA Grace Hopper Superchip
Room 404

[Tutorial] Identifying Software Inefficiency with Fine-grained Value Profilers
Room 406

10:30-10:45

Break

10:45-12:15

[Workshop] LLMxHPC: 2024 International Workshop on Large Language Models (LLMs) and HPC
Room 401

[Workshop] REX-IO 2024: 4th Workshop on Re-envisioning Extreme-Scale I/O for Emerging Hybrid HPC Workloads
Room 403

[Workshop] Sustainable HPC State of the Practice Workshop (Sustainable HPC SOP Workshop)
Room 402

[Tutorial] Embrace Arm and GPU in the datacenter using the NVIDIA Grace Hopper Superchip
Room 404

[Tutorial] Best Practices for HPC in the Cloud with AWS and Graviton (Arm) instances
Room 405

[Tutorial] Identifying Software Inefficiency with Fine-grained Value Profilers
Room 406

12:15-13:15

Lunch Break
(Lunch box is provided for attendee)

13:15-14:45

[Workshop] LLMxHPC: 2024 International Workshop on Large Language Models (LLMs) and HPC
Room 401

[Workshop] REX-IO 2024: 4th Workshop on Re-envisioning Extreme-Scale I/O for Emerging Hybrid HPC Workloads
Room 403

[Workshop] Sustainable HPC State of the Practice Workshop (Sustainable HPC SOP Workshop)
Room 402

[Tutorial] Embrace Arm and GPU in the datacenter using the NVIDIA Grace Hopper Superchip
Room 404

[Tutorial] Best Practices for HPC in the Cloud with AWS and Graviton (Arm) instances
Room 405

Canceled[Workshop] Cluster Computing - A Sustainable Approach
Room 406

14:45-15:00

Break

15:00-16:30

[Workshop] Fifth Workshop on Heterogeneous Memory Systems (HMEM)
Room 401

[Workshop] REX-IO 2024: 4th Workshop on Re-envisioning Extreme-Scale I/O for Emerging Hybrid HPC Workloads
Room 403

[Workshop] Sustainable HPC State of the Practice Workshop (Sustainable HPC SOP Workshop)
Room 402

[Tutorial] Embrace Arm and GPU in the datacenter using the NVIDIA Grace Hopper Superchip
Room 404

[Tutorial] Best Practices for HPC in the Cloud with AWS and Graviton (Arm) instances
Room 405

[Tutorial] GPU accelerated applications with CUDA-Q: An integrated framework for hybrid classical-quantum computing workloads
Room 406

16:30-16:45

Break

16:45-18:15

[Workshop] Fifth Workshop on Heterogeneous Memory Systems (HMEM)
Room 401

[Workshop] REX-IO 2024: 4th Workshop on Re-envisioning Extreme-Scale I/O for Emerging Hybrid HPC Workloads
Room 403

[Workshop] Sustainable HPC State of the Practice Workshop (Sustainable HPC SOP Workshop)
Room 402

[Tutorial] Best Practices for HPC in the Cloud with AWS and Graviton (Arm) instances
Room 405

[Tutorial] GPU accelerated applications with CUDA-Q: An integrated framework for hybrid classical-quantum computing workloads
Room 406

September 25

9:00-9:30

Registration

9:30-11:00

Opening Session
International Conference Room
Chair: Taisuke Boku (U. Tsukuba)
Welcome address (Satoshi Matusoka, General Co-Chair, RIKEN R-CCS)
TPC report (Mohamed Wahib, TPC Co-Chair, RIKEN R-CCS)
Keynote (1): Estela Suarez
International Conference Room
Chair: Kengo Nakajima (U.Tokyo/RIKEN R-CCS)

11:00-11:30

Break

11:30-13:00

Best Paper Finalists
International Conference Room
Chair: Mohamed Wahib (RIKEN R-CCS)
GPU Reliability Assessment: Insights Across the Abstraction Layers
Siesta: Synthesizing Proxy Applications for MPI Programs
Distributed Order Recording Techniques for Efficient Record-and-Replay of Multi-threaded Programs

13:00-14:00

Student Networking Lunch
(Lunch box is provided for attendee)
International Conference Room Room 401/402

Lunch Break
(Lunch box is provided for attendee)
Room 401/402 International Conference Room

14:00-15:30

Session (1) Graph Algorithms & GNNs
International Conference Room
Chair: Lishan Yang (GMU)
FTGraph: A Flexible Tree-based Graph Store on Persistent Memory for Large-Scale Dynamic Graphs
PGSampler: Accelerating GPU-based Graph Sampling in GNN Systems via Workload Fusion
MassiveGNN: Efficient Training via Prefetching for Massively Connected Distributed Graphs

Session (2) Performance Modeling
Room 401/402
Chair: Filippo Spiga (Nvidia)
A Protocol to Assess the Accuracy of Process-Level Power Models
Holistic Performance Analysis for Asynchronous Many-Task Runtimes
Automated approach for accurate CPU power modelling

15:30-16:00

Break

16:00-16:30

Poster Indexing (30 min)
International Conference Room

16:30-18:30

Poster Session
Reception Hall

September 26

9:00-9:30

Registration

9:30-10:30

Keynote (2): Rio Yokota
International Conference Room
Chair: Miwako Tsuji (RIKEN R-CCS)

10:30-11:00

Break

11:00-12:30

Session (3) Networks & Communication
International Conference Room
Chair: Seydou Ba (RIKEN R-CCS)
MPI Collective Algorithm Selection in the Presence of Process Arrival Patterns
Optimizing Neighbor Collectives with Topology Objects
A Topology- and Load-Aware Design for Neighborhood Allgather

Session (4) Numerical Libraries
Room 401/402
Chair: Emil Vatai (RIKEN R-CCS)
Uncut-GEMMs : Communication-aware matrix multiplication on multi-GPU nodes
Generating High-Performance FFT Code through MLIR Linalg Dialect and Micro-Kernel Optimization
Understanding Mixed Precision GEMM with MPGemmFI: Insights into Fault Resilience

12:30-13:30

Lunch Break
(Lunch box is provided for attendee)
International Conference Room

Student Networking Lunch
(Lunch box is provided for attendee)
Room 401/402

13:30-15:00

Session (5) IoT, Cloud, and Data Center (1 of 2)
International Conference Room
Chair: Matthew Dearing (UIC) Ryohei Kobayashi (University of Tsukuba)
Parallelism or Fairness? How to be friendly for SSDs in cloud environments
SLACKVM: Packing Virtual Machines in Oversubscribed Cloud Infrastructures
RL-Cache: An Efficient Reinforcement Learning based Cache Partitioning Approach for Multi-tenant CDN Services

Session (6) Runtime Optimizations
Room 401/402
Chair: Yuan He (RIKEN R-CCS)
FCUFS: Core-Level Frequency Tuning for Energy Optimization on Intel Processors
ML-based Dynamic Operator-Level Query Mapping for Stream Processing Systems in Heterogeneous Computing Environments
Enabling Practical Transparent Checkpointing for MPI: A Topological Sort Approach

15:00-15:30

Break

15:30-17:00

Session (7) IoT, Cloud, and Data Center (2 of 2)
International Conference Room
Chair: George Michelogiannakis (LBNL) Asif Ali Ahmed R (IEEE)
Enabling Workload-Driven Elasticity in MPI-based Ensembles
Geo-Distributed Analytical Streaming Architecture for IoT Platforms
Seastar: A Cache-Efficient and Load-Balanced Key-Value Store on Disaggregated Memory

Session (8) Job Scheduling & Orchestration
Room 401/402
Chair: L., Lavanye (HPE) Mohamed Wahib (RIKEN R-CCS)
HEFTLess: A Bi-Objective Serverless Workflow Batch Orchestration on the Computing Continuum
Job Scheduling in High Performance Computing Systems with Disaggregated Memory Resources
Fully Decentralized Data Distribution for Exascale-HPC: End of the Provider-Demander Matching Puzzle

17:00-17:15

Short Break w/o Coffee

17:15-18:45

Panel Discussion
International Conference Room
AI for Science: What should we do?
Panelists: Sunita Chandrasekaran (U.Delaware, USA), Estela Suarez (JSC/U.Bonn, Germany), Ikko Hamamura (NVIDIA, Japan), Mohamed Wahib (RIKEN R-CCS, Japan), Rio Yokota (Tokyo Tech, Japan).
Moderator: Kengo Nakajima (U.Tokyo/RIKEN R-CCS).

18:45-19:00

Short Break w/o Coffee

19:00-21:00

September 27

9:00-9:30

Registration

09:30-10:30

Keynote (3): Sunita Chandrasekaran
International Conference Room
Chair: Taisuke Boku (U.Tsukuba)

10:30-11:00

CLUSTER 2025 Presentation
International Conference Room
Chair: Taisuke Boku (U.Tsukuba)
Presenter: Michele Weiland (U. Edinburgh)

11:00-11:30

Break

11:30-13:00

Session (9) Accelerators & In-Network Computing
International Conference Room
Chair: Lingqi Zhang (RIKEN R-CCS)
FT K-Means: A High-Performance K-Means on GPU with Fault Tolerance
ScalFrag: Efficient Tiled-MTTKRP with Adaptive Launching on GPUs
Leveraging high-performance data transfer to offload data management tasks to SmartNICs

Session (10) Workflows
Room 401/402
Chair: Jay F Lofstead (Sandia National Lab)
DaYu: Optimizing Distributed Scientific Workflows by Decoding Dataflow Semantics and Dynamics
Sizey: Memory-Efficient Execution of Scientific Workflow Tasks
Phase-based Data Placement Optimization in Heterogeneous Memory

13:00-14:00

Lunch Break
(Lunch box is provided for attendee)
International Conference Room

Student Job Fair Lunch (inc. Supporter Companies)
(Lunch box is provided for attendee)
Room 401/402

14:00-15:30

Session (11) Applications
International Conference Room
Chair: Jiajun Huang (UC Riverside)
Xphase3d: Memory-Distributed Phase Retrieval for Reconstructing Large-Scale 3D Density Maps of Biological Macromolecules
Accuracy-Efficiency Optimization for Multi-Stage Small Object Detection in Surveillance Video with Collaborative Frame Sampling
Modernizing an Operational Real-time Tsunami Simulator to Support Diverse Hardware Platforms

Session (12) Storage & I/O
Room 401/402
Chair: Reza HoseinyF (University of Sydney)
I/O Behind the Scenes: Bandwidth Requirements of HPC Applications With Asynchronous I/O
FINCHFS: Design of Ad-Hoc File System for I/O Heavy HPC Workloads
A High-Performance and Fast-Recovery Scheme for Secure Non-Volatile Memory Systems

15:30-15:35

Short Break w/o Coffee

15:35-16:05

Award and Closing Session
International Conference Room
Chair: Kengo Nakajima (U. Tokyo/RIKEN R-CCS)
Best Student Poster & Best Poster Awards (Takeshi Fukaya, Posters Chair, Hokkaido U.)
Best Student Paper & Best Paper Awards (Mohamed Wahib, TPC Co-Chair, RIKEN R-CCS)
Closing Remarks (James Lin, General Co-Chair, Shanghai Jiao Tong U.)

16:25-17:30