IEEE Cluster 2025

Keynotes ● Cluster 2025

Keynote 1: Wednesday 3rd September

Wafer-Scale Computing: AI and HPC with Fewer, Stronger Machines
Natalia Vassilieva
Cerebras Systems

Abstract

Computer performance has advanced by many orders of magnitude since the earliest systems, yet the demands of AI and scientific workloads continue to outpace what current large-scale clusters can provide. Demand engenders supply, and new approaches to maintaining computational progress are emerging. One such breakthrough is the development of a wafer-scale compute platform by Cerebras. Why wafer-scale? For many workloads, real achieved performance in supercomputers (as opposed to the peak speed) is limited by the bandwidth and latency barriers --- memory and communication walls --- that impose delay when off-processor-chip data is needed. By changing the scale of the chip by two orders of magnitude, we can pack a small, powerful, mini-supercomputer into a single piece of silicon, greatly reducing off-chip traffic and eliminating these bottlenecks.

Cerebras overcame technical challenges, including yield, packaging, cooling, and power delivery power, to make wafer-scale computing viable. This talk will present the details of the Cerebras hardware and software stack and discuss diverse use cases, from large-scale deep learning model training to high-throughput inference and scientific computing.

We will delve into the architecture of the Wafer-Scale-Engine (WSE), highlighting its wafer-scale integration, on-chip memory, and high-bandwidth communication fabric. We will cover the co-designed weight streaming execution strategy for training, which disaggregates parameter storage from compute, enabling independent scaling of model and cluster size. This approach allows for data-parallel distributed training for arbitrary-sized models and clusters with simple single-device model code, achieving linear scaling while avoiding the complexities of hybrid distribution techniques.

We will explore hardware-optimized LLM mapping to wafer-scale clusters for ultra-low-latency autoregressive inference, enabled by a large pool of on-chip memory. It unlocks interactivity for agentic and reasoning workflows, which require multiple sequential inference calls for planning and multi-step execution and reasoning.

Finally, we will highlight scientific computing applications which take full advantage of the unique architecture of the WSE. These include a stencil-based finite-difference solver for 3D wave equation, which shifts from being memory-bound to compute-bound on the WSE, as well as pioneering work in multi-dimensional seismic processing and molecular dynamics. These applications achieved up to 750x speedups over the world’s leading supercomputers and have been recognized as Gordon Bell Award finalists.

Biography

Natalia Vassilieva is VP and Field CTO, ML at Cerebras Systems. She has decades of R&D experience in NLP, CV, ML and IR. Prior to Cerebras, Natalia was a Sr. Research Manager at Hewlett Packard Labs, where she led the Software and AI group in 2015-2019 and served as the head of HP Labs Russia in 2011-2015. She led research teams developing algorithms and applications for text, image and time series analysis and modelling. In 2012-2015 Natalia also was a part-time Associate Professor at St. Petersburg State University and a part-time lecturer at Computer Science Center, St. Petersburg, Russia. She holds a PhD in CS from St. Petersburg State University.

Keynote 2: Thursday 4th September

Lifecycle-Aware Workflow Design for Scalable Digital Twins in HPC Ecosystems
Rosa Badia
Barcelona Supercomputing Centre

Abstract

High-performance computing (HPC) is a powerful enabler of scientific and industrial progress. Europe has made significant strides in this domain through the EuroHPC Joint Undertaking (JU), which now operates ten supercomputers—four of which rank among the top 15 in the June 2025 TOP500 list. This effort has established a world-leading open infrastructure that combines HPC, Quantum Computing, and Artificial Intelligence (AI).

A rapidly evolving area in HPC is the development of application workflows that integrate heterogeneous computing approaches, including HPC simulation, data analytics, and AI. Ensuring the sustainability and reusability of these workflows requires comprehensive lifecycle management—from development and deployment to operation—as well as access to provenance metadata. These practices align with FAIR principles (Findable, Accessible, Interoperable, Reusable) for both data and software.

In many domains, application workflows underpin the construction of domain-specific digital twins—such as those used in geohazard mitigation, aerospace, or energy systems. However, most existing digital twins have been developed using ad-hoc approaches without standardized methodologies.

This talk will review current trends, present representative examples of application workflows, and propose a generic methodology for the systematic development and management of digital twins in scientific and industrial contexts.

Biography

Rosa M. Badia is the director of the HPC software research area and the manager of the Workflows and Distributed Computing research group, both roles at the Barcelona Supercomputing Center (BSC, Spain). Her research has contributed to parallel programming models for multicore and distributed computing. Recent contributions have focused in the area of the digital continuum, proposing new programming environemnts and software environment for edge-to-cloud, as well as for the support of hybrid quantum-classic workflows. The research is integrated in PyCOMPSs/COMPSs, a parallel task-based programming distributed computing framework, and its application to developing large heterogeneous workflows that combine HPC, Big Data, and Machine Learning.

Dr Badia has published more than 200 papers on her research topics in international conferences and journals. She has been very active in projects funded by the European Commission and in contracts with industry. She is a member of the EuroHPC JU RIAG.

Keynote 3: Friday 5th September

Performance analysis of GPU architectures for solving partial differential equations
Garth Wells
University of Cambridge

Abstract

With the development of easy-to-use development tools and the maturing of compilers, it has become easy to create working programs for GPUs. However, it remains hard to create algorithms and implementations that reach a good fraction of peak performance, be that floating point operations, main memory bandwidth or fast local memory.

We consider the design and performance of GPU algorithms and implementations for finite element operators, including parametrised algorithms that can be adjusted to suit the characteristics of the target hardware. The performance of the algorithms is investigated on a range of architectures, including the AMD MI300X and the NVIDIA GH200 processors. We show that in nearly all cases the performance limiter is the local fast memory, and this has informed the design of the algorithms. We also show, contrary to accepted wisdom, that the performance of lower-order methods on GPUs can be good and can be faster than what has been reported in the literature. This is promising for engineering applications, where low-order methods are more robust. Finally, we explore the use of tensor cores for solving differential equations and analyse performance for practical engineering problems on the LUMI supercomputer.

Biography

Garth Wells is the Hibbitt Professor of Solid Mechanics at the University of Cambridge. He received his undergraduate degree in engineering from The University of Western Australia and PhD from Delft University of Technology. Before joining University of Cambridge in 2007, he held a faculty position at Delft University of Technology and post-doctoral positions at Stanford University and The University of Texas at Austin. His interests include numerical analysis, scientific computing and mathematical software, motivated by challenging engineering applications. He is a leader of the FEniCS Project on mathematical software and a strong advocate for open-source scientific software. He serves as an Associate Editor of the SIAM Journal on Scientific Computing.