IEEE Cluster 2023

Keynotes ● Cluster 2023

Wednesday, November 1 – 9:30 - 10:30

AI, Cloud, and the Future of HPC.
Bill Magro
Google

Abstract

The slowing of Moore’s Law had a profound impact on the trajectory of HPC and spurred the worldwide race to exascale. This, in turn, fostered new directions in system architecture that have spurred a renaissance in AI. Cloud computing has also steadily grown and is feeling the influence of HPC. We have now reached a point where AI and cloud are directly shaping the future of HPC. In this talk we will discuss how we got here, what the future of HPC looks like, and some important implications for HPC practitioners.

Biography

Bill Magro is Chief Technologist for HPC at Google, where he drives strategy and customer success for Google Cloud.

Magro joined Google in 2020, after 20 years at Intel, where he was Intel Fellow and Chief Technologist for HPC. There, he served as a key strategist and driver for Intel’s HPC business, with a focus on software, solutions, and emerging areas, including HPC/Cloud and Exascale.

A recognized leader in the InfiniBand industry, Magro helped found the OpenFabrics Alliance and served as InfiniBand Trade Association TWG co-chair from 2007-2020.

A prominent voice in the HPC community for over two decades, Magro regularly presents at HPC conferences, advisory boards, and panels.

Magro spent 3 years as a post-doctoral fellow and staff member at the Cornell Theory Center. He holds a B.Eng in applied and engineering physics from Cornell University and Ph.D. in physics from the University of Illinois.

Thursday, November 2 – 9:30 - 10:30

Pushing RISC-V into HPC.
Jesús Labarta Mancho
Barcelona Supercomputing Center (BSC)

Abstract

The talk will present the philosophy and results of the activity within the European Processor Initiative (EPI) to design a RISC-V vector accelerator. I will briefly present the overall project structure but then focus on the vision of how long vector architectures address fundamental issues in HPC computing such as expressing concurrency and dealing with latency. I will also discuss how the Open Standard RISC-V ISA provides a foundation on which that vision can be deployed while at the same time leveraging contributions of a growing community.

I will describe the architecture of the RISC-V processor designed in the project and its software environment. I will present performance an analysis results obtained on an FPGA emulator implementing the same RTL of the taped out test chip now in the bring up process. The FPGA emulator constitutes a Software Development Vehicle (SDV) where a standard Linux environment is available, as well as an LLVM compiler supporting both intrinsics and automatic vectorization. A powerful performance analysis framework is available to understand the behavior of real applications. This environment seamlessly covers a very wire range of levels of detail, from full application coarse grain to microscopic micro-architectural behavior.

Biography

Prof. Jesús Labarta received his Ph.D. in Telecommunications Engineering from UPC in 1983, where he has been a full professor of Computer Architecture since 1990. He is the Director of the Computer Sciences Dept. at the Barcelona Supercomputing Center since 2005. His research interest includes performance analysis and prediction tools, parallel programming models, malleability and resource management and RISC-V vector architectures. He was awarded the 2017 Ken Kennedy Award.

Friday, November 3 – 9:30 - 10:30

Update on the Aurora Supercomputer.
Susan Coghlan
Argonne National Laboratory

Abstract

This presentation will provide an overview of the Aurora supercomputer, touching on the architecture, early science, and Aurora’s current status. The talk will also discuss some of the challenges seen along the way, how this deployment has differed from past deployments, and offer up a few lessons learned.

Biography

Susan Coghlan is an expert in designing and deploying the world’s largest extreme-scale parallel and distributed computing systems. For over 30 years she has pioneered the evaluation, selection, design, integration, and operation of first-of-their kind computer systems, from one of the first commercially integrated Linux clusters deployed by a national laboratory to the transition to large shared memory systems, more recently massively parallel low-power cores with on-package memory. Many of the systems Susan has worked on were ranked among the top 10 fastest computers in the world. She also has experience leading a commercial software code team. In 2000, Susan helped found a research laboratory for TurboLinux that developed the world’s first dynamic provisioning system for cloud computing and HPC clusters. Susan was both a senior software developer and project manager. In 2002 she joined Argonne National Laboratory as the HPC Manager, leading the deployment and operations of Argonne’s first production supercomputer for the newly created Argonne Laboratory Computing Resource Center. Later, she led the deployment of the lab’s first BlueGene/L system at Argonne. When Argonne was selected as one of two Leadership Computing Facilities for the Department of Energy, Susan became the Argonne Leadership Computing Facility’s (ALCF) first Director of Operations. Later, as Deputy Director of the ALCF, she worked closely with industry partners and computer architectures to field yet more powerful systems. Her latest project is leading the acquisition and deployment of the Aurora supercomputer that will be over 2 ExaFlops. Susan is currently the Future Systems Project Director for the Computing, Environment, and Life Sciences (CELS) directorate at Argonne, and the Deputy Director for Hardware Integration on the DOE Exascale Computing Project.