Homepage
 
FORTH - Institute of Computer Science
 
 

IEEE Technical Committee on Scalable Computing

www.nsf.gov

Gold supporters:

Silver supporters:


SGI

PRACE

 

 
 
IEEE Cluster 2010 Program Program  
   

Monday, September 20 2010

Workshop on Parallel Programming and Applications on Accelerator Clusters (PPAAC)

Session 1 / 09:00 – 10.30 / Ourania Hall

09:00–09:15


 

09:15–10:00

 

 

10:00–10:30

Opening Remarks

Alexandros Stamatakis, Technische Universität München


Keynote Speech : Highly Parallel Implementations of Bioinformatics Applications

Ioannis Papaefstathiou, Technical University of Crete


A Package for OpenCL Based Heterogeneous Computing on Clusters with Many GPU Devices

Amnon Barak, Tal Ben-Nun, Ely Levy, Amnon Shiloh, The Hebrew University of Jerusalem

10:30–11:00

Coffee Break

Session 2 / 11:00 – 12.30 / Ourania Hall

11:00–11:30

 


11:30–12:00

 


12:00–12:30

Accelerating Data Clustering on GPU-based Clusters under Shared Memory Abstraction

Konstantinos Karantasis, Eleftherios Polychronopoulos and George N. Dimitrakopoulos, University of Patras

A Multi-Platform Linear Algebra Toolbox for Finite Element Solvers on Heterogeneous Clusters

Vincent Heuveline, Chandramowli Subramanian, Dimitar Lukarski, Jan-Philipp Weiss, Karlsruhe Institute of Technology

Efficient Complex Matrix Multiplication on the Synergistic Processing Element of the Cell Processor

Quentin Bourgerie, Pierre Fortin, Jean-Luc Lamotte, Université Pierre et Marie Curie

12:30–14:00

Lunch (Main Restaurant)

Session 3 / 14:00 – 15.45 / Ourania Hall

14:00–14:45



14:45–15:15



15:15–15:45

Invited Presentation: Green Flash: Ultra-Efficient Supercomputing

David Donofrio, Lawrence Berkeley National Laboratory


High Performance Triangle versus Box Intersection Checks

Thomas V. Christensen and Sven Karlsson, Technical University of Denmark

Assessment of Barrier Implementations for Fine-Grain Parallel Regions on Current Multi-core Architectures

Simon A. Berger and Alexandros Stamatakis, Technische Universität München

 

Tutorial on Practical Approach to Performance Analysis and Modeling

09:00 – 17:00 / Kalia Hall

09:00 – 17:00

Adolfy Hoisie. Darren J. Kerbyson,

Pacific Northwest National Laboratory


Abstract: This tutorial presents a practical approach to the performance modeling of large-scale scientific applications on high performance systems. The defining characteristic involves the description of a proven modeling approach, developed at Los Alamos, of full-blown scientific codes, that has been validated on systems containing 10,000’s of processors and beyond. We show how models are constructed and demonstrate how they are used to predict, explain, diagnose, and engineer application performance in existing or future codes and/or systems. Notably, our approach does not require the use of specific tools but rather is applicable across commonly used environments. Moreover, since our performance models are parametric in terms of machine and application characteristics, they imbue the user with the ability to “experiment ahead” with different system configurations or algorithms/coding strategies. Both will be demonstrated in studies emphasizing the application of these modeling techniques including: verifying system performance, comparison of large-scale systems, and examination of possible future systems.

 

Workshop on High Performance Computing on Complex Environments (HPCCE)

Session 1 / 08:30 – 10:30 / Clio Hall

08:30–09:00


09:00–09:30


 

 

09:30–10:00


 

10:00–10:30

Opening Remarks, Emmanual Jeannot, INRIA


Parallel Sorting Algorithms for Optimizing Particle Simulations

Michael Hofmann, Gudula Rünger, Chemnitz University of Technology;

Paul Gibbon, Robert Speck, Jülich Supercomputing Centre


Investigation of Selection Strategies in Parallel Branch and Bound Algorithm with Simplicial Partitions

Remigijus Paulavicius, Julius Žilinskas, Institute of Mathematics and Informatics–Akademijos;

Andreas Grothey, University of Edinburgh


Investigation of Parallel Particle Swarm Optimization Algorithm With Reduction of the Search Area

Algirdas Lancinskas, Julius Žilinskas, Institute of Mathematics and Informatics–Akademijos;

Pilar Martínez Ortigosa, University of Almeria

10:30–11:00

Coffee Break

Session 2 / 11:00–12:30 / Clio Hall

11:00–11:30

 

11:30–12:00

12:00–12:30

Optimization of Topology of Truss Structures using Grid Computing

Aleksandr Igumenov, Julius Žilinskas, Institute of Mathematics and Informatics–Akademijos;

Krzysztof Kurowski, Mikolaj Mackowiak, Poznan Supercomputing and Networking Center


Identifying Cloud Computing Usage Patterns, Dana Petcu, West University of Timisoara


THOR: A Transparent Heterogeneous Open Resource framework

Jose Luis Vázquez-Poletti Universidad Compultense de Madrid;

Jan Perhac, John Ryan, Anne C. Elster, Norwegian University of Science and Technology

12:30–14:00

Lunch (Main Restaurant)

Session 3 / 14:00–15:00 / Clio Hall

14:00–14:30



14:30–15:00

Run-Time Optimization of Sends, Receives and File I/O

Thorvald Natvig, Anne C. Elster, Norwegian University of Science and Technology


Applicability of Dynamic Selection of Implementation Variants of Sequential Iterated Runge-Kutta Methods

Natalia Kalinnik, Matthias Korch, Thomas Rauber, University of Bayreuth

15:00–15:30

Coffee Break

Session 4 / 15:30–16:30 / Clio Hall

15:30–16:00



16:00–16:30


GPU-Based Segmentation of Cervical Vertebra in X-Ray Images

Sidi Ahmed Mahmoudi, Fabian Lecron, Pierre Manneback, Mohammed Benjelloun, Saïd Mahmoudi, University of Mons


GPU Implementation of the Pixel Purity Index Algorithm for Hyperspectral Image Analysis

Sergio Sánchez, Antonio Plaza, University of Extremadura

16:30–17:00

Coffee Break

Session 5 (Invited Presentations) / 17:00–18:20 / Clio Hall

17:00–17:20



17:20–17:40



17:40–18:00

 


18:00–18:20

Performance of Scheduling Strategies in Computational Grids and Clouds

Helen Karatza, Aristotle University of Thessaloniki

Component-based Methodology for High Development Productivity of Complex Applications

Vladimir Getov, University of Westminster

Research Activities at the University of Manchester related to Complex HPC, Rizos Sakellariou, University of Manchester


Selecting High Performance Computing and High Throughput Computing Capabilities for Hydro Meteo Research e-Instrastructures

Andrea Clematis, Daniele D’ Agostino, Antonella Galizia, Alfonso Quarati, IMATI-CNR; Antonio Parodi, Nicola Rebora, CIMA Research Foundation; Dieter Kranzlmueller, Michael Schiffers, Ludwig Maximilian Universität and Leibniz Supercomputing Center

 

Tuesday, September 21 2010

Plenary Session

Opening Remarks & Keynote 1 / 09:00 – 10:30 / Hermes Hall

09:00–09:15





09:15 –10:30

Opening Remarks

Dimitrios S. Nikolopoulos, Angelos Bilas, FORTH-ICS;

Ricardo Bianchini, Rutgers University


Keynote 1 pdf

Title: No Power, No Cloud

Speaker: Christian Belady, Microsoft Research

10:30–11:00

Coffee Break

Session 1 (Chair: DK Panda) / 11:00 – 12:30 / Hermes Hall

11:00–11:30

 

 


11:30–12:00

 



12:00–12:30

Minimizing MPI Resource Contention in Multithreaded Multicore Environments

David Goodell, Pavan Balaji, Darius Buntinas, ANL; Gabor Dozsa, IBM; William Gropp, University of Illinois; Sameer Kumar, pdfIBM; Bronis De Supinski, LLNL/CASC; Rajeev Thakur, ANL

 

TCCluster: A Cluster Architecture Utilizing the Processor Host Interface as a Network Interconnectppt

Heiner Litz, Maximilian Thuermer, Ulrich Bruening, University of Heidelbe


Adaptive Optimization for Petascale Heterogeneous CPU/GPU Computing

Canqun Yang, Feng Wang, NUDT, PRC; pdf

Yunfei Du, Juan Chen, Jie Liu, Huizhan Yi, Kai Lu, School of Computer Science, National University of Defense Technology

12:30-14:00

Lunch (Main Restaurant)


Session 2 (Chair: Brian Wylie) / 14:00 – 15:30 / Hermes Hall

Session 3 (Chair: Ron Brightwell) / 14:00 – 15:30 / Apollon Hall

14:00–14:30

 

 

 

 

14:30–15:00



 

 


15:00–15:30

How to scale Nested OpenMP Applications on the ScaleMP vSMP Architecturepdf

Dirk Schmidl, Christian Terboven, Andreas Wolf, Dieter anMey, Christian Bischof, RWTH Aachen University

 


Synchronizing Concurrent Events in Traces of Hybrid MPI/OpenMP Applications
ppt

Daniel Becker, German Research School for Sim; MarkusGeimer, Forschungszentrum Juelich GmbH; RolfRabenseifner; Felix Wolf, GRS

Getting Rid of Coherency Overhead for Memory-Hungry Applicationspdf

Hector Montaner, Federico Silla; Univ. Politècnica deValència; Holger Froning, Universität Heidelberg; JoseDuato, Univ. Politècnica de València

Energy-aware Scheduling in Virtualized Datacenterspdf

Íñigo Goiri, Ferran Julià, UPC; Ramón Nou, Josep Berral, Jordi Guitart, Jordi Torres, BSC


TRACER: A Trace Replay Tool to Evaluate Energy-Efficiency of Mass Storage Systems

Zhuo Liu, Fei Wu, Xiao Qin, Department of Computer Science pptand Software Engineering, Auburn University, Auburn; Chang Sheng Xie, Jian Zhou, Huazhong University of Science and Technology; Jianzong Wang

Designing OS for HPC applications: Schedulingpdf

Roberto Gioiosa; BSC; Sally McKee; Chalmers University of Technology; Mateo Valero; BSC

15:30 -16:00

Coffee Break


Session 4 (Chair: Darren Kerbyson) / 16:00 -17:30 / Hermes Hall

Session 5 (Chair: Vijay Pai) / 16:00 -17:30 / Apollon Hall

16:00–16:30

 

 

 



16:30–17:00

 




17:00–17:30

Exploiting Data Deduplication to Accelerate Live Virtual Machine Migrationppt

Xiang Zhang, Zhigang Huo, Dan Meng, Chinese Academy of Sciences


SHelp: Automatic Self-healing for Multiple Application Instances in Virtual Machine Environmentppt

Gang Chen, Hai Jin, Deqing Zou, Huazhong Univ. of Sci. & Tech.; Bingbing Zhou, University of Sydney; Weizhong Qiang, Huazhong Univ. of Sci. & Tech.

Virtualizing Modern OS-bypass Networks with Performance and Scalabilityppt

Bo Li, Institute of Computing Technology; Zhigang Huo, Panyong Zhang, Dan Meng, Chinese Academy of Sciences

RDMA-Based Job Migration Framework for MPI over InfiniBand

Xiangyong Ouyang, Sonya Marcarelli, Raghunath pdfRajachandrasekar, Dhabaleswar Panda, The Ohio State University

Host Side Dynamic Reconfiguration in Infinibandppt

Wei Lin Guay, Sven-Arne Reinemo, Olav Lysne, Tor Skeie, Simula Research Laboratory

 


Multiplexing Endpoints of HCA for Scaling MPI applications: Design and Performance Evaluation with uDAPLppt

Jasjit Singh, Yogeshwar Sonawane, C-DAC

 

Poster Session

19:00-21:00

19:00-21:00

Design and Evaluation of Remote Memory Disk Cache

Changgyoo Park, Shin-gyu Kim, Hyuck Han, Hyeonsang Eom, Heon Y. Yeom, Seoul National University


Power-aware, Dependable, and High-Performance Communication Link using PCI Express: PEARL

Toshihiro Hanawa, Taisuke Boku, Shin’ichi Miura, Mitsuhisa Sato, Kazutami Arimoto, University of Tsukuba

Cloud-based Synchronization of Distributed File System Hierarchies

Sandesh Uppoor, Michail D. Flouris, Angelos Bilas, FORTH-ICS

Low-latency Explicit Communication and Synchronization in Scalable Multi-core Clusters

Christoforos Kachris, George Nikiforos, Vassilis Papaefstathiou, Stamatis Kavadias, Manolis Katevenis, FORTH-ICS


Non-blocking Adaptive Cycles: Deadlock Avoidance for Fault-tolerant Interconnection Networks

Gonzalo Zarza, Diego Lugones, Daniel Franco, Emilio Luque, Universitat Autonoma Barcelona


A Multi-Pronged Approach to Benchmark Characterization

Nikola Puzovic, University of Siena; Sally McKee, Chalmers University; Revital Eres, Ayal Zaks, IBM Haifa; Paolo Gai, Evidence S.r.l.; Stephan Wong, Delft University of Technology; Roberto Giorgi, University of Siena


Early Experience of Building a Cloud Platform for Service Oriented Software Development

Hailong Sun, Xu Wang, Chao Zhou, Zicheng Huang, Xudong Liu, Beihang University


Adaptable Scheduling Schemes for Scientific Applications on Science Cloud

Seoyoung Kim, Yoonhee Kim, Sookmyung Women's University; Naeyoung Song, Chongam Kim, Seoul National University


Fault-Tolerance Mechanisms for Exascale Systems

Maria Ruiz Varela, University of Delaware; Kurt B. Ferreira, Rolf E. Riesen, Sandia National Laboratories


(Drinks and snacks will be served at the adjoining area)

 

Wednesday, September 22 2010

Plenary Session

Keynote 2 / 09:00 – 10:30 / Hermes Hall

09:00–10:30

Title: Scaling Storage into the Exascale Erapdf

Speaker: Garth Gibson, Carnegie Mellon University and Panasas Inc.

10:30–11:00

Coffee Break

Session 6 (Chair: Daniel Katz) / 11:00–12:30 / Hermes Hall

11:00–11:30




11:30–12:00



12:00–12:30

The Impact of System Design Parameters on Application Noise Sensitivity

Kurt Ferreira, Sandia National Labs; Patrick Bridges, Univ. of New Mexico; pdf

Ron Brightwell, Kevin Pedretti, Sandia National Labs

Computing Contingency Statistics in Parallel: Design Trade-Offs and Limiting Casespdf

Philippe Pébay, Janine Bennett, David Thompson, Sandia National Labs

Integration Experiences and Performance Studies of A COTS Parallel Archive Systemppt

Hsing-bung (HB) Chen, Los Alamos National Lab

12:30–14:00

Lunch (Main Restaurant)


Session 7 (Chair: Roberto Gioiosa) / 14:00 – 15:30 / Hermes Hall

Session 8 (Chair: Rob Latham) / 14:00 – 15:30 / Apollon Hall

14:00–14:30



 


14:30–15:00





15:00–15:30

Enforcing SLAs in Scientific Clouds

Oliver Nieh¨rster, André Brinkmann, Gregor Fels, Paderborn Center for Parallel Computing; Jens Krüger, Univ. of Paderborn; Jens Simon, Paderborn Center for pdfParallel Computing

DRM: A Dynamic Replication Management Scheme for Cloud Storage Cluster

Qingsong Wei, Data Storage Institute; Bharadwaj Veeravallippt, National University of Singapore

An Efficient Process Live Migration Mechanism for Load Balanced Distributed Virtual Environmentsppt

Balazs Gerofi, Hajime Fujita, Yutaka Ishikawa, University of Tokyo

Acceleration of Streamed Tensor Contraction Expressions on GPGPU-based Clusters

Wenjing Ma, Sriram Krishnamoorthy, Oreste Villa, Karol pdfKowalski, Pacific Northwest National Laboratory


Efficient Parallel Subgraph Counting using G-Tries

Pedro Ribeiro, Fernando Silva, Luís Lopes, Universidade do pdfPorto

Cluster versus GPU Implementation of an Orthogonal Target Detection Algorithm for Remotely Sensed Hyperspectral Imagespdf

Abel Paz, Antonio Plaza, University of Extremadura

15:30–16:00

Coffee Break

Conference Panel / 16:00–17:30 / Hermes Hall

16:00–17:30

Title: Implications of Exascale Computing for Storage Systems Research

Moderator: Andre Brinkmann, Univ. of Paderborn, Germany

Panelists:

19:00–22:00

Conference Beach Dinner

 

Thursday, September 23 2010

Plenary Session

Keynote 3 / 09:00 – 10:30 / Hermes Hall

09:00–10:30

Title: Image-Based Biomedical Modeling, Simulation and Visualizationpdf

Speaker: Chris Johnson, University of Utah

10:30–11:00

Coffee Break

Session 9 (Chair: Rob Ross) / 11:00–12:30 / Hermes Hall

11:00–11:30



11:30–12:00



12:00–12:30

Breaking the MapReduce stage barrierpdf

Abhishek Verma, Nicolas Zea, Brian Cho, Indranil Gupta, Roy Campbell, University of Illinois at Urbana-Champaign

Asynchronous Algorithms in MapReducepdf

Karthik Shashank Kambatla, Naresh Rapolu, Suresh Jagannathan, Ananth Grama, Purdue University

Reducing Communication Overhead in Large Eddy Simulation of Jet Engine Noisepdf

Yingchong Situ, Lixia Liu, Chandra Martha, Matthew Louis, Zhiyuan Li, Gregory Blaisdell, Anastasios Lyrintzis, Purdue University 

12:30–14:00

Lunch (Main Restaurant)


Session 10 (Chair: Adolfy Hoisie) / 14:00 – 15:30 / Hermes Hall

Session 11 (Chair: Toni Cortes) / 14:00 – 15:30 / Apollon Hall

14:00–14:30




14:30–15:00






15:00–15:30

Performance Analysis of Multi-level Time Sharing Task Assignment Policies on Cluster-based Systemspdf

Malith Jayasinghe, Zahir Tari, Panlop Zeephongsekul, RMIT Univ., Australia

A Simulation Framework to Automatically Analyze the Communication-Computation Overlap in Scientific pptApplications

Vladimir Subotic, Jose Carlos Sancho, Jesus Labarta, Mateo Valero, BSC

Analysis of Tasks Reallocation in a Dedicated Grid Environmentpdf

Ghislain Charrier, INRIA - LIP/ENS Lyon; FrédéricDesprez, Yves Caniou, UCBL - LIP/ENS Lyon

Replication-based Highly Available Metadata Management for pdfCluster File Systems

Zhuan Chen, ICT; Jin Xiong, Dan Meng, Chinese Academy of Sciences

Improving Parallel I/O Performance with Data Layout Awareness

Yong Chen, Xian-He Sun, Illinois Institute of Tech; Rajeev pptThakur, ANL; Huaiming Song, Hui Jin, Illinois Institute of Technology


Optimization Techniques at I/O Forwarding Layer

Kazuki Ohta, Univ. of Tokyo; Dries Kimpe, Univ. of Chicago; pdfJason Cope, Kamil Iskra, Robert Ross, ANL; Yutaka Ishikawa, Univ. of Tokyo

15:30–16:00

Coffee Break

Session 12 (Industry Session) / 16:00–17:00 / Hermes Hall

16:00–16:30



16:30–17:00

Paving The Road to Exascale Computingpdf

Gilad Sainer, Mellanox Technologies


HPC and Cluster Systems – Made in Saxonypdf

Jörg Heydemüller, Megware

 

Friday, September 24 2010

Workshop on Interfaces and Abstractions for Scientific Data Storage (IASDS)

Session 1 / 08:30 – 10:00 / Ourania Hall

08:30–08:45



08:45–09:30

 

 


09:30–10:00


Opening Remarks

Rob Latham, Argonne National Laboratory


Invited Presentation: Block-level Virtualization

aka Doing Things Below the Filesystem: Examples, Observations, and Challenges

Angelos Bilas, FORTH-ICS

Object Storage Semantics for Replicated Concurrent-Writer File Systems

Philip Carns, Robert Ross, Samuel Lang, Argonne National Laboratory

10:30–11:00

Coffee Break

Session 2 / 11:00 – 12:300 / Ourania Hall

11:00–11:30

 


11:30–12:00



12:00–12:30

Supporting High-Performance I/O at the Petascale: The Event Data Store for ATLAS at the LHC

Peter van Gemmeren, David Malon, Argonne National Laboratory


Comprehensive Data Infrastructure for Plant Bioinformatics

Chris Jordan, Dan Stanzione, Texas Advanced Computing Center; Doreen Ware, Christos Noutsos, Jerry Lu, Cold Spring Harbor Laboratory


H5hut: A High-Performance I/O Library for Particle-based Simulation

Mark Howison, Lawrence Berkeley National Laboratory;

Andreas Adelmann, Paul Sherrer Institut; E. Wes Bethel, Lawrence Berkeley National Laboratory;

Achim Gsell, Benedikt Oswald, Paul Sherrer Institut; Prabhat, Lawrence Berkeley National Laboratory

12:30–14:00

Lunch (Main Restaurant)

Session 3 / 14:00 – 15:30 / Ourania Hall

14:00–14:30



14:30–15:15



15:15–15:30

pWalrus: Towards Better Integration of Parallel File Systems into Cloud Storage

Yoshihisa Abe, Garth Gibson, Carnegie Mellon University

Invited Presentation: Title TBA

Robert Ross, Argonne National Laboratory

Closing Remarks


Tutorial on Designing High-End Computing Systems with IB and 10GigEth

08:30 – 12:30 / Kalia Hall

08:30–12:30

Dhabaleswar K. Panda, Ohio State University; Pavan Balaji, Argonne National Labroatory



Abstract: InfiniBand (IB) and 10-Gigabit Ethernet (10GE) interconnects are generating a lot of excitement towards building next generation High Performance Computing (HPC) systems and enterprise datacenters. This tutorial will provide an overview of these emerging interconnects, their offered features, their current market standing, and their suitability for prime-time HPC. It will start with a brief overview of IB, 10GE and their architectural features. An overview of the emerging OpenFabrics stack which encapsulates both IB and 10GE in a unified manner will be presented. IB and 10GE hardware/software solutions and the market trends will be highlighted. Finally, sample performance numbers highlighting the performance these technologies can achieve in different environments such as MPI, Sockets, Parallel File Systems, Multi-tier Datacenters, and Virtual Machines, will be shown.


Workshop on Application/Architecture Co-design for Extreme-scale Computing (AACEC)

Session 1 / 08:45 – 10:30 / Clio Hall

08:45–09:00

09:00–09:30



09:30–10:00



10:00–10:30

Welcome and Introductory Remarks


Invited Presentation: Bringing up Anton: Taking Co-Design into Production

Joseph Bank, D. E. Shaw Research


Invited Presentation: Green Flash: Three Problems, One Solution

David Donofrio, Lawrence Berkeley National Laboratory

Mobile-Subjective Programming for Massively Multithreaded Shared Memory Applications

Megan Vance, Peter Kogge, University of Notre Dame

10:30–11:00

Coffee Break

Session 2 / 11:00–12:30 / Clio Hall

11:00–11:30

 


11:30–12:00



12:00–12:30

Invited Presentation: Designing Applications, HW and SW together: adventures with 80 and 48 cores

Tim Mattson, Intel


Facilitating Co-Design for Extreme-Scale Systems Through Lightweight Simulation

Christian Engelmann, Frank Lauer, Oak Ridge National Laboratory


Invited Presentation: An Evolutionary Approach to Exascale System Software by Leveraging Co-Design Principles

Robert Wisniewski, IBM T. J. Watson Research Center

12:30–14:00

Lunch (Main Restaurant)

Session 3 / 14:00–15:30/ Clio Hall

14:00–14:30



14:30–15:0


15:00–15:30

Invited Presentation : Co-Designing MPI Library and Applications for InfiniBand Clusters

Dhabaleswar K. Panda, Ohio State University


Efficient Sparse Matrix-Matrix Multiplication on Heterogeneous High Performance Systems

Jakob Siegel, University of Delaware; Oreste Villa, Sriram Krishnamoorthy, Antonio Tumeo, Pacific Northwest National Laboratory; Xiaoming Li, University of Delaware


Confidence: Analyzing Performance With Empirical Probabilities

Bradley W. Settlemyer, Stephen W. Hodson, Jeffery A. Kuehn, Stephen W. Poole, Oak Ridge National Laboratory

15:30–16:00

Coffee Break

Session 4 / 16:00–16:45/ Clio Hall

16:00–16:30


 


16:30–16:45

Invited Presentation : Opportunities and Approaches for System Software in Supporting

Application/Architecture Co- Design

Ron Brightwell, Sandia National Laboratories


Concluding Remarks


Tutorial on Practical Parallel Application Performance Engineering Using Innovative Tools

08:30 – 17:00 / Thalia Hall

08:30–17:00

Bryan J. N. Wylie, Jülich Supercomputing Centre; Michael Gerndt, Technical University of Munich; Wolfgang Nagel, Technical University of Dresden



Abstract: This tutorial presents state-of-the-art tools for engineering performant parallel applications on computer clusters with MPI and/or OpenMP. The suite of tools developed by the Virtual Institute for High Productivity Supercomputing (VI-HPS) are introduced, including Scalasca, Vampir and Periscope. The tools support automated and manually-customizable measurement and analyses with hardware counter metrics as well as communication and synchronization overheads. A series of hands-on exercises are included which participants are encouraged to follow on their notebook computers using a provided Live-DVD with a bootable typical HPC cluster Linux environment. This will offer practical experience using the tools and help prepare participants to apply modern methods for locating and diagnosing performance bottlenecks in real-world parallel applications up to the largest scales.