IEEE Cluster 2020 Program

A

Abe, Makito · more

Toward OpenACC-enabled GPU-FPGA Accelerated Computing · view

Ahmad, Zafar · more

Efficient Execution of Dynamic Programming Algorithms on Apache Spark · view

Ali, Ghanzanfar · more

MonSTer: An Out-of-the-Box Monitoring Tool for High Performance Computing Systems · view

Antoniu, Gabriel · more

E2Clab: Exploring the Computing Continuum through Repeatable, Replicable and Reproducible Edge-to-Cloud Experiments · view

Arslan, Engin · more

Streaming File Transfer Optimization for Distributed Science Workflows · view

Ayguadé, Eduard · more

Evaluating Worksharing Tasks on Distributed Environments · view

Return to Top

B

Bai, Yang · more

SSP: Speeding up Small Flows for Proactive Transport in Datacenters. · view

Bateman, Keith · more

HCL: Distributing Parallel Data Structures in Extreme Scales · view

Beltran, Vicenç · more

Evaluating Worksharing Tasks on Distributed Environments · view
Towards Data-Flow Parallelization for Adaptive Mesh Refinement Applications · view

Benoit, Anne · more

Resilient Scheduling of Moldable Jobs on Failure-Prone Platforms · view

Bhatele, Abhinav · more

Predicting MPI Collective Communication Performance Using Machine Learning · view

Binyahib, Roba · more

Parallel Particle Advection Bake-Off For Scientific Visualization Workloads · view

Bosilca, George · more

HAN: a Hierarchical AutotuNed Collective Communication Framework · view
Flexible Data Redistribution in a Task-Based Runtime System · view
Predicting MPI Collective Communication Performance Using Machine Learning · view

Bouteiller, Aurelien · more

Flexible Data Redistribution in a Task-Based Runtime System · view

Brinkmann, André · more

DelveFS - An event-driven semantic file system for object stores · view

Bull, Mark · more

Evaluating Worksharing Tasks on Distributed Environments · view

Return to Top

C

Cao, Qinglei · more

HAN: a Hierarchical AutotuNed Collective Communication Framework · view
Flexible Data Redistribution in a Task-Based Runtime System · view

Cappello, Franck · more

DeepClone: Scalable Live Migration of Deep Learning Models for Data Parallel Training · view
Towards End-to-end SDC Detection for HPC Applications Equipped with Lossy Compression · view

Casanova, Henri · more

Modeling the Performance of Scientific Workflow Executions on HPC Platforms with Burst Buffers · view

Chang, Chan-Jung · more

ECS2: A Fast Erasure Coding Library for GPU-Accelerated Storage Systems With Parallel & Direct IO · view

Chen, Jin-Kun · more

NeoMPX: Characterizing and Improving Estimation of Multiplexing Hardware Counters for PAPI · view

Chen, Wei · more

Data Life Aware Model Updating Strategy for Stream-based Online Deep Learning · view

Chen, Yong · more

MonSTer: An Out-of-the-Box Monitoring Tool for High Performance Computing Systems · view

Chen, Zizhong · more

Towards End-to-end SDC Detection for HPC Applications Equipped with Lossy Compression · view

Cheng, Dazhao · more

Data Life Aware Model Updating Strategy for Stream-based Online Deep Learning · view

Chien, Steven W. D. · more

tf-Darshan: Understanding Fine-grained I/O Performance in Machine Learning Workloads · view

Childs, Hank · more

Parallel Particle Advection Bake-Off For Scientific Visualization Workloads · view

Chou, Jerry · more

ECS2: A Fast Erasure Coding Library for GPU-Accelerated Storage Systems With Parallel & Direct IO · view

Chou, Yu-Ching · more

ECS2: A Fast Erasure Coding Library for GPU-Accelerated Storage Systems With Parallel & Direct IO · view

Choudhary, Alok N. · more

AI for Science (Alok N. Choudhary) · view
The Price Performance of Performance Models (Felix Wolf) · view
Fugaku: the First `Exascale' Supercomputer (Satoshi Matsuoka) · view

Chowdhury, Rezaul · more

Efficient Execution of Dynamic Programming Algorithms on Apache Spark · view

Chu, Ching-Hsiang · more

Dynamic Kernel Fusion for Bulk Non-contiguous Data Transfer on GPU Clusters · view

Chung, I-Hsin · more

ECS2: A Fast Erasure Coding Library for GPU-Accelerated Storage Systems With Parallel & Direct IO · view

Cook, Brandon · more

Quantifying the impact of network congestion on application performance and network metrics · view

Coskun, Ayse K. · more

Quantifying the impact of network congestion on application performance and network metrics · view

Costan, Alexandru · more

E2Clab: Exploring the Computing Continuum through Repeatable, Replicable and Reproducible Edge-to-Cloud Experiments · view

Return to Top

D

Da Silva, Rafael Ferreira · more

Modeling the Performance of Scientific Workflow Executions on HPC Platforms with Burst Buffers · view

Dang, Tommy · more

MonSTer: An Out-of-the-Box Monitoring Tool for High Performance Computing Systems · view

Davis, Philip E. · more

A Staging Based Task Execution Framework for Data-driven Scientific Workflows · view

Deelman, Ewa · more

Modeling the Performance of Scientific Workflow Executions on HPC Platforms with Burst Buffers · view

Devarajan, Hariharan · more

HCL: Distributing Parallel Data Structures in Extreme Scales · view

Di, Sheng · more

Towards End-to-end SDC Detection for HPC Applications Equipped with Lossy Compression · view

Dong, Dezun · more

SSP: Speeding up Small Flows for Proactive Transport in Datacenters. · view

Dongarra, Jack · more

HAN: a Hierarchical AutotuNed Collective Communication Framework · view
Flexible Data Redistribution in a Task-Based Runtime System · view

Dorier, Matthieu · more

DeepClone: Scalable Live Migration of Deep Learning Models for Data Parallel Training · view
A Staging Based Task Execution Framework for Data-driven Scientific Workflows · view

Return to Top

E

Enes, Jonatan · more

Power Budgeting of Big Data Applications in Container-based Clusters · view

Expósito, Roberto Rey · more

Power Budgeting of Big Data Applications in Container-based Clusters · view

Return to Top

F

Fieni, Guillaume · more

Power Budgeting of Big Data Applications in Container-based Clusters · view

Fujisawa, Katsuki · more

Performance Evaluation of Supercomputer Fugaku using Breadth-First Search Benchmark in Graph500 · view

Fujita, Norihisa · more

Toward OpenACC-enabled GPU-FPGA Accelerated Computing · view

Return to Top

G

Gong, Lei · more

OctCNN: An Energy-Efficient FPGA Accelerator for CNNs using Octave Convolution Algorithm · view

Groves, Taylor · more

Quantifying the impact of network congestion on application performance and network metrics · view

Return to Top

H

Hanawa, Toshihiro · more

Analysis of Cooling Water Temperature Impact on Computing Performance and Energy Consumption · view

Harrison, Robert · more

Efficient Execution of Dynamic Programming Algorithms on Apache Spark · view

Hass, Jon · more

MonSTer: An Out-of-the-Box Monitoring Tool for High Performance Computing Systems · view

Hatta, Kazuma · more

ChOWDER: A New Approach for Viewing 3D Web GIS on Ultra-High-Resolution Scalable Display · view

Hegeman, Tim · more

Grade10: A Framework for Performance Characterization of Distributed Graph Processing · view

Huan, Shan · more

SSP: Speeding up Small Flows for Proactive Transport in Datacenters. · view

Hunold, Sascha · more

Efficient Process-to-Node Mapping Algorithms for Stencil Computations · view
Predicting MPI Collective Communication Performance Using Machine Learning · view
Decomposing MPI Collectives for Exploiting Multi-lane Communication · view

Huthmann, Jens · more

Extending High-Level Synthesis with High-Performance Computing Performance Visualization · view

Return to Top

I

Imamura, Toshiyuki · more

Prompt report on Exa-scale HPL-AI benchmark · view
An FPGA-based Sound Field Rendering System · view

Ina, Takuya · more

Prompt report on Exa-scale HPL-AI benchmark · view

Iosup, Alexandru · more

Grade10: A Framework for Performance Characterization of Distributed Graph Processing · view

Return to Top

J

Jansson, Niclas · more

A Hybrid MPI+PGAS Approach to Improve Strong Scalability Limits of Finite Element Solvers · view

Javanmard, Mohammad Mahdi · more

Efficient Execution of Dynamic Programming Algorithms on Apache Spark · view

Return to Top

K

K G, Renga Bashyam · more

Fast Scalable Approximate Nearest Neighbor Search for High-dimensional Data · view

Kang, Ji-Hoon · more

An HPC-based Prediction on the Practicality of Long-distance Quantum Key Distributions · view

Kawanabe, Tomohiro · more

ChOWDER: A New Approach for Viewing 3D Web GIS on Ultra-High-Resolution Scalable Display · view

Kenny, Joseph · more

Opportunities and limitations of Quality-of-Service in Message Passing applications on adaptively routed Dragonfly and Fat Tree networks · view

Kim, Sejin · more

Co-scheML: Interference-aware Container Co-scheduling Scheme using Machine Learning Application Profiles for GPU Clusters · view

Kim, Yoonhee · more

Co-scheML: Interference-aware Container Co-scheduling Scheme using Machine Learning Application Profiles for GPU Clusters · view

Knees, Peter · more

Predicting MPI Collective Communication Performance Using Machine Learning · view

Kobayashi, Ryohei · more

Toward OpenACC-enabled GPU-FPGA Accelerated Computing · view

Koch, Andreas · more

Extending High-Level Synthesis with High-Performance Computing Performance Visualization · view

Kodama, Yuetsu · more

Performance Evaluation of Supercomputer Fugaku using Breadth-First Search Benchmark in Graph500 · view

Kougkas, Anthony · more

HCL: Distributing Parallel Data Structures in Extreme Scales · view

Kremer-Herman, Nathaniel · more

Autoscaling High-Throughput Workloads on Container Orchestrators · view

Kudo, Shuhei · more

Prompt report on Exa-scale HPL-AI benchmark · view

Kwon, Minseok · more

CuVPP: Filter-based Longest Prefix Matching in Software Data Planes · view

Return to Top

L

Le Fèvre, Valentin · more

Resilient Scheduling of Moldable Jobs on Failure-Prone Platforms · view

Lehr, Markus · more

Efficient Process-to-Node Mapping Algorithms for Stencil Computations · view

Li, Dong · more

Exploring Non-Volatility of Non-Volatile Memory for High Performance Computing Under Failures · view

Li, Jie · more

MonSTer: An Out-of-the-Box Monitoring Tool for High Performance Computing Systems · view

Li, Sihuan · more

Towards End-to-end SDC Detection for HPC Applications Equipped with Lossy Compression · view

Li, Yun · more

Estimating Power Consumption of Containers and Virtual Machines in Data Centers · view

Liang, Xin · more

Towards End-to-end SDC Detection for HPC Applications Equipped with Lossy Compression · view

Liao, Qing · more

Exploring the Potential of Fast Delta Encoding: Marching to a Higher Compression Ratio · view

Liao, Xiangke · more

SSP: Speeding up Small Flows for Proactive Transport in Datacenters. · view

Lin, James · more

NeoMPX: Characterizing and Improving Estimation of Multiplexing Hardware Counters for PAPI · view

Liu, Zheng · more

Estimating Power Consumption of Containers and Virtual Machines in Data Centers · view

Lou, Wenqi · more

OctCNN: An Energy-Efficient FPGA Accelerator for CNNs using Octave Convolution Algorithm · view

Lu, Gangzhao · more

Optimizing GPU Memory Transactions for Convolution Operations · view

Luo, Xi · more

HAN: a Hierarchical AutotuNed Collective Communication Framework · view

Return to Top

M

Magoutis, Kostas · more

The Case for Better Integrating Scalable Data Stores and Stream-Processing Systems · view

Markidis, Stefano · more

tf-Darshan: Understanding Fine-grained I/O Performance in Machine Learning Workloads · view

Maroñas, Marcos · more

Evaluating Worksharing Tasks on Distributed Environments · view

Marshall, John · more

CuVPP: Filter-based Longest Prefix Matching in Software Data Planes · view

Return to Top

N

Nakao, Masahiro · more

Performance Evaluation of Supercomputer Fugaku using Breadth-First Search Benchmark in Graph500 · view

Neupane, Krishna Prasad · more

CuVPP: Filter-based Longest Prefix Matching in Software Data Planes · view

Nguyen, Ngan · more

MonSTer: An Out-of-the-Box Monitoring Tool for High Performance Computing Systems · view

Nicolae, Bogdan · more

DeepClone: Scalable Live Migration of Deep Learning Models for Data Parallel Training · view

Nitadori, Keigo · more

Prompt report on Exa-scale HPL-AI benchmark · view

Nonaka, Jorji · more

Analysis of Cooling Water Temperature Impact on Computing Performance and Energy Consumption · view

Return to Top

O

Ohmura, Itta · more

Implementing a Comprehensive Networks-on-Chip Generator with Optimal Configurations · view

Ono, Kenji · more

ChOWDER: A New Approach for Viewing 3D Web GIS on Ultra-High-Resolution Scalable Display · view

Return to Top

P

Panda, Dhabaleswar K. · more

Dynamic Kernel Fusion for Bulk Non-contiguous Data Transfer on GPU Clusters · view

Papaioannou, Antonis · more

The Case for Better Integrating Scalable Data Stores and Stream-Processing Systems · view

Parashar, Manish · more

A Staging Based Task Execution Framework for Data-driven Scientific Workflows · view

Patinyasakdikul, Thananon · more

HAN: a Hierarchical AutotuNed Collective Communication Framework · view

Pei, Yu · more

HAN: a Hierarchical AutotuNed Collective Communication Framework · view

Peng, Ivy B. · more

tf-Darshan: Understanding Fine-grained I/O Performance in Machine Learning Workloads · view

Perotin, Lucas · more

Resilient Scheduling of Moldable Jobs on Failure-Prone Platforms · view

Podobas, Artur · more

tf-Darshan: Understanding Fine-grained I/O Performance in Machine Learning Workloads · view

Posner, Jonas · more

System-Level vs. Application-Level Checkpointing · view

Pottier, Loic · more

Modeling the Performance of Scientific Workflow Executions on HPC Platforms with Burst Buffers · view

Pouchet, Louis-Noël · more

Efficient Execution of Dynamic Programming Algorithms on Apache Spark · view

Pugmire, David · more

Parallel Particle Advection Bake-Off For Scientific Visualization Workloads · view

Return to Top

R

Rafique, M. Mustafa · more

CuVPP: Filter-based Longest Prefix Matching in Software Data Planes · view

Raghavan, Padma · more

Resilient Scheduling of Moldable Jobs on Failure-Prone Platforms · view

Rang, Wei · more

Data Life Aware Model Updating Strategy for Stream-based Online Deep Learning · view

Ren, Jie · more

Exploring Non-Volatility of Non-Volatile Memory for High Performance Computing Under Failures · view

Rico, Alejandro · more

Towards Data-Flow Parallelization for Adaptive Mesh Refinement Applications · view

Robert, Yves · more

Resilient Scheduling of Moldable Jobs on Failure-Prone Platforms · view

Rosendo, Daniel · more

E2Clab: Exploring the Computing Continuum through Repeatable, Replicable and Reproducible Edge-to-Cloud Experiments · view

Rouvoy, Romain · more

Power Budgeting of Big Data Applications in Container-based Clusters · view

Ryu, Hoon · more

An HPC-based Prediction on the Practicality of Long-distance Quantum Key Distributions · view

Return to Top

S

Sala, Kevin · more

Towards Data-Flow Parallelization for Adaptive Mesh Refinement Applications · view

Salkhordeh, Reza · more

DelveFS - An event-driven semantic file system for object stores · view

Sano, Kentaro · more

Profiling and Visualizing Performance of FPGAs in High-Performance Computing Environments · view

Sato, Mitsuhisa · more

Performance Evaluation of Supercomputer Fugaku using Breadth-First Search Benchmark in Graph500 · view

Schulz, Christian · more

Efficient Process-to-Node Mapping Algorithms for Stencil Computations · view

Shaffer, Tim · more

Autoscaling High-Throughput Workloads on Container Orchestrators · view

Shafie Khorassani, Kawthar · more

Dynamic Kernel Fusion for Bulk Non-contiguous Data Transfer on GPU Clusters · view

Shen, Ziyu · more

Estimating Power Consumption of Containers and Virtual Machines in Data Centers · view

Shoji, Fumiyoshi · more

Analysis of Cooling Water Temperature Impact on Computing Performance and Energy Consumption · view

Sill, Alan · more

MonSTer: An Out-of-the-Box Monitoring Tool for High Performance Computing Systems · view

Silva, Pedro · more

E2Clab: Exploring the Computing Continuum through Repeatable, Replicable and Reproducible Edge-to-Cloud Experiments · view

Simonin, Matthieu · more

E2Clab: Exploring the Computing Continuum through Repeatable, Replicable and Reproducible Edge-to-Cloud Experiments · view

Smigielski, Jean-François · more

DelveFS - An event-driven semantic file system for object stores · view

Sommer, Lukas · more

Profiling and Visualizing Performance of FPGAs in High-Performance Computing Environments · view

Steiner, Rebecca · more

DelveFS - An event-driven semantic file system for object stores · view

Steinkamp, Jörg · more

DelveFS - An event-driven semantic file system for object stores · view

Su, Xiao-Ming · more

NeoMPX: Characterizing and Improving Estimation of Multiplexing Hardware Counters for PAPI · view

Subedi, Pradeep · more

A Staging Based Task Execution Framework for Data-driven Scientific Workflows · view

Subramoni, Hari · more

Dynamic Kernel Fusion for Bulk Non-contiguous Data Transfer on GPU Clusters · view

Sun, Hongyang · more

Resilient Scheduling of Moldable Jobs on Failure-Prone Platforms · view

Sun, Xian-He · more

HCL: Distributing Parallel Data Structures in Extreme Scales · view

Suo, Kun · more

Data Life Aware Model Updating Strategy for Stream-based Online Deep Learning · view

Return to Top

T

Taiji, Makoto · more

Implementing a Comprehensive Networks-on-Chip Generator with Optimal Configurations · view

Tan, Haoliang · more

Exploring the Potential of Fast Delta Encoding: Marching to a Higher Compression Ratio · view

TAN, YIYU · more

An FPGA-based Sound Field Rendering System · view

Teruel, Xavier · more

Evaluating Worksharing Tasks on Distributed Environments · view

Thain, Douglas · more

Autoscaling High-Throughput Workloads on Container Orchestrators · view

Touriño, Juan · more

Power Budgeting of Big Data Applications in Container-based Clusters · view

Träff, Jesper Larsson · more

Efficient Process-to-Node Mapping Algorithms for Stencil Computations · view
Decomposing MPI Collectives for Exploiting Multi-lane Communication · view

Trivedi, Animesh · more

Grade10: A Framework for Performance Characterization of Distributed Graph Processing · view

Return to Top

U

Ucar, Davut · more

Streaming File Transfer Optimization for Distributed Science Workflows · view

Ueno, Koji · more

Performance Evaluation of Supercomputer Fugaku using Breadth-First Search Benchmark in Graph500 · view

Umemura, Masayuki · more

Toward OpenACC-enabled GPU-FPGA Accelerated Computing · view

Return to Top

V

Vadhiyar, Sathish · more

Fast Scalable Approximate Nearest Neighbor Search for High-dimensional Data · view

Vef, Marc-André · more

DelveFS - An event-driven semantic file system for object stores · view

Vennetier, Florent · more

DelveFS - An event-driven semantic file system for object stores · view

von Kirchbach, Konrad · more

Efficient Process-to-Node Mapping Algorithms for Stencil Computations · view

Return to Top

W

Wang, Chao · more

OctCNN: An Energy-Efficient FPGA Accelerator for CNNs using Octave Convolution Algorithm · view

Wang, Jie · more

NeoMPX: Characterizing and Improving Estimation of Multiplexing Hardware Counters for PAPI · view

Wang, Yi-Chao · more

NeoMPX: Characterizing and Improving Estimation of Multiplexing Hardware Counters for PAPI · view

Wang, Zhe · more

A Staging Based Task Execution Framework for Data-driven Scientific Workflows · view

Wang, Zheng · more

Optimizing GPU Memory Transactions for Convolution Operations · view

Wilke, Jeremiah · more

Opportunities and limitations of Quality-of-Service in Message Passing applications on adaptively routed Dragonfly and Fat Tree networks · view

Wozniak, Justin · more

DeepClone: Scalable Live Migration of Deep Learning Models for Data Parallel Training · view

Wright, Nicholas · more

Quantifying the impact of network congestion on application performance and network metrics · view

Wu, Kai · more

Exploring Non-Volatility of Non-Volatile Memory for High Performance Computing Under Failures · view

Wu, Wei · more

HAN: a Hierarchical AutotuNed Collective Communication Framework · view
Flexible Data Redistribution in a Task-Based Runtime System · view

Return to Top

X

Xia, Bin · more

Estimating Power Consumption of Containers and Virtual Machines in Data Centers · view

Xia, Wen · more

Exploring the Potential of Fast Delta Encoding: Marching to a Higher Compression Ratio · view

Return to Top

Y

Yamaguchi, Yoshiki · more

Toward OpenACC-enabled GPU-FPGA Accelerated Computing · view

Yang, Donglin · more

Data Life Aware Model Updating Strategy for Stream-based Online Deep Learning · view

Yenpure, Abhishek · more

Parallel Particle Advection Bake-Off For Scientific Visualization Workloads · view

Yoshikawa, Kohji · more

Toward OpenACC-enabled GPU-FPGA Accelerated Computing · view

Return to Top

Z

Zeginis, Chrysostomos · more

The Case for Better Integrating Scalable Data Stores and Stream-Processing Systems · view

Zhang, Hao · more

Implementing a Comprehensive Networks-on-Chip Generator with Optimal Configurations · view

Zhang, Weizhe · more

Optimizing GPU Memory Transactions for Convolution Operations · view

Zhang, Xusheng · more

Estimating Power Consumption of Containers and Virtual Machines in Data Centers · view

Zhang, Yijia · more

Quantifying the impact of network congestion on application performance and network metrics · view

Zhang, Zhiyuan · more

Exploring the Potential of Fast Delta Encoding: Marching to a Higher Compression Ratio · view

Zhao, Kai · more

Towards End-to-end SDC Detection for HPC Applications Equipped with Lossy Compression · view

Zheng, Chao · more

Autoscaling High-Throughput Workloads on Container Orchestrators · view

Zhong, Dong · more

HAN: a Hierarchical AutotuNed Collective Communication Framework · view
Flexible Data Redistribution in a Task-Based Runtime System · view

Zhou, Qinghua · more

Dynamic Kernel Fusion for Bulk Non-contiguous Data Transfer on GPU Clusters · view

Zhou, Xuehai · more

OctCNN: An Energy-Efficient FPGA Accelerator for CNNs using Octave Convolution Algorithm · view

Zhou, Zejia · more

SSP: Speeding up Small Flows for Proactive Transport in Datacenters. · view

Zola, Jaroslaw · more

Efficient Execution of Dynamic Programming Algorithms on Apache Spark · view

Zou, Xiangyu · more

Exploring the Potential of Fast Delta Encoding: Marching to a Higher Compression Ratio · view

Zuo, Si-Cheng · more

NeoMPX: Characterizing and Improving Estimation of Multiplexing Hardware Counters for PAPI · view

Return to Top