Abdennadher, Nabil · more Nabil Abdennadher (HES-SO) | On the Benefits of Anticipating Load Imbalance for Performance Optimization of Parallel Applications · view |
Afzal, Ayesha · more Ayesha Afzal (Friedrich-Alexander University Erlangen-Nürnberg) | Propagation and Decay of Injected One-Off Delays on Clusters: A Case Study · view |
Ahmad, Yousuf · more Yousuf Ahmad (Carnegie Mellon University in Qatar) | Efficient Distributed Graph Analytics using Triply Compressed Sparse Format · view |
Albahar, Hadeel · more Hadeel Albahar (Virginia Tech) | Large-Scale Analysis of the Docker Hub Dataset · view |
Alonazi, Amani · more Amani Alonazi (KAUST) | Asynchronous Task-Based Execution of the Reverse Time Migration for the Oil and Gas Industry · view |
Amvrosiadis, George · more George Amvrosiadis (Carnegie Mellon University) | Compact Filter Structures for Fast Data Partitioning · view |
Anirudh, Rushil · more Rushil Anirudh (Lawrence Livermore National Lab) | Parallelizing Training of Deep Generative Models on Massive Scientific Datasets · view |
Anthony, Quentin · more Quentin Anthony (The Ohio State University) | Performance Characterization of DNN Training using TensorFlow and PyTorch on Modern Clusters · view |
Anwar, Ali · more Ali Anwar (IBM Research—Almaden) | Large-Scale Analysis of the Docker Hub Dataset · view |
Atchley, Scott · more Scott Atchley (ORNL) | Evaluating Burst Buffer Placement in HPC Systems · view |
Awan, Ammar Ahmad · more Ammar Ahmad Awan (The Ohio State University) | Performance Characterization of DNN Training using TensorFlow and PyTorch on Modern Clusters · view |
Banerjee, Kunal · more Kunal Banerjee (Intel) | Training Google Neural Machine Translation on an Intel CPU Cluster · view |
Beckman, Pete · more Pete Beckman (Argonne National Laboratory) Pete Beckman is the co-director of the Northwestern University/Argonne Institute for Science and Engineering. During the past 25 years, his research has been focused on software and architectures for large-scale parallel and distributed computing systems. For the DOE’s Exascale Computing Project, Beckman leads the Argo project focused on low-level resource management for the operating system and runtime. He is the founder and leader of the Waggle project for smart sensors and edge computing that is used by the Array of Things project. Beckman also coordinates the collaborative technical research activities in extreme-scale computing between the US Department of Energy and Japan’s ministry of education, science, and technology and helps lead the BDEC (Big Data and Extreme Computing) series of international workshops. Beckman received his PhD in computer science from Indiana University. | AI@Edge (Pete Beckman) · view |
Benson, Tom · more Tom Benson (Lawrence Livermore National Lab) | Parallelizing Training of Deep Generative Models on Massive Scientific Datasets · view |
Bosilca, George · more George Bosilca (University of Tennessee) | Give MPI Threading a Fair Chance: A Study of Multithreaded MPI Designs · view |
Boulmier, Anthony · more Anthony Boulmier (University of Geneva) | On the Benefits of Anticipating Load Imbalance for Performance Optimization of Parallel Applications · view |
Bremer, Peer-Timo · more Peer-Timo Bremer (Lawrence Livermore National Lab) | Parallelizing Training of Deep Generative Models on Massive Scientific Datasets · view |
Briggs, Ian · more Ian Briggs (University of Utah) | DiffTrace: Efficient Whole-Program Trace Analysis and Diffing for Debugging · view |
Burtscher, Martin · more Martin Burtscher (Texas State University) | DiffTrace: Efficient Whole-Program Trace Analysis and Diffing for Debugging · view |
Butt, Ali R. · more Ali R. Butt (Virginia Tech) | FSMonitor: Scalable File System Monitoring for Arbitrary Storage Systems · view A Quantitative Study of Deep Learning Training on Heterogeneous Supercomputers · view Large-Scale Analysis of the Docker Hub Dataset · view |
Calhoun, Jon C. · more Jon C. Calhoun (Clemson University) | Analyzing the Impact of Lossy Compressor Variability on Checkpointing Scientific Simulations (short paper) · view |
Canon, Louis-Claude · more Louis-Claude Canon (Univ. Franche Comté) | Scheduling Independent Stochastic Tasks on Heterogeneous Cloud Platforms · view |
Cappello, Franck · more Franck Cappello (Argonne National Laboratory) | Improving Performance of Data Dumping with Lossy Compression for Scientific Simulation · view |
Castain, Ralph · more Ralph Castain (Intel) | MPI Sessions: Evaluation of an Implementation in Open MPI · view |
Cavelan, Aurélien · more Aurélien Cavelan (University of Basel, Swizterland) | Algorithm-Based Fault Tolerance for Parallel Stencil Computations · view |
Challa, Jagat Sesh · more Jagat Sesh Challa (Birla Institute of Technology & Science, Pilani) | MuDBSCAN: An Exact Scalable DBSCAN Algorithm for Big Data Exploiting Spatial Locality · view |
Chang, Huaixin · more Huaixin Chang (Alibaba) | X-RDMA: Effective RDMA Middleware in Large-scale Production Environments · view |
Chard, Kyle · more Kyle Chard (University of Chicago) | FSMonitor: Scalable File System Monitoring for Arbitrary Storage Systems · view |
Chard, Ryan · more Ryan Chard (Argonne National Laboratory) | FSMonitor: Scalable File System Monitoring for Arbitrary Storage Systems · view |
Chen, Hsing-Bung · more Hsing-Bung Chen (Los Alamos National Laboratory) | Building Reliable High-Performance Storage Systems: An Empirical and Analytical Study · view |
Chen, Kang · more Kang Chen (Tsinghua University) | X-RDMA: Effective RDMA Middleware in Large-scale Production Environments · view |
Chen, Yong · more Yong Chen (Texas Tech University) | RE-Store: Reliable and Efficient KV-Store with Erasure Coding and Replication · view |
Chen, Zizhong · more Zizhong Chen (UC, Riverside) | Improving Performance of Data Dumping with Lossy Compression for Scientific Simulation · view |
Chopard, Bastien · more Bastien Chopard (Unversity of Geneva) | On the Benefits of Anticipating Load Imbalance for Performance Optimization of Parallel Applications · view |
Chowdhury, Fahim · more Fahim Chowdhury (Florida State University) | Efficient User-Level Storage Disaggregation for Deep Learning · view |
Ciorba, Florina M. · more Florina M. Ciorba (University of Basel, Swizterland) | Algorithm-Based Fault Tolerance for Parallel Stencil Computations · view |
Cornebize, Tom · more Tom Cornebize (Université Grenoble Alpes, French Institute for Research in Computer Science and Automation (INRIA)) | Fast and Faithful Performance Prediction of MPI Applications: the HPL Case Study · view |
Cranor, Charles · more Charles Cranor (Carnegie Mellon University) | Compact Filter Structures for Fast Data Partitioning · view |
Das, Chita · more Chita Das (Penn State) | Kube-Knots: Resource Harvesting through Dynamic Container Orchestration in GPU-based Datacenters · view |
Davis, Philip E. · more Philip E. Davis (Rutgers University) | Leveraging Machine Learning for Anticipatory Data Delivery in Extreme Scale In-situ Workflows · view |
de Supinski, Bronis · more Bronis de Supinski (Lawrence Livermore National Laboratory) | Mitigating Inter-Job Interference via Process-Level Quality-of-Service (short paper) · view |
Di, Sheng · more Sheng Di (Argonne National Laboratory) | Improving Performance of Data Dumping with Lossy Compression for Scientific Simulation · view |
Draeger, Erik W. · more Erik W. Draeger (Lawrence Livermore National Laboratory) | Multi-physics simulations of particle tracking in arterial geometries with a scalable moving window algorithm · view |
Eaton, Joe · more Joe Eaton (Nvidia) Joe Eaton is the Principal System Engineer for Data and Graph Analytics at NVIDIA. He has spent the last 6 years at NVIDIA working on applications of sparse linear algebra: CUDA Libraries, cuSOLVER, cuSPARSE, nvGRAPH and AmgX. Now 100% on RAPIDS, developing Python APIs, cuML and cuGRAPH. RAPIDS is an end-to-end platform for data science, including IO, ETL, model training, inference and visualization. Previously he spent 18 years in Oil & Gas reservoir simulation. He is a frequent speaker at SC, GTC, and directly interfaces with engineers and mathematicians across industries. | RAPIDS: Open Source Python Data Science with GPU Acceleration and Dask (Joe Eaton) · view |
Eberius, David · more David Eberius (University of Tennessee) | Give MPI Threading a Fair Chance: A Study of Multithreaded MPI Designs · view |
Fan, Xiaopeng · more Xiaopeng Fan (Shenzhen Institutes of Advanced Technology) | DP_Greedy: A Two-Phase Caching Algorithm for Mobile Cloud Services · view |
Faverge, Mathieu · more Mathieu Faverge (INRIA) | Leveraging Task-Based Polar Decomposition Using PARSEC on Massively Parallel Systems · view |
Foster, Ian · more Ian Foster (Argonne National Laboratory, University of Chicago) | FSMonitor: Scalable File System Monitoring for Arbitrary Storage Systems · view |
Fu, Song · more Song Fu (University of North Texas) | Building Reliable High-Performance Storage Systems: An Empirical and Analytical Study · view |
Fuerlinger, Karl · more Karl Fuerlinger (LMU Munich) | Engineering a Distributed Histogram Sort · view |
Gaffney, Jim · more Jim Gaffney (Lawrence Livermore National Lab) | Parallelizing Training of Deep Generative Models on Massive Scientific Datasets · view |
Ganger, Gregory · more Gregory Ganger (Carnegie Mellon University) | Compact Filter Structures for Fast Data Partitioning · view |
Gao, Yiqin · more Yiqin Gao (ENS Lyon) | Scheduling Independent Stochastic Tasks on Heterogeneous Cloud Platforms · view |
Gavahi, Mohsen · more Mohsen Gavahi (Florida State University) | An Empirical Study of Cryptographic Libraries for MPI Communications · view |
Georganas, Evangelos · more Evangelos Georganas (Intel) | Training Google Neural Machine Translation on an Intel CPU Cluster · view |
Gibson, Garth · more Garth Gibson (Carnegie Mellon University) | Compact Filter Structures for Fast Data Partitioning · view |
Gopalakrishnan, Ganesh · more Ganesh Gopalakrishnan (University of Utah) | DiffTrace: Efficient Whole-Program Trace Analysis and Diffing for Debugging · view |
Gounley, John · more John Gounley (Oak Ridge National Laboratory) | Multi-physics simulations of particle tracking in arterial geometries with a scalable moving window algorithm · view |
Goyal, Navneet · more Navneet Goyal (Birla Institute of Technology & Science, Pilani) | MuDBSCAN: An Exact Scalable DBSCAN Algorithm for Big Data Exploiting Spatial Locality · view |
Goyal, Poonam · more Poonam Goyal (Birla Institute of Technology & Science, Pilani) | MuDBSCAN: An Exact Scalable DBSCAN Algorithm for Big Data Exploiting Spatial Locality · view |
Grider, Gary · more Gary Grider (Los Alamos National Lab) | Compact Filter Structures for Fast Data Partitioning · view |
Grindeanu, Iulian · more Iulian Grindeanu (Argonne National Laboratory) | Scalable, High-Order Continuity Across Block Boundaries of Functional Approximations Computed in Parallel · view |
Gunasekaran, Jashwant Raj · more Jashwant Raj Gunasekaran (Penn State) | Kube-Knots: Resource Harvesting through Dynamic Container Orchestration in GPU-based Datacenters · view |
Gutiérrez, Samuel K. · more Samuel K. Gutiérrez (Los Alamos National Laboratory) | MPI Sessions: Evaluation of an Implementation in Open MPI · view |
Hager, Georg · more Georg Hager (Friedrich-Alexander University Erlangen-Nürnberg) | Propagation and Decay of Injected One-Off Delays on Clusters: A Case Study · view |
Halappanavar, Mahantesh · more Mahantesh Halappanavar (Pacific Northwest National Laboratory) | Fast and Scalable Implementations of Influence Maximization Algorithms · view |
Hammoud, Mohammad · more Mohammad Hammoud (Carnegie Mellon University in Qatar) | Efficient Distributed Graph Analytics using Triply Compressed Sparse Format · view |
Han, Jingoo · more Jingoo Han (Virginia Tech) | A Quantitative Study of Deep Learning Training on Heterogeneous Supercomputers · view |
Hasanzadeh Mofrad, Mohammad · more Mohammad Hasanzadeh Mofrad (University of Pittsburgh) | Efficient Distributed Graph Analytics using Triply Compressed Sparse Format · view |
He, Shuibing · more Shuibing He (Zhejiang University) | DP_Greedy: A Two-Phase Caching Algorithm for Mobile Cloud Services · view |
Heinecke, Alexander · more Alexander Heinecke (Intel) | Training Google Neural Machine Translation on an Intel CPU Cluster · view |
Heinrich, Franz Christian · more Franz Christian Heinrich (French Institute for Research in Computer Science and Automation (INRIA)) | Fast and Faithful Performance Prediction of MPI Applications: the HPL Case Study · view |
Herschlag, Gregory J. · more Gregory J. Herschlag (Duke University) | Multi-physics simulations of particle tracking in arterial geometries with a scalable moving window algorithm · view |
Hjelm, Nathan · more Nathan Hjelm (University of New Mexico) | Give MPI Threading a Fair Chance: A Study of Multithreaded MPI Designs · view |
Hoang, Viet Tung · more Viet Tung Hoang (Florida State University) | An Empirical Study of Cryptographic Libraries for MPI Communications · view |
Holmes, Daniel J. · more Daniel J. Holmes (EPCC, The University of Edinburgh) | MPI Sessions: Evaluation of an Implementation in Open MPI · view |
Huang, Dong · more Dong Huang (Shenzhen Institutes of Advanced Technology) | DP_Greedy: A Two-Phase Caching Algorithm for Mobile Cloud Services · view |
Huang, Lei · more Lei Huang (TACC) | Quantifying the Impact of Memory Errors in Deep Learning · view |
Huang, Ruizhu · more Ruizhu Huang (TACC) | Quantifying the Impact of Memory Errors in Deep Learning · view |
Hysom, David · more David Hysom (Lawrence Livermore National Lab) | Parallelizing Training of Deep Generative Models on Massive Scientific Datasets · view |
Islam, Saiyedul · more Saiyedul Islam (Birla Institute of Technology & Science, Pilani) | MuDBSCAN: An Exact Scalable DBSCAN Algorithm for Big Data Exploiting Spatial Locality · view |
Jackson, Adrian · more Adrian Jackson (EPCC, The University of Edinburgh) | NORNS: Extending Slurm to Support Data-Driven Workflows through Asynchronous Data Staging · view |
Jacobs, Sam Ade · more Sam Ade Jacobs (Lawrence Livermore National Lab) | Parallelizing Training of Deep Generative Models on Massive Scientific Datasets · view |
Jain, Ankush · more Ankush Jain (Carnegie Mellon University) | Compact Filter Structures for Fast Data Partitioning · view |
Jain, Arpan · more Arpan Jain (The Ohio State University) | Performance Characterization of DNN Training using TensorFlow and PyTorch on Modern Clusters · view |
Jain, Nikhil · more Nikhil Jain (Nvidia) | Mitigating Inter-Job Interference via Process-Level Quality-of-Service (short paper) · view |
Jiang, Hai · more Hai Jiang (Arkansas State University) | X-RDMA: Effective RDMA Middleware in Large-scale Production Environments · view |
Jiao, Bing · more Bing Jiao (Florida State University) | Efficient User-Level Storage Disaggregation for Deep Learning · view |
Jin, Jiahui · more Jiahui Jin (Southeast University) | MBECN: Enabling ECN with Micro-burst Traffic in Multi-queue Data Center · view |
Jungblut, Pascal · more Pascal Jungblut (LMU Munich) | Engineering a Distributed Histogram Sort · view |
Kalamkar, Dhiraj · more Dhiraj Kalamkar (Intel) | Training Google Neural Machine Translation on an Intel CPU Cluster · view |
Kalyanaraman, Ananth · more Ananth Kalyanaraman (Washington State University) | Fast and Scalable Implementations of Influence Maximization Algorithms · view |
Kandemir, Mahmut · more Mahmut Kandemir (Penn State) | Kube-Knots: Resource Harvesting through Dynamic Container Orchestration in GPU-based Datacenters · view |
Kang, Kexi · more Kexi Kang (Southeast University) | MBECN: Enabling ECN with Micro-burst Traffic in Multi-queue Data Center · view |
Katz, Daniel S. · more Daniel S. Katz (University of Illinois) | Quantifying the Impact of Memory Errors in Deep Learning · view |
Keyes, David · more David Keyes (KAUST) | Asynchronous Task-Based Execution of the Reverse Time Migration for the Oil and Gas Industry · view Leveraging Task-Based Polar Decomposition Using PARSEC on Massively Parallel Systems · view |
Khandelwal, Paahuni · more Paahuni Khandelwal (Colorado State University) | STASH : Fast Hierarchical Aggregation Queries for Effective Visual Spatiotemporal Explorations · view |
Khetawat, Harsh · more Harsh Khetawat (NCSU) | Evaluating Burst Buffer Placement in HPC Systems · view |
Kowalewski, Roger · more Roger Kowalewski (LMU Munich) | Engineering a Distributed Histogram Sort · view |
Kumari, Sonal · more Sonal Kumari (Birla Institute of Technology & Science, Pilani) | MuDBSCAN: An Exact Scalable DBSCAN Algorithm for Big Data Exploiting Spatial Locality · view |
Legrand, Arnaud · more Arnaud Legrand (National Center for Scientific Research (CNRS), French Institute for Research in Computer Science and Automation (INRIA)) | Fast and Faithful Performance Prediction of MPI Applications: the HPL Case Study · view |
Li, Jingxuan · more Jingxuan Li (Alibaba) | X-RDMA: Effective RDMA Middleware in Large-scale Production Environments · view |
Li, Sihuan · more Sihuan Li (UC, Riverside) | Improving Performance of Data Dumping with Lossy Compression for Scientific Simulation · view |
Li, Wenxin · more Wenxin Li (Hong Kong University of Science and Technology) | MBECN: Enabling ECN with Micro-burst Traffic in Multi-queue Data Center · view |
Li, Yuzhe · more Yuzhe Li (Institute of Information Engineering, Chinese Academy of Sciences) | RE-Store: Reliable and Efficient KV-Store with Erasure Coding and Replication · view |
Liang, Xin · more Xin Liang (UC, Riverside) | Improving Performance of Data Dumping with Lossy Compression for Scientific Simulation · view |
Lim, Seung-Hwan · more Seung-Hwan Lim (Oak Ridge National Laboratory) | A Quantitative Study of Deep Learning Training on Heterogeneous Supercomputers · view |
Liu, Shusen · more Shusen Liu (Lawrence Livermore National Lab) | Parallelizing Training of Deep Generative Models on Massive Scientific Datasets · view |
Liu, Yi · more Yi Liu (Beihang University) | SMQoS: Improving Utilization and Power Efficiency with QoS Awareness on GPUs (short paper) · view |
Lowenthal, David · more David Lowenthal (University of Arizona) | Mitigating Inter-Job Interference via Process-Level Quality-of-Service (short paper) · view |
Ltaief, Hatem · more Hatem Ltaief (KAUST) | Asynchronous Task-Based Execution of the Reverse Time Migration for the Oil and Gas Industry · view Leveraging Task-Based Polar Decomposition Using PARSEC on Massively Parallel Systems · view |
Luan, Zhongzhi · more Zhongzhi Luan (Beihang University) | SMQoS: Improving Utilization and Power Efficiency with QoS Awareness on GPUs (short paper) · view |
Luo, Junzhou · more Junzhou Luo (Southeast University) | MBECN: Enabling ECN with Micro-burst Traffic in Multi-queue Data Center · view |
Ma, Tao · more Tao Ma (Alibaba) | X-RDMA: Effective RDMA Middleware in Large-scale Production Environments · view |
Ma, Teng · more Teng Ma (Tsinghua University, Alibaba) | X-RDMA: Effective RDMA Middleware in Large-scale Production Environments · view |
Mahadevan, Vijay · more Vijay Mahadevan (Argonne National Laboratory) | Scalable, High-Order Continuity Across Block Boundaries of Functional Approximations Computed in Parallel · view |
Mcclure, Ryan · more Ryan Mcclure (Pacific Northwest National Laboratory) | Fast and Scalable Implementations of Influence Maximization Algorithms · view |
McDermott, Jason · more Jason McDermott (Pacific Northwest National Laboratory) | Fast and Scalable Implementations of Influence Maximization Algorithms · view |
Melhem, Rami · more Rami Melhem (University of Pittsburgh) | Efficient Distributed Graph Analytics using Triply Compressed Sparse Format · view |
Minutoli, Marco · more Marco Minutoli (Pacific Northwest National Laboratory, Washington State University) | Fast and Scalable Implementations of Influence Maximization Algorithms · view |
Miranda, Alberto · more Alberto Miranda (Barcelona Supercomputing Center) | NORNS: Extending Slurm to Support Data-Driven Workflows through Asynchronous Data Staging · view |
Mitra, Saptashwa · more Saptashwa Mitra (Colorado State University) | STASH : Fast Hierarchical Aggregation Queries for Effective Visual Spatiotemporal Explorations · view |
Mohamed, Mohamed · more Mohamed Mohamed (Apple) | Large-Scale Analysis of the Docker Hub Dataset · view |
Mohror, Kathryn · more Kathryn Mohror (Lawrence Livermore National Laboratory) | Mitigating Inter-Job Interference via Process-Level Quality-of-Service (short paper) · view Efficient User-Level Storage Disaggregation for Deep Learning · view |
Moody, Adam · more Adam Moody (Lawrence Livermore National Laboratory) | Efficient User-Level Storage Disaggregation for Deep Learning · view |
Moon, Tim · more Tim Moon (Lawrence Livermore National Lab) | Parallelizing Training of Deep Generative Models on Massive Scientific Datasets · view |
Mubarak, Misbah · more Misbah Mubarak (ANL) | Evaluating Burst Buffer Placement in HPC Systems · view |
Mueller, Frank · more Frank Mueller (NCSU) | Evaluating Burst Buffer Placement in HPC Systems · view |
Naser, Abu · more Abu Naser (Florida State University) | An Empirical Study of Cryptographic Libraries for MPI Communications · view |
Nashed, Youssef S. · more Youssef S. Nashed (Argonne National Laboratory) | Scalable, High-Order Continuity Across Block Boundaries of Functional Approximations Computed in Parallel · view |
Nicolae, Bogdan · more Bogdan Nicolae (Argonne National Laboratory) | Improving Performance of Data Dumping with Lossy Compression for Scientific Simulation · view |
Nou, Ramon · more Ramon Nou (Barcelona Supercomputing Center) | NORNS: Extending Slurm to Support Data-Driven Workflows through Asynchronous Data Staging · view |
Pallickara, Sangmi · more Sangmi Lee Pallickara (Colorado State University) | Big Data Spatiotemporal Analytics - Trends, Characteristics and Applications (Sangmi Lee Pallickara) · view STASH : Fast Hierarchical Aggregation Queries for Effective Visual Spatiotemporal Explorations · view |
Pallickara, Shrideep · more Shrideep Pallickara (Colorado State University) | STASH : Fast Hierarchical Aggregation Queries for Effective Visual Spatiotemporal Explorations · view |
Panda, Dhabaleswar K. · more Dhabaleswar K. Panda (The Ohio State University) | Performance Characterization of DNN Training using TensorFlow and PyTorch on Modern Clusters · view |
Panourgias, Iakovos · more Iakovos Panourgias (EPCC, The University of Edinburgh) | NORNS: Extending Slurm to Support Data-Driven Workflows through Asynchronous Data Staging · view |
Parashar, Manish · more Manish Parashar (Rutgers University) | Leveraging Machine Learning for Anticipatory Data Delivery in Extreme Scale In-situ Workflows · view |
Patinyasakdikul, Thananon · more Thananon Patinyasakdikul (University of Tennessee) | Give MPI Threading a Fair Chance: A Study of Multithreaded MPI Designs · view |
Paul, Arnab K. · more Arnab K. Paul (Virginia Tech) | FSMonitor: Scalable File System Monitoring for Arbitrary Storage Systems · view |
Peng, Bo · more Bo Peng (Indiana University) | HarpGBDT: Optimizing Gradient Boosting Decision Tree for Parallel Efficiency · view |
Peterka, Tom · more Tom Peterka (Argonne National Laboratory) | Scalable, High-Order Continuity Across Block Boundaries of Functional Approximations Computed in Parallel · view |
Peterson, Luc · more Luc Peterson (Lawrence Livermore National Lab) | Parallelizing Training of Deep Generative Models on Massive Scientific Datasets · view |
Pritchard, Howard · more Howard Pritchard (Los Alamos National Laboratory) | MPI Sessions: Evaluation of an Implementation in Open MPI · view |
Qian, Depei · more Depei Qian (Beihang University) | SMQoS: Improving Utilization and Power Efficiency with QoS Awareness on GPUs (short paper) · view |
Qiao, Zhi · more Zhi Qiao (University of North Texas; USRC, LANL) | Building Reliable High-Performance Storage Systems: An Empirical and Analytical Study · view |
Qiu, Judy · more Judy Qiu (Indiana University) | HarpGBDT: Optimizing Gradient Boosting Decision Tree for Parallel Efficiency · view |
Rafique, M. Mustafa · more M. Mustafa Rafique (Rochester Institute of Technology) | A Quantitative Study of Deep Learning Training on Heterogeneous Supercomputers · view |
Ramapantulu, Lavanya · more Lavanya Ramapantulu (International Institute of Information Technology) | Harmony: An Approach for Geo-distributed Processing of Big-Data Applications · view |
Randles, Amanda · more Amanda Randles (Duke University) | Multi-physics simulations of particle tracking in arterial geometries with a scalable moving window algorithm · view |
Raynaud, Franck · more Franck Raynaud (Unversity of Geneva) | On the Benefits of Anticipating Load Imbalance for Performance Optimization of Parallel Applications · view |
Reza, Tasmia · more Tasmia Reza (Clemson University) | Analyzing the Impact of Lossy Compressor Variability on Checkpointing Scientific Simulations (short paper) · view |
Robert, Yves · more Yves Robert (ENS Lyon, Univ. Tenn. Knoxville) | Scheduling Independent Stochastic Tasks on Heterogeneous Cloud Platforms · view |
Robinson, Peter · more Peter Robinson (Lawrence Livermore National Lab) | Parallelizing Training of Deep Generative Models on Massive Scientific Datasets · view |
Roychowdhury, Sayan · more Sayan Roychowdhury (Duke University) | Multi-physics simulations of particle tracking in arterial geometries with a scalable moving window algorithm · view |
Rupprecht, Lukas · more Lukas Rupprecht (IBM Research—Almaden) | Large-Scale Analysis of the Docker Hub Dataset · view |
Said, Issam · more Issam Said (NVIDIA) | Asynchronous Task-Based Execution of the Reverse Time Migration for the Oil and Gas Industry · view |
Sarma, Aditya · more Aditya Sarma (Birla Institute of Technology & Science, Pilani) | MuDBSCAN: An Exact Scalable DBSCAN Algorithm for Big Data Exploiting Spatial Locality · view |
Sathanur, Arun · more Arun Sathanur (Pacific Northwest National Laboratory) | Fast and Scalable Implementations of Influence Maximization Algorithms · view |
Savoie, Lee · more Lee Savoie (University of Arizona) | Mitigating Inter-Job Interference via Process-Level Quality-of-Service (short paper) · view |
Settlemyer, Bradley · more Bradley Settlemyer (Los Alamos National Laboratory) | Building Reliable High-Performance Storage Systems: An Empirical and Analytical Study · view Compact Filter Structures for Fast Data Partitioning · view |
Sharma, Bikash · more Bikash Sharma (Facebook) | Kube-Knots: Resource Harvesting through Dynamic Container Orchestration in GPU-based Datacenters · view |
Shen, Dian · more Dian Shen (Southeast University) | MBECN: Enabling ECN with Micro-burst Traffic in Multi-queue Data Center · view |
Skjellum, Anthony · more Anthony Skjellum (University of Tennessee at Chattanooga) | MPI Sessions: Evaluation of an Implementation in Open MPI · view |
Skourtis, Dimitrios · more Dimitrios Skourtis (IBM Research—Almaden) | Large-Scale Analysis of the Docker Hub Dataset · view |
Smorkalov, Mikhail · more Mikhail Smorkalov (Intel) | Training Google Neural Machine Translation on an Intel CPU Cluster · view |
Song, Zhuo · more Zhuo Song (Alibaba) | X-RDMA: Effective RDMA Middleware in Large-scale Production Environments · view |
Spears, Brian · more Brian Spears (Lawrence Livermore National Lab) | Parallelizing Training of Deep Generative Models on Massive Scientific Datasets · view |
Sridharan, Srinivas · more Srinivas Sridharan (Intel) | Training Google Neural Machine Translation on an Intel CPU Cluster · view |
Srinivasan, Sudarshan · more Sudarshan Srinivasan (Intel) | Training Google Neural Machine Translation on an Intel CPU Cluster · view |
Subedi, Pradeep · more Pradeep Subedi (Rutgers University) | Leveraging Machine Learning for Anticipatory Data Delivery in Extreme Scale In-situ Workflows · view |
Subramoni, Hari · more Hari Subramoni (The Ohio State University) | Performance Characterization of DNN Training using TensorFlow and PyTorch on Modern Clusters · view |
Sukkari, Dalal · more Dalal Sukkari (KAUST) | Leveraging Task-Based Polar Decomposition Using PARSEC on Massively Parallel Systems · view |
Sun, Qingxiao · more Qingxiao Sun (Beihang University) | SMQoS: Improving Utilization and Power Efficiency with QoS Awareness on GPUs (short paper) · view |
Taheri, Saeed · more Saeed Taheri (University of Utah) | DiffTrace: Efficient Whole-Program Trace Analysis and Diffing for Debugging · view |
Tao, Dingwen · more Dingwen Tao (the University of Alabama) | Improving Performance of Data Dumping with Lossy Compression for Scientific Simulation · view |
Tarasov, Vasily · more Vasily Tarasov (IBM Research—Almaden) | Large-Scale Analysis of the Docker Hub Dataset · view |
Teo, Yong Meng · more Yong Meng Teo (National University of Singapore) | Harmony: An Approach for Geo-distributed Processing of Big-Data Applications · view |
Thiagaranjan, Jayaraman · more Jayaraman Thiagaranjan (Lawrence Livermore National Lab) | Parallelizing Training of Deep Generative Models on Massive Scientific Datasets · view |
Thibault, Samuel · more Samuel Thibault (University of Bordeaux, LaBRI – INRIA Bordeaux Sud-Ouest) | Asynchronous Task-Based Execution of the Reverse Time Migration for the Oil and Gas Industry · view |
Thinakaran, Prashanth · more Prashanth Thinakaran (Penn State) | Kube-Knots: Resource Harvesting through Dynamic Container Orchestration in GPU-based Datacenters · view |
Tocci, Tommaso · more Tommaso Tocci (Barcelona Supercomputing Center) | NORNS: Extending Slurm to Support Data-Driven Workflows through Asynchronous Data Staging · view |
Triantafyllides, Pavlo D. · more Pavlo D. Triantafyllides (Clemson University) | Analyzing the Impact of Lossy Compressor Variability on Checkpointing Scientific Simulations (short paper) · view |
Tuecke, Steven · more Steven Tuecke (University of Chicago) | FSMonitor: Scalable File System Monitoring for Arbitrary Storage Systems · view |
Van Essen, Brian · more Brian Van Essen (Lawrence Livermore National Lab) | Parallelizing Training of Deep Generative Models on Massive Scientific Datasets · view |
Vazhkudai, Sudharshan · more Sudharshan Vazhkudai (ORNL) | Evaluating Burst Buffer Placement in HPC Systems · view |
Vivien, Frédéric · more Frédéric Vivien (Inria) | Scheduling Independent Stochastic Tasks on Heterogeneous Cloud Platforms · view |
Wang, Cho-Li · more Cho-Li Wang (HKU) | FluentPS: A Parameter Server Design with Low-frequency Synchronization for Distributed Deep Learning · view |
Wang, Weiping · more Weiping Wang (Institute of Information Engineering, Chinese Academy of Sciences) | RE-Store: Reliable and Efficient KV-Store with Erasure Coding and Replication · view |
Wang, Yang · more Yang Wang (Shenzhen Institutes of Advanced Technology) | DP_Greedy: A Two-Phase Caching Algorithm for Mobile Cloud Services · view |
Wang, Zhi · more Zhi Wang (Florida State University) | An Empirical Study of Cryptographic Libraries for MPI Communications · view |
Wani, Anand · more Anand Wani (Birla Institute of Technology & Science, Pilani) | MuDBSCAN: An Exact Scalable DBSCAN Algorithm for Big Data Exploiting Spatial Locality · view |
Warke, Amit S. · more Amit S. Warke (IBM Research—Almaden) | Large-Scale Analysis of the Docker Hub Dataset · view |
Wellein, Gerhard · more Gerhard Wellein (Friedrich-Alexander University Erlangen-Nürnberg) | Propagation and Decay of Injected One-Off Delays on Clusters: A Case Study · view |
Wu, Cong · more Cong Wu (Florida State University) | An Empirical Study of Cryptographic Libraries for MPI Communications · view |
Wu, Xueyu · more Xueyu Wu (HKU) | FluentPS: A Parameter Server Design with Low-frequency Synchronization for Distributed Deep Learning · view |
Wu, Yongwei · more Yongwei Wu (Tsinghua University) | X-RDMA: Effective RDMA Middleware in Large-scale Production Environments · view |
Wu, Zhiang · more Zhiang Wu (Nanjing University of Finance and Economics) | MBECN: Enabling ECN with Micro-burst Traffic in Multi-queue Data Center · view |
Xu, Chengzhong · more Chengzhong Xu (University of Macau) | DP_Greedy: A Two-Phase Caching Algorithm for Mobile Cloud Services · view |
Xu, Cong · more Cong Xu (Intel) | Training Google Neural Machine Translation on an Intel CPU Cluster · view |
Xu, Luna · more Luna Xu (IBM Research) | A Quantitative Study of Deep Learning Training on Heterogeneous Supercomputers · view |
Xu, Weijia · more Weijia Xu (TACC) | Quantifying the Impact of Memory Errors in Deep Learning · view |
Yang, Hailong · more Hailong Yang (Beihang University) | SMQoS: Improving Utilization and Power Efficiency with QoS Awareness on GPUs (short paper) · view |
Yao, Xin · more Xin Yao (HKU) | FluentPS: A Parameter Server Design with Low-frequency Synchronization for Distributed Deep Learning · view |
Yeom, Jae Seung · more Jae Seung Yeom (Lawrence Livermore National Lab) | Parallelizing Training of Deep Generative Models on Massive Scientific Datasets · view |
Yu, Weikuan · more Weikuan Yu (Florida State University) | Efficient User-Level Storage Disaggregation for Deep Learning · view |
Yuan, Xin · more Xin Yuan (Florida State University) | An Empirical Study of Cryptographic Libraries for MPI Communications · view |
Zhang, Han · more Han Zhang (National University of Singapore) | Harmony: An Approach for Geo-distributed Processing of Big-Data Applications · view |
Zhang, Jinghui · more Jinghui Zhang (Southeast University) | MBECN: Enabling ECN with Micro-burst Traffic in Multi-queue Data Center · view |
Zhang, Zhao · more Zhao Zhang (TACC) | Quantifying the Impact of Memory Errors in Deep Learning · view |
Zhao, Nannan · more Nannan Zhao (Virginia Tech) | Large-Scale Analysis of the Docker Hub Dataset · view |
Zheng, Qing · more Qing Zheng (Carnegie Mellon University) | Compact Filter Structures for Fast Data Partitioning · view |
Zhou, Jiang · more Jiang Zhou (Institute of Information Engineering, Chinese Academy of Sciences) | RE-Store: Reliable and Efficient KV-Store with Erasure Coding and Replication · view |
Zhu, Yue · more Yue Zhu (Florida State University) | Efficient User-Level Storage Disaggregation for Deep Learning · view |
Zimmer, Christopher · more Christopher Zimmer (ORNL) | Evaluating Burst Buffer Placement in HPC Systems · view |