Cluster 2002 - Tutorials


September 23, 2002
Two Morning Tutorials
8:00 am - 12:00

"Cluster Computing: The OSCAR Approach" by Stephen Scott (ORNL)

"Clustered Data Acquisition" by Johannes Gutleber (CERN)

Lunch (on your own) 12 noon - 1:30 pm

Two Afternoon Tutorials
1:30 pm - 5:30 pm

"InfiniBand Architecture: Where Is It Headed and What Will Be the Impact on Cluster Computing?" by D. K. Panda (Ohio State)

"Supporting IPv6 on Linux Clusters" by Ibrahim Haddad (Open Systems Lab)

Short Descriptions

"Cluster Computing: The OSCAR Approach" by Stephen Scott

With over 40,000 downloads since June 2001, OSCAR (Open Source Cluster Application Resources) has become a standard in Beowulf cluster computing. The OSCAR project was initiated in early 2000 as an industry / academic / research institution partnership with the goal to streamline the software construction phase of building Beowulf style clusters using "current best practices." Since that time, OSCAR has grown to encompass more than simply the initial load and configuration of Beowulf cluster software. The OSCAR suite now contains numerous options including cluster maintenance operations such as adding/deleting nodes, adding/deleting software packages, and cluster administration.

This will be the most comprehensive OSCAR tutorial to date, covering OSCAR from its design philosophy through its installation, administration, application use, and how to configure your cluster software to be an OSCAR "contrib" package. Each of the major components presently included with OSCAR will be covered from both the cluster administrator and the cluster user viewpoint.


"Clustered Data Acquisition" by Johannes Gutleber

Data acquisition (DAQ) systems read values from detectors and sensors that digitize the world around us and process these data on-line. Such systems can be found in a variety of indutry and science areas such as air traffic control or nuclear fusion testbeds. CERN's quest for fundamental laws of physics requires completely novel approaches to DAQ in order to process the amounts of data that future experiments will deliver.

This tutorial presents a next generation DAQ architecture that has been developed in response to the requirements of the upcoming Large Hadron Collider experiments. These include issues like the concurrent use of multiple networks, direct device manipulation, interfaces to third party SCADA systems, cluster partitioning and last but not least the demand to provide efficiency close to the one provided by the communication and processing hardware.

Copy of the tutorial:
pdf file (24 MB)
Powerpoint file (22 MB)

Links to the project would be good:
The Compact Muon Solenoid Experiment
The XDAQ DAQ software infrastructure


"InfiniBand Architecture: Where is it Headed and What will be the Impact on Cluster Computing?" by D. K. Panda (Ohio State)

The emerging InfiniBand Architecture (IBA) standard is generating a lot of excitement towards building next generation high performance computing systems in a radical different manner. This is leading to the following common questions among many scientists, engineers, managers, developers, and users associated with Cluster Computing:

  • What is InfiniBand Architecture?
  • How is it different from other on-going developments and standardization efforts such as Virtual Interface Architecture (VIA), PCI-X, Gigabit Ethernet, Rapid I/O, Hyper-transport, 3GIO, and TCP off-load engines?
  • How does it perform compared to other contemporary interconnects (Myrinet, Gigabit Ethernet, and GigaNet)?
  • What unique features and benefits does IBA bring to designing next generation cluster computing systems?
  • This tutorial is designed to provide answers to the above questions. We will start with the background behind the origin of the IBA standard. Then we will make the attendees familiar with the novel features of IBA (such as provision for multiple transport services and mechanisms to support QoS and protection in the network; uniform treatment of interprocessor communication and I/O, hardware support for remote DMA, atomic, and multicast operations; support for virtual lanes and service levels; and support for low latency communication with Virtual Interface). We will compare and contrast the IBA standard with other on-going developments/standards. We will show how the IBA standard facilitates the next generation computing systems to be designed not only to deliver high performance but also RAS (Reliability, Availability, and Serviceability). Open research challenges in designing communication and I/O subsystems of next generation HPC systems with IBA will be outlined. Challenges in developing efficient programming model layers (Message Passing Interface (MPI), Distributed Shared Memory (DSM), and Get/Put) on top of IBA-based communication subsystems will be discussed. Performance numbers obtained on clusters with first generation InfiniBand products and their comparisons with other contemporary interconnects (Myrinet, Gigabit Ethernet, and GigaNet) will be presented. The tutorial will conclude with an overview of on-going IBA related research projects, IBA products, and the market time frame for the IBA products.


    "Supporting IPv6 on Linux Clusters" by Ibrahim Haddad

    The interest in clustering from the telecom world comes from the fact that we can address the availability and scaled performance using cost-effective hardware and software while maintaining near telecom-grade characteristics. These characteristics include linear scalability, continuous service availability, high reliability, superior performance, and ease and completeness of management. However, in addition to these requirements, telecom-grade clustered systems must now support the new Internet protocol, IPv6, as it became a mandatory requirement. The current version of the IP protocol, IPv4, has proved to be robust, easily implemented, interoperable, and, with mechanisms such as network address translation (NAT), has stood the test of scaling to the size of today's Internet. However, it is beginning to have problems since its initial design did not take into consideration several issues that are very important today such as a large address space, mobility, security, auto-configuration, and quality of service. To address these concerns, IEFT has developed a suite of protocols and standards known as IPv6, which incorporates many of the concepts and proposed methods for updating IPv4. IPv6 does not only fix a number of IPv4 problems, but also adds many improvements. Some of the IPv6 features include a new header format, a larger address space, an efficient and hierarchical addressing and routing infrastructure, built-in security, better support of mobility, and a new protocol for neighboring node interaction.

    This tutorial addresses the challenges in supporting IPv6 on the near telecom-grade Linux clusters. It provides an excellent opportunity for designers and engineers working on clustering technologies to get acquainted with IPv6 and follow a tutorial that addresses in detail the design and implementation issues faced when building Linux clusters that support IPv6 at as the operating system level, network level, and applications level.


    back to Cluster2002 home page