This is an archived page of the 2003 conference

abstracts

Applications Track, Systems Track, Bioinformatics, Automotive & Aerospace Engineering, Digital Content Creation/Scientific Visualization/Simulation, Cluster Solutions, and Petroleum/Geophysical Exploration Abstracts


Last updated: 21 July 2003




Applications

Title

 

Large Scale Parallel Reservoir Simulations on a Linux PC Cluster

Author(s)

 

Walid A. Habiballah and M. Ehtesham Hayder

Author Inst

 

Petroleum Engineering Application Services Department, Saudi Aramco

Presenter

 

M. Ehtesham Hayder

Abstract


Numerical simulation is an important tool used by engineers to develop production strategies and enhance hydrocarbon recovery from reservoirs. Demand for large scale reservoir simulations is increasing as engineers want to study larger and more complex models. In this study, we evaluate a state of the art PC cluster and available software tools for production simulations of large reservoir models. We discuss some of our findings and issues related to large scale parallel reservoir simulations and present performance comparisons between a Pentium IV Linux PC cluster and an IBM SP Nighthawk supercomputer.

 

 

 

Title

 

Scalable Performance of FLUENT on NCSA IA-32 Linux Cluster

Author(s)

 

Wai Yip Kwok

Author Inst

 

National Center for Supercomputing Applications (NCSA)

Presenter

 

Wai Yip Kwonk

Abstract


FLUENT, a leading industrial computational fluid dynamics (CFD) software, has been ported to the NCSA IA-32 Linux cluster. For this study, the scalable performance of FLUENT is benchmarked with two engineering problems from Caterpillar, Inc. and Fluent, Inc with a maximum of 64 processors to accommodate up to 10 million cells. This session will outline the impacts of different interconnects on simulation performance. Using Myrinet interconnects, the Linux cluster computes more than 2.5 times faster than an SGI Origin2000 supercomputer at NCSA. A performance increase of seven times is observed when 32 processors are used instead of two.

 

 

 

Title

 

Moore's Law and Cluster Computing When Moore Is Not Enough

Author(s)

 

Greg Lindahl

Author Inst

 

Key Research, Inc.

Presenter

 

Greg Lindahl

Abstract


Linux cluster builders have become accustomed to continuous improvement of cluster building blocks: each year, CPUs get faster, disks get bigger, memory bandwidth rises and networks get cheaper and faster. These improvements are often seen as the inevitable march of progress, driven by the commodity market and Moore’s Law. This session will revisit Moore’s famous law in detail to determine if it adequately predicts an environment ripe for commodity cluster computing.

 

 

 

Title

 

Cooperative Caching in Linux Clusters

Author(s)

 

Ying Xu and Brett Fleisch

Author Inst

 

University of California Riverside

Presenter

 

Ying Xu

Abstract


Operating systems used in most Linux clusters only manage memory locally without cooperating with other nodes in the system. This can create states where a node within the cluster may be short of memory while idle memory in other nodes is wasted. This session attempts to solve the problem of how to improve the cluster operating system to support the use of cluster-wide memory as a global distributed resource. Presented will be a description of a cooperative caching scheme for caching files in the cluster-wide memory and corresponding changes in Linux kernel memory management to support it.

 

 

 

Title

 

Object Storage: Scalable Bandwidth for HPC Clusters

Author(s)

 

G. Gibson, B. Welch, and D. Nagle

Author Inst

 

Panasas Inc.

Presenter

 

Garth Gibson

Abstract


This session describes the Object Storage Architecture solution for cost-effective, high bandwidth storage in HPC environments. It addresses the unique problems of storage intensive computations in very large clusters, suggesting that a shared file system with out-of-band metadata management is needed to achieve the required bandwidth. The session further argues that for excellent data reliability, storage protection needs to be supported on the data path and it recommends the higher-level semantics of object-based, rather than block-based, storage for scalable performance, data reliability and efficient sharing.

 

 

 

Title

 

Analyzing Cluster Log Files Using Logsurfer

Author(s)

 

James Prewett

Author Inst

 

University of New Mexico

Presenter

 

James Prewett

Abstract


Logsurfer is a log analysis tool that simplifies maintaining a cluster by aiding identification and resolution of system issues. This session will outline several examples of using Logsufer in a cluster environment. Examples range from finding the traces of a comples exploitation of a service to determining which of a set of nodes have problems rebooting. Attendees will learn to configure Logsurfer to meet the particular needs of their environment.

 

 

 

Title

 

Performance Evaluation of Load Sharing Policies with PANTS on Beowulf Cluster

Author(s)

 

James Nichols and Mark Claypool

Author Inst

 

Worcester Polytechnic Institute

Presenter

 

James Nichols

Abstract


Powerful, low-cost clusters of personal computers, such as Beowulf clusters, have fueled the potential for widespread distributed computation. While these Beowulf clusters typically have software that facilitates development of distributed applications, there is still a need for effective distributed computation that is transparent to the application programmer.

 

 

 

Title

 

On the Numeric Efficiency of C++ Packages in Scientific Computing

Author(s)

 

Ulisses Mello and Ildar Khabibrakhmanov

Author Inst

 

T.J. Watson Research Center

Presenter

 

Ullises Mello

Abstract


Object-Oriented Programming (OOP) has proven to be a useful paradigm for programming complex models. In spite of recent interest in expressing OOP paradigms in languages such as FORTRAN90, C++ is the dominant OO language in scientific computing, despite its complexity. Barton & Nackman advocated C++ as a replacement for FORTRAN in engineering and scientific computing due to its availability, portability, effciency, correctness and generality. These authors used OOP for code reorganization of LAPACK (Linear Algebra PACKage), and they were able to group and wrap over 250 FORTRAN routines into much smaller set of classes, which expressed the common structure of LAPACK.

 

 

 

Title

 

Benchmarking I/O Solutions for Clusters

Author(s)

 

Stefano Cozzini and Moshe Bar

Author Inst

 

Democritos INFM National Simulation Cente

Presenter

 

Stefano Cozzini

Abstract


Clustered Systems offer many advantages for demanding scientific applications: they can deal with massive CPU-bound requirements and allow the distribution of RAM among many nodes. However, many scientific applications process massive amounts of data and therefore require high performance, distributed storage next to parallel I/O. This session will discuss present-day I/O cluster solutions based on Bonnie performance benchmarking for a variety of popular systems.

 

 

 

Title

 

The Design, Implementation, and Evaluation of mpiBLAST

Author(s)

 

Aaron E. Darling, Lucus Carey, and Wu-chun Feng

Author Inst

 

University of Wisconsin -- Madison

Presenter

 

Aaron E. Darling

Abstract


mpiBLAST is an Open Source parallelization of BLAST that achieves superlinear speed-up by segmenting a BLAST database and then having each node in a computational cluster search a unique portion of the database. Database segmentation permits each node to search a smaller portion of the database, eliminating disk I/O and vastly improving BLAST performance. Because database segmentation does not create heavy communication demands, BLAST users can take advantage of low-cost and efficient Linux cluster architectures such as the bladed Beowulf. In addition to this presentation of the software architecture of mpiBLAST, there will be a detailed performance analysis of mpiBLAST to demonstrate its scalability.

 

 

 

Systems

Title

 

SLURM: Simple Linux Utiltity for Resource Management

Author(s)

 

Morris Jette and Mark Grondona

Author Inst

 

Lawrence Livermore National Laboratory

Presenter

 

Morris Jette

Abstract


SLURM is an open source, fault-tolerant and highly scalable cluster management and job scheduling system for Linux clusters of thousands of nodes. Components include machine status, partition management, scheduling and stream copy modules. This session presents an overview of the SLURM architecture and functionality.

 

 

 

Title

 

A Simple Installation and Administration Tool for the Large-Scaled PC Cluster System: DCAST

Author(s)

 

Tomoyuki Hiroyasu, Mitsunori Miki, Kenzo Kodama, Junichi Uekawa, and Jack Dongarra

Author Inst

 

Doshisha University

Presenter

 

Tomoyuki Hiroyasu

Abstract


The installation and configuration of clusters with many nodes is difficult due to the large amount of time and knowledge required to fully complete the task. To solve this problem a simple installation and administration tool, “Doshisha Cluster Auto Setup Tool: DCAST,” has been developed. Targeted at Linux, it supports both diskless and diskfull clusters, requires no interaction during install, boots slave nodes over the network and changes to configuration are propagated to all nodes.

 

 

 

Title

 

The Space Simulator

Author(s)

 

Michael S. Warren, Chris Fryer, and M. Patrick Goda

Author Inst

 

Los Alamos National Laboratory

Presenter

 

Michael S. Warren

Abstract


The Space Simulator is a 294 processor Beowulf cluster with a peak performance near1.5 Teraflops. It achieved Linpack performance of 665.1 Gflops on 288 processors, making it the 85th fastest computer in the world. The Space Simulator Cluster is dedicated to performing computational astrophysics simulations in the Theoretical Astrophysics group (T6) at Los Alamos National Laboratory. This case study will outline the design drivers, software and applications applied to

 

 

 

Title

 

A Middleware-Level Parallel Transfer Technique Over Multiple Network Interfaces

Author(s)

 

Nader Mohamed, Jameela Al-Jaroodi, Hong Jiang, and David Swason

Author Inst

 

University of Nebrask--Lincoln

Presenter

 

Nader Mohamed

Abstract


Network middleware is a software layer that provides abstract network APIs to hide the lowlevel technical details from users. Existing network middleware support single network interface and link message transfers. In this session, we describe a middleware level parallel transfer technique that utilizes multiple network interface units that may be connected through multiple networks. It operates on any reliable transport protocol such as TCP and transparently provides an expandable high bandwidth solution that reduces message transfer time, provides fault tolerance and facilitates dynamic load balancing between the underlying multiple networks. The experimental evaluation displayed a peak performance of 187Mbps on two fast Ethernet networks.

 

 

 

Title

 

The Cluster Integration Toolkit (CIT)

Author(s)

 

James H. Laros III, Lee Ward, Nathan W. Dauchy, Ruth Klundt, Glen Laguna, James Vasak, Marcus Epperson, and Jon R. Stearley

Author Inst

 

Sandia National Labs

Presenter

 

James H. Laros III

Abstract


The Cluster Integration Toolkit is an extensible, portable, scalable cluster management software architecture for a variety of systems. It has been successfully used to integrate and support a number of clusters at Sandia National Labs and several other sites, the largest of which is 1861 nodes. This session will discuss the goals of the project and how they were achieved. The installation process will be described and common tasks for cluster implementation and support will be demonstrated.

 

 

 

Title

 

Scalable C3 Power Tools

Author(s)

 

Stephen Scott and Brian Luethke

Author Inst

 

Oak Ridge National Laboratory

Presenter

 

John Mugler

Abstract


With the growth of the typical cluster reaching 512 and more compute nodes, it is apparent that cluster tools must begin to reach toward the 1000’s of nodes in scalability. Version 3.2 of the C3 tools has started stretching the Single System Illusion concept into the realm of 1000’s of compute nodes by actually improving performance on larger clusters. This session is a discussion of how this was implemented and how to use this new version of C3 and also presents some results comparing the latest release with prior versions of C3.

 

 

 

Title

 

Full Circle: Simulating Linux Clusters on Linux Cluster

Author(s)

 

Jose Moreira, Luis Ceze, Karin Strauss, George Almasi, Patrick J. Bohrer, Jose R. Brunheroto, Calin Cascaval, Jose G. Gastranos, and Derek Lieber

Author Inst

 

IBM T.J. Watson Research Center

Presenter

 

Jose Moreira

Abstract


BGLsim is a complete system simulator for parallel machines allowing users to develop, test and run the same code that will be used in a real system. It is currently being used in hardware validation and software development for the BlueGene/L cellular architecture machine. BGLsim is capable of functionally simulating multiple nodes of this machine operating in parallel. It simulates instruction execution in each node and the communication that happens between nodes. To illustrate the capabilities of BGLsim, experiments running the NAS Parallel Benchmark IS on a simulated BlueGene/L machine are described.

 

 

 

Title

 

Memory Performance of Dual-Processor Nodes: Comparison of Intel, Xeon, and AMD Opteron Memory Subsystem Architectures

Author(s)

 

Avijit Purkayastha, Chona S. Guiang, Kent F. Milfeld, and John R. Boisseau

Author Inst

 

University of Texas--Austin

Presenter

 

Avijit Purkayastha

Abstract


There are several important features in the AMD x86­64 microarchitecture (Opteron) and the HyperTransport technology that are beneficial to the HPC community. The Opteron processor has an integrated memory controller, and hence a direct connection to memory through two 64­bit wide interfaces. More importantly, this means that each processor in an SMP system has a "separate" interface and memory modules. In addition, HyperTransport technology has been built directly into the processors and also into the chipsets, creating processor­to­processor and processor­to­chipset interconnects that are high­speed and have low latencies. Systems that implement processors with on­chip memory controllers and HyperTransport point­to­point links for inter­processor communication can support parallel applications that have large communication and data sharing needs. Such systems provide an ideal environment for both shared­memory (OpenMP) and distributed­memory (MPI) paradigms.

The Opteron can also achieve excellent single­processor performance. It is unencumbered by the latencies and bottleneck of a north bridge, so memory­intensive applications have the opportunity to deliver full­bandwidth streams from memory to each processor. The large L2 caches provide more room for improving the performance of compute­intensive applications. Also, the native 64­bit Opteron microarchitecture supports large­memory applications, as well as legacy 32­bit applications, concurrently.

In this paper we will explore the benefits of the new x86­64 architecture through the performance of some standard HPC code kernels and applications that use multi­threading (OpenMP) and multi­processing (MPI). Our analysis will focus on characteristics of the memory subsystem and will examine two key issues. We will conduct scaling studies of compute and memory intensive applications on dual­processor AMD Opteron and Intel Xeon nodes to assess how well the memory subsystem copes with the increased memory demands of the second processor. We will also investigate how OS memory affinity and process binding affects memory bandwidths.

 

 

 

 

Title

 

Scheduling for Improved Write Performance in a Cost-Effective, Fault-Tolerant Parallel Virtual File System (CEFT-PVFS)

Author(s)

 

Yifeng Zhu, Hong Jiang, Xiao Qin, Dan Feng, and David R. Swanson

Author Inst

 

University of Nebraska -- Lincoln

Presenter

 

Yifeng Zhu

Abstract


This session will demonstrate that all the disks on the nodes of a cluster can be connected together through CEFTPVFS, an RAID10 style parallel file system for Linux system, to provide a GBytes/sec parallel I/O performance , without any additional cost. To improve the overall I/O performance, I/O requests can be scheduled on a less loaded node in each mirroring pair, thus making more informed scheduling decisions. Based on the heuristic rules we found from the experimental results, a scheduling algorithm for dynamic load-balancing has been developed that significantly improves the overall performance.

 

 

 

Title

 

Archiving Order through CHAOS: The LLNL HPC Cluster Experience

Author(s)

 

Robin Goldsteine, Ryan Braby, and Jim Garlick

Author Inst

 

Lawrence Livermore National Laboratory

Presenter

 

Robin Goldstone

Abstract


For the past several years, Lawrence Livermore National Laboratory (LLNL) has invested significant effort in the deployment of large High Performance Computing (HPC) Linux clusters. After deploying two modest sized clusters (88 nodes and 128 nodes) in early 2002, efforts progressed to the deployment of the Multiprogrammatic Capability Resource (MCR, 1154 nodes) in fall 2002 and ASCI Linux Cluster (ALC, 962 nodes) in early 2003. Through these efforts, LLNL has developed expertise in a number of areas related to the design, deployment and management of large Linux clusters. In this session LLNL will present their experiences, including challenges encountered and lessons learned.

 

 

 

Title

 

Supercomputing Center Management Using AIRS

Author(s)

 

Robert Ballance, Jared Galbraith, and Roy Heimbac

Author Inst

 

University of New Mexico

Presenter

 

Robert A. Ballance

Abstract


Running a large university supercomputing center teaches many lessons, including the need to centralize data collection and analysis, automate system administration functions, and enable users to manage their own projects. Albuquerque Integrated Reporting System (AIRS), a centralized, web-enabled application capable of user and project administration across multiple clusters and reporting against both active and historical data, evolved in response to these pressures.

 

 

 

Bioinformatics

Title

 

Running BLAST on a Linux Cluster

Presenter(s)

 

Ray Hookway

Presenter Inst

 

Hewlett-Packard

Abstract


Everyone knows that Blast is an example of an embarrassingly parallel application, i.e., an application that will run well on a cluster. Conceptually, one breaks up a query against a database into several queries against subsets of the database and distributes the resulting jobs across the nodes of the cluster. However, it is not obvious how to go about doing this. The talk will begin with a brief review of how Blast works and then will explore factors that affect the performance of Blast running on a single system. Final focus will be on the answer to the question “How to run Blast on a cluster?”

 

 

 

Title

 

Biobrew Linux: A Linux Cluster Distribution for Bioinformatics

Presenter(s)

 

Glen Otero

Presenter Inst

 

Callident

Abstract


BioBrew Linux is the first known attempt at creating and freely distributing an easy-to-use clustering software package designed for bioinformaticists. With support for both IA32 and IA64 platforms, BioBrew is a Linux distribution that combines the NPACI Rocks cluster software with several popular Open Source bioinformatics software tools like BLAST, HMMER, ClustalW and BioPerl. The result is a Linux distribution that can be used to install a workstation or a Beowulf cluster for bioinformatics analyses.

 

 

 

Title

 

Terascale LinuxClusters: Supercomputing Solutions for the Life Sciences

Presenter(s)

 

Bruce Ling and Padmanabhan Iver

Presenter Inst

 

Tularik, Inc. and Linux NetworkX (respectively)

Abstract


At Tularik, a biotechnology company specializing in drug discovery and development using gene regulation, informatics has become essential for the process of genomics-based drug discovery. With the explosion of the genomic data and lead discovery screening data points, a powerful computing environment becomes a must in order to boost B&D productivity. By deploying a 150-processor cluster, Tularik has successfully managed millions of data points, coming from Assay-Development, High-Throughput-Screening (HTS), Structure-Activity-Relationship (SAR), Lead-Optimization and Micro-Array to speed its R&D productivity and decision-making processes.

 

 

 

Title

 

Blade Servers for Genomic Research

Presenter(s)

 

Ron Neyland

Presenter Inst

 

RLX Technologies

Abstract


Clusters based on industry standard hardware and software have become the most widely used tools for performing genomic processing and analysis. While providing many benefits such as outstanding price/performance, they also introduce a new set of problems. This session will address how blade servers provide a compute cluster platform that delivers the compute power required for genomic research, while minimizing many of the problems. Real world examples of clusters running many of the widely used genomic applications will be presented, along with tips and tools for managing the cluster environment.

 

 

 

Title

 

High Performance Mathematical Libraries for Itanium 2 Clusters

Presenter(s)

 

Hsin-Ying Lin

Presenter Inst

 

Hewlett-Packard

Abstract


HP’s Mathematical LIBrary (HP MLIB) provides a user-friendly interface using standard definitions of public domain software and enables users to access the power of high performance computing. HP MLIB fully exploits the architecture of the processor and achieves optimal performance on Itanium 2. HP MLIB has been used by high performance computing customers for over 15 years. This session will provide a brief overview of relevant architectural features and depict how these features have been used to design high-level algorithms. The performance of some of the key components in HP MLIB on Itanium 2 clusters will be discussed: i.e. matrix multiplication, ScaLAPACK and SuperLU_DIST.

 

 

 

Title

 

Parallel Computational Biology Tools and Applications for Windows Clusters

Presenter(s)

 

Jaroslaw Pillardy

Presenter Inst

 

Cornell Theory Center

Abstract


Using massively parallel programs for data analysis is the most popular way of dealing with the enormous amounts of data produced in molecular biology research. Several computational biology tools for Microsoft Windows clusters of different levels of complexity, available at the Computational Biology Service Unit at the Cornell Theory Center, will be discussed. All of the tools follow a master-worker approach using MPI communications. The simplest tools - tools that are very important to biologists - are standard sequence-based data mining tools such as BLAST and HMMER. More sophisticated is the structure-based (threading) protein annotation algorithm LOOPP.

 

 

 

Title

 

Building Software for High Performance Informatics and Chemistry

Presenter(s)

 

Joseph Landman

Presenter Inst

 

Scalable Informatics LLC

Abstract


Given the growth rate of life science data sets, analysis applications designed for single machines with shared memory and one or more CPUs quickly leads to a performance bottleneck. Clusters and Grids represent a potential solution to this bottleneck but only when applications are properly designed to make full use of the resources available. In this session we will look at the hard realities of building software for the informatics industry, including: problems with running legacy software on clusters, how to make efficient use of clusters, for both the cluster and the user, and making life science informatics and chemistry applications scale well on clustered systems.

 

 

 

Title

 

To Cluster or Not to Cluster

Presenter

 

Tom Scanlan

Presenter Inst

 

NEC Solutions America

Abstract


(Unavailable)

 

 

 

 

 

 

 

Automotive & Aerospace Engineering

Title

 

Cluster Computing in Space Applications

Presenter(s)

 

Eric George

Presenter Inst

 

The Aerospace Corporation

Abstract


This case study will examine how The Aerospace Corporation utilizes cluster computing for a variety of applications in support of high priority national defense programs including the Global Positioning System (GPS) and future missile warning programs. Applications to date have focused on astrodynamics, satellite constellation design, communications network modeling, thermal analysis, and complex scheduling/tasking algorithms. Processing techniques range from Monte Carlo analysis & brute force search operations to genetic algorithms. Research is progressing on implementation of a diverse grid-computing environment at Aerospace.

 

 

 

Title

 

Full Vehicle Dynamic Analysis Using Automated Component Modal Synthesis

Presenter(s)

 

Peter Schartz

Presenter Inst

 

MSC Software

Abstract


Today it is commonplace to attempt to analyze the fully trimmed body of an automobile for its vibration characteristics, over increasing frequency ranges, and on inexpensive computer hardware. The cost effectiveness of RISC based cache processors, combined with upward pressure in the form in large, detailed models, has allowed new software methods to utilize domain decomposition to enable high-level parallelism. A domain decomposition, followed by a component modal synthesis solution, is the bases for Automated Modal Component Synthesis (ACMS) in MSC.Nastran. The solution is described in theory, and its effectiveness is demonstrated by an example taken from today’s automotive industry.

 

 

 

Title

 

Using Clusters to Deliver Turn Key DFC Solutinons

Presenter(s)

 

Greg Stuckert

Presenter Inst

 

Fluent

Abstract


While low cost, high performance clusters have been in use since the early 1990’s, the application of commercial off-the-shelf CFD software, such as Fluent, to harness these shared nothing architectures has only been viable near the end of that decade. Early implementations required persistent IT department willing to commit the time and resources necessary to overcome these challenges. Now, however, organizations are able to access a full-featured implementation of Fluent via the Internet in a pay as you go scenario. This session will discuss the problems solved and gains realized by a distributed implementation of Fluent 6.1.

 

 

 

Title

 

LS-DYNA: CAE Simulation Software on Linux Clusters

Presenter(s)

 

Guangye Li

Presenter Inst

 

IBM

Abstract


LS-DYNA is used in a wide variety of simulation applications: automotive crashworthiness & occupant safety; sheet metal forming, military and defense applications, aerospace industry applications, electronic component design. Several years ago, one simulation of a very simplified finite element model needed days to complete on a Symmetric Multiprocessing (SMP) vector computer. With the introduction of Distributed Multiprocessing technology, the MPP (Massively Parallel Processors) version of LS-DYNA can dramatically reduce the turnaround time for the simulation and therefore reduce the time for the automotive design process. We will present the comparison of the scalability of the SMP and MPP versions of LS-DYNA, as well as the comparison of communication networks (Myrinet, Fast Ethernet, Gigabit Ethernet) on Linux clusters.

 

 

 

Title

 

Linux Clusters in the German Automotive Industry

Presenter(s)

 

Karsten Galer

Presenter Inst

 

science + computing AG

Abstract


After the first German CAE-Linux computer cluster (LCC) was installed in 1999 at DaimlerChrysler for electromagnetic compatibility calculations (EMC), there has been great success in the adoption of LCC. This includes clusters based on 512 CPUs used for crash-calculations running at a major automotive manufacturer. This talk will provide an overview of ways in which Linux clusters are changing the course of CAE in Germany. It will also look at a number of different configurations currently being implemented in some of the world’s largest automotive manufacturers.

 

 

 

Title

 

Improving Multi-site/Multi-departmental Cluster Systems through Data Grids in the Automotive and Aerospace Industries

Presenter(s)

 

Andrew Grimshaw

Presenter Inst

 

Avaki

Abstract


As the pressure increases to optimize the product design and manufacturing processes it is critical for the automotive and aerospace industries to give professionals secure access to product and manufacturing information. Data is often located at multiple R&D sites and suppliers, regardless of location. Additionally, product developers require more and more processing power, delivered via clusters that are not effective unless they can provide access to the data they need. This session will examine the most significant data challenges facing today’s automotive and aerospace companies and how Grid technology impacts the engineering and manufacturing process.

 

 

 

Title

 

Scrutinizing CFD Performance on Multiple Linux Cluster Architectures

Presenter(s)

 

Thomas Hauser

Presenter Inst

 

Utah State University

Abstract


Linux cluster supercomputers are a cost-effective platform for simulating fluid flow in engineering applications. However, obtaining high performance on these clusters is a non-trivial problem, requiring tuning and design modifications to the Computational Fluid Dynamics (CFD) codes. Investigations in optimizing CFD codes on Linux cluster platforms will be presented. Detailed performance results of two CFD codes on a wide range of cluster architectures, including Pentium and Athlon, Intel Itanium and the AMD Opteron, will be analyzed. The single and multi-processor performance of these codes on different cluster architectures will be compared and means of improving performance discussed.

 

 

 

Title

 

Managing CAE Simulation Workload in Cluster Environments

Presenter(s)

 

Michael M. Humphrey

Presenter Inst

 

Altair

Abstract


Automotive manufacturers are beginning to capitalize on workload management software to get the most out their numerically intense computing environments. Workload management software is middleware technology that sits between your compute-intensive applications - such as ABAQUS, ANSYS, FLUENT, LS-DYNA, NASTRAN and OPTISTRUCT - and your network hardware operating systems. The software schedules and distributes all types of application runs (serial, parallel, distributed memory, parameter studies, big memory, long running, etc.), on all types of hardware (desktops, clusters, supercomputers and even across sites). This presentation will describe the current capabilities of PBS Pro workload management software as a middleware enabler for robust system design.

 

 

 

Digital Content Creation / Scientific Visualization / Simulation

Title

 

The Current State of Numerical Weather Prediction on Cluster Technology -- What is Needed to Break the 25% Efficiency Barrier?

Presenter(s)

 

Dan Weber

Presenter Inst

 

Center for the Analysis and Prediction of Storms

Abstract


This session will look in depth at the current state of weather prediction and the many challenges it faces. The talk will examine the computational needs (teraflops) of a robust numerical weather prediction (NWP) system at thunderstorm scale and review NWP performance on current computer technology. A review of current models will be addressed, as well as the roadblocks associated with clusters. Finally, a proposal for a complete shift in the way systems of equations are solved on scalar technology in order to break the 25% efficiency ceiling will be examined.

 

 

 

Title

 

Building and Using Tiled Display Walls

Presenter(s)

 

Paul Rajlich

Presenter Inst

 

National Center for Supercomputing Applications (NCSA)

Abstract


Tiled display walls provide a large-format environment for presenting very high-resolution visualizations by tiling together the output from a collection of projectors. Projectors are driven by a Linux cluster augmented with high-performance graphics accelerator cards and costs are controlled by using commodity projectors and low-cost PCs. Tiled walls must face a number of challenges, such as, aligning the projectors so that the output of adjacent tiles align to create a seamless image. This session will discuss the Alliance Display Wall-in-a-Box effort; a distribution of related Open Source software packages that reduce the setup and maintenance of complex high-end display systems.

 

 

 

Title

 

Discovery and Analysis of Communication Patterns in Complex Network-based Systems Using Virtual Environments

Presenter(s)

 

Tom Caudell

Presenter Inst

 

University of New Mexico

Abstract


The real-time visualization of cluster networks provides a number of benefits to administrators and developers in search of performance bottlenecks. Real-time visuals provide early warning of real problems in network traffic as well as provide clear indication of potential problems before they occur. However, real-time network visualization is a remarkably difficult project. This session will discuss a number of the technical hurdles involved in building a visualization system that will scale with increased performance. Using network visualization, organizations can design applications that take better advantage of network traffic, avoiding bottlenecks, and administrators can make informed decisions on scheduling that lead a cluster toward optimal performance.

 

 

 

Title

 

HPC and HA Clustering for Online Gaming

Presenter(s)

 

Jesper Jensen

Presenter Inst

 

SCI

Abstract


SCI, the company who developed and supports the backend for the Department of Defense's America's Army game, will deliver a case study on deploying gaming clusters for the DoD — and other game titles — and give an overview of where large-scale game technology is and where it is going. With technology capable of pushing an average of 1.35 teraflops per cabinet space and leveraging multiple transit carriers, SCI clusters deliver both the HPC and HA required to support a massive gaming audience. This discussion will touch on solutions for 32 bit and next-generation 64 bit architectures both in place and under development.

 

 

 

Title

 

Large Scale Scientific Visualization on PC Clusters

Presenter(s)

 

Brian Wylie

Presenter Inst

 

Sandia National Labs

Abstract


This session covers the use of PC clusters with commodity graphics cards as high-performance scientific visualization platforms. A cluster of PC nodes, in which many or all of the nodes have 3D hardware accelerators, is an attractive approach to building a scalable graphics system. The main challenge in using cluster-based graphics systems is the difficulty of realizing the full aggregate performance of all the individual graphics accelerators. Topics covered will include parallel geometric rendering, parallel volume rendering, data distribution approaches and novel techniques for utilizing graphics processors.

 

 

 

Title

 

The Use of Clusters for Engineering Simulation

Presenter(s)

 

Lynn Lewis

Presenter Inst

 

Hewlett-Packard

Abstract


Clusters allow the use of advanced mathematical techniques for optimization, changing the way engineers arrive at cost effective, safe designs. Without inexpensive clusters, engineers at automotive manufacturers could not do 1000's of crash test simulations integrated with the initial design stage nor test for structural integrity much less manufacturability within weeks. This session will examine in detail how, over the previous decade, Unix and lately Linux clusters have found use in commercial cash and fluid dynamics simulations, changing the way cars and aircraft are designed and built.

 

 

 

Title

 

NEESgrid: Virtual Collaboratory for Earthquake Engineering and Simulation

Presenter(s)

 

Tom Prudhomme

Presenter Inst

 

National Center for Supercomputing Applications (NCSA)

Abstract


NEESgrid will link earthquake engineering researchers across the U.S. with leading-edge computing resources and research and testing facilities, allowing teams to plan, perform, and publish their research. Via both Telepresence and other collaboration technologies, research teams are able to work remotely on experimental trials and simulations. This session will examine how NEESgrid, through the shared resources of Grid technology, will bring together information technologists and engineers in a way that will revolutionize earthquake engineering, research and simulation.

 

 

 

Title

 

In the Architecture of and Audio Identification Cluster

Presenter(s)

 

Daniel Culbert

Presenter Inst

 

Shazam Entertainment, Inc.

Abstract


(Unavailable)

 

 

 

Cluster Solutions

Title

 

Building the TeraGrid: The World's Largest Grid, Fastest Linux Cluster, and Fastest Optical Network Dedicataed to Open Science

Presenter(s)

 

Pete Beckman

Presenter Inst

 

Argonne National Laboratory

Abstract


The TeraGrid is one of the most ambitious collaborative grid projects ever undertaken. The building blocks for the $88 million National Science Foundation funded project include mammoth computational resources, ultra-fast fiberoptic networks linking NCSA, SDSC, CalTech Argonne and PSC and a software “grid hosting environment.” Together, they will form an environment that makes developing cluster-based, grid-enabled scientific applications easy. This presentation will provide an overview of the project, the bleeding edge technologies used to bring clusters and grids to the scientific community and an update on current status and results.

 

 

 

Title

 

Building Blocks for 64-bit AMD Opteron Clusters

Presenter(s)

 

Richard Brunner

Presenter Inst

 

AMD

Abstract


This presentation describes the hardware and software building blocks that are in place to construct 64-bit AMD Opteron(TM) based Clusters. We begin with an overview of the newly released AMD Opteron(TM) processor and its system architecture that allows affordable 64-bit clustered computing while maintaining 32-bit performance and compatibility. Special attention will be given to the "glueless" multiprocessing capability provided by fast HyperTransport(tm) Technology interconnects and per-processor integrated memory controllers. We will next describe how 64-bit SuSE Linux Enterprise Server for AMD x86-64 exploits this hardware topology and discuss the accompanying thread and explicit parallelism tools and compilers that are available. We will end the presentation with a survey of the available third-party cluster adapters and interconnects that are supported on AMD Opteron(TM) platforms.

 

 

 

Title

 

Tools for Optimizing HPC Applications on Intel Clusters

Presenter(s)

 

Don Gunning

Presenter Inst

 

Intel

Abstract


The Intel software research lab is involved in several projects related to the development and deployment of HPC software on Intel based clusters. This discussion will focus on the work Intel is doing in parallel/concurrent computing within a single job or task, the development, debugging and tuning multithreaded applications, in addition to deploying MPI (and mixed MPI/threaded) applications and Extending OpenMP to execute across clusters. This discussion will also touch on ideas for maximum messaging performance on the interconnect while maximizing application performance on the node.

 

 

 

Title

 

The Ultra Scalable HPTC Lustre Filesystem

Presenter(s)

 

Kent Koeninger

Presenter Inst

 

Hewlett-Packard

Abstract


The Lustre filesystem is designed to provide a coherent-scalable shared filesystem that can serve thousands of Linux client nodes, delivering extremely high-bandwidth parallel-filesystem access to many terabytes of storage. This talk will describe how the Luster filesystem will be used in scalable-HPTC-Linux systems to combine the flexibility, scalability and manageability of NAS systems with the performance of SAN systems. The Lustre development effort is an open source project with initial release target in 2003.

 

 

 

Title

 

Building the World's Most Powerful Cluster: 11.2 Tflops at Lawrence Livermore National Laboratory

Presenter(s)

 

Kim Clark

Presenter Inst

 

Linux NetworX

Abstract


In 2002, Linux Networx built the MCR cluster housed at Lawrence Livermore National Laboratory. It is currently the largest cluster in the world with a theoretical peak of 11.2 Tflops and, with more than 1,000 nodes to manage and monitor, ranks as the fifth largest supercomputer in the world. The unique challenges involved in building and configuring such a massive system and what was leaned from this experience will be discussed. Attendees will learn how to apply aspects of the LLNL system to their own smaller system to enhance cluster performance and reliability.

 

 

 

Title

 

Driving Cluster/Grid Technologies in HPC

Presenter(s)

 

David Barkai

Presenter Inst

 

Intel

Abstract


High performance computing has undergone a metamorphosis in the last 15-20 years. The changes, and what they mean to the industry and the user community, will be reviewed. The cluster approach to HPC drives the evolution of a new ecosystem. In this talk we will describe the building blocks as a set of components built upon enabling technologies. The application characteristics determine the choices made for system software, middleware, interconnect, cluster topology, the nodes, and the processor. The resulting architecture and the nature of the workload and computing environment dictate the management tools that are needed. We will summarize the considerations for the choices that need to be made while highlighting the gaps and the challenges, as cluster computing ramps up and grid computing continues to develop.

 

 

 

Title

 

Emerging Trends in Data Center Powering and Cooling

Presenter(s)

 

Wahid Nawabi

Presenter Inst

 

APC

Abstract


Traditional data center architecture approaches force enterprises to build out to full capacity from day one, yet one hundred percent utilization of the designed capacity is seldom reached. This results in long deployment schedules, millions of dollars of unrecoverable up-front capital investments and the maintenance of expensive service contracts on under-utilized infrastructure. APC’s PowerStruXure offers an on-demand solution that accelerates speed of deployment and allows you to invest in a data center solution that is sufficient to meet today’s demands, rather than an uncertain estimate of future capacity.

 

 

 

Title

 

The Virtual Environment and Its Impact on IT Infrastructure

Presenter(s)

 

Daniel Kusnetzky

Presenter Inst

 

IDC

Abstract


IDC has been examining the evolution of the virtual environment for quite a number of years. This session will examine IDC’s definition of the virtual environment, its roots in techniques developed in the late 1970s, and how Windows, Unix and Linux can be deployed as platforms in the virtual environment. Dan Kusnetzky, IDC’s Vice President of System Software, will present the drivers for virtual environment software adoption and project how the virtual environment will impact the overall IT infrastructure in the coming years.

 

 

 

Petroleum / Geophysical Expoloration

Title

 

Exploring the Earth's Subsurface with Itanium 2 Linux Clusters

Presenter(s)

 

Keith Gray

Presenter Inst

 

British Petroleum

Abstract


This case study of an Itanium 2 processor architecture and Linux cluster technology for seismic imaging and migration imperative project, allowed a British Petroleum to reduce their cost for this high-end infrastructure by one-half while increasing performance by 3X, and in some cases exceeding this expectation by 5X. The environment includes 1024 processors (4-way HP rx5670 servers x 256 servers) with 8.2 Terabytes (32GB per rx5670 server) of memory and operates at over 4Teraflops peak performance.

 

 

 

Title

 

Scalablity Considerations for Compute Intensive Appplications on Clusters

Presenter(s)

 

Christian Tanasescu

Presenter Inst

 

SGI

Abstract


This session investigates the scalability, architectural requirements and performance characteristics of some of the most widely used compute intensive applications in the scientific and engineering communities. Seismic Processing and Reservoir Simulation (SPR) applications generally consume data read from memory and have to load continuous new data. As a result, to keep the floating point (FP) units busy, these applications require computer architectures with high memory bandwidth, mainly due to the data addressing patterns and heavy I/O activities. We will also introduce BandeLa, to study the influence of the communication bandwidth and latency for MPI applications.

 

 

 

Title

 

Parallel Reservoir Simulation on Intel Xeon HPC Clusters

Author(s)

 

Baris Guler, Tau Leng, Victor Mashayekhi, and Kamy Sepehmoori

Author Inst

 

Dell and Univeristy of Texas - Austin

Presenter

 

Kamy Sepehmoori and Reza Rooholamini

Abstract


Numerical simulation of reservoirs is an integral part of geo-scientific studies, with the goal of optimizing petroleum recovery. In this session, we conduct a series of benchmarks by running a parallel reservoir simulation code on an Intel Xeon Linux cluster and study the scalability while using different interconnects for the cluster. Our results show that the simulator’s performance scales linearly from one to 64 single-processor nodes, when using a low-latency, high-bandwidth interconnect. In addition to benchmarking, we describe a process-to-processor mapping approach for dual-processor clusters to improve communication performance as well as overall performance of the simulator.

 

 

 

Title

 

Geoscience Visualization and Seismic Processing Clusters: Collaboration and Integration Issues

Presenter(s)

 

Phil Neri

Presenter Inst

 

Paradigm

Abstract


The active development of Linux visualization clusters has led to the notion of associating closely compute-intensive seismic processing and geosciences visualization, notably for the purpose of building and verifying velocity and solid models. The options are to implement cross-system data integration, or to share of a common hardware resource. Practical implementations of the integration model will be presented, based on Paradigm’s experience with existing production systems. The use of a CORBA-based distributed data architecture will also be discussed. The common hardware concept, still in the design phase, will be analyzed for its expected benefits, economics and potential problems.

 

 

 

Title

 

Cluster Computing at CGG

Presenter(s)

 

L. Clerc

Presenter Inst

 

CGG

Abstract


(Unavailable)

 

 

 

Title

 

Grid Computing in the Energy Industry

Presenter(s)

 

Jamie Bernardin

Presenter Inst

 

DataSynapse

Abstract


Grid computing has attracted significant attention in the current IT environment. What are the business and technical factors driving companies to adopt Grid? In this presentation on Grid computing in Oil & Gas, we will examine, frequently encountered obstacles to deploying a grid computing solution, compare the vision of Grid to the realities of today, identify target deployments for distributed computing solutions in the Oil & Gas sector, and describe the value impact of grid computing. DataSynapse will share case studies from its existing engagements as well as identify specific technical requirements unique to the energy market.

 

 

 

Title

 

Drilling in the Digital Oil Field: High Pay-offs from Linux Clusters

Presenter(s)

 

Shawn Fuller

Presenter Inst

 

Hewlett-Packard

Abstract


The Oil & Gas industry is required to manage mammoth volumes of complex data for both engineering and scientific requirements in their search for discovering new reservoirs and more cost efficient production methods. Globally deployable high-performance computing systems coupled with best-in-class applications are the keys to success for Oil & Gas companies to excel in their business. This session will cover the areas of technology receiving the most focus: mobility, desktop visualization, scalable and immersive visualization, global collaboration, scalable clustered systems, network storage systems, imaging and printing - covering the full gamut of Oil & Gas IT requirements.