 |
Applications |
|
Title |
|
Large Scale
Parallel Reservoir Simulations on a Linux PC Cluster |
Author(s) |
|
Walid A. Habiballah and M. Ehtesham Hayder |
Author Inst |
|
Petroleum Engineering Application Services Department,
Saudi Aramco |
Presenter |
|
M.
Ehtesham Hayder |
| Abstract |
|
Numerical simulation is an important tool used
by engineers to develop production strategies and enhance hydrocarbon
recovery from reservoirs. Demand for large scale reservoir simulations
is increasing as engineers want to study larger and more complex
models. In this study, we evaluate a state of the art PC cluster
and available software tools for production simulations of large
reservoir models. We discuss some of our findings and issues
related to large scale parallel reservoir simulations and present
performance comparisons between a Pentium IV Linux PC cluster
and an IBM SP Nighthawk supercomputer.
|
|
|
Title |
|
Scalable Performance
of FLUENT on NCSA IA-32 Linux Cluster |
Author(s) |
|
Wai Yip Kwok |
Author Inst |
|
National Center for Supercomputing Applications
(NCSA) |
Presenter |
|
Wai Yip Kwonk |
| Abstract |
|
FLUENT, a leading industrial computational fluid
dynamics (CFD) software, has been ported to the NCSA IA-32 Linux
cluster. For this study, the scalable performance of FLUENT is
benchmarked with two engineering problems from Caterpillar, Inc.
and Fluent, Inc with a maximum of 64 processors to accommodate
up to 10 million cells. This session will outline the impacts
of different interconnects on simulation performance. Using Myrinet
interconnects, the Linux cluster computes more than 2.5 times
faster than an SGI Origin2000 supercomputer at NCSA. A performance
increase of seven times is observed when 32 processors are used
instead of two.
|
|
|
Title |
|
Moore's Law and Cluster
Computing When Moore Is Not Enough |
Author(s) |
|
Greg Lindahl |
Author Inst |
|
Key Research, Inc. |
Presenter |
|
Greg Lindahl |
| Abstract |
|
Linux cluster builders have become accustomed
to continuous improvement of cluster building blocks: each year,
CPUs get faster, disks get bigger, memory bandwidth rises and
networks get cheaper and faster. These improvements are often
seen as the inevitable march of progress, driven by the commodity
market and Moore’s Law. This session will revisit Moore’s
famous law in detail to determine if it adequately predicts an
environment ripe for commodity cluster computing.
|
|
|
Title |
|
Cooperative Caching
in Linux Clusters |
Author(s) |
|
Ying Xu and Brett Fleisch |
Author Inst |
|
University of California Riverside |
Presenter |
|
Ying
Xu |
| Abstract |
|
Operating systems used in most Linux clusters
only manage memory locally without cooperating with other nodes
in the system. This can create states where a node within the
cluster may be short of memory while idle memory in other nodes
is wasted. This session attempts to solve the problem of how
to improve the cluster operating system to support the use of
cluster-wide memory as a global distributed resource. Presented
will be a description of a cooperative caching scheme for caching
files in the cluster-wide memory and corresponding changes in
Linux kernel memory management to support it.
|
|
|
Title |
|
Object Storage: Scalable
Bandwidth for HPC Clusters |
Author(s) |
|
G. Gibson, B. Welch, and D. Nagle |
Author Inst |
|
Panasas Inc. |
Presenter |
|
Garth
Gibson |
| Abstract |
|
This session describes the Object Storage Architecture
solution for cost-effective, high bandwidth storage in HPC environments.
It addresses the unique problems of storage intensive computations
in very large clusters, suggesting that a shared file system
with out-of-band metadata management is needed to achieve the
required bandwidth. The session further argues that for excellent
data reliability, storage protection needs to be supported on
the data path and it recommends the higher-level semantics of
object-based, rather than block-based, storage for scalable performance,
data reliability and efficient sharing.
|
|
|
Title |
|
Analyzing Cluster Log
Files Using Logsurfer |
Author(s) |
|
James Prewett |
Author Inst |
|
University of New Mexico |
Presenter |
|
James Prewett |
| Abstract |
|
Logsurfer is a log analysis tool that simplifies
maintaining a cluster by aiding identification and resolution
of system issues. This session will outline several examples
of using Logsufer in a cluster environment. Examples range from
finding the traces of a comples exploitation of a service to
determining which of a set of nodes have problems rebooting.
Attendees will learn to configure Logsurfer to meet the particular
needs of their environment.
|
|
|
Title |
|
Performance Evaluation
of Load Sharing Policies with PANTS on Beowulf Cluster |
Author(s) |
|
James Nichols and Mark Claypool |
Author Inst |
|
Worcester Polytechnic Institute |
Presenter |
|
James
Nichols |
| Abstract |
|
Powerful, low-cost clusters of personal computers,
such as Beowulf clusters, have fueled the potential for widespread
distributed computation. While these Beowulf clusters typically
have software that facilitates development of distributed applications,
there is still a need for effective distributed computation that
is transparent to the application programmer.
|
|
|
Title |
|
On the Numeric Efficiency
of C++ Packages in Scientific Computing |
Author(s) |
|
Ulisses Mello and Ildar Khabibrakhmanov |
Author Inst |
|
T.J. Watson Research Center |
Presenter |
|
Ullises
Mello |
| Abstract |
|
Object-Oriented Programming (OOP) has proven
to be a useful paradigm for programming complex models. In spite
of recent interest in expressing OOP paradigms in languages such
as FORTRAN90, C++ is the dominant OO language in scientific computing,
despite its complexity. Barton & Nackman advocated C++ as
a replacement for FORTRAN in engineering and scientific computing
due to its availability, portability, effciency, correctness
and generality. These authors used OOP for code reorganization
of LAPACK (Linear Algebra PACKage), and they were able to group
and wrap over 250 FORTRAN routines into much smaller set of classes,
which expressed the common structure of LAPACK.
|
|
|
Title |
|
Benchmarking I/O Solutions
for Clusters |
Author(s) |
|
Stefano Cozzini and Moshe Bar |
Author Inst |
|
Democritos INFM National Simulation Cente |
Presenter |
|
Stefano
Cozzini |
| Abstract |
|
Clustered Systems offer many advantages for demanding
scientific applications: they can deal with massive CPU-bound
requirements and allow the distribution of RAM among many nodes.
However, many scientific applications process massive amounts
of data and therefore require high performance, distributed storage
next to parallel I/O. This session will discuss present-day I/O
cluster solutions based on Bonnie performance benchmarking for
a variety of popular systems.
|
|
|
Title |
|
The Design, Implementation,
and Evaluation of mpiBLAST |
Author(s) |
|
Aaron E. Darling, Lucus Carey, and Wu-chun Feng |
Author Inst |
|
University of Wisconsin -- Madison |
Presenter |
|
Aaron
E. Darling |
| Abstract |
|
mpiBLAST is an Open Source parallelization of
BLAST that achieves superlinear speed-up by segmenting a BLAST
database and then having each node in a computational cluster
search a unique portion of the database. Database segmentation
permits each node to search a smaller portion of the database,
eliminating disk I/O and vastly improving BLAST performance.
Because database segmentation does not create heavy communication
demands, BLAST users can take advantage of low-cost and efficient
Linux cluster architectures such as the bladed Beowulf. In addition
to this presentation of the software architecture of mpiBLAST,
there will be a detailed performance analysis of mpiBLAST to
demonstrate its scalability.
|
|
|
 |
Systems |
|
Title |
|
SLURM: Simple Linux Utiltity
for Resource Management |
Author(s) |
|
Morris Jette and Mark Grondona |
Author Inst |
|
Lawrence Livermore National Laboratory |
Presenter |
|
Morris Jette |
| Abstract |
|
SLURM is an open source, fault-tolerant and highly
scalable cluster management and job scheduling system for Linux
clusters of thousands of nodes. Components include machine status,
partition management, scheduling and stream copy modules. This
session presents an overview of the SLURM architecture and functionality.
|
|
|
Title |
|
A Simple Installation and
Administration Tool for the Large-Scaled PC Cluster System:
DCAST |
Author(s) |
|
Tomoyuki Hiroyasu, Mitsunori Miki, Kenzo Kodama,
Junichi Uekawa, and Jack Dongarra |
Author Inst |
|
Doshisha University |
Presenter |
|
Tomoyuki Hiroyasu |
| Abstract |
|
The installation and configuration of clusters
with many nodes is difficult due to the large amount of time
and knowledge required to fully complete the task. To solve this
problem a simple installation and administration tool, “Doshisha
Cluster Auto Setup Tool: DCAST,” has been developed. Targeted
at Linux, it supports both diskless and diskfull clusters, requires
no interaction during install, boots slave nodes over the network
and changes to configuration are propagated to all nodes.
|
|
|
Title |
|
The Space Simulator |
Author(s) |
|
Michael S. Warren, Chris Fryer, and M. Patrick
Goda |
Author Inst |
|
Los Alamos National Laboratory |
Presenter |
|
Michael S.
Warren |
| Abstract |
|
The Space Simulator is a 294 processor Beowulf
cluster with a peak performance near1.5 Teraflops. It achieved
Linpack performance of 665.1 Gflops on 288 processors, making
it the 85th fastest computer in the world. The Space Simulator
Cluster is dedicated to performing computational astrophysics
simulations in the Theoretical Astrophysics group (T6) at Los
Alamos National Laboratory. This case study will outline the
design drivers, software and applications applied to |
|
|
Title |
|
A Middleware-Level Parallel
Transfer Technique Over Multiple Network Interfaces |
Author(s) |
|
Nader Mohamed, Jameela Al-Jaroodi, Hong Jiang,
and David Swason |
Author Inst |
|
University of Nebrask--Lincoln |
Presenter |
|
Nader Mohamed |
| Abstract |
|
Network middleware is a software layer that provides
abstract network APIs to hide the lowlevel technical details
from users. Existing network middleware support single network
interface and link message transfers. In this session, we describe
a middleware level parallel transfer technique that utilizes
multiple network interface units that may be connected through
multiple networks. It operates on any reliable transport protocol
such as TCP and transparently provides an expandable high bandwidth
solution that reduces message transfer time, provides fault tolerance
and facilitates dynamic load balancing between the underlying
multiple networks. The experimental evaluation displayed a peak
performance of 187Mbps on two fast Ethernet networks.
|
|
|
Title |
|
The Cluster Integration
Toolkit (CIT) |
Author(s) |
|
James H. Laros III, Lee Ward, Nathan W. Dauchy,
Ruth Klundt, Glen Laguna, James Vasak, Marcus Epperson, and Jon
R. Stearley |
Author Inst |
|
Sandia National Labs |
Presenter |
|
James
H. Laros III |
| Abstract |
|
The Cluster Integration Toolkit is an extensible,
portable, scalable cluster management software architecture for
a variety of systems. It has been successfully used to integrate
and support a number of clusters at Sandia National Labs and
several other sites, the largest of which is 1861 nodes. This
session will discuss the goals of the project and how they were
achieved. The installation process will be described and common
tasks for cluster implementation and support will be demonstrated.
|
|
|
Title |
|
Scalable C3 Power Tools |
Author(s) |
|
Stephen Scott and Brian Luethke |
Author Inst |
|
Oak Ridge National Laboratory |
Presenter |
|
John
Mugler |
| Abstract |
|
With the growth of the typical cluster reaching
512 and more compute nodes, it is apparent that cluster tools
must begin to reach toward the 1000’s of nodes in scalability.
Version 3.2 of the C3 tools has started stretching the Single
System Illusion concept into the realm of 1000’s of compute
nodes by actually improving performance on larger clusters. This
session is a discussion of how this was implemented and how to
use this new version of C3 and also presents some results comparing
the latest release with prior versions of C3.
|
|
|
Title |
|
Full Circle: Simulating
Linux Clusters on Linux Cluster |
Author(s) |
|
Jose Moreira, Luis Ceze, Karin Strauss, George
Almasi, Patrick J. Bohrer, Jose R. Brunheroto, Calin Cascaval,
Jose G. Gastranos, and Derek Lieber |
Author Inst |
|
IBM T.J. Watson Research Center |
Presenter |
|
Jose Moreira |
| Abstract |
|
BGLsim is a complete system simulator for parallel
machines allowing users to develop, test and run the same code
that will be used in a real system. It is currently being used
in hardware validation and software development for the BlueGene/L
cellular architecture machine. BGLsim is capable of functionally
simulating multiple nodes of this machine operating in parallel.
It simulates instruction execution in each node and the communication
that happens between nodes. To illustrate the capabilities of
BGLsim, experiments running the NAS Parallel Benchmark IS on
a simulated BlueGene/L machine are described.
|
|
|
Title |
|
Memory Performance of Dual-Processor
Nodes: Comparison of Intel, Xeon, and AMD Opteron Memory
Subsystem Architectures |
Author(s) |
|
Avijit Purkayastha, Chona S. Guiang, Kent F.
Milfeld, and John R. Boisseau |
Author Inst |
|
University of Texas--Austin |
Presenter |
|
Avijit
Purkayastha |
| Abstract |
|
There are several important features in the AMD
x8664 microarchitecture (Opteron) and the HyperTransport
technology that are beneficial to the HPC community. The Opteron
processor has an integrated memory controller, and hence a direct
connection to memory through two 64bit wide interfaces.
More importantly, this means that each processor in an SMP system
has a "separate" interface and memory modules. In addition,
HyperTransport technology has been built directly into the processors
and also into the chipsets, creating processortoprocessor
and processortochipset interconnects that are highspeed
and have low latencies. Systems that implement processors with
onchip memory controllers and HyperTransport pointtopoint
links for interprocessor communication can support parallel
applications that have large communication and data sharing needs.
Such systems provide an ideal environment for both sharedmemory
(OpenMP) and distributedmemory (MPI) paradigms.
The Opteron can also achieve excellent singleprocessor
performance. It is unencumbered by the latencies and bottleneck
of a north bridge, so memoryintensive applications have
the opportunity to deliver fullbandwidth streams from memory
to each processor. The large L2 caches provide more room for
improving the performance of computeintensive applications.
Also, the native 64bit Opteron microarchitecture supports
largememory applications, as well as legacy 32bit applications,
concurrently.
In this paper we will explore the benefits of the new x8664
architecture through the performance of some standard HPC code
kernels and applications that use multithreading (OpenMP)
and multiprocessing (MPI). Our analysis will focus on characteristics
of the memory subsystem and will examine two key issues. We will
conduct scaling studies of compute and memory intensive applications
on dualprocessor AMD Opteron and Intel Xeon nodes to assess
how well the memory subsystem copes with the increased memory
demands of the second processor. We will also investigate how
OS memory affinity and process binding affects memory bandwidths.
|
|
|
Title |
|
Scheduling for Improved
Write Performance in a Cost-Effective, Fault-Tolerant Parallel
Virtual File System (CEFT-PVFS) |
Author(s) |
|
Yifeng Zhu, Hong Jiang, Xiao Qin, Dan Feng, and
David R. Swanson |
Author Inst |
|
University of Nebraska -- Lincoln |
Presenter |
|
Yifeng
Zhu |
| Abstract |
|
This session will demonstrate that all the disks
on the nodes of a cluster can be connected together through CEFTPVFS,
an RAID10 style parallel file system for Linux system, to provide
a GBytes/sec parallel I/O performance , without any additional
cost. To improve the overall I/O performance, I/O requests can
be scheduled on a less loaded node in each mirroring pair, thus
making more informed scheduling decisions. Based on the heuristic
rules we found from the experimental results, a scheduling algorithm
for dynamic load-balancing has been developed that significantly
improves the overall performance.
|
|
|
Title |
|
Archiving Order through
CHAOS: The LLNL HPC Cluster Experience |
Author(s) |
|
Robin Goldsteine, Ryan Braby, and Jim Garlick |
Author Inst |
|
Lawrence Livermore National Laboratory |
Presenter |
|
Robin
Goldstone |
| Abstract |
|
For the past several years, Lawrence Livermore
National Laboratory (LLNL) has invested significant effort in
the deployment of large High Performance Computing (HPC) Linux
clusters. After deploying two modest sized clusters (88 nodes
and 128 nodes) in early 2002, efforts progressed to the deployment
of the Multiprogrammatic Capability Resource (MCR, 1154 nodes)
in fall 2002 and ASCI Linux Cluster (ALC, 962 nodes) in early
2003. Through these efforts, LLNL has developed expertise in
a number of areas related to the design, deployment and management
of large Linux clusters. In this session LLNL will present their
experiences, including challenges encountered and lessons learned.
|
|
|
Title |
|
Supercomputing Center
Management Using AIRS |
Author(s) |
|
Robert
Ballance, Jared Galbraith, and
Roy Heimbac |
Author Inst |
|
University of New Mexico |
Presenter |
|
Robert A. Ballance |
| Abstract |
|
Running a large university supercomputing center
teaches many lessons, including the need to centralize data collection
and analysis, automate system administration functions, and enable
users to manage their own projects. Albuquerque Integrated Reporting
System (AIRS), a centralized, web-enabled application capable
of user and project administration across multiple clusters and
reporting against both active and historical data, evolved in
response to these pressures.
|
|
|
 |
Bioinformatics |
|
Title |
|
Running BLAST on
a Linux Cluster |
Presenter(s) |
|
Ray Hookway |
Presenter Inst |
|
Hewlett-Packard |
| Abstract |
|
Everyone knows that Blast is an example of an
embarrassingly parallel application, i.e., an application that
will run well on a cluster. Conceptually, one breaks up a query
against a database into several queries against subsets of the
database and distributes the resulting jobs across the nodes
of the cluster. However, it is not obvious how to go about doing
this. The talk will begin with a brief review of how Blast works
and then will explore factors that affect the performance of
Blast running on a single system. Final focus will be on the
answer to the question “How to run Blast on a cluster?”
|
|
|
Title |
|
Biobrew Linux: A Linux
Cluster Distribution for Bioinformatics |
Presenter(s) |
|
Glen Otero |
Presenter Inst |
|
Callident |
| Abstract |
|
BioBrew Linux is the first known attempt at creating
and freely distributing an easy-to-use clustering software package
designed for bioinformaticists. With support for both IA32 and
IA64 platforms, BioBrew is a Linux distribution that combines
the NPACI Rocks cluster software with several popular Open Source
bioinformatics software tools like BLAST, HMMER, ClustalW and
BioPerl. The result is a Linux distribution that can be used
to install a workstation or a Beowulf cluster for bioinformatics
analyses.
|
|
|
Title |
|
Terascale LinuxClusters:
Supercomputing Solutions for the Life Sciences |
Presenter(s) |
|
Bruce Ling
and Padmanabhan Iver |
Presenter Inst |
|
Tularik, Inc. and Linux NetworkX (respectively) |
| Abstract |
|
At Tularik, a biotechnology company specializing
in drug discovery and development using gene regulation, informatics
has become essential for the process of genomics-based drug discovery.
With the explosion of the genomic data and lead discovery screening
data points, a powerful computing environment becomes a must
in order to boost B&D productivity. By deploying a 150-processor
cluster, Tularik has successfully managed millions of data points,
coming from Assay-Development, High-Throughput-Screening (HTS),
Structure-Activity-Relationship (SAR), Lead-Optimization and
Micro-Array to speed its R&D productivity and decision-making
processes. |
|
|
Title |
|
Blade Servers for Genomic
Research |
Presenter(s) |
|
Ron Neyland |
Presenter Inst |
|
RLX Technologies |
| Abstract |
|
Clusters based on industry standard hardware
and software have become the most widely used tools for performing
genomic processing and analysis. While providing many benefits
such as outstanding price/performance, they also introduce a
new set of problems. This session will address how blade servers
provide a compute cluster platform that delivers the compute
power required for genomic research, while minimizing many of
the problems. Real world examples of clusters running many of
the widely used genomic applications will be presented, along
with tips and tools for managing the cluster environment.
|
|
|
Title |
|
High Performance Mathematical
Libraries for Itanium 2 Clusters |
Presenter(s) |
|
Hsin-Ying
Lin |
Presenter Inst |
|
Hewlett-Packard |
| Abstract |
|
HP’s Mathematical LIBrary (HP MLIB) provides
a user-friendly interface using standard definitions of public
domain software and enables users to access the power of high
performance computing. HP MLIB fully exploits the architecture
of the processor and achieves optimal performance on Itanium
2. HP MLIB has been used by high performance computing customers
for over 15 years. This session will provide a brief overview
of relevant architectural features and depict how these features
have been used to design high-level algorithms. The performance
of some of the key components in HP MLIB on Itanium 2 clusters
will be discussed: i.e. matrix multiplication, ScaLAPACK and
SuperLU_DIST.
|
|
|
Title |
|
Parallel Computational
Biology Tools and Applications for Windows Clusters |
Presenter(s) |
|
Jaroslaw
Pillardy |
Presenter Inst |
|
Cornell Theory Center |
| Abstract |
|
Using massively parallel programs for data analysis
is the most popular way of dealing with the enormous amounts
of data produced in molecular biology research. Several computational
biology tools for Microsoft Windows clusters of different levels
of complexity, available at the Computational Biology Service
Unit at the Cornell Theory Center, will be discussed. All of
the tools follow a master-worker approach using MPI communications.
The simplest tools - tools that are very important to biologists
- are standard sequence-based data mining tools such as BLAST
and HMMER. More sophisticated is the structure-based (threading)
protein annotation algorithm LOOPP.
|
|
|
Title |
|
Building Software for
High Performance Informatics and Chemistry |
Presenter(s) |
|
Joseph Landman |
Presenter Inst |
|
Scalable Informatics LLC |
| Abstract |
|
Given the growth rate of life science data sets,
analysis applications designed for single machines with shared
memory and one or more CPUs quickly leads to a performance bottleneck.
Clusters and Grids represent a potential solution to this bottleneck
but only when applications are properly designed to make full
use of the resources available. In this session we will look
at the hard realities of building software for the informatics
industry, including: problems with running legacy software on
clusters, how to make efficient use of clusters, for both the
cluster and the user, and making life science informatics and
chemistry applications scale well on clustered systems.
|
|
|
Title |
|
To Cluster or Not to
Cluster |
Presenter |
|
Tom Scanlan |
Presenter Inst |
|
NEC Solutions America |
| Abstract |
|
(Unavailable)
|
|
|
 |
Automotive
& Aerospace Engineering |
|
Title |
|
Cluster Computing in
Space Applications |
Presenter(s) |
|
Eric George |
Presenter Inst |
|
The Aerospace Corporation |
| Abstract |
|
This case study will examine how The Aerospace
Corporation utilizes cluster computing for a variety of applications
in support of high priority national defense programs including
the Global Positioning System (GPS) and future missile warning
programs. Applications to date have focused on astrodynamics,
satellite constellation design, communications network modeling,
thermal analysis, and complex scheduling/tasking algorithms.
Processing techniques range from Monte Carlo analysis & brute
force search operations to genetic algorithms. Research is progressing
on implementation of a diverse grid-computing environment at
Aerospace. |
|
|
Title |
|
Full Vehicle Dynamic
Analysis Using Automated Component Modal Synthesis |
Presenter(s) |
|
Peter Schartz |
Presenter Inst |
|
MSC Software |
| Abstract |
|
Today it is commonplace to attempt to analyze
the fully trimmed body of an automobile for its vibration characteristics,
over increasing frequency ranges, and on inexpensive computer
hardware. The cost effectiveness of RISC based cache processors,
combined with upward pressure in the form in large, detailed
models, has allowed new software methods to utilize domain decomposition
to enable high-level parallelism. A domain decomposition, followed
by a component modal synthesis solution, is the bases for Automated
Modal Component Synthesis (ACMS) in MSC.Nastran. The solution
is described in theory, and its effectiveness is demonstrated
by an example taken from today’s automotive industry.
|
|
|
Title |
|
Using Clusters to Deliver
Turn Key DFC Solutinons |
Presenter(s) |
|
Greg Stuckert |
Presenter Inst |
|
Fluent |
| Abstract |
|
While low cost, high performance clusters have
been in use since the early 1990’s, the application of
commercial off-the-shelf CFD software, such as Fluent, to harness
these shared nothing architectures has only been viable near
the end of that decade. Early implementations required persistent
IT department willing to commit the time and resources necessary
to overcome these challenges. Now, however, organizations are
able to access a full-featured implementation of Fluent via the
Internet in a pay as you go scenario. This session will discuss
the problems solved and gains realized by a distributed implementation
of Fluent 6.1. |
|
|
Title |
|
LS-DYNA: CAE Simulation
Software on Linux Clusters |
Presenter(s) |
|
Guangye Li |
Presenter Inst |
|
IBM |
| Abstract |
|
LS-DYNA is used in a wide variety of simulation
applications: automotive crashworthiness & occupant safety;
sheet metal forming, military and defense applications, aerospace
industry applications, electronic component design. Several years
ago, one simulation of a very simplified finite element model
needed days to complete on a Symmetric Multiprocessing (SMP)
vector computer. With the introduction of Distributed Multiprocessing
technology, the MPP (Massively Parallel Processors) version of
LS-DYNA can dramatically reduce the turnaround time for the simulation
and therefore reduce the time for the automotive design process.
We will present the comparison of the scalability of the SMP
and MPP versions of LS-DYNA, as well as the comparison of communication
networks (Myrinet, Fast Ethernet, Gigabit Ethernet) on Linux
clusters.
|
|
|
Title |
|
Linux Clusters in the
German Automotive Industry |
Presenter(s) |
|
Karsten Galer |
Presenter Inst |
|
science + computing AG |
| Abstract |
|
After the first German CAE-Linux computer cluster
(LCC) was installed in 1999 at DaimlerChrysler for electromagnetic
compatibility calculations (EMC), there has been great success
in the adoption of LCC. This includes clusters based on 512 CPUs
used for crash-calculations running at a major automotive manufacturer.
This talk will provide an overview of ways in which Linux clusters
are changing the course of CAE in Germany. It will also look
at a number of different configurations currently being implemented
in some of the world’s largest automotive manufacturers.
|
|
|
Title |
|
Improving Multi-site/Multi-departmental
Cluster Systems through Data Grids in the Automotive and Aerospace
Industries |
Presenter(s) |
|
Andrew Grimshaw |
Presenter Inst |
|
Avaki |
| Abstract |
|
As the pressure increases to optimize the product
design and manufacturing processes it is critical for the automotive
and aerospace industries to give professionals secure access
to product and manufacturing information. Data is often located
at multiple R&D sites and suppliers, regardless of location.
Additionally, product developers require more and more processing
power, delivered via clusters that are not effective unless they
can provide access to the data they need. This session will examine
the most significant data challenges facing today’s automotive
and aerospace companies and how Grid technology impacts the engineering
and manufacturing process.
|
|
|
Title |
|
Scrutinizing CFD
Performance on Multiple Linux Cluster Architectures |
Presenter(s) |
|
Thomas Hauser |
Presenter Inst |
|
Utah State University |
| Abstract |
|
Linux cluster supercomputers are a cost-effective
platform for simulating fluid flow in engineering applications.
However, obtaining high performance on these clusters is a non-trivial
problem, requiring tuning and design modifications to the Computational
Fluid Dynamics (CFD) codes. Investigations in optimizing CFD
codes on Linux cluster platforms will be presented. Detailed
performance results of two CFD codes on a wide range of cluster
architectures, including Pentium and Athlon, Intel Itanium and
the AMD Opteron, will be analyzed. The single and multi-processor
performance of these codes on different cluster architectures
will be compared and means of improving performance discussed.
|
|
|
Title |
|
Managing CAE Simulation
Workload in Cluster Environments |
Presenter(s) |
|
Michael M. Humphrey |
Presenter Inst |
|
Altair |
| Abstract |
|
Automotive manufacturers are beginning to capitalize
on workload management software to get the most out their numerically
intense computing environments. Workload management software
is middleware technology that sits between your compute-intensive
applications - such as ABAQUS, ANSYS, FLUENT, LS-DYNA, NASTRAN
and OPTISTRUCT - and your network hardware operating systems.
The software schedules and distributes all types of application
runs (serial, parallel, distributed memory, parameter studies,
big memory, long running, etc.), on all types of hardware (desktops,
clusters, supercomputers and even across sites). This presentation
will describe the current capabilities of PBS Pro workload management
software as a middleware enabler for robust system design.
|
|
|
 |
Digital Content Creation / Scientific Visualization / Simulation |
|
Title |
|
The Current State of
Numerical Weather Prediction on Cluster Technology -- What
is Needed to Break the 25% Efficiency Barrrier? |
Presenter(s) |
|
Dan Weber |
Presenter Inst |
|
Center for the Analysis and Prediction of Storms |
| Abstract |
|
This session will look in depth at the current
state of weather prediction and the many challenges it faces.
The talk will examine the computational needs (teraflops) of
a robust numerical weather prediction (NWP) system at thunderstorm
scale and review NWP performance on current computer technology.
A review of current models will be addressed, as well as the
roadblocks associated with clusters. Finally, a proposal for
a complete shift in the way systems of equations are solved on
scalar technology in order to break the 25% efficiency ceiling
will be examined.
|
|
|
Title |
|
Building and Using Tiled
Display Walls |
Presenter(s) |
|
Paul Rajlich |
Presenter Inst |
|
National Center for Supercomputing Applications
(NCSA) |
| Abstract |
|
Tiled display walls provide a large-format environment
for presenting very high-resolution visualizations by tiling
together the output from a collection of projectors. Projectors
are driven by a Linux cluster augmented with high-performance
graphics accelerator cards and costs are controlled by using
commodity projectors and low-cost PCs. Tiled walls must face
a number of challenges, such as, aligning the projectors so that
the output of adjacent tiles align to create a seamless image.
This session will discuss the Alliance Display Wall-in-a-Box
effort; a distribution of related Open Source software packages
that reduce the setup and maintenance of complex high-end display
systems.
|
|
|
Title |
|
Discovery and Analysis
of Communication Patterns in Complex Network-based Systems
Using Virtual Environments |
Presenter(s) |
|
Tom Caudell |
Presenter Inst |
|
University of New Mexico |
| Abstract |
|
The real-time visualization of cluster networks
provides a number of benefits to administrators and developers
in search of performance bottlenecks. Real-time visuals provide
early warning of real problems in network traffic as well as
provide clear indication of potential problems before they occur.
However, real-time network visualization is a remarkably difficult
project. This session will discuss a number of the technical
hurdles involved in building a visualization system that will
scale with increased performance. Using network visualization,
organizations can design applications that take better advantage
of network traffic, avoiding bottlenecks, and administrators
can make informed decisions on scheduling that lead a cluster
toward optimal performance.
|
|
|
Title |
|
HPC and HA Clustering
for Online Gaming |
Presenter(s) |
|
Jesper Jensen |
Presenter Inst |
|
SCI |
| Abstract |
|
SCI, the company who developed and supports the
backend for the Department of Defense's America's Army game,
will deliver a case study on deploying gaming clusters for the
DoD — and other game titles — and give an overview
of where large-scale game technology is and where it is going.
With technology capable of pushing an average of 1.35 teraflops
per cabinet space and leveraging multiple transit carriers, SCI
clusters deliver both the HPC and HA required to support a massive
gaming audience. This discussion will touch on solutions for
32 bit and next-generation 64 bit architectures both in place
and under development.
|
|
|
Title |
|
Large Scale Scientific
Visualization on PC Clusters |
Presenter(s) |
|
Brian Wylie |
Presenter Inst |
|
Sandia National Labs |
| Abstract |
|
This session covers the use of PC clusters with
commodity graphics cards as high-performance scientific visualization
platforms. A cluster of PC nodes, in which many or all of the
nodes have 3D hardware accelerators, is an attractive approach
to building a scalable graphics system. The main challenge in
using cluster-based graphics systems is the difficulty of realizing
the full aggregate performance of all the individual graphics
accelerators. Topics covered will include parallel geometric
rendering, parallel volume rendering, data distribution approaches
and novel techniques for utilizing graphics processors.
|
|
|
Title |
|
The Use of Clusters for
Engineering Simulation |
Presenter(s) |
|
Lynn Lewis |
Presenter Inst |
|
Hewlett-Packard |
| Abstract |
|
Clusters allow the use of advanced mathematical
techniques for optimization, changing the way engineers arrive
at cost effective, safe designs. Without inexpensive clusters,
engineers at automotive manufacturers could not do 1000's of
crash test simulations integrated with the initial design stage
nor test for structural integrity much less manufacturability
within weeks. This session will examine in detail how, over the
previous decade, Unix and lately Linux clusters have found use
in commercial cash and fluid dynamics simulations, changing the
way cars and aircraft are designed and built.
|
|
|
Title |
|
NEESgrid: Virtual Collaboratory
for Earthquake Engineering and Simulation |
Presenter(s) |
|
Tom Prudhomme |
Presenter Inst |
|
National Center for Supercomputing Applications
(NCSA) |
| Abstract |
|
NEESgrid will link earthquake engineering researchers
across the U.S. with leading-edge computing resources and research
and testing facilities, allowing teams to plan, perform, and
publish their research. Via both Telepresence and other collaboration
technologies, research teams are able to work remotely on experimental
trials and simulations. This session will examine how NEESgrid,
through the shared resources of Grid technology, will bring together
information technologists and engineers in a way that will revolutionize
earthquake engineering, research and simulation.
|
|
|
Title |
|
In the Architecture of
and Audio Identification Cluster |
Presenter(s) |
|
Daniel Culbert |
Presenter Inst |
|
Shazam Entertainment, Inc. |
| Abstract |
|
(Unavailable)
|
|
|
 |
Cluster Solutions |
|
Title |
|
Building the TeraGrid:
The World's Largest Grid, Fastest Linux Cluster, and Fastest
Optical Network Dedicataed to Open Science |
Presenter(s) |
|
Pete Beckman |
Presenter Inst |
|
Argonne National Laboratory |
| Abstract |
|
The TeraGrid is one of the most ambitious collaborative
grid projects ever undertaken. The building blocks for the $88
million National Science Foundation funded project include mammoth
computational resources, ultra-fast fiberoptic networks linking
NCSA, SDSC, CalTech Argonne and PSC and a software “grid
hosting environment.” Together, they will form an environment
that makes developing cluster-based, grid-enabled scientific
applications easy. This presentation will provide an overview
of the project, the bleeding edge technologies used to bring
clusters and grids to the scientific community and an update
on current status and results.
|
|
|
Title |
|
Building Blocks for 64-bit
AMD Opteron Clusters |
Presenter(s) |
|
Richard Brunner |
Presenter Inst |
|
AMD |
| Abstract |
|
This presentation describes the hardware and
software building blocks that are in place to construct 64-bit
AMD Opteron(TM) based Clusters. We begin with an overview of
the newly released AMD Opteron(TM) processor and its system architecture
that allows affordable 64-bit clustered computing while maintaining
32-bit performance and compatibility. Special attention will
be given to the "glueless" multiprocessing capability
provided by fast HyperTransport(tm) Technology interconnects
and per-processor integrated memory controllers. We will next
describe how 64-bit SuSE Linux Enterprise Server for AMD x86-64
exploits this hardware topology and discuss the accompanying
thread and explicit parallelism tools and compilers that are
available. We will end the presentation with a survey of the
available third-party cluster adapters and interconnects that
are supported on AMD Opteron(TM) platforms.
|
|
|
Title |
|
Tools for Optimizing
HPC Applications on Intel Clusters |
Presenter(s) |
|
Don Gunning |
Presenter Inst |
|
Intel |
| Abstract |
|
The Intel software research lab is involved in
several projects related to the development and deployment of
HPC software on Intel based clusters. This discussion will focus
on the work Intel is doing in parallel/concurrent computing within
a single job or task, the development, debugging and tuning multithreaded
applications, in addition to deploying MPI (and mixed MPI/threaded)
applications and Extending OpenMP to execute across clusters.
This discussion will also touch on ideas for maximum messaging
performance on the interconnect while maximizing application
performance on the node.
|
|
|
Title |
|
The Ultra Scalable HPTC
Lustre Filesystem |
Presenter(s) |
|
Kent Koeninger |
Presenter Inst |
|
Hewlett-Packard |
| Abstract |
|
The Lustre filesystem is designed to provide
a coherent-scalable shared filesystem that can serve thousands
of Linux client nodes, delivering extremely high-bandwidth parallel-filesystem
access to many terabytes of storage. This talk will describe
how the Luster filesystem will be used in scalable-HPTC-Linux
systems to combine the flexibility, scalability and manageability
of NAS systems with the performance of SAN systems. The Lustre
development effort is an open source project with initial release
target in 2003.
|
|
|
Title |
|
Building the World's
Most Powerful Cluster: 11.2 Tflops at Lawrence Livermore
National Laboratory |
Presenter(s) |
|
Kim Clark |
Presenter Inst |
|
Linux NetworX |
| Abstract |
|
In 2002, Linux Networx built the MCR cluster
housed at Lawrence Livermore National Laboratory. It is currently
the largest cluster in the world with a theoretical peak of 11.2
Tflops and, with more than 1,000 nodes to manage and monitor,
ranks as the fifth largest supercomputer in the world. The unique
challenges involved in building and configuring such a massive
system and what was leaned from this experience will be discussed.
Attendees will learn how to apply aspects of the LLNL system
to their own smaller system to enhance cluster performance and
reliability.
|
|
|
Title |
|
Driving Cluster/Grid
Technologies in HPC |
Presenter(s) |
|
David Barkai |
Presenter Inst |
|
Intel |
| Abstract |
|
High performance computing has undergone a metamorphosis
in the last 15-20 years. The changes, and what they mean to the
industry and the user community, will be reviewed. The cluster
approach to HPC drives the evolution of a new ecosystem. In this
talk we will describe the building blocks as a set of components
built upon enabling technologies. The application characteristics
determine the choices made for system software, middleware, interconnect,
cluster topology, the nodes, and the processor. The resulting
architecture and the nature of the workload and computing environment
dictate the management tools that are needed. We will summarize
the considerations for the choices that need to be made while
highlighting the gaps and the challenges, as cluster computing
ramps up and grid computing continues to develop.
|
|
|
Title |
|
Emerging Trends in Data
Center Powering and Cooling |
Presenter(s) |
|
Wahid Nawabi |
Presenter Inst |
|
APC |
| Abstract |
|
Traditional data center architecture approaches
force enterprises to build out to full capacity from day one,
yet one hundred percent utilization of the designed capacity
is seldom reached. This results in long deployment schedules,
millions of dollars of unrecoverable up-front capital investments
and the maintenance of expensive service contracts on under-utilized
infrastructure. APC’s PowerStruXure offers an on-demand
solution that accelerates speed of deployment and allows you
to invest in a data center solution that is sufficient to meet
today’s demands, rather than an uncertain estimate of future
capacity.
|
|
|
Title |
|
The Virtual Environment and
Its Impact on IT Infrastructure |
Presenter(s) |
|
Daniel Kusnetzky |
Presenter Inst |
|
IDC |
| Abstract |
|
IDC has been examining the evolution of the virtual
environment for quite a number of years. This session will examine
IDC’s definition of the virtual environment, its roots
in techniques developed in the late 1970s, and how Windows, Unix
and Linux can be deployed as platforms in the virtual environment.
Dan Kusnetzky, IDC’s Vice President of System Software,
will present the drivers for virtual environment software adoption
and project how the virtual environment will impact the overall
IT infrastructure in the coming years. |
|
|
 |
Petroleum / Geophysical Expoloration |
|
Title |
|
Exploring the Earth's
Subsurface with Itanium 2 Linux Clusters |
Presenter(s) |
|
Keith Gray |
Presenter Inst |
|
British Petroleum |
| Abstract |
|
This case study of an Itanium 2 processor architecture
and Linux cluster technology for seismic imaging and migration
imperative project, allowed a British Petroleum to reduce their
cost for this high-end infrastructure by one-half while increasing
performance by 3X, and in some cases exceeding this expectation
by 5X. The environment includes 1024 processors (4-way HP rx5670
servers x 256 servers) with 8.2 Terabytes (32GB per rx5670 server)
of memory and operates at over 4Teraflops peak performance.
|
|
|
Title |
|
Scalablity Considerations
for Compute Intensive Appplications on Clusters |
Presenter(s) |
|
Christian Tanasescu |
Presenter Inst |
|
SGI |
| Abstract |
|
This session investigates the scalability, architectural
requirements and performance characteristics of some of the most
widely used compute intensive applications in the scientific
and engineering communities. Seismic Processing and Reservoir
Simulation (SPR) applications generally consume data read from
memory and have to load continuous new data. As a result, to
keep the floating point (FP) units busy, these applications require
computer architectures with high memory bandwidth, mainly due
to the data addressing patterns and heavy I/O activities. We
will also introduce BandeLa, to study the influence of the communication
bandwidth and latency for MPI applications.
|
|
|
Title |
|
Parallel Reservoir Simulation
on Intel Xeon HPC Clusters |
Author(s) |
|
Baris Guler, Tau Leng, Victor Mashayekhi, and
Kamy Sepehmoori |
Author Inst |
|
Dell and Univeristy of Texas - Austin |
Presenter |
|
Kamy Sepehmoori and Reza Rooholamini |
| Abstract |
|
Numerical simulation of reservoirs is an integral
part of geo-scientific studies, with the goal of optimizing petroleum
recovery. In this session, we conduct a series of benchmarks
by running a parallel reservoir simulation code on an Intel Xeon
Linux cluster and study the scalability while using different
interconnects for the cluster. Our results show that the simulator’s
performance scales linearly from one to 64 single-processor nodes,
when using a low-latency, high-bandwidth interconnect. In addition
to benchmarking, we describe a process-to-processor mapping approach
for dual-processor clusters to improve communication performance
as well as overall performance of the simulator. |
|
|
Title |
|
Geoscience Visualization
and Seismic Processing Clusters: Collaboration and Integration
Issues |
Presenter(s) |
|
Phil Neri |
Presenter Inst |
|
Paradigm |
| Abstract |
|
The active development of Linux visualization
clusters has led to the notion of associating closely compute-intensive
seismic processing and geosciences visualization, notably for
the purpose of building and verifying velocity and solid models.
The options are to implement cross-system data integration, or
to share of a common hardware resource. Practical implementations
of the integration model will be presented, based on Paradigm’s
experience with existing production systems. The use of a CORBA-based
distributed data architecture will also be discussed. The common
hardware concept, still in the design phase, will be analyzed
for its expected benefits, economics and potential problems.
|
|
|
Title |
|
Cluster
Computing at CGG |
Presenter(s) |
|
L. Clerc |
Presenter Inst |
|
CGG |
| Abstract |
|
(Unavailable) |
|
|
Title |
|
Grid Computing in the
Energy Industry |
Presenter(s) |
|
Jamie Bernardin |
Presenter Inst |
|
DataSynapse |
| Abstract |
|
Grid computing has attracted significant attention
in the current IT environment. What are the business and technical
factors driving companies to adopt Grid? In this presentation
on Grid computing in Oil & Gas, we will examine, frequently
encountered obstacles to deploying a grid computing solution,
compare the vision of Grid to the realities of today, identify
target deployments for distributed computing solutions in the
Oil & Gas sector, and describe the value impact of grid computing.
DataSynapse will share case studies from its existing engagements
as well as identify specific technical requirements unique to
the energy market.
|
|
|
Title |
|
Drilling in the Digital
Oil Field: High Pay-offs from Linux Clusters |
Presenter(s) |
|
Shawn Fuller |
Presenter Inst |
|
Hewlett-Packard |
| Abstract |
|
The Oil & Gas industry is required to manage
mammoth volumes of complex data for both engineering and scientific
requirements in their search for discovering new reservoirs and
more cost efficient production methods. Globally deployable high-performance
computing systems coupled with best-in-class applications are
the keys to success for Oil & Gas companies to excel in their
business. This session will cover the areas of technology receiving
the most focus: mobility, desktop visualization, scalable and
immersive visualization, global collaboration, scalable clustered
systems, network storage systems, imaging and printing - covering
the full gamut of Oil & Gas IT requirements.
|
|
|