This is an archived page of the 2007 conference


Last updated: April 19, 2007.

keynotes | papers | speakers | vendors


Title Parallelism and Power in the Age of Petascale Computing
author(s) Horst Simon, Lawrence Berkeley National Laboratory, USA
presenter Horst Simon
We are about to enter the age of Petascale computing, By about 2009 the first Petaflop/s system will have entered the TOP500, and in the following decade we will see Petaflops computing to become more common place. Just like today Terascale computing on commodity clusters is widespread, we will see about ten tears now wide spread adoption of Petascale computers using commodity technology. By June 2016 one Petaflop/s performance will be required to enter the TOP500 list, and the age of Exaflops computing will be upon us soon after.

In this talk we will survey two interrelated challenges for the age of commodity Petaflops computing: dealing with increasing parallelism and reducing the power consumption of future systems. We first point out that these challenges are interrelated, and that one way to lower power consumption is by increasing parallelism. We will also explain the seeming contradiction that low power solutions are not necessarily the most energy efficient solutions.  We then claim that contrary to current hype about multicore processors that we will be able to deal with the parallelism challenge based on the experience of the HPC community in the last 15 years. Finally we will show that the problem of power consumption should be approached with a multi-tier strategy, attacking the problem at the component, system, computer room, and building-environment level. We have started such a multi-tier strategy in Berkeley, and will show some early results.
title Advances in Regional Climate System Modeling for a Better Understanding of Land Use and Climate Change Impacts in California
Author(s) Norman L. Miller, Lawrence Berkeley National Laboratory, USA
Presenter Norman L. Miller
abstract Regional Climate System Modeling (RCSM) is an outgrowth of Global Climate System Modeling (GCSM) and Numerical Weather Prediction (NWP). The Berkeley RCSM framework consists of pre- and post- processors, numerical and statistical models, and a database management interface. Our RCSM includes high-resolution atmospheric dynamics, physics, and sub-grid physically based land surface processes with detailed snow and hydrology, and coupled surface-deep groundwater, with modules for water quality and agriculture. It has been applied for climate change research in the western U.S., East Asia, Australia, and results were used as contributions to the 2nd, 3rd, and 4th Intergovernmental Panel for Climate Change Assessment Reports. Some of our most recent advances with RCSM include long-term drought studies of the California Central Valley and water supply and demand optimization. New enhancements include ensemble simulations of soil moisture and plant functional types to generate initial conditions without the normal multi-year spin up for equilibrium conditions, a nested fine-scale (100 m) resolution urban runoff model and estuary model, and links with economic analysis We have implemented the latest version of the Weather Research and Forecasting (WRF) code into our RCSM and have completed a series of large multi-processor ensemble simulations evaluating pre- and post-industrial land use change and the shifts in daily minimum and maximum temperature, latent and sensible heat, and soil moisture. This presentation will provide a brief overview of climate modeling, numerical and parallel computing advances, and California land use and climate change impacts on water resources, energy, agriculture, and other sectors.
title From Beowulf to Cray-o-wulf: Extending Linux Clustering Paradigm to Supercomputing Scale
author(s) Peter J. Ungaro, Cray Inc., USA
presenter Peter J. Ungaro
Commodity technology trends are leading to an ever-increasing reliance on scalability for high performance computing.  This talk will examine some of these trends and issues and discuss the implications for large-scale Linux cluster design and their use for scalable applications.


Parallel I/O, File Systems, & Storage
title pNFS and Linux: Working Towards a Heterogeneous Future
author(s) Dean Hildebrand, Peter Honeyman, and William Adamson, University of Michigan, USA
presenter Dean Hildebrand
abstract Heterogeneous and scalable remote data access is a critical enabling feature of widely distributed collaborations. Parallel file systems feature impressive throughput, but sacrifice heterogeneous access, seamless integration, security, and cross-site performance. Remote data access tools such as NFS and GridFTP provide secure access to parallel file systems, but either lack scalability (NFS) or seamless integration and file system semantics (GridFTP).

Anticipating terascale and petascale HPC demands, NFSv4 architects are designing pNFS, a standard extension that provides direct storage access to parallel file systems while preserving operating system and hardware platform independence. pNFS distributes I/O across the bisectional bandwidth of the storage network between clients and storage devices, removing the single server bottleneck so vexing to client/server-based systems.

Researchers at the University of Michigan are collaborating with industry to develop pNFS for the Linux operating system. Linux pNFS features a pluggable client architecture that harnesses the potential of pNFS as a universal and scalable metadata protocol by enabling dynamic support for layout format, storage protocol, and file system policies. This paper evaluates the scalability and performance of the Linux pNFS architecture with the PVFS2 and GPFS parallel file systems.
title Efficient Methods for Parallel I/O
author(s) Jeff Larkin, Cray, USA and Mark Fahey, Oak Ridge National Laboratory, USA
presenter Jeff Larkin
abstract Available soon.
title SAN Lessons Learned
author(s) Andy Loftus and Chad Kerner, NCSA, USA
presenter Andy Loftus
abstract Available soon.
title Practical Experiences of Setting Up, Managing, and Diagnosing a Large Parallel Filesystem
author(s) Jim Laros, SNL, USA
presenter Jim Laros
abstract Available soon.
Tools & Programming Environments
title Detecting and Solving Memory Problems in Linux Clusters
author(s) Chris Gottbrath, TotalView Technologies, LLC, USA
presenter Chris Gottbrath
abstract Available soon.
title Automated MPI Correctness Checking: What If There Were a Magic Option?
author(s) Patrick Ohly and Werner Krotz-Vogel, Intel Corporation, DE
presenter Patrick Ohly
abstract Available soon.
title A Framework for Scalable Parameter Estimation on Clusters
author(s) Tom Bulatewicz, Daniel Andresen, Stephen Welch, Wei Jin, Sanjoy Das, and Matthew Miller, Kansas State University, USA
presenter Daniel Andresen
abstract Available soon.
title PerfTrack: Scalable Application Performance Diagnosis for LInux Clusters
author(s) Rashawn Knapp, Kathryn Monror, Aaron Amauba, Karen KaravanicPortland State University, USA; Thomas Conerly, Caitlin Gaebel High School, USA; Abraham Neben, Wilson High Scholl, USA; John May, Lawrence Livermore National Laboratory, USA
presenter Rashawn Knapp
abstract Available soon.
Resource Management, Networks, & Power
title Anatomy of Ethernet Resiliency and Scalibility for Cluster Computing
author(s) Debbie Montano, Force 10, USA
presenter Debbie Montano
abstract Available soon.
title Grids for the Real World: Addressing Sovereignty and Ease of Use
author(s) David Jackson, Cluster Resources, USA
presenter David Jackson
abstract Available soon.
title How Long Can You Go?
author(s) Wade Vinson, HP, USA
presenter Wade Vinson
abstract Available soon.
title Stateless Booting
author(s) Egan Ford, IBM, USA
presenter Egan Ford
abstract Available soon.
title Benefits of Centralized Service Processor Management in Clustered Environments
author(s) Ivan Passos, Avocent, USA
presenter Ivan Passos
abstract Available soon.
title Best Practices in Cluster Management
author(s) Richard Friedman, Scali, USA
presenter Richard Friedman
abstract This session will discuss the impact that MPI technology can have in overall system performance, with a particular focus on how MPI can help optimize performance in multi-core based systems. Additionally, experience with Scali MPI Connect on various customer examples will be used to illustrate the impact an MPI can have in overall system efficiency and effectiveness.
title OCS and LSF HPC: An Integrated Solution for System and Workload Management
author(s) Mehdi Bozzo-Rey, Platform Computing, USA
presenter Mehdi Bozzo-Rey
abstract Available soon.
Resource Management & Networks
title An Architecture for Dynamic Allocation of Compute Cluster Bandwidth
author(s) John Bresnahan and Ian Foster, University of Chicago, USA
presenter John Bresnahan
abstract Available soon.
title Experiences Deploying a 10-Gigabit Ethernet Computing Environment to Support Regional Computational Science
author(s) Jason Cope, Theron Voran and Matthew Woitaszek, University of Colorado at Boulder, USA; Adam Boggs, Sean McCreary, and Michael Oberg, National Center for Atmospheric Research, USA
presenter Jason Cope
abstract Available soon.
title The Application Level Placement Scheduler (ALPS)
author(s) Michael Karo, Richard Lagerstrom and Carl Albing, Cray Inc., USA
presenter Michael Karo
abstract Available soon.
Performance Analysis & Applications
title A Case Study in Using Local I/O and GPFS to Improve Simulation Scalability
author(s) Vincent Bannister, Microsoft, USA; Gary Howell and Eric Sills, HPC/ITD NCSU, USA; Tim Kelley and Qianyi Zhang, NCSU, USA
presenter Vincent Bannister
abstract Many optimization algorithms exploit parallelism by calling multiple independent instances of the function to be minimized, and these function in turn may call off-the-shelf simulators. The I/O load from the simulators can cause problems for an NFS file system. In this paper we explore efficient parallelization in a parallel program for which each processor makes serial calls to a MODFLOW simulator. Each MODFLOW simulation reads input files and produces output files. The application is "embarassingly" parallel except for disk I/O. Substitution of local scratch as opposed to global file storage ameliorates synchronization and contention issues. An easier solution was to use the high performance global file system GPFS instead of NFS. Compared to using local I/O, using a high performance shared file system such as GPFS requires less user effort.
title Visualizing I/O Performance during the BGL Deployment
author(s) Andrew Uselton and Brian Behlendorf, Lawrence Livermore National Laboratory, USA
presenter Brian Behlendorf
abstract Available soon.
title Load Balancing in Pre-Processing of Large-Scale Distributed Sparse Computing
author(s) Olfa Hamdi-Larbi and Zaher Mahjoub, Faculty of Sciences of Tunis, TN; Nahid Emad, Univeristy of Versailles, FR
presenter Olfa Hamdi-Larbi
abstract Available soon.
OS Alternatives
title Linux Kernel Improvement: Toward Dynamic Power Management of Beowulf Clusters
author(s) Fengping Hu and Jeffrey Evans, Purdue University, USA
presenter Fengping Hu
abstract Available soon.
title HPC System Call Usage Trends
author(s) Terry Jones, Lawrence Livermore National Laboratory, USA; Andrew Tauferner and Todd Inglett, IBM, USA
presenter Terry Jones
abstract Available soon.
title Compute Node Linux (CNL): From Capability to Capacity
author(s) Kevin Peterson, Cray Inc., USA
presenter Kevin Peterson
abstract Available soon.
title Starting with Linux: A System Design Case Study
author(s) John Goodhue and Win Treese, SiCortex Inc. USA
presenter Available soon.
abstract Available soon.
Emerging Technology
title Intel Woodcrest: An Evaluation for Scientific Computing
author(s) Philip Roth and Jeffery Vetter, Oak Ridge National Laboratory, USA
presenter Philip Roth and Jeffery Vetter
abstract Available soon.
title The PeakStream Platform: High-Productivity Software Development for GPUs
author(s) Matthew Papakipos, PeakStream, Inc. USA
presenter Matthew Papakipos
abstract The emerging world of multi-core processors and massively parallel systems requires a programming model that scales to the new generation of computing architectures. Existing codes written for single-core CPUs are not likely to take full advantage of multi-core technology without modification. In this talk, we will show you the PeakStream Virtual Machine and how it provides automatic parallelization of programs written in C/C++ so that developers can focus on their application logic
- and not the intricate details of parallelizing the application. During this session, learn how we are improving development time on such computationally intense applications as synthetic aperture imaging, computed tomography scans, Monte-Carlo simulation and Black-Scholes option pricing.
title Effective Use of Commodity Mutli-Core Systems in HPC
author(s) Kent Milfeld, Kazushige Goto, Avi Purkayastha, Chona Guiang, and Karl Schulz, University of Texas at Austin, USA
presenter Available soon.
abstract Available soon.
title The Future of Storage: Commodity Clusters and Parallel I/O
author(s) Dave Fellinger and Alex Sayyah, DDN, USA
presenter Dave Fellinger
abstract Available soon.
Technical Briefs: I/O
title Considerations for Scalable Environmental Sciences Applications on Conventional HPC Linux Platforms
author(s) Stan Posey, Panasas Inc., USA
presenter Stan Posey
abstract Research organizations continue to increase their investments in computational environmental sciences and related applications, as they face growing demands of computational scientists who continue to expect more from scalable computer system platforms. Typical demands of the application workflow often include rapid single simulation job turnaround and multi-job throughput capability for users with diverse application requirements in a high-performance computing (HPC) infrastructure.

For today’s economics of HPC, the required resources of CPU cycles, large memory, system bandwidth and scalability, storage and I/O, and file and data management – must all deliver the highest levels of user productivity and reliability that is possible from leverage of conventional systems based on commodity HPC Linux clusters. As the popularity of Linux clusters and distributed memory computing has grown, a new class of storage cluster technology has been developed that are designed to extend conventional cluster capability. These storage systems offer a large single resource of shared addressable storage to provide an improved balance between capability and capacity for effective scalability of HPC application software.

HPC and Scalable Environmental Modeling

Rapid progress in computational environmental application performance has been influenced by advanced developments in application software algorithms, balanced graph partitioning for domain parallel schemes, and HPC cluster systems. By far the most important HPC advancement in recent years for such application software is the parallel scalability that is possible with geometry domain decomposition and distributed memory parallel through explicit message passing. Most environmental modeling software employ this technique today, owing to its potential for scalability on the complete range of HPC systems currently available.

From an additional HPC perspective, many environmental modeling software differ in their discretization schemes and algorithms. That is, some application software use a structured mesh discretization vs. unstructured, and/or perhaps an explicit vs. implicit algorithm – characteristics of which influence HPC performance behaviour on current microprocessor and system architectures, and I/O systems. Performance and scalability can be achieved when all these factors are considered in an overall applications and HPC evaluation.

Parallel scalability of fluid flow algorithms, in theory, are independent of discretization choice, although current methods in graph partitioning efficiency (e.g. METIS) for distributed memory parallel favour unstructured-mesh vs. structured-mesh. This partitioning or domain decomposition seeks to balance the computational “work” among each partition and minimize the amount of information that will “pass” between partition boundaries. This communication requirement between partitions is often what affects the level of parallel scalability for different system architectures.

The choice of an explicit algorithm vs. implicit can influence these communication requirements that affect scalability. Explicit algorithms typically scale better since they rely on a local computational stencil for time integration, which can minimize the exchange of information at domain interfaces. Implicit schemes involve solution of a linear system whose neighbour-dependency is larger, meaning parallel algorithms experience a “delay” of communications that must propagate among several domain interfaces.

An additional topic of parallel consideration are simulations of transient fluid flow behaviour, whereby the amount of I/O required for saving data at each time step during the simulation can severely limit a system architecture’s ability to scale the run. Recent HPC storage cluster technology have developed breakthroughs on high bandwidth over commodity interconnects through software and I/O performance features that remove this bottleneck to separate this I/O requirement from the solver for overall run turn-around on conventional Linux clusters.

This presentation examines application-driven HPC workflow efficiencies for relevant applications in computational environmental sciences. Modeling parameters such as model size, solution schemes, range of scales (and coupled scales), and a variety of simulation conditions can produce a wide range of computational behavior and large-scale data management requirements, such that careful consideration should be given to how HPC resources are configured and balanced to satisfy increasing user requirements.
title Performance Reliabilty and Operational Issues for High Performance NAS Storage
author(s) Matthew O'Keefe, Cray Inc., USA
presenter Matthew O'Keefe
abstract Available soon.
title Experiences with Parallel Commodity Storage
author(s) David Chaffin, Texas Tech HPCC, USA
presenter David Chaffin
abstract Available soon.
title Architecting High Performance, Scalable & Highly Available Cluster Storage with Best-of-Breed Storage Software and DDN S2A Technology
author(s) Jeff Danworth & Bob Woolery, DataDirect, USA
presenter Jeff Danworth & Bob Woolery
abstract Available soon.


title HPC I/O Roadmap: The Next Five Years
author(s) Gary Grider, Los Alamos National Laboratory, USA
presenter Gary Grider
The talk will include a brief history of developments in scalable I/O, file systems, and storage networks for high-performance computing system,; a survey of current work in the area,; and an overview of work being started to address future needs. Additionally, how parallel applications use I/O in HPC systems will be summarized. The presentation will also give a survey of many of the more difficult current and emerging issues for the HPC I/O and file systems area.
title Operations at Scale - Lessons to Be Remembered
author(s) Robert Ballance, Sandia National Laboratories, USA
presenter Robert Ballance
Available soon.
title Tales of Tahoe
author(s) Don Lane, U.S. Forest Service, USA
presenter Don Lane
Don Lane says about his presentation, " You've seen its [Tahoe] beauty, now learn about the region's rich history from the earliest period through today. Hear short tales of the colorful characters who inhabited this magnificent area from the discovery of gold through today."


Network & Cluster Optimization
title Data-Intensive Cluster Optimization
author(s) Benoit Marchand, eXludus, USA
presenter Benoit Marchand
abstract The rapid adoption of clusters as the predominant architecture for HPC yields clear benefits for cost-effective scaling across a broad range of application disciplines. While commodity computational nodes are very cost-effective and both throughput serial workloads and many parallelized applications scale well on clusters, there are many other application workloads which scale less effectively, and can benefit from additional optimization. As the SMP architecture was “blown up” into clusters, slower network connections replaced shared memory crossbar connections and file servers became more remote. This creates challenges for optimizing workload performance, especially for data-intensive applications.

In this case the input data must originate from the remote file server and traverse the network; as requests increase beyond a certain level, queuing conflicts and bandwidth limitations can become severe bottlenecks. Large volumes of output data must be restored to the central file server when simulations are completed, this can also present a challenge with many large files being written simultaneously. We present a toolkit of automated cluster optimization capabilities, which can accelerate data input and output into clusters, and more effectively optimize compute node performance as well. The software toolkit provides: Parallel File Serving Asynchronous Results Transfer Schedule Optimizer Meta Language Processor The Parallel file serving module allows us to provide shared data to all nodes in a cluster simultaneously. This data replication provides much higher effective aggregate bandwidth across commodity (e.g. Gigabit Ethernet) network switches, and scales in a highly linear fashion with the number of nodes. The Asynchronous Results Transfer (ART) allows output data to be restored from a large number of compute nodes to the central file server in the background; this establishes a more efficient processing pipeline. The compute nodes are freed up to get back to their compute-intensive simulation work. The output data in question can be final results, intermediate results, or even checkpoint-restart files. The same technique can be used to prestage data on the input side and alert workload mangers to send jobs to nodes which are already “data-hot”.

The Schedule Optimizer allows us to enhance the performance of existing workload managers (PBS, SGE, LSF, Torque, etc.) by dynamically optimizing the number of processes per core. The software is fault-tolerant and transparent to install and operate, in conjunction with existing file systems (e.g. NFS, Lustre, Panasas, ..) and existing workload managers, and without modification to existing application source or binaries. The modules may be used selectively on an application by application basis, as appropriate. Modifications are typically limited to minor changes in existing job scripts, and our MLP provides a user interface to implement modified scripts. Installation on a large number of nodes can be completed in a matter of minutes, and the software is designed to be “set and forget” from a system administration standpoint. Performance results on customer clusters with real-world applications using these cluster optimization modules will be presented. Performance enhancements of a factor of 3 or more are achieved with parallel file serving and the schedule optimizer module, and a factor of 1.5 or more with ART.
title Myri-10G: The Technically Superior HPC Interconnect
author(s) Tom Leinberger, Myricom, USA
presenter Tom Leinberger
Myri-10G is 10-Gigabit Ethernet from Myricom, and more.  In HPC clusters using the kernel-bypass MX-10G software, Myri-10G exhibits 2.3 microsecond MPI PingPong latency, 1.2 GByte/s PingPong data rate, 2.4 GByte/s SendRecv data rate, very high application availability (very low host-CPU load) with MPI and sockets, a small and constant memory footprint in the host, and wire-speed interoperability between 10-Gigabit Ethernet and 10-Gigabit Myrinet.  These benchmarks are for clusters of any size; they are not just marketing benchmarks on small clusters.  MX 10G and Myri-10G NICs operate with either Myrinet or Ethernet switch networks, and carry IP traffic efficiently along with MPI traffic.  The Lustre and PVFS2 cluster file systems have native MX support, and are blazingly fast with MX-10G. Myri-10G switches are available with a mix of Myrinet-protocol and Ethernet-protocol ports.  In June, Myricom will start shipping a new series of Myri-10G switches with up to 512 host ports from a single switch enclosure.
title Open Fabrics Enterprise Edition (OFED) Update
author(s) Jamie Riotto, Cisco, USA
presenter Jamie Riotto
An overview of the Open Fabrics Association (OFA) Open Fabrics Enterprise Edition (OFED) Software. OFED is an open-source fabric agnostic set of host driver software supporting RDMA technologies over InfiniBand and Ethernet. This talk will give an update of the new OFED 1.2 release, as well as cover what it contains, who were the contributors, and where it is being used in both enterprise and HPC. Also, a roadmap will be presented on future OFED releases and the new technologies they will contain.
title Available soon.
Software for Clustered Systems
title Optimizing Application Performance on x64 Processor-based Systems with PGI Compilers and Tools
author(s) Douglas Miles, PGI, USA
presenter Douglas Miles
abstract PGI Fortran, C and C++ compilers and tools are available on many Intel and AMD x64 processor-based Linux clusters. Optimizing performance of x64 processors often depends on maximizing SSE vectorization, ensuring alignment of vectors, and minimizing the number of cycles the processors are stalled waiting on data from main memory. The PGI compilers support a number of directives and options that allow the programmer to control and guide optimizations including vectorization, parallelization, function inlining, memory prefetching, interprocedural optimization, and others.

In this talk we provide detailed examples of the use of several of these features as a means for extracting maximum single-node performance from Linux clusters using PGI compilers and tools.
title Debugging and Optimizing Applications for Multicore MPP Architectures
author(s) Michael Rudgyard, Allinea, USA
presenter Michael Rudgyard
As two, four and potentially eight-core processors become the norm, the defacto HPC architecture is tending towards large clusters of modest 8 16 core shared-memory servers, potentially with co-processing devices (eg. GPGPUs, FPGAs, Clearspeed). Programming these machines optimally presents a number of challenges, and applications that use a mixed programming models are now becoming commonplace.

In this presentation we will discuss the challenges facing  today's HPC application developers, and the need for simple tools that can address mixed programming models. We will present new multicore features of Allinea's Distributed Debugging Tool (DDT) and Optimisation and Profiling Tool (OPT), and discuss our aims to provide a consolidated, scalable, yet intuitive framework for HPC developers.
title Improving System Performance with Scali MPI Connect
author(s) Rick Friedman, Scali, USA
presenter Rick Friedman
abstract Available soon.
title Adaptive Computing in HPC Today
author(s) David Jackson, Cluster Resources, USA
presenter David Jackson
abstract Clusters that heal themselves? Clusters that learn and improve scheduling over time? Clusters that actively coordinate compute, storage, and network resources to optimize total performance? Clusters that dynamically grow and shrink and even customize themselves based on workload? Think this is science fiction? Think again!

In this presentation, we will discuss how Moab Utility Computing Suite is enabling customers to accomplish these objectives today on HPC clusters, grids, and data centers. We will discuss the benefits of systems that can dynamically adapt their resources, jobs, and policies to meet changing objectives and environmental conditions. Furthermore, we will cover how advanced high-level policies can harness these capabilities to improve both utilization and response time, as well as deliver on QOS/SLA agreements in a way never before possible.

The magic to this solution is a virtualized batch layer that enables the use of technology to accomplish all these objectives. On the outside, users only see the familiar interfaces for submitting and managing workload on a cluster that grows, shrinks and adapts on command.

This presentation will discuss capabilities and a number of case studies on real-world sites that have been successfully utilizing this technology for years. We will also discuss industry trends that are moving adaptive computing into the mainstream.
Upcoming Hardware Technology
title HPC Technologies from Intel
author(s) David Barkai and David Lombard, Intel, USA
presenter David Barkai and David Lombard
abstract Available soon.
title Cool, Tight, Fast, Reliable HPC Clustering with Blades and InfiniBand
author(s) Kent Koeninger, HP, USA
presenter Kent Koeninger
abstract Processors speed alone is not the primary driver for HPC clusters, nor are HPC clusters just for academic science and industrial engineering. In addition to delivering TFLOPS they need to run with minimum power and cooling, fit in minimum spaces, optimize the total cost of ownership, and scale to large tightly-connected clusters at minimum cost. Enter blades with InfiniBand. This combination is opening new enterprise-oriented HPC markets, including Financial Services and On-Line gaming. In addition to the requirements above, these markets demand enterprise reliability, availability, and serviceability (RAS), including redundant configurations for resiliency. This talk will highlight the roles of HP BladeSystem c-Class clusters and InfiniBand in meeting these advancing HPC requirements.
title Trends in High Performance Computing Commodity Clusters
author(s) Jay Urbanski, IBMl, USA
presenter Jay Urbanski
This talk will examine trends in HPC clusters from both a hardware and software perspective. Areas of discussion will include processor technologies and implications of multi-core, heterogenous computing, interconnect directions, systems management and scalability, and power, cooling, and packaging optimization.
title High Performance for Big Science
author(s) Kevin Noreen, Dell, USA
presenter Kevin Noreen
abstract Available soon.
title The Transition to Multi-core: Is Your Software Ready?
author(s) Matthew Papakipos, PeakStream, USA
presenter Matthew Papakipos
abstract Available soon.