Documents/NITRD/1: WeCompute/1.2: Acceleration

1.2: Acceleration

Accelerating the Future of Computing

Other Information:

Historically, the U.S. has aggressively led advances in high-performance computing (called highend computing [HEC] in the NITRD Program) because HEC provides a competitive national advantage, supports science and technology leadership, and plays a critical role in advancing Federal missions and other national priorities. The HEC technical area encompasses all of the challenges of leading this advancement, including not just better (e.g., massively scaled-up, more efficient, and resilient) system hardware and software, but also more efficient provisioning models (e.g., surge and cloud computing), improved mathematical and computer science underpinnings for analysis and modeling of multi-scale and ultra-scale data, and new programming environments and tools for easier development and usability of advanced scientific applications. Where we are now: The Top500 supercomputers list, which has chronicled a decades-long exponential advance in HEC technology, shows China and others now nipping at our heels. The Top500 analysis also extrapolates future HEC performance, indicating that we will continue to see 1,000x increases in capability every decade or so. The most powerful Federal leadership systems have achieved petascale speeds (1,000 trillion floating-point calculations [flops] per second) and are expected to reach the exascale (a quintillion flops, or 1018) within a decade. At the same time, however, our current means of reaching these ever-higher computational speeds – packing multiple processors into every computer chip – is creating a crisis in our ability to program system and application software to efficiently exploit the emerging many-core and heterogeneous computing architectures. Moreover, as the number of cores and components rises, the likelihood diminishes of correctly completing data-intensive computations of increasing scale and complexity. In addition, the cost to power large HPC facilities is growing unsustainably rapidly. Research needs: The massive parallelism necessary to enable software to fully exploit the speeds and computational capabilities of supercomputing systems with heterogeneous components and up to millions of microprocessors presents enormous challenges for both system and application designers. Developing robust code that can be partitioned into many parallel subroutines for efficient multicore/many-core processing and then reintegrating the results as final output require breakthrough advances in the underlying mathematics, design, and engineering of HEC software. The goal of making HEC environments easier to use and more productive, reducing time to solution, remains elusive; even at today’s levels of system complexity, down time due to software issues is rising as a proportion of total operational costs. One promising R&D avenue is resilience – concepts and technologies enabling a HEC system to continue to function amid software faults, anomalies, and errors, and to alert operators about problems without necessitating a shutdown. Also critical to the long-term future of U.S. supercomputing leadership is research in technological approaches to reduce the steadily rising energy demands of large-scale HEC systems and facilities, which typically consume many megawatts per year for operations and cooling. Because advances to change this unsustainable energy-use trajectory may arise from multiple research fields, the search for scientific breakthroughs must be pursued across the board, in power management and heat dissipation technologies, new materials such as nanoscale composites, novel power-saving platform and system architectures, computing technologies (e.g., nonvolatile computing), and computational methods (e.g., spintronics, analog computing), as well as in next-generation computing concepts such as quantum information science. Advances in nano, biological, and quantum sciences also may lead the way to radically different system architectures and computational methods, providing the basis for next-generation leadership in computing at all scales.

Indicator(s):