Advances in GPU Research and Practice
eBook - ePub

Advances in GPU Research and Practice

  1. 774 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Advances in GPU Research and Practice

Book details
Book preview
Table of contents
Citations

About This Book

Advances in GPU Research and Practice focuses on research and practices in GPU based systems. The topics treated cover a range of issues, ranging from hardware and architectural issues, to high level issues, such as application systems, parallel programming, middleware, and power and energy issues.

Divided into six parts, this edited volume provides the latest research on GPU computing. Part I: Architectural Solutions focuses on the architectural topics that improve on performance of GPUs, Part II: System Software discusses OS, compilers, libraries, programming environment, languages, and paradigms that are proposed and analyzed to help and support GPU programmers. Part III: Power and Reliability Issues covers different aspects of energy, power, and reliability concerns in GPUs. Part IV: Performance Analysis illustrates mathematical and analytical techniques to predict different performance metrics in GPUs. Part V: Algorithms presents how to design efficient algorithms and analyze their complexity for GPUs. Part VI: Applications and Related Topics provides use cases and examples of how GPUs are used across many sectors.

  • Discusses how to maximize power and obtain peak reliability when designing, building, and using GPUs
  • Covers system software (OS, compilers), programming environments, languages, and paradigms proposed to help and support GPU programmers
  • Explains how to use mathematical and analytical techniques to predict different performance metrics in GPUs
  • Illustrates the design of efficient GPU algorithms in areas such as bioinformatics, complex systems, social networks, and cryptography
  • Provides applications and use case scenarios in several different verticals, including medicine, social sciences, image processing, and telecommunications

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Advances in GPU Research and Practice by Hamid Sarbazi-Azad in PDF and/or ePUB format, as well as other popular books in Computer Science & Hardware. We have over one million books available in our catalogue for you to explore.

Information

Year
2016
ISBN
9780128037881
Part 1
Programming and Tools
Chapter 1

Formal analysis techniques for reliable GPU programming

current solutions and call to action

A.F. Donaldson1; G. Gopalakrishnan2; N. Chong1; J. Ketema1; G. Li2; P. Li2; A. Lokhmotov3; S. Qadeer4 1 Imperial College London, London, United Kingdom
2 University of Utah, Salt Lake City, UT, United States
3 dividiti, Cambridge, United Kingdom
4 Microsoft Research, Redmond, WA, United States

Abstract

Graphics processing units (GPU)-accelerated computing is being adopted increasingly in a number of areas, ranging from high-end scientific computing to mobile and embedded computing. While GPU programs routinely provide high computational throughput in a number of areas, they also prove to be notoriously difficult to write and optimize correctly, largely because of the subtleties of GPU concurrency. This chapter discusses several issues that make GPU programming difficult and examines recent progress on rigorous methods for formal analysis of GPU software. Our key observation is that given the fast-paced advances in GPU programming, the use of rigorous specification and verification methods must be an integral part of the culture of programming and training, and not an afterthought.

Keywords

GPUs; Concurrency; Many-core programming; Formal verification; CUDA; OpenCL

Acknowledgments

We are grateful to Geof Sawaya, Tyler Sorensen, and John Wickerson for feedback on a draft of this chapter, and to Daniel Poetzl for guidance regarding the placement of memory fences in Fig. 4.
We thank Geof Sawaya, Tyler Sorensen, Ian Briggs, and Mark Baranowski for testing and improving the robustness of GKLEE.

1 GPUs in Support of Parallel Computing

When the stakes are high, money is no object in a nation’s quest for computing power. For instance, the SAGE air-defense network [1] was built using a computing system with 60,000 vacuum tubes, weighing 250 tons, consuming 3 MW of power, and at a cost of over $8 billion, in 1964. Now (five decades later), a door-lock computer often exceeds SAGE in computing power, and more than ever, we are critically dependent on computers becoming faster as well as more energy efficient. This is true not only for defense but also for every walk of science and engineering and the social sector.
Graphics processing units (GPUs) are a natural outgrowth of the computational demands placed by today’s applications on such stringently power-capped and bounded wire-delay hardware [2]. GPUs achieve higher computational efficiency than CPUs by employing simpler cores and hiding memory latency by switching away from stalled threads. Overall, the GPU throughput-oriented execution model is a good match for many data-parallel applications.
The rate of evolution of GPUs is spectacular: from the 3M transistor Nvidia NV3 in 1997, their trajectory is marked by the 7B transistor Nvidia GK110 (Kepler) in 2012 [3]. Nvidia’s CUDA programming model, introduced in 2007, provided a dramatic step up from the baroque notation associated with writing pixel and vertex shaders, and the recent CUDA 7.5 [4] offers versatile concurrency primitives. OpenCL, an industry standard programming model [5] supported by all major device vendors, including Nvidia, AMD, ARM, and Imagination Technologies, provides straightforward, portable mapping of computational intent to tens of thousands of cores.
GPUs now go far beyond graphics applications, forming an essential component in our quest for parallelism, in diverse areas such as gaming, web search, genome sequencing, and high-performance supercomputing.

Bugs in parallel and GPU code

Correctness of programs has been a fundamental challenge of computer science even from the time of Turing [6]. Parallel programs written using CPU threads or message passing (e.g., MPI) are more error prone than sequential programs, as the programmer has to encode the logic of synchronization and communication among the threads and arrange for resources (primarily the memory) to be shared. This situation leads to bugs that are very difficult to locate and rectify.
In contrast to general concurrent programs, GPU programs are “embarrassingly parallel,” with threads being subject to structured rules for control and synchronization. Nevertheless, GPU programs pose certain unique debugging challenges that have not received sufficient attention, as will be clear in Sections 2 and 3 where we discuss GPU bugs in detail. Left unchecked, GPU bugs can become show-stoppers, rendering simulation results produced by multimillion-dollar scientific projects utterly useless. Sudden inexplicable crashes, irreproducible executions, as well as irreproducible scientific results are all the dirty laundry that often goes unnoticed amid the din of exascale computing.
Yet these bugs do occur, and they seriously worry experts who often struggle to bring up newly commissioned machines or are pulled away from doing useful science. The purpose of this chapter is to describe some of these challenges, provide an understanding of the solutions being developed, and characterize what remains to be accomplished.
After introducing GPUs at a high level (Section 2), we survey some key GPU correctness issues in a manner that the program analysis and verification community can benefit from (Section 3). The key question addressed is: What are the correctness challenges, and how can we benefit from joint efforts to address scalability and reliability? We then address the question: How can we build rigorous correctness checking tools that handle today’s as well as upcoming forms of heterogeneous concurrency? To this end we discuss existing tools that help establish correctness, providing a high-level description of how they operate and summarizing their limitations (Section 4). We conclude with our perspective on the imperatives for further work in this field through a call to action for (a) research-driven advances in correctness checking methods for GPU-accelerated software, motivated by open problems and (b) dissemination and technology transfer activities to increase uptake of analysis tools by industry (Section 5).

2 A quick introduction to GPUs

GPUs are commonly used as parallel co-processors under the control of a host CPU in a heterogeneous system. In this setting, a task with abundant parallelism can be offloaded to the GPU as a kernel: a template specifying the behavior of an arbitrary thread. Fig. 1 presents a CUDA kernel for performing ...

Table of contents

  1. Cover image
  2. Title page
  3. Table of Contents
  4. Copyright
  5. Dedication
  6. List of Contributors
  7. Preface
  8. Acknowledgments
  9. Part 1: Programming and Tools
  10. Part 2: Algorithms and Applications
  11. Part 3: Architecture and Performance
  12. Part 4: Power and Reliability
  13. Author Index
  14. Subject Index