Parallel Processing Algorithms For GIS
eBook - ePub

Parallel Processing Algorithms For GIS

  1. 320 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Parallel Processing Algorithms For GIS

Book details
Book preview
Table of contents
Citations

About This Book

Over the last fifteen years GIS has become a fully-fledged technology, deployed across a range of application areas. However, although computer advances in performance appear to continue unhindered, data volumes and the growing sophistication of analysis procedures mean that performance will increasingly become a serious concern in GIS. Parallel computing offers a potential solution. However, traditional algorithms may not run effectively in a parallel environment, so utilization of parallel technology is not entirely straightforward. This groundbreaking book examines some of the current strategies facing scientists and engineers at this crucial interface of parallel computing and GIS.; The book begins with an introduction to the concepts, terminology and techniques of parallel processing, with particular reference to GIS. High level programming paradigms and software engineering issues underlying parallel software developments are considered and emphasis is given to designing modular reusable software libraries. The book continues with problems in designing parallel software for GIS applications, potential vector and raster data structures and details the algorithmic design for some major GIS operations. An implementation case study is included, based around a raster generalization problem, which illustrates some of the principles involved. Subsequent chapters review progress in parallel database technology in a GIS environment and the use of parallel techniques in various application areas, dealing with both algorithmic and implementation issues.; "Parallel Processing Algorithms for GIS" should be a useful text for a new generation of GIS professionals whose principal concern is the challenge of embracing major computer performance enhancements via parallel computing. Similarly, it should be an important volume for parallel computing professionals who are increasingly aware that GIS offers a major application domain for their technology.

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Parallel Processing Algorithms For GIS by Richard Healey,Steve Dowers,Bruce Gittings,Mike J Mineter in PDF and/or ePUB format, as well as other popular books in Ciencias físicas & Geografía. We have over one million books available in our catalogue for you to explore.

Information

Publisher
CRC Press
Year
2020
ISBN
9781000162745
Edition
1
Subtopic
Geografía

1
Introduction

R.G. Healey, B.M. Gittings, S. Dowers and M.J. Mineter

1.1 GIS Challenges

Over the last fifteen years the field of GIS has been transformed. Once a Cinderella subject, it was untouched by all but the most quantitatively avant-garde of departments of Geography, and a small group of far-sighted public sector bodies. From these small beginnings it has now become a multi-billion dollar industry and a major player within the broader field of information technology.
In the course of its evolution it has negotiated the restrictions of limited processing power and mass storage, the tedium of manual digitising and the dearth of reliable software, to become a fully fledged technology, widely deployed across a range of application fields. Indeed, its success and that of the sister technology of satellite remote sensing now threaten to engulf us in digital data, if not information, which we can scarcely store, let alone analyse, in any potentially useful time frame. Although the makers of computer chips continue to surprise us with advances in processor speed, cache performance and graphics capability, they are no match for the growth of available GIS and remote sensing data. Further to this, the spectre of real-time GIS applications (Xiong & Marble, 1996) which would add sub-second response times to analytical operations on large and highly dynamic datasets, is also starting to haunt us.
Faced by these immense data resources, many of them freely available over the Internet, and yesterday’s workstation that capital budget restrictions prevent us from replacing, it is difficult to stretch beyond moderate sized datasets and limited analytical operations if our processing capabilities are not to be overwhelmed. The more exploratory, combinatorial or interactive kinds of analysis are therefore not accessible to the extent we would wish.
It is very important that the field should not lose impetus, now that rich data resources are available. The expectations of potential users are high, yet the throughput of useful results from GIS analysis is often limited. Processor, memory, disk and network constraints still temper the enthusiasm of even the most energetic and insomniac of postgraduate students. Similarly, contract deadlines tick past before image datasets can be classified, generalised and vectorised, while dozens of workstations on the network stand idle overnight. Power on the desktop has brought excellent interactivity to the user interface, but it has yet to be harnessed enterprise-wide for cost-effective GIS processing.
One further emerging area of application is likely to hold new and significant challenges. The increasing use of real-time mapping and analysis in conjunction with the World-Wide Web (WWW), with millions of potential users, will result in enormous demands for multi-streamed performance. Already many web-servers are seen to be receiving tens of thousands of accesses every day, and as the sophistication of these accesses develop, with demands for complex database queries and the mapping of results, so the performance of the system must service these requests rapidly to satisfy a user community at present frustrated by lack of network bandwidth.
Already Web sites such as Digital’s AltaVista search engine use powerful multiprocessor servers to satisfy current levels of use, yet with only simple and unsophisticated queries. It cannot be long until the demands made of GIS software, customised into vertical markets such as tourist information systems, reach and rapidly exceed these levels.
Given these demands for both high performance computation and data input/output (I/O), it is now a matter of concern that the GIS community is still largely restricted to algorithmic approaches originally developed when only serial processors were available, with performance orders of magnitude lower than contemporary hardware.

1.2 Parallel Opportunities and Problems

The literature has for some years been drawing attention to the ‘promise of general-purpose parallel computing’ (Hack, 1989). This has been based on a number of arguments. Firstly, and most obviously, the ability to distribute components of a large computational task across a number of processors ought to result in more rapid throughput than if only a single processor is utilised (Nicol & Willard, 1988). Secondly, irrespective of how long it takes before fundamental limits to the speedup of serial processors are approached, any performance gains available can be multiplied dramatically by running the processors in parallel. Thirdly, the trend towards the use of commodity processors, even if their interconnects are proprietary, has made parallel machines more maintainable, lower cost and more able to run versions of standard systems software such as UNIX.
However, algorithms developed for serial machines may not run effectively in a parallel environment and vice versa, so utilisation of parallel technology may be less straightforward than at first thought. More specifically, it requires consideration of the trade-offs between:
  • the architecture of individual processing nodes
  • the granularity of the machine (no. of nodes)
  • communication bandwidth and the latency (the time between the sending and receipt of messages between processors)
  • the topology of the interconnection between processors
  • the extent of overlap (or not) of communication and computation. (Fox, 1991)
Unfortunately, the issues involved in moving to a parallel computing environment are not limited to the purely algorithmic. In a critical assessment of the challenges of scalable parallel computing, Bell (1994) has referred to the four ‘flat tyres’ of the parallel bandwagon, caused by lack of: systems software, skilled programmers, good heuristics for the design and use of parallel algorithms, and good parallelisable applications. To this, regrettably, we might add the spare tyre of commercial failures among parallel hardware vendors, which have not enhanced confidence in the potential user community. However, it should be noted that failure has been more common among vendors of massively parallel machines, where each component processor has local memory, than among the suppliers of shared memory multi-processors, where common memory is accessed by all processors across a bus (see Chapter 2).
It could be argued that the field of parallel processing, in both hardware and software components, is simply relearning at an accelerated pace the earlier lessons of computing as a whole: namely, that viability is a function of generality (Bell, op. cit). More specifically, without general purpose hardware components, and standardised, re-usable software component libraries, soaring development and maintenance costs will rapidly render even the best-funded of projects uneconomic.
Yet, parallel computing technology and GIS need to be brought together. The increasing complexity of the applications, the volumes of data and the escalating costs of designing higher and higher performance serial processors mean that the adoption of parallelism is inevitable. Until recently, the costs of redesigning software to take advantage of this technology (due to the lack of tools, standards and generic operating system support) has deterred the GIS community from embracing parallelism, but this is now changing.

1.3 GIS and Parallel Processing

With GIS applications clamouring for enhanced throughput, and the potential offered by parallel processing, bringing the technologies together would seem to offer substantial benefits, if successful. However, several earlier points counsel caution in relation to such an ambition. At a very general level, while GIS operations are both compute and I/O intensive, until very recently parallel processing has concentrated much more on performance enhancements of the former rather than the latter type. This situation is now changing with machines such as the CRAY T3E. Secondly, GIS data structures are complex, with varying numbers of variable length data records, spread across linked files which may be very large (see Chapter 6). Extensive pre-processing may therefore be necessary within the parallel environment, if subsequent computation is to make optimal use of available processors. Thirdly, GIS operations may be multi-stage, requiring the application of several algorithms in turn. These algorithms may contain sequential components. Fourthly, computation and I/O may be interleaved during the same operation. This may be a function of the operation itself, or of the very large size of the datasets, such that it cannot be assumed they will fit entirely into the memory (either shared or distributed among processors) that may be available at any given time. Fifthly, it may be appropriate to link the code for individual operations to a proprietary database manager, for storage or manipulation of co-ordinate or attribute data.
The range of issues involved, from database management to applied computational geometry, indicates that the task of making GIS operations, as opposed to individual algorithms, fully functional in a parallel environment is a daunting one. A large number of design decisions have to be made, where the trade-offs may be difficult to calculate with any precision, before any approach to implementation can even be considered. However, what is essential at this stage in the development of the field is that these issues and problems are clearly explained and documented, so future work can build on it in a structured manner. This is the approach adopted in this book. The main focus is on documentation of algorithm and software design issues, to a much greater level than is common is GIS texts, with implementation examples to illustrate how the research could be taken forward in future.
A final point under this heading is that, if there is a shortage of skilled staff in parallel processing, there are even fewer individuals with expertise in the complex and sometimes arcane world of GIS algorithms. To find those with both types of skill is therefore even more difficult, and makes interdisciplinary collaboration essential. Even with this, the time requirements and costs of work of this kind are considerable and relatively few groups will be able to entertain it. This is all the more reason for proper documentation of work that is done, so no available effort is wasted.

1.4 The Context of this Book

The subsequent sections of this introduction outline the background to the project that funded much of the work reported in this volume, followed by an outline structure of the ensuing chapters.

1.4.1 Project Background

This book has its origins in a large research project on parallel GIS funded at the University of Edinburgh by the Department of Trade and Industry (DTI) and SERC, as part of their Parallel Applications Programme, with substantial contributions from a number of industrial partners, who are acknowledged individually at the beginning of this volume. Owing to the size of the interdisciplinary research team, comprising both parallel processing and GIS specialists, the appropriate format for wider dissemination of the results of the project was an edited collection of papers, rather than a book written by just one or two individuals. However, unlike most edited collections, a sustained argument is developed through the first fourteen chapters, before broadening out the coverage in the later chapters to consider a range of related topics.

1.4.2 Project Aims

The DTI/SERC Parallel Applications Programme started in 1991 from the premise that parallel processing was now a sufficiently mature field for the funding emphasis to move from fundamental research towards industrial applications. While this was perhaps true in certain areas, it was difficult to argue this view in relation to GIS, where, at the time, the published output in relation to the application of parallel processing methods was extremely limited. Indeed, with hindsight, a view much closer to that put forward by Kuck (1994) would have been more appropriate. He identified the need to distinguish in parallel processing between computer architectures (i.e. hardware), compilers, applications and problem solving environments. In each of these areas there are three stages, namely research and development, commercialisation and commodity availability. While architecture and compiler developments were well down the road to commercialisation by the early 1990s, the same could not be said of applications and problem solving environments. These were barely at the beginning of a process of commercialisation that could be expected to extend until 2020 and beyond. The results of the GIS project reported here fit more comfortably into Kuck’s realistic assessments of timescales than an over-enthusiastic push for ‘near-market’ activity that is not based on sound research foundations.
Given the very limited amount of previous work on parallel processing in GIS, the main focus of the project was necessarily on algorithm design rather than implementation and performance testing. Restriction in the range of algorithms considered was also essential and the target GIS operations were chosen in consultation with the industrial partners. Initial...

Table of contents

  1. Cover
  2. Half Title
  3. Title Page
  4. Copyright Page
  5. Dedication Page
  6. Contents
  7. Editors’ Biographical Details
  8. Acknowledgements
  9. Contributors
  10. 1 Introduction
  11. Part One: Parallel Processing Technology in Context
  12. Part Two: Design Issues
  13. Part Three: Parallelising Fundamental GIS Operations
  14. Part Four: Application of Parallel Processing
  15. Part Five: Conclusions
  16. Glossary
  17. Index