Beep to Boom
eBook - ePub

Beep to Boom

The Development of Advanced Runtime Sound Systems for Games and Extended Reality

  1. 288 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Beep to Boom

The Development of Advanced Runtime Sound Systems for Games and Extended Reality

Book details
Book preview
Table of contents
Citations

About This Book

Drawing on decades of experience, Beep to Boom: The Development of Advanced Runtime Sound Systems for Games and Extended Reality is a rigorous, comprehensive guide to interactive audio runtime systems.

Packed with practical examples and insights, the book explains each component of these complex geometries of sound. Using practical, lowest-common-denominator techniques, Goodwin covers soundfield creation across a range of platforms from phones to VR gaming consoles.

Whether creating an audio system from scratch or building on existing frameworks, the book also explains costs, benefits and priorities. In the dynamic simulated world of games and extended reality, interactive audio can now consider every intricacy of real-world sound. This book explains how and why to tame it enjoyably.

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Beep to Boom by Simon Goodwin in PDF and/or ePUB format, as well as other popular books in Tecnología e ingeniería & Ingeniería acústica. We have over one million books available in our catalogue for you to explore.

Information

Publisher
Routledge
Year
2019
ISBN
9781351005524

Chapter 1

The Essence of Interactive Audio

Let’s pretend you are strapped in on the starting grid of a Formula 1 motor race. Twenty-four cars of a dozen designs roar as the lights turn green. Ninety-six tyres of various types and infinitely variable loads are poised to emit any or all of eight subtly inflected sounds, including peel, skid, scrub, bumps and roll transitions, not to mention brakes, suspension and other parts. Seconds later, they’ll be jostling for position on the first corner.
In F1 and similar games, all this is orchestrated in absentia by the audio team, performed by the audio runtime system, conducted by the player. It might play on a single phone speaker, stereo headphones, surround-sound speakers or more than one of those at a time. Interactive audio might be rendered more than once, in a local split-screen view or remotely on the screens of competing players or in headphones for a virtual reality (VR) player and simultaneously on speakers for a watching audience.
Or imagine you are in a shell-hole—for practice, as penance or for kicks. Between the bomb-blasts, small-arms fire and cries of friends and foes, buffeted mud squelches round your calves as you squirm. Hundreds of explosions large and small echo above and around you. Sounds shift realistically in your speakers or headphones as you warily scan the parapet. The ground around you thumps with each impact. Will you know which way and when to jump? How soon and how surely can you know when an incoming shell has your number on it? Without accurate audio—not just cinematic immersion—you won’t last long.
More prosaically, imagine you’re trying to find a public toilet. Normal headphones, in conjunction with binaural sound synthesis, GPS, gyros and an augmented reality system in your phone, can tell you which way to go, whatever you’re looking at, whatever you’re trying to find.
A rally car splashes across a ditch. In Codemasters’ DiRT2 game, this “event” triggers 24 voices of fresh sound effects to play. Milliseconds later, as the positions, velocities, amplitudes and echoes of all those sounds are finely adjusted, another dozen voices chirp up. That’s what happens when you give a sound designer a budget of hundreds of voices and need to sell millions of copies on all major console and PC platforms.
That’s how modern games work. Whether the player is racing, exploring, shooting or playing a sport, convincing, immersive and informative sound is essential. The same techniques apply to virtual, augmented and extended realities, in education or training as well as entertainment. Each soundfield is tailored for a single listener who controls the camera, the view-point (first or third person, close or distant) and chooses the listening environment. All these parameters they can change on a whim.
Propelled by commercial and aesthetic competitive pressures, gamers represent a diverse and wealthy global community of early adopters, while extended reality introduces new markets in education and simulation. Interactive audio is a superset of prior sound technologies. This book explains what it needs and what it lacks.

Size of the Challenge

The author was Central Technology Group Lead Programmer on the DiRT2 driving game, which has grossed around $200M since 2009, primarily on PlayStation and Xbox consoles; it’s still selling on PC and Mac. More than 50 programmers worked on that game for more than a year, plus a similar number of full-time testers, dozens of designers and more than 100 graphic artists. This book directly addresses the roles of the eight sound designers and five audio programmers.
The concepts are relevant to anyone interested in creating interactive media and the many differences between that and the old passive media of TV, cinema and recorded music. This book draws on decades of game development experience, encompassing music, sport and shooting genres and even VR space exploration. It includes war stories, deep geek detail, analysis and predictions.
DiRT and Formula 1 (F1) games are hardly the tip of the iceberg. Top-selling “triple-A” games, like Rockstar’s Grand Theft Auto 5, cost hundreds of millions to develop but generate operating profits of billions of dollars.[1] In its first five years GTA5 sold more than 90 million worldwide, at prices higher than any movie. Such success is only possible by designing it to suit all platforms, current and future, rather than picking one console or PC configuration.
GTA5 audio benefits from audio adaptations demonstrated by the author at the 2009 Audio Engineering Society (AES) Audio for Games conference in London. Such scalability is a prerequisite of sustained global sales, and it’s been achieved by constant technical innovation, including the adoption and refinement of advanced techniques described here.
This is a book about managing complexity in a way that suits the customers, the designers, the platforms and current and future genres of entertainment and training. It’s a book about curves, synergies and neat tricks. It’s also about having fun making interesting noises and understanding and playing with psychoacoustic principles.

It’s Different for Games

A misperception, slowly abating, concerns the superficial resemblance between games and passive media like TV and movies. Those are made for a mass audience, compromised to suit a generic consumer and set in stone before release. But each game is live, one time only.
The experience is never the same twice. The more it varies, the greater its lasting appeal and the more it involves and teaches the player. Whether it’s limited to a few dozen sounds or the hundreds of concurrent samples modern computers can mix, it is a designed experience, dependent upon categorisations and decisions made long before but mediated by the player. There is no ‘final cut.’
The ear never blinks.
The runtime system necessarily embeds the experience of sound designers, engineers and live mixers so it can “finish” the product on the fly. This draws on the asset-selection and balancing skills of designers, just as movies might, but demands more variety, flexibility and dynamic configuration, because the players call the shots. Audio is more demanding than graphics because all directions are equally important in a game, the ear never blinks and there’s no “persistence of hearing” akin to the persistence of vision which smooths out the flickering of video.
Perception is multi-modal. Georgia Tech research, published by MIT in 2001, established that high-quality video is more highly rated when coupled with high-quality sound.[2] Good graphics also make poor audio seem worse! Even if audio is not explicitly mentioned, it has a profound influence on perception.

Psychoacoustics

Sounds have multiple dimensions: pitch, intensity, spread, timbre, distance, direction, motion, environment. Our brains interpret these according to psychoacoustic curves learned over lifetimes in the real world. Each aspect, and the changes in each dimension, must be plausible, sympathetic to the whole and perceived to be smooth, progressive and repeatable, to maximise the informational content and forestall confusion.
Whether they can see it or not, audio tells the player the distance and direction of each sound, how it is moving, and what sort of environment surrounds it and the listener. More deeply, it identifies threats, opportunities, choices and risks which will influence future events, in whatever way the player chooses to interpret them. Learning by finding out is more personal and persuasive than following some other’s story. It relies on coherence, consistency and flexibility, because any glitch might break the spell and destroy the suspension of disbelief which turns a simulation into a lived experience.
Like any artistic representation, the results can be symbolic, realistic or hyper-real. Games and VR often target a remembered dream state more than strict reality, in which auditory salience works as a filter rather like depth of field in a film. But whereas cinema auteurs pick a subset of the sounds to fit their pre-ordained story and spend arbitrary time compiling each scene, interactive media must continuously identify and include cues that help the listener create a new, unique story of their own, not once but many times over, without inducing monotony or revealing gaps that might burst the bubble of immersion and agency.
Strict realism is just a start. The most realistic Richard Burns Rally would only be playable by Richard Burns. In a real F1 race you’d rarely hear anything but your own engine, but in a game that’s not good enough. Interactive audio has the capability to be symbolic, extracting just the essence or key cues in a scene, or hyper-real—reproducing the remembered synaesthetic experience, augmented by imagination, not the prosaic reality. If you’re ever unlucky enough to be in a car accident you may find the crash disappointingly dull by game standards—but games and VR are meant to be fun, thus all the more memorable.

What’s New?

This book explains how modern interactive audio systems create and maintain consistent and informative soundfields around game players and consumers of extended reality products. It complements books for sound designers and game programmers by filling the gaps between their accustomed models of reality.
It’s not a book about sound design, though it contains many tips for sound designers, especially related to interactivity. It’s not a book for mathematicians, though it builds on and refers to their work. It’s not even a book about psychoacoustics, though of all the related fields, that one informs the content most of all. There are many excellent books about all those subjects, as the references reveal.
This book pulls together concepts from those fields and decades of practical systems design and programming experience to explain how they fit together in modern games and extended realities. It presents a layered approach to implementing advanced audio runtime systems which has been successfully applied to platforms ranging from obsolescent telephones to the latest game consoles, arcade cabinets and VR rigs.
This is a practical book for designers, programmers and engineers, boffins, inventors and technophiles. The focus is on runtimes—the active parts that ship to the customer—rather than pre-production tools, though Chapter 24 surveys free and commercial tools for content-creation. It tells how to do things, with tested examples, but more significantly it explains why those things are useful and how they fit together.
Starting from the first computer-generated beeps and the basic challenges of volume and pitch control, it traces the development of audio output hardware from tone generators to sample replay, the layering power of multi-channel mixing and the spatial potential of HDMI output, building up to the use of Ambisonic soundfields to recreate the sensation of hundreds of independently positioned and moving sound sources around the listener.
The author has pioneered the interactive use of 3D soundfields but is well aware from research and direct experience that few listeners enjoy a perfect listening environment. One of the greatest changes taking place in media consumption in the 21st century is the realisation that there’s no correct configuration. Figure 1.1 shows the preferred listening configurations of more than 700 console and PC gamers.[3] Every listener benefits from a custom mix, and interactive audio systems deliver that as a matter of course.
Figure 1.1:
Figure 1.1:Listening preferences in percentage by output format
Since then phones and tablets have shifted the goalposts—although Apple TV and many Androids support HDMI and 7.1 output, most listeners use mono speakers. Live analytics provide specific usage information. Over 40 million sessions, barely 3% of mobile players of F1 Race Stars used headphones; 17% had the audio muted! The story was similar in Colin McRae Rally for iOS, with 8% on headphones and 12% unable to hear the co-driver. It’s nice to know that 88% were listening, even if mostly in mono.

Single-Player Optimisation

Traditional movies deliver a generic mix for a mass audience in a cinema, aiming to create a sense of immersion so that those in the cheap seats still feel part of the party. Immersion is easy—it involves little more than spreading ambience around the listener. But interactive media is much more demanding. It focusses on each individual listener without requiring them to be locked in a halo-brace centred in an anechoic irregular pentagonal room with a wire-mesh floor—even if that would deliver the most technically perfect reconstruction available with commodity components to the golden ears of an ideal listener.
Some expensive aspects of VR—like head tracking, which adjusts the experience every few milliseconds to account for movement of the client’s ears and eyes—are not needed for conventional gaming. Here the player’s thumb directs the senses of their avatar; looking away from the screen is a recipe for disaster. However control is achieved, audio challenges remain. The “sweet spot” in which sounds are perfectly balanced may be small, especially if space is tight and speaker positions are compromised, but a lone player is strongly incentivised to move into it.
Playing and exploring together is still fun even if the shared environment compromises individualised perception. So this book reveals techniques to deliver multiple soundfields in a single listening space for split-screen multi-player games and ways to mix adaptively for multiple simultaneous listening environments, like a single-player VR experience shared by remote passive observers or others waiting their turn in the same room.
This concept of “multiple endpoints” marks out new media platforms and modern ways of listening. Custom mixing and spatialisation technique can create additional coherent soundfields at low marginal cost. We’ll explore the pros and cons of headphone and multi-loudspeaker listening while showing how even a single mono loudspeaker can inform active listeners about the movement and relative position of 3D sounds in their vicinity.

Dynamic Adjustment

One of the greatest strengths of interactive audio—as opposed to TV, radio, cinema and other media designed to place a ready-made mix before the ears of many passive consumers—is the continuous conversation that takes place between active listeners and a dynamic audio r...

Table of contents

  1. Cover
  2. Half Title
  3. Series
  4. Title
  5. Copyright
  6. Contents
  7. Chapter 1 The Essence of Interactive Audio
  8. Chapter 2 Early Digital Audio Hardware
  9. Chapter 3 Sample Replay
  10. Chapter 4 Interactive Audio Development Roles
  11. Chapter 5 Audio Resource Management
  12. Chapter 6 Loading and Streaming Concepts
  13. Chapter 7 Streaming Case Studies
  14. Chapter 8 The Architecture of Audio Runtimes
  15. Chapter 9 Quick Curves—Transcendental Optimisations
  16. Chapter 10 Objects, Voices, Sources and Handles
  17. Chapter 11 Modelling Distance
  18. Chapter 12 Implementing Voice Groups
  19. Chapter 13 Implementing Listeners
  20. Chapter 14 Split-Screen Multi-Player Audio
  21. Chapter 15 Runtime System Physical Layers
  22. Chapter 16 Mixing and Resampling Systems
  23. Chapter 17 Interactive Audio Codecs
  24. Chapter 18 Panning Sounds for Speakers and Headphones
  25. Chapter 19 Ambisonic Surround-Sound Principles and Practice
  26. Chapter 20 Design and Selection of Digital Filters
  27. Chapter 21 Interactive Reverberation
  28. Chapter 22 Geometrical Interactions, Occlusion and Reflections
  29. Chapter 23 Audio Outputs and Endpoints
  30. Chapter 24 Glossary and Resources
  31. Acknowledgements
  32. Index