eBook - ePub

Intelligent Speech Signal Processing

Name: Intelligent Speech Signal Processing
ISBN: 9780128181317

Nilanjan Dey,

209 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Intelligent Speech Signal Processing

Nilanjan Dey,

About this book

Intelligent Speech Signal Processing investigates the utilization of speech analytics across several systems and real-world activities, including sharing data analytics, creating collaboration networks between several participants, and implementing video-conferencing in different application areas. Chapters focus on the latest applications of speech data analysis and management tools across different recording systems. The book emphasizes the multidisciplinary nature of the field, presenting different applications and challenges with extensive studies on the design, development and management of intelligent systems, neural networks and related machine learning techniques for speech signal processing. - Highlights different data analytics techniques in speech signal processing, including machine learning and data mining - Illustrates different applications and challenges across the design, implementation and management of intelligent systems and neural networks techniques for speech signal processing - Includes coverage of biomodal speech recognition, voice activity detection, spoken language and speech disorder identification, automatic speech to speech summarization, and convolutional neural networks

Tools to learn more effectively

Saving Books

Keyword Search

Annotating Text

Listen to it instead

Information

Publisher

Year

Print ISBN

eBook ISBN

Topic

Technology & Engineering

Subtopic

Business Intelligence

Index

Technology & Engineering

Chapter 1

Speech Processing in Healthcare: Can We Integrate?

K.C. Santosh Department of Computer Science, The University of South Dakota, Vermillion, SD, United States

Abstract

This chapter focuses on the way speech recognition, processing, and synthesis help in the healthcare. The chapter begins with the basic idea of speech recognition in the domain, and it particularly focuses on a complete healthcare project so as to obtain a clear understanding of the value of speech processing. The chapter also provides detailed information about how speech synthesis affects healthcare and its business model.

Keywords

Speech recognition; Text-to-speech; Healthcare; Signal/pattern analysis; Machine learning

Speech recognition—also known as name voice recognition—refers to the translation from speech into words in a machine-readable format [1–3].

Speech processing has been considered for various purposes in the domain, for example, signal processing, pattern recognition, and machine learning [3]. Starting with the improvement of customer service, as well as the role of hospital care in combating crime, among other purposes, we have found that speech recognition has increased its global market from $104.4 billion in 2016 to an estimated $184.9 billion in 2021 (source: https://www.news-medical.net/whitepaper/20170821/Speech-Recognition-in-Healthcare-a-Significant-Improvement-or-Severe-Headache.aspx). This is not a new trend; for cases in which different languages are needed, speech-to-text conversion is an example that has been widely used. The opposite holds true as well [4, 5]. For example, can we process or reuse speech data that occurred during a telephone conversation a few years previously, in which a client claimed that fraud happened on his or her credit card? Yes, this is possible. Beside other sources of data, speech can be taken as an authentic component to describe an event or scene wherein emotions can be analyzed [6–8].

Examples exist showing how speech analysis can be integrated into healthcare. In Fig. 1.1, a complete healthcare automated scenario has been created, which can be summarized as follows:

A patient visits clinical center (hospital), where he/she gets X-rayed, provides sensor-based data (external and internal), and receives (handwritten and machine-printed) prescription(s) and report(s) from the specialist. In these events, a patient and other staff (including the specialists) have gone through different levels of conversation, and, if recorded, they will be able to integrate these with signal processing, pattern recognition, image processing, and machine learning.

In the aforementioned healthcare project for instance, it would be convenient to combine speech and signal processing tools and techniques with image analysis-based tools and techniques [9–12]. More specifically, it is important to note that doctors can predict or guess about the presence of tuberculosis, for instance, based on verbal communication (quoted answers to questions, such as “do you sleep well?” and “how are your eating habits?”) before they start the X-ray screening procedure. If this is the case, in the complete project (outlined earlier), speech processing can help. This means that we would be able to come up with the complete information so that further processing can be performed.

It is important to note, for instance, that speech and voice cues (before and after the doctor’s visit) can help one understand the patient’s willingness to continue with treatment. Speech and voice can definitely convey emotions over time. Further, pain can simply be read by speech/voice level. We can also automate and check the trends related to how doctors and other staff members behave toward patients. Can speech be a component in helping to find consistency that is evident in other sources of data, as shown in Fig. 1.1? The use of a (proposed) convolutional neural network, as in Fig. 1.1, helps explain the fact that a machine, unlike a human expert, for instance, can make a decision without the bias possible in human choices. Also, visualization is possible that can show how different sources of data are connected. We consider artificial intelligence (AI) and machine learning (ML) tools for automation since data have to be collected over time. Analyzing big data is extremely important since it is not possible manually because humans are more error-prone and human analysis is costlier.

As mentioned earlier, local languages other than English can be considered in healthcare. In one work [13], authors reported the use of the Tamil language, in addition to English, to estimate heartbeat from speech/voice. A few more can be cited, showing how local, regional languages, such as German [14], Malay [15], and Slovenian [16], have helped speech technology progress. In another context [17], emergency medical care often depends on how quickly and accurately field medical personnel can access a patient’s background information and document their assessment and treatment of the patient. What if we could automate speech/voice recognition tools in the field? It is clear that research scientists should come up with precise tools that people can trust. Analyzing speech/voice concurrently with background music or other noise is important. Other works can be referenced for more detailed information [18–21].

As data change and increase, machine learning could help to automate the data retrieval and recording system. The use of the extreme-learning voice activity detection is prominent in the field [22]. As we go further, for example with real-time speech/voice recognition/classification, active learning should be considered since scientists found that learning over time is vital [23].

In general, Fig. 1.1 shows how important different sources of data and speech are as components in things such as sensor-based and image data (X-ray and/or reports that are handwritten or machine printed).

Cover image
Title page
Table of Contents
Copyright
Contributors
About the Editor
Preface
Chapter 1: Speech Processing in Healthcare: Can We Integrate?
Chapter 2: End-to-End Acoustic Modeling Using Convolutional Neural Networks
Chapter 3: A Real-Time DSP-Based System for Voice Activity Detection and Background Noise Reduction
Chapter 4: Disambiguating Conflicting Classification Results in AVSR
Chapter 5: A Deep Dive Into Deep Learning Techniques for Solving Spoken Language Identification Problems
Chapter 6: Voice Activity Detection-Based Home Automation System for People With Special Needs
Chapter 7: Speech Summarization for Tamil Language
Chapter 8: Classifying Recurrent Dynamics on Emotional Speech Signals
Chapter 9: Intelligent Speech Processing in the Time-Frequency Domain
Chapter 10: A Framework for Artificially Intelligent Customized Voice Response System Design using Speech Synthesis Markup Language
Index

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription

No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, we’ve got you covered! Learn about our mission

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud

Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app

Yes, you can access Intelligent Speech Signal Processing by Nilanjan Dey in PDF and/or ePUB format, as well as other popular books in Technology & Engineering & Business Intelligence. We have over one million books available in our catalogue for you to explore.

About this book

Tools to learn more effectively

Information

Table of contents

Frequently asked questions