Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
|||
Showing 1 - 25 of 44 matches in All Departments
This book aims to explore and discuss theories and technologies for the development of socially competent and culture-aware embodied conversational agents for elderly care. To tackle the challenges in ageing societies, this book was written by experts who have a background in assistive technologies for elderly care, culture-aware computing, multimodal dialogue, social robotics and synthetic agents. Chapter 1 presents a vision of an intelligent agent to illustrate the current challenges for the design and development of adaptive systems. Chapter 2 examines how notions of trust and empathy may be applied to human-robot interaction and how it can be used to create the next generation of emphatic agents, which address some of the pressing issues in multicultural ageing societies. Chapter 3 discusses multimodal machine learning as an approach to enable more effective and robust modelling technologies and to develop socially competent and culture-aware embodied conversational agents for elderly care. Chapter 4 explores the challenges associated with real-world field tests and deployments. Chapter 5 gives a short introduction to socio-cognitive language processing that describes the idea of coping with everyday language, irony, sarcasm, humor, paralinguistic information such as the physical and mental state and traits of the dialogue partner, and social aspects. This book grew out of the Shonan Meeting seminar entitled "Multimodal Agents for Ageing and Multicultural Societies" held in 2018 in Japan. Researchers and practitioners will be helped to understand the emerging field and the identification of promising approaches from a variety of disciplines such as human-computer interaction, artificial intelligence, modelling, and learning.
Current speech recognition systems are based on speaker independent speech models and suffer from inter-speaker variations in speech signal characteristics. This work develops an integrated approach for speech and speaker recognition in order to gain space for self-learning opportunities of the system. This work introduces a reliable speaker identification which enables the speech recognizer to create robust speaker dependent models In addition, this book gives a new approach to solve the reverse problem, how to improve speech recognition if speakers can be recognized. The speaker identification enables the speaker adaptation to adapt to different speakers which results in an optimal long-term adaptation.
"Introducing Spoken Dialogue Systems into Intelligent
Environments "outlines the formalisms of a novel knowledge-driven
framework for spoken dialogue management and presents the
implementation ofa model-based Adaptive Spoken Dialogue
Manager(ASDM) called OwlSpeak. The authors have identified three
stakeholders thatpotentially influence the behavior of the ASDM:
the user, the SDS, and a complex Intelligent Environment (IE)
consisting of various devices, services, and task
descriptions.
In this work, the authors present a fully statistical approach to model non--native speakers' pronunciation. Second-language speakers pronounce words in multiple different ways compared to the native speakers. Those deviations, may it be phoneme substitutions, deletions or insertions, can be modelled automatically with the new method presented here. The methods is based on a discrete hidden Markov model as a word pronunciation model, initialized on a standard pronunciation dictionary. The implementation and functionality of the methodology has been proven and verified with a test set of non-native English in the regarding accent. The book is written for researchers with a professional interest in phonetics and automatic speech and speaker recognition.
Proactive Spoken Dialogue Interaction in Multi-Party Environments describes spoken dialogue systems that act as independent dialogue partners in the conversation with and between users. The resulting novel characteristics such as proactiveness and multi-party capabilities pose new challenges on the dialogue management component of such a system and require the use and administration of an extensive dialogue history. In order to assist the proactive spoken dialogue systems development, a comprehensive data collection seems mandatory and may be performed in a Wizard-of-Oz environment. Such an environment builds also the appropriate basis for an extensive usability and acceptance evaluation. Proactive Spoken Dialogue Interaction in Multi-Party Environments is a useful reference for students and researchers in speech processing.
Bandwidth Extension of Speech Signals describes the theory and methods for quality enhancement of clean speech signals and distorted speech signals such as those that have undergone a band limitation, for instance, in a telephone network. Problems and the respective solutions are discussed for the different approaches. The different approaches are evaluated and a real-time implementation of the most promising approach is presented. The book includes topics related to speech coding, pattern- / speech recognition, speech enhancement, statistics and digital signal processing in general.
Over the last decade a number of research areas have contributed to the concept of advanced intelligent environments, these include ubiquitous computing, pervasive computing, embedded intelligence, intelligent user interfaces, human factors, intelligent buildings, mobile communications, domestic robots, intelligent sensors, artistic and architectural design and ambient intelligence. Undeniably, multimodal spoken language dialogue interaction is a key factor in ensuring natural interaction and therefore of particular interest for advanced intelligent environments. It will therefore represent one focus of the proposed book. The book will cover all key topics in the field of intelligent environments from a variety of leading researchers. It will bring together several perspectives in research and development in the area.
This book covers key topics in the field of intelligent ambient adaptive systems. It focuses on the results worked out within the framework of the ATRACO (Adaptive and TRusted Ambient eCOlogies) project. The theoretical background, the developed prototypes, and the evaluated results form a fertile ground useful for the broad intelligent environments scientific community as well as for industrial interest groups. The new edition provides: Chapter authors comment on their work on ATRACO with final remarks as viewed in retrospective Each chapter has been updated with follow-up work emerging from ATRACO An extensive introduction to state-of-the-art statistical dialog management for intelligent environments Approaches are introduced on how Trust is reflected during the dialog with the system
Spoken Dialogue Systems Technology and Design covers key topics in the field of spoken language dialogue interaction from a variety of leading researchers. It brings together several perspectives in the areas of corpus annotation and analysis, dialogue system construction, as well as theoretical perspectives on communicative intention, context-based generation, and modelling of discourse structure. These topics are all part of the general research and development within the area of discourse and dialogue with an emphasis on dialogue systems; corpora and corpus tools and semantic and pragmatic modelling of discourse and dialogue.
The ongoing migration of computing and information access from stationary environments to mobile computing devices for eventual use in mobile environments, such as Personal Digital Assistants (PDAs), tablet PCs, next generation mobile phones, and in-car driver assistance systems, poses critical challenges for natural human-computer interaction. Spoken dialogue is a key factor in ensuring natural and user-friendly interaction with such devices which are meant not only for computer specialists, but also for everyday users. Speech supports hands-free and eyes-free operation, and becomes a key alternative interaction mode in mobile environments, e.g. in cars where driver distraction by manually operated devices may be a significant problem. On the other hand, the use of mobile devices in public places, may make the possibility of using alternative modalities possibly in combination with speech, such as graphics output and gesture input, preferable due to e.g. privacy issues. Researchers' interest is progressively turning to the integration of speech with other modalities such as gesture input and graphics output, partly to accommodate more efficient interaction and partly to accommodate different user preferences. This book: Audience: Computer scientists, engineers, and others who work in
the area of spoken multimodal dialogue systems in academia and in
the industry;
Stochastically-Based Semantic Analysis investigates the problem of automatic natural language understanding in a spoken language dialog system. The focus is on the design of a stochastic parser and its evaluation with respect to a conventional rule-based method. Stochastically-Based Semantic Analysis will be of most interest to researchers in artificial intelligence, especially those in natural language processing, computational linguistics, and speech recognition. It will also appeal to practicing engineers who work in the area of interactive speech systems.
Reasoning for Information: Seeking and Planning Dialogues provides a logic-based reasoning component for spoken language dialogue systems. This component, called Problem Assistant is responsible for processing constraints on a possible solution obtained from various sources, namely user and the system's domain-specific information. The authors also present findings on the implementation of a dialogue management interface to the Problem Assistant. The dialogue system supports simple mixed-initiative planning interactions in the TRAINS domain, which is still a relatively complex domain involving a number of logical constraints and relations forming the basis for the collaborative problem-solving behavior that drives the dialogue.
In this book, hierarchical structures based on neural networks are investigated for automatic speech recognition. These structures are mainly evaluated within the phoneme recognition task under the Hybrid Hidden Markov Model/Artificial Neural Network (HMM/ANN) paradigm. The baseline hierarchical scheme consists of two levels each which is based on a Multilayered Perceptron (MLP). Additionally, the output of the first level is used as an input for the second level. This system can be substantially speeded up by removing the redundant information contained at the output of the first level.
In this book, a novel approach that combines speech-based emotion recognition with adaptive human-computer dialogue modeling is described. With the robust recognition of emotions from speech signals as their goal, the authors analyze the effectiveness of using a plain emotion recognizer, a speech-emotion recognizer combining speech and emotion recognition, and multiple speech-emotion recognizers at the same time. The semi-stochastic dialogue model employed relates user emotion management to the corresponding dialogue interaction history and allows the device to adapt itself to the context, including altering the stylistic realization of its speech. This comprehensive volume begins by introducing spoken language dialogue systems and providing an overview of human emotions, theories, categorization and emotional speech. It moves on to cover the adaptive semi-stochastic dialogue model and the basic concepts of speech-emotion recognition. Finally, the authors show how speech-emotion recognizers can be optimized, and how an adaptive dialogue manager can be implemented. The book, with its novel methods to perform robust speech-based emotion recognition at low complexity, will be of interest to a variety of readers involved in human-computer interaction.
Speech and Human-Machine Dialog focuses on the dialog management component of a spoken language dialog system. Spoken language dialog systems provide a natural interface between humans and computers. These systems are of special interest for interactive applications, and they integrate several technologies including speech recognition, natural language understanding, dialog management and speech synthesis. Due to the conjunction of several factors throughout the past few years, humans are significantly changing their behavior vis-a-vis machines. In particular, the use of speech technologies will become normal in the professional domain, and in everyday life. The performance of speech recognition components has also significantly improved. This book includes various examples that illustrate the different functionalities of the dialog model in a representative application for train travel information retrieval (train time tables, prices and ticket reservation). Speech and Human-Machine Dialog is designed for a professional audience, composed of researchers and practitioners in industry. This book is also suitable as a secondary text for graduate-level students in computer science and engineering. "
This book is a collection of eleven chapters which together represent an original contribution to the field of (multimodal) spoken dialogue systems. The chapters include highly relevant topics, such as dialogue modeling in research systems versus industrial systems, evaluation, miscommunication and error handling, grounding, statistical and corpus-based approaches to discourse and dialogue modeling, data analysis, and corpus annotation and annotation tools. The book contains several detailed application studies, including, e.g., speech-controlled MP3 players in a car environment, negotiation training with a virtual human in a military context, application of spoken dialogue to question-answering systems, and cognitive aspects in tutoring systems. The chapters vary considerably with respect to the level of expertise required in advance to benefit from them. However, most chapters start with a state-of-the-art description from which all readers from the spoken dialogue community may benefit. Overview chapters and state-of-the-art descriptions may also be of interest to people from the human-computer interaction community.
In its nine chapters, this book provides an overview of the state-of-the-art and best practice in several sub-fields of evaluation of text and speech systems and components. The evaluation aspects covered include speech and speaker recognition, speech synthesis, animated talking agents, part-of-speech tagging, parsing, and natural language software like machine translation, information retrieval, question answering, spoken dialogue systems, data resources, and annotation schemes. With its broad coverage and original contributions this book is unique in the field of evaluation of speech and language technology. This book is of particular relevance to advanced undergraduate students, PhD students, academic and industrial researchers, and practitioners.
Adaptive Multimodal Interactive Systems introduces a general framework for adapting multimodal interactive systems and comprises a detailed discussion of each of the steps required for adaptation. This book also investigates how interactive systems may be improved in terms of usability and user friendliness while describing the exhaustive user tests employed to evaluate the presented approaches. After introducing general theory, a generic approach for user modeling in interactive systems is presented, ranging from an observation of basic events to a description of higher-level user behavior. Adaptations are presented as a set of patterns similar to those known from software or usability engineering.These patterns describe recurring problems and present proven solutions. The authors include a discussion on when and how to employ patterns and provide guidance to the system designer who wants to add adaptivity to interactive systems. In addition to these patterns, the book introduces an adaptation framework, which exhibits an abstraction layer using Semantic Web technology.Adaptations are implemented on top of this abstraction layer by creating a semantic representation of the adaptation patterns. The patterns cover both graphical interfaces as well as speech-based and multimodal interactive systems.
Novel Techniques for Dialectal Arabic Speech describes approaches to improve automatic speech recognition for dialectal Arabic. Since speech resources for dialectal Arabic speech recognition are very sparse, the authors describe how existing Modern Standard Arabic (MSA) speech data can be applied to dialectal Arabic speech recognition, while assuming that MSA is always a second language for all Arabic speakers. In this book, Egyptian Colloquial Arabic (ECA) has been chosen as a typical Arabic dialect. ECA is the first ranked Arabic dialect in terms of number of speakers, and a high quality ECA speech corpus with accurate phonetic transcription has been collected. MSA acoustic models were trained using news broadcast speech. In order to cross-lingually use MSA in dialectal Arabic speech recognition, the authors have normalized the phoneme sets for MSA and ECA. After this normalization, they have applied state-of-the-art acoustic model adaptation techniques like Maximum Likelihood Linear Regression (MLLR) and Maximum A-Posteriori (MAP) to adapt existing phonemic MSA acoustic models with a small amount of dialectal ECA speech data. Speech recognition results indicate a significant increase in recognition accuracy compared to a baseline model trained with only ECA data.
In Monitoring Adaptive Spoken Dialog Systems, authors Alexander Schmitt and Wolfgang Minker investigate statistical approaches that allow for recognition of negative dialog patterns in Spoken Dialog Systems (SDS). The presented stochastic methods allow a flexible, portable and accurate use. Beginning with the foundations of machine learning and pattern recognition, this monograph examines how frequently users show negative emotions in spoken dialog systems and develop novel approaches to speech-based emotion recognition using hybrid approach to model emotions. The authors make use of statistical methods based on acoustic, linguistic and contextual features to examine the relationship between the interaction flow and the occurrence of emotions using non-acted recordings several thousand real users from commercial and non-commercial SDS. Additionally, the authors present novel statistical methods that spot problems within a dialog based on interaction patterns. The approaches enable future SDS to offer more natural and robust interactions. This work provides insights, lessons and inspiration for future research and development, not only for spoken dialog systems, but for data-driven approaches to human-machine interaction in general.
This book addresses the problem of separating spontaneous multi-party speech by way of microphone arrays (beamformers) and adaptive signal processing techniques. It is written is a concise manner and an effort has been made such that all presented algorithms can be straightforwardly implemented by the reader. All experimental results have been obtained with real in-car microphone recordings involving simultaneous speech of the driver and the co-driver.
This book covers key topics in the field of intelligent ambient adaptive systems. It focuses on the results worked out within the framework of the ATRACO (Adaptive and TRusted Ambient eCOlogies) project. The theoretical background, the developed prototypes, and the evaluated results form a fertile ground useful for the broad intelligent environments scientific community as well as for industrial interest groups. The new edition provides: Chapter authors comment on their work on ATRACO with final remarks as viewed in retrospective Each chapter has been updated with follow-up work emerging from ATRACO An extensive introduction to state-of-the-art statistical dialog management for intelligent environments Approaches are introduced on how Trust is reflected during the dialog with the system
Introducing Spoken Dialogue Systems into Intelligent Environments outlines the formalisms of a novel knowledge-driven framework for spoken dialogue management and presents the implementation of a model-based Adaptive Spoken Dialogue Manager(ASDM) called OwlSpeak. The authors have identified three stakeholders that potentially influence the behavior of the ASDM: the user, the SDS, and a complex Intelligent Environment (IE) consisting of various devices, services, and task descriptions. The theoretical foundation of a working ontology-based spoken dialogue description framework, the prototype implementation of the ASDM, and the evaluation activities that are presented as part of this book contribute to the ongoing spoken dialogue research by establishing the fertile ground of model-based adaptive spoken dialogue management. This monograph is ideal for advanced undergraduate students, PhD students, and postdocs as well as academic and industrial researchers and developers in speech and multimodal interactive systems.
Proactive Spoken Dialogue Interaction in Multi-Party Environments describes spoken dialogue systems that act as independent dialogue partners in the conversation with and between users. The resulting novel characteristics such as proactiveness and multi-party capabilities pose new challenges on the dialogue management component of such a system and require the use and administration of an extensive dialogue history. In order to assist the proactive spoken dialogue systems development, a comprehensive data collection seems mandatory and may be performed in a Wizard-of-Oz environment. Such an environment builds also the appropriate basis for an extensive usability and acceptance evaluation. Proactive Spoken Dialogue Interaction in Multi-Party Environments is a useful reference for students and researchers in speech processing.
In this book, hierarchical structures based on neural networks are investigated for automatic speech recognition. These structures are mainly evaluated within the phoneme recognition task under the Hybrid Hidden Markov Model/Artificial Neural Network (HMM/ANN) paradigm. The baseline hierarchical scheme consists of two levels each which is based on a Multilayered Perceptron (MLP). Additionally, the output of the first level is used as an input for the second level. This system can be substantially speeded up by removing the redundant information contained at the output of the first level. |
You may like...
|