![]() |
Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
||
|
Books > Computing & IT > Applications of computing > Audio processing
This agenda-setting book presents state of the art research in Music and Human-Computer Interaction (also known as 'Music Interaction'). Music Interaction research is at an exciting and formative stage. Topics discussed include interactive music systems, digital and virtual musical instruments, theories, methodologies and technologies for Music Interaction. Musical activities covered include composition, performance, improvisation, analysis, live coding, and collaborative music making. Innovative approaches to existing musical activities are explored, as well as tools that make new kinds of musical activity possible. Music and Human-Computer Interaction is stimulating reading for professionals and enthusiasts alike: researchers, musicians, interactive music system designers, music software developers, educators, and those seeking deeper involvement in music interaction. It presents the very latest research, discusses fundamental ideas, and identifies key issues and directions for future work.
Download chronicles of the making of the new record industry, from the boom years of the CD revolution of the late 1980s to the crisis of the present day, with particular stress on the last decade. It follows the actions and reactions of the major international record companies, five at the beginning of the story, now four, as they ploughed their way through the digital slough of despond, bewildered by the fleet-of-foot digital innovators far more responsive to the changing marketing conditions through which (recorded) music was consumed and valued. These all have their significant place in Download but the real story is the structural change that has, almost surreptitiously, taken place, within the music business. This change, for reasons author Phil Hardy will explain in detail, has left the captains of the record industry as unable to act as they were unwilling to act. In effect they became little but very well paid observers of the shrinking of their domains.
Proactive Spoken Dialogue Interaction in Multi-Party Environments describes spoken dialogue systems that act as independent dialogue partners in the conversation with and between users. The resulting novel characteristics such as proactiveness and multi-party capabilities pose new challenges on the dialogue management component of such a system and require the use and administration of an extensive dialogue history. In order to assist the proactive spoken dialogue systems development, a comprehensive data collection seems mandatory and may be performed in a Wizard-of-Oz environment. Such an environment builds also the appropriate basis for an extensive usability and acceptance evaluation. Proactive Spoken Dialogue Interaction in Multi-Party Environments is a useful reference for students and researchers in speech processing.
This book discusses all aspects of computing for expressive performance, from the history of CSEMPs to the very latest research, in addition to discussing the fundamental ideas, and key issues and directions for future research. Topics and features: includes review questions at the end of each chapter; presents a survey of systems for real-time interactive control of automatic expressive music performance, including simulated conducting systems; examines two systems in detail, YQX and IMAP, each providing an example of a very different approach; introduces techniques for synthesizing expressive non-piano performances; addresses the challenges found in polyphonic music expression, from a statistical modelling point of view; discusses the automated analysis of musical structure, and the evaluation of CSEMPs; describes the emerging field of embodied expressive musical performance, devoted to building robots that can expressively perform music with traditional instruments.
Spoken dialog systems have the potential to offer highly intuitive user interfaces, as they allow systems to be controlled using natural language. However, the complexity inherent in natural language dialogs means that careful testing of the system must be carried out from the very beginning of the design process. This book examines how user models can be used to support such early evaluations in two ways: by running simulations of dialogs, and by estimating the quality judgments of users. First, a design environment supporting the creation of dialog flows, the simulation of dialogs, and the analysis of the simulated data is proposed. How the quality of user simulations may be quantified with respect to their suitability for both formative and summative evaluation is then discussed. The remainder of the book is dedicated to the problem of predicting quality judgments of users based on interaction data. New modeling approaches are presented, which process the dialogs as sequences, and which allow knowledge about the judgment behavior of users to be incorporated into predictions. All proposed methods are validated with example evaluation studies.
Anyone wanting to set up a low cost web radio station will benefit from the advice and information provided by this book. Not only will you gain technical and practical know-how to enable your station to go live, but also an appreciation of the legal and copyright implications of making radio, potentially for international audiences and in the rapidly evolving environment of the web. To succeed, your radio content will need to be carefully planned and your station properly promoted. Advice is given on taking advantage of the scalability web radio introduces for building audiences in line with your resources, for scheduled live output and for making programmes available on demand, including music, news, speech radio and audience participation. Case studies from around the world are provided to demonstrate how different radio organisations are applying the new flexibility web radio has to offer in a wide range of situations. Together with its associated website www.web-radio-book.com, the book also acts as a starting point for locating a range of sources for further advice and lines of research. Learn how to: - go live with your own low cost web radio station (either managing the server yourself or using a host service) - assess the right server set-up to handle the number of simultaneous listeners expected - get the best sound quality to your listeners - take account of the range of devices available for receiving web radio - plan your station, programming and associated website - identify and reach your audience - build audience feedback and data into your station's strategy - tackle the additional legal and ethical dimensions of radio on the web - source more detailed information
In Monitoring Adaptive Spoken Dialog Systems, authors Alexander Schmitt and Wolfgang Minker investigate statistical approaches that allow for recognition of negative dialog patterns in Spoken Dialog Systems (SDS). The presented stochastic methods allow a flexible, portable and accurate use. Beginning with the foundations of machine learning and pattern recognition, this monograph examines how frequently users show negative emotions in spoken dialog systems and develop novel approaches to speech-based emotion recognition using hybrid approach to model emotions. The authors make use of statistical methods based on acoustic, linguistic and contextual features to examine the relationship between the interaction flow and the occurrence of emotions using non-acted recordings several thousand real users from commercial and non-commercial SDS. Additionally, the authors present novel statistical methods that spot problems within a dialog based on interaction patterns. The approaches enable future SDS to offer more natural and robust interactions. This work provides insights, lessons and inspiration for future research and development, not only for spoken dialog systems, but for data-driven approaches to human-machine interaction in general.
An Introduction to Audio Content Analysis Enables readers to understand the algorithmic analysis of musical audio signals with AI-driven approaches An Introduction to Audio Content Analysis serves as a comprehensive guide on audio content analysis explaining how signal processing and machine learning approaches can be utilized for the extraction of musical content from audio. It gives readers the algorithmic understanding to teach a computer to interpret music signals and thus allows for the design of tools for interacting with music. The work ties together topics from audio signal processing and machine learning, showing how to use audio content analysis to pick up musical characteristics automatically. A multitude of audio content analysis tasks related to the extraction of tonal, temporal, timbral, and intensity-related characteristics of the music signal are presented. Each task is introduced from both a musical and a technical perspective, detailing the algorithmic approach as well as providing practical guidance on implementation details and evaluation. To aid in reader comprehension, each task description begins with a short introduction to the most important musical and perceptual characteristics of the covered topic, followed by a detailed algorithmic model and its evaluation, and concluded with questions and exercises. For the interested reader, updated supplemental materials are provided via an accompanying website. Written by a well-known expert in the music industry, sample topics covered in Introduction to Audio Content Analysis include: Digital audio signals and their representation, common time-frequency transforms, audio features Pitch and fundamental frequency detection, key and chord Representation of dynamics in music and intensity-related features Beat histograms, onset and tempo detection, beat histograms, and detection of structure in music, and sequence alignment Audio fingerprinting, musical genre, mood, and instrument classification An invaluable guide for newcomers to audio signal processing and industry experts alike, An Introduction to Audio Content Analysis covers a wide range of introductory topics pertaining to music information retrieval and machine listening, allowing students and researchers to quickly gain core holistic knowledge in audio analysis and dig deeper into specific aspects of the field with the help of a large amount of references.
In an era when performing live is more essential than ever, this is the go-to guidebook for getting your show on the road and making a living from music. Previously published as The Tour Book, this new edition has been extensively revised, reorganized, and updated to reflect today's music industry. Written by a touring professional with over 25 years of experience.
Musical robotics is a multi- and trans-disciplinary research area involving a wide range of different domains that contribute to its development, including: computer science, multimodal interfaces and processing, artificial intelligence, electronics, robotics, mechatronics and more. A musical robot requires many different complex systems to work together; integrating musical representation, techniques, expressions, detailed analysis and controls, for both playing and listening. The development of interactive multimodal systems provides advancements which enable enhanced human-machine interaction and novel possibilities for embodied robotic platforms. This volume is focused on this highly exciting interdisciplinary field. This book consists of 14 chapters highlighting different aspects of musical activities and interactions, discussing cutting edge research related to interactive multimodal systems and their integration with robots to further enhance musical understanding, interpretation, performance, education and enjoyment. It is dichotomized into two sections: Section I focuses on understanding elements of musical performance and expression while Section II concentrates on musical robots and automated instruments. Musical Robots and Interactive Multimodal Systems provides an introduction and foundation for researchers, students and practitioners to key achievements and current research trends on interactive multimodal systems and musical robotics.
Digital sound synthesis has long been approached using standard digital filtering techniques. Newer synthesis strategies, however, make use of physical descriptions of musical instruments, and allow for much more realistic and complex sound production and thereby synthesis becomes a problem of simulation. This book has a special focus on time domain finite difference methods presented within an audio framework. It covers time series and difference operators, and basic tools for the construction and analysis of finite difference schemes, including frequency-domain and energy-based methods, with special attention paid to problems inherent to sound synthesis. Various basic lumped systems and excitation mechanisms are covered, followed by a look at the 1D wave equation, linear bar and string vibration, acoustic tube modelling, and linear membrane and plate vibration. Various advanced topics, such as the nonlinear vibration of strings and plates, are given an elaborate treatment. Key features: Includes a historical overview of digital sound synthesis techniques, highlighting the links between the various physical modelling methodologies. A pedagogical presentation containing over 150 problems and programming exercises, and numerous figures and diagrams, and code fragments in the MATLAB(R) programming language helps the reader with limited experience of numerical methods reach an understanding of this subject. Offers a complete treatment of all of the major families of musical instruments, including certain audio effects. "Numerical Sound Synthesis" is suitable for audio and software engineers, and researchers in digital audio, sound synthesis and more general musical acoustics. Graduate students in electrical engineering, mechanical engineering or computer science, working on the more technical side of digital audio and sound synthesis, will also find this book of interest.
The accurate determination of the speech spectrum, particularly for short frames, is commonly pursued in diverse areas including speech processing, recognition, and acoustic phonetics. With this book the author makes the subject of spectrum analysis understandable to a wide audience, including those with a solid background in general signal processing and those without such background. In keeping with these goals, this is not a book that replaces or attempts to cover the material found in a general signal processing textbook. Some essential signal processing concepts are presented in the first chapter, but even there the concepts are presented in a generally understandable fashion as far as is possible. Throughout the book, the focus is on applications to speech analysis; mathematical theory is provided for completeness, but these developments are set off in boxes for the benefit of those readers with sufficient background. Other readers may proceed through the main text, where the key results and applications will be presented in general heuristic terms, and illustrated with software routines and practical "show-and-tell" discussions of the results. At some points, the book refers to and uses the implementations in the Praat speech analysis software package, which has the advantages that it is used by many scientists around the world, and it is free and open source software. At other points, special software routines have been developed and made available to complement the book, and these are provided in the Matlab programming language. If the reader has the basic Matlab package, he/she will be able to immediately implement the programs in that platform---no extra "toolboxes" are required.
In this work, the authors present a fully statistical approach to model non--native speakers' pronunciation. Second-language speakers pronounce words in multiple different ways compared to the native speakers. Those deviations, may it be phoneme substitutions, deletions or insertions, can be modelled automatically with the new method presented here. The methods is based on a discrete hidden Markov model as a word pronunciation model, initialized on a standard pronunciation dictionary. The implementation and functionality of the methodology has been proven and verified with a test set of non-native English in the regarding accent. The book is written for researchers with a professional interest in phonetics and automatic speech and speaker recognition.
With Computational Thinking in Sound, veteran educators Gena R. Greher and Jesse M. Heines provide the first book ever written for music fundamentals educators which is devoted specifically to music, sound, and technology. The authors demonstrate how the range of mental tools in computer science - for example, analytical thought, system design, and problem design and solution - can be fruitfully applied to music education, including examples of successful student work. While technology instruction in music education has traditionally focused on teaching how computers and software work to produce music, Greher and Heines offer context: a clear understanding of how music technology can be structured around a set of learning challenges and tasks of the type common in computer science classrooms. Using a learner-centered approach that emphasizes project-based experiences, the book provides music educators with multiple strategies to explore, create, and solve problems with music and technology in equal parts. It also provides examples of hands-on activities which encourage students, alone and in interdisciplinary groups, to explore the basic principles that underlie today's music technology and which expose them to current multimedia development tools.
"Emotion Recognition Using Speech Features" provides coverage of emotion-specific features present in speech. The author also discusses suitable models for capturing emotion-specific information for distinguishing different emotions. The content of this book is important for designing and developing natural and sophisticated speech systems. In this Brief, Drs. Rao and Koolagudi lead a discussion of how emotion-specific information is embedded in speech and how to acquire emotion-specific knowledge using appropriate statistical models. Additionally, the authors provide information about exploiting multiple evidences derived from various features and models. The acquired emotion-specific knowledge is useful for synthesizing emotions. Features includes discussion of: * Global and local prosodic features at syllable, word and phrase levels, helpful for capturing emotion-discriminative information; * Exploiting complementary evidences obtained from excitation sources, vocal tract systems and prosodic features in order to enhance the emotion recognition performance; * Proposed multi-stage and hybrid models for improving the emotion recognition performance. This brief is for researchers working in areas related to speech-based products such as mobile phone manufacturing companies, automobile companies, and entertainment products as well as researchers involved in basic and applied speech processing research.
This volume constitutes the refereed proceedings of the Spanish Conference, IberSPEECH 2012: Joint VII "Jornadas en Tecnologia del Habla" and III Iberian SLTech Workshop, held in Madrid, Spain, in November 21-23, 2012. The 29 revised papers were carefully reviewed and selected from 80 submissions. The papers are organized in topical sections on speaker characterization and recognition; audio and speech segmentation; pathology detection and speech characterization; dialogue and multimodal systems; robustness in automatic speech recognition; applications of speech and language technologies.
Refining Sound is a practical roadmap to the complexities of creating sounds on modern synthesizers. As author, veteran synthesizer instructor Brian K. Shepard draws on his years of experience in synthesizer pedagogy in order to peel back the often-mysterious layers of sound synthesis one-by-one. The result is a book which allows readers to familiarize themselves with each individual step in the synthesis process, in turn empowering them in their own creative or experimental work. The book follows the stages of synthesis in chronological progression, starting readers at the raw materials of sound creation and ultimately bringing them to the final "polishing" stage. Each chapter focuses on a particular aspect of the synthesis process, culminating in a last chapter that brings everything together as the reader creates his/her own complex sounds. Throughout the text, the material is supported by copious examples and illustrations as well as by audio files and synthesis demonstrations on a related companion website. Each chapter contains easily digestible guided projects (entitled "Your Turn" sections) that focus on the topics of the corresponding chapter. In addition to this, one complete project will be carried through each chapter of the book cumulatively, allowing the reader to follow - and build - a sound from start to finish. The final chapter includes several sound creation projects in which readers are given types of sound to create as well as some suggestions and tips, with final outcomes is left to readers' own creativity. Perhaps the most difficult aspect of learning to create sounds on a synthesizer is to understand exactly what each synthesizer component does independent of the synthesizer's numerous other components. Not only does this book thoroughly illustrate and explain these individual components, but it also offers numerous practical demonstrations and exercises that allow the reader to experiment with and understand these elements without the distraction of the other controls and modifiers. Refining Sound is essential for all electronic musicians from amateur to professional levels of accomplishment, students, teachers, libraries, and anyone interested in creating sounds on a synthesizer.
"Automatic Speech Signal Analysis for Clinical Diagnosis and
Assessment of Speech Disorders "provides a survey of methods
designed to aid clinicians in the diagnosis and monitoring of
speech disorders such as dysarthria and dyspraxia, with an emphasis
on the signal processing techniques, statistical validity of the
results presented in the literature, and the appropriateness of
methodsthat do not requirespecialized equipment, rigorously
controlled recording procedures or highly skilled personnel to
interpret results.
Modern communication devices, such as mobile phones, teleconferencing systems, VoIP, etc., are often used in noisy and reverberant environments. Therefore, signals picked up by the microphones from telecommunication devices contain not only the desired near-end speech signal, but also interferences such as the background noise, far-end echoes produced by the loudspeaker, and reverberations of the desired source. These interferences degrade the fidelity and intelligibility of the near-end speech in human-to-human telecommunications and decrease the performance of human-to-machine interfaces (i.e., automatic speech recognition systems). The proposed book deals with the fundamental challenges of speech processing in modern communication, including speech enhancement, interference suppression, acoustic echo cancellation, relative transfer function identification, source localization, dereverberation, and beamforming in reverberant environments. Enhancement of speech signals is necessary whenever the source signal is corrupted by noise. In highly non-stationary noise environments, noise transients, and interferences may be extremely annoying. Acoustic echo cancellation is used to eliminate the acoustic coupling between the loudspeaker and the microphone of a communication device. Identification of the relative transfer function between sensors in response to a desired speech signal enables to derive a reference noise signal for suppressing directional or coherent noise sources. Source localization, dereverberation, and beamforming in reverberant environments further enable to increase the intelligibility of the near-end speech signal.
This book constitutes the refereed proceedings of the 16th International Conference on Text, Speech and Dialogue, TSD 2013, held in Pilsen, Czech Republic, in September 2013. The 65 papers presented together with 5 invited talks were carefully reviewed and selected from 148 submissions. The main topics of this year's conference was corpora, texts and transcription, speech analysis, recognition and synthesis, and their intertwining within NL dialogue systems. The topics also included speech recognition, corpora and language resources, speech and spoken language generation, tagging, classification and parsing of text and speech, semantic processing of text and speech, integrating applications of text and speech processing, as well as automatic dialogue systems, and multimodal techniques and modelling.
Current speech recognition systems are based on speaker independent speech models and suffer from inter-speaker variations in speech signal characteristics. This work develops an integrated approach for speech and speaker recognition in order to gain space for self-learning opportunities of the system. This work introduces a reliable speaker identification which enables the speech recognizer to create robust speaker dependent models In addition, this book gives a new approach to solve the reverse problem, how to improve speech recognition if speakers can be recognized. The speaker identification enables the speaker adaptation to adapt to different speakers which results in an optimal long-term adaptation.
Database of Piano Chords: An Engineering View of Harmony includes a unique database of piano chords developed exclusively for music research purposes, and outlines the key advantages to using this dataset to further one's research. The book also describes the physical bases of the occidental music chords and the influence used in the detection and transcription of the music, enabling researchers to intimately understand the construction of each occidental chord. The online database contains more than 275,000 chords with different degrees of polyphony and with different playing styles. Together, the database and the book are an invaluable tool for researchers in this field.
Audio Signal Processing for Next-Generation Multimedia Communication Systems presents cutting-edge digital signal processing theory and implementation techniques for problems including speech acquisition and enhancement using microphone arrays, new adaptive filtering algorithms, multichannel acoustic echo cancellation, sound source tracking and separation, audio coding, and realistic sound stage reproduction. This book's focus is almost exclusively on the processing, transmission, and presentation of audio and acoustic signals in multimedia communications for telecollaboration where immersive acoustics will play a great role in the near future.
Auralization is the creation of audible acoustic sceneries from computer-generated data. The term "auralization" is to be understood as being analogue to the well-known technique of "visualization." In visual illustration of scenes, data or any other meaningful information, in movie animation and in computer graphics, we describe the process of "making visible" as visualization. In acoustics, auralization is taking place when acoustic effects, primary sound signals or means of sound reinforcement or sound transmission, are processed to be presented by using electro-acoustic equipment. This book is organized as comprehensive collection of basics, methodology and strategies of acoustic simulation and auralization. With mathematical background of advanced students the reader will be able to follow the main strategy of auralization easily and work own implementations of auralization in various fields of applications in acoustic engineering, sound design and virtual reality. For readers interested in basic research the technique of auralization may be useful to create sound stimuli for specific investigations in linguistic, medical, neurological and psychological research and in the field of human-machine interaction.
Computers are at the center of almost everything related to audio. Whether for synthesis in music production, recording in the studio, or mixing in live sound, the computer plays an essential part. Audio effects plug-ins and virtual instruments are implemented as software computer code. Music apps are computer programs run on a mobile device. All these tools are created by programming a computer. Hack Audio: An Introduction to Computer Programming and Digital Signal Processing in MATLAB provides an introduction for musicians and audio engineers interested in computer programming. It is intended for a range of readers including those with years of programming experience and those ready to write their first line of code. In the book, computer programming is used to create audio effects using digital signal processing. By the end of the book, readers implement the following effects: signal gain change, digital summing, tremolo, auto-pan, mid/side processing, stereo widening, distortion, echo, filtering, equalization, multi-band processing, vibrato, chorus, flanger, phaser, pitch shifter, auto-wah, convolution and algorithmic reverb, vocoder, transient designer, compressor, expander, and de-esser. Throughout the book, several types of test signals are synthesized, including: sine wave, square wave, sawtooth wave, triangle wave, impulse train, white noise, and pink noise. Common visualizations for signals and audio effects are created including: waveform, characteristic curve, goniometer, impulse response, step response, frequency spectrum, and spectrogram. In total, over 200 examples are provided with completed code demonstrations. |
You may like...
Statistical Pronunciation Modeling for…
Rainer E. Gruhn, Wolfgang Minker, …
Hardcover
R2,653
Discovery Miles 26 530
Trends in Music Information Seeking…
Petros Kostagiolas, Konstantina Martzoukou, …
Hardcover
R4,969
Discovery Miles 49 690
Multimodal Behavior Analysis in the Wild…
Xavier Alameda-Pineda, Elisa Ricci, …
Paperback
|