|
|
Books > Professional & Technical > Other technologies > General
This book offers an overview of models, measurements, calculations
and examples connecting musical acoustics and music psychology.
Indeed, many mathematical formulations that explain musical
acoustics can also be used to help predict human auditory
perception.
Der Band beschreibt die Entstehung, Ausbreitung, Abstrahlung und
Messung von Korperschall - wichtige Themen fur die Larmminderung
bei Maschinen oder Gebauden, aber auch bei der Messung mechanischer
Materialdaten. In der 3. Auflage wurde der Band erneuert mit dem
Ziel, den Geist des ursprunglichen Werks (Lothar Cremer/Manfred
Heckl) zu bewahren und es zugleich an den aktuellen Wissensstand
anzupassen. So fuhrt das erste Kapitel jetzt in den Korperschall
und die physikalischen Prinzipien ein, der Messtechnik ist ein
eigener Abschnitt gewidmet."
This book highlights the advantages of the vector-phase method in
underwater acoustic measurements and presents results of
theoretical and experimental studies of the deep open ocean and
shallow sea based on vector-phase representations. Based on the
physical phenomena discovered and compensation of counter streams
of energy and vortices of the acoustic intensity vector, processes
of transmitting acoustic energy of a tonal signal in the real ocean
are described. The book also discusses the development of advanced
detection tools based on vector-phase sonar. This book provides
useful content for professionals and researchers working in various
fields of applied underwater acoustics.
"Resonant Alterities" bridges the gap between sound studies and
literary criticism. A queer ghost story by Vernon Lee, an occultist
novel of psychic adventure by Algernon Blackwood, a dystopian
science fiction tale by J.G. Ballard and a post-traumatic short
novel by Don DeLillo are its primary objects of analysis. Each is
explored within the context of its contemporary cultural debates on
sound. Meanwhile, all four theory-enriched readings focus on
intersecting and desire-laden processes of meaning making,
knowledge production and subject formation. Focal points are
aurally/audio-visually structured phenomena expressive of both
collective and individual anxieties.
This book evaluates the impact of relevant factors affecting the
results of speech quality assessment studies carried out in
crowdsourcing. The author describes how these factors relate to the
test structure, the effect of environmental background noise, and
the influence of language differences. He details multiple
user-centered studies that have been conducted to derive guidelines
for reliable collection of speech quality scores in crowdsourcing.
Specifically, different questions are addressed such as the optimal
number of speech samples to include in a listening task, the
influence of the environmental background noise in the speech
quality ratings, as well as methods for classifying background
noise from web audio recordings, or the impact of language
proficiency in the user perception of speech quality. Ultimately,
the results of these studies contributed to the definition of the
ITU-T Recommendation P.808 that defines the guidelines to conduct
speech quality studies in crowdsourcing.
Service procedures for yard and garden tractors manufactured
through 1990.
Digital measurement of the analog acoustical parameters of a music
performance hall is difficult. The aim of such work is to create a
digital acoustical derivation that is an accurate numerical
representation of the complex analog characteristics of the hall.
The present study describes the exponential sine sweep (ESS)
measurement process in the derivation of an acoustical impulse
response function (AIRF) of three music performance halls in
Canada. It examines specific difficulties of the process, such as
preventing the external effects of the measurement transducers from
corrupting the derivation, and provides solutions, such as the use
of filtering techniques in order to remove such unwanted effects.
In addition, the book presents a novel method of numerical
verification through mean-squared error (MSE) analysis in order to
determine how accurately the derived AIRF represents the acoustical
behavior of the actual hall.
As speech processing devices like mobile phones, voice controlled
devices, and hearing aids have increased in popularity, people
expect them to work anywhere and at any time without user
intervention. However, the presence of acoustical disturbances
limits the use of these applications, degrades their performance,
or causes the user difficulties in understanding the conversation
or appreciating the device. A common way to reduce the effects of
such disturbances is through the use of single-microphone noise
reduction algorithms for speech enhancement. The field of
single-microphone noise reduction for speech enhancement comprises
a history of more than 30 years of research. In this survey, we
wish to demonstrate the significant advances that have been made
during the last decade in the field of discrete Fourier transform
domain-based single-channel noise reduction for speech
enhancement.Furthermore, our goal is to provide a concise
description of a state-of-the-art speech enhancement system, and
demonstrate the relative importance of the various building blocks
of such a system. This allows the non-expert DSP practitioner to
judge the relevance of each building block and to implement a
close-to-optimal enhancement system for the particular application
at hand. Table of Contents: Introduction / Single Channel Speech
Enhancement: General Principles / DFT-Based Speech Enhancement
Methods: Signal Model and Notation / Speech DFT Estimators / Speech
Presence Probability Estimation / Noise PSD Estimation / Speech PSD
Estimation / Performance Evaluation Methods / Simulation
Experiments with Single-Channel Enhancement Systems / Future
Directions
Senior level/graduate level text/reference presenting state-of-the-
art numerical techniques to solve the wave equation in
heterogeneous fluid-solid media. Numerical models have become
standard research tools in acoustic laboratories, and thus
computational acoustics is becoming an increasingly important
branch of ocean acoustic science. The first edition of this
successful book, written by the recognized leaders of the field,
was the first to present a comprehensive and modern introduction to
computational ocean acoustics accessible to students. This
revision, with 100 additional pages, completely updates the
material in the first edition and includes new models based on
current research. It includes problems and solutions in every
chapter, making the book more useful in teaching (the first edition
had a separate solutions manual). The book is intended for graduate
and advanced undergraduate students of acoustics, geology and
geophysics, applied mathematics, ocean engineering or as a
reference in computational methods courses, as well as
professionals in these fields, particularly those working in
government (especially Navy) and industry labs engaged in the
development or use of propagating models.
Noise is a widely recognized problem and health concern in the
modern world. Given the importance of managing noise levels and
developing suitable 'soundscapes' in contexts such as industry,
schools, or public spaces, this is an area of active research for
acousticians. But noise, in the sense of dissonance, can also be
used positively; composers have employed it from Baroque music to
Rock feedback; medicine harnesses it to shatter kidney stones and
treat cancer; and even the military uses it in (real and rumoured)
weapons. Mike Goldsmith looks back at the long history of the
battle between people and noise - a battle that has changed our
lives and moulded our societies. He investigates how increasing
noise levels relate to human progress, from the clatter of wheels
on cobbles to the sound of heavy machinery; he explains how our
scientific understanding of sound and hearing has developed; and he
looks at noise in nature, including the remarkable ways in which
some animals, such as shrimps, use noise as a weapon or to catch
prey. He concludes by turning to the future, discussing the noise
sources which are likely to dominate it and the ways in which new
science and new ideas may change the way our future will sound.
This book is about music. the instruments and players who produce
it. and the technologies that support it. Although much modern
music is produced by electronic means. its underlying basis is
still traditional acoustical sound production. and that broad topic
provides the basis for this book. There are many fine books
available that treat musical acoustics largely from the physical
point of view. The approach taken here is to present only the
fundamentals of musical physics. while giving special emphasis to
the relation between instrument and player and stressing the
characteristics of instruments that are of special concern to
engineers and technicians in volved in the fields of recording.
sound reinforcement. and broadcasting. In order to understand
musical instruments in their normal performance environments. the
student must have a basic working knowledge of physical and
architectural acoustics. The book begins with a review of the
elements of acoustics. stressing the nature of sound fields and
phenomena that are wavelength-dependent. The book then moves on to
a discussion of those aspects of psychological acoustics that are
of special concern to music technicians. most notably concepts of
stereophonic imaging. loudness-related phenomena. and critical band
theory."
This classic reference on musical acoustics and performance
practice begins with a brief introduction to the fundamentals of
acoustics and the generation of musical sounds. It then discusses
the particulars of the sounds made by all the standard instruments
in a modern orchestra as well as the human voice, the way in which
the sounds made by these instruments are dispersed and how the room
into which they are projected affects the sounds.
Immediately following the Second World War, between 1947 and 1955,
several classic papers quantified the fundamentals of human speech
information processing and recognition. In 1947 French and
Steinberg published their classic study on the articulation index.
In 1948 Claude Shannon published his famous work on the theory of
information. In 1950 Fletcher and Galt published their theory of
the articulation index, a theory that Fletcher had worked on for 30
years, which integrated his classic works on loudness and speech
perception with models of speech intelligibility. In 1951 George
Miller then wrote the first book Language and Communication,
analyzing human speech communication with Claude Shannon's just
published theory of information. Finally in 1955 George Miller
published the first extensive analysis of phone decoding, in the
form of confusion matrices, as a function of the speech-to-noise
ratio. This work extended the Bell Labs' speech articulation
studies with ideas from Shannon's Information theory. Both Miller
and Fletcher showed that speech, as a code, is incredibly robust to
mangling distortions of filtering and noise. Regrettably much of
this early work was forgotten. While the key science of information
theory blossomed, other than the work of George Miller, it was
rarely applied to aural speech research. The robustness of speech,
which is the most amazing thing about the speech code, has rarely
been studied. It is my belief (i.e., assumption) that we can
analyze speech intelligibility with the scientific method. The
quantitative analysis of speech intelligibility requires both
science and art. The scientific component requires an error
analysis of spoken communication, which depends critically on the
use of statistics, information theory, and psychophysical methods.
The artistic component depends on knowing how to restrict the
problem in such a way that progress may be made. It is critical to
tease out the relevant from the irrelevant and dig for the key
issues. This will focus us on the decoding of nonsense phonemes
with no visual component, which have been mangled by filtering and
noise. This monograph is a summary and theory of human speech
recognition. It builds on and integrates the work of Fletcher,
Miller, and Shannon. The long-term goal is to develop a
quantitative theory for predicting the recognition of speech
sounds. In Chapter 2 the theory is developed for maximum entropy
(MaxEnt) speech sounds, also called nonsense speech. In Chapter 3,
context is factored in. The book is largely reflective, and
quantitative, with a secondary goal of providing an historical
context, along with the many deep insights found in these early
works.
Speech dynamics refer to the temporal characteristics in all stages
of the human speech communication process. This speech "chain"
starts with the formation of a linguistic message in a speaker's
brain and ends with the arrival of the message in a listener's
brain. Given the intricacy of the dynamic speech process and its
fundamental importance in human communication, this monograph is
intended to provide a comprehensive material on mathematical models
of speech dynamics and to address the following issues: How do we
make sense of the complex speech process in terms of its functional
role of speech communication? How do we quantify the special role
of speech timing? How do the dynamics relate to the variability of
speech that has often been said to seriously hamper automatic
speech recognition? How do we put the dynamic process of speech
into a quantitative form to enable detailed analyses? And finally,
how can we incorporate the knowledge of speech dynamics into
computerized speech analysis and recognition algorithms? The
answers to all these questions require building and applying
computational models for the dynamic speech process. What are the
compelling reasons for carrying out dynamic speech modeling? We
provide the answer in two related aspects. First, scientific
inquiry into the human speech code has been relentlessly pursued
for several decades. As an essential carrier of human intelligence
and knowledge, speech is the most natural form of human
communication. Embedded in the speech code are linguistic (as well
as para-linguistic) messages, which are conveyed through four
levels of the speech chain. Underlying the robust encoding and
transmission of the linguistic messages are the speech dynamics at
all the four levels. Mathematical modeling of speech dynamics
provides an effective tool in the scientific methods of studying
the speech chain. Such scientific studies help understand why
humans speak as they do and how humans exploit redundancy and
variability by way of multitiered dynamic processes to enhance the
efficiency and effectiveness of human speech communication. Second,
advancement of human language technology, especially that in
automatic recognition of natural-style human speech is also
expected to benefit from comprehensive computational modeling of
speech dynamics. The limitations of current speech recognition
technology are serious and are well known. A commonly acknowledged
and frequently discussed weakness of the statistical model
underlying current speech recognition technology is the lack of
adequate dynamic modeling schemes to provide correlation structure
across the temporal speech observation sequence. Unfortunately, due
to a variety of reasons, the majority of current research
activities in this area favor only incremental modifications and
improvements to the existing HMM-based state-of-the-art. For
example, while the dynamic and correlation modeling is known to be
an important topic, most of the systems nevertheless employ only an
ultra-weak form of speech dynamics; e.g., differential or delta
parameters. Strong-form dynamic speech modeling, which is the focus
of this monograph, may serve as an ultimate solution to this
problem. After the introduction chapter, the main body of this
monograph consists of four chapters. They cover various aspects of
theory, algorithms, and applications of dynamic speech models, and
provide a comprehensive survey of the research work in this area
spanning over past 20~years. This monograph is intended as advanced
materials of speech and signal processing for graudate-level
teaching, for professionals and engineering practioners, as well as
for seasoned researchers and engineers specialized in speech
processing
Latent semantic mapping (LSM) is a generalization of latent
semantic analysis (LSA), a paradigm originally developed to capture
hidden word patterns in a text document corpus. In information
retrieval, LSA enables retrieval on the basis of conceptual
content, instead of merely matching words between queries and
documents. It operates under the assumption that there is some
latent semantic structure in the data, which is partially obscured
by the randomness of word choice with respect to retrieval.
Algebraic and/or statistical techniques are brought to bear to
estimate this structure and get rid of the obscuring ""noise.""
This results in a parsimonious continuous parameter description of
words and documents, which then replaces the original
parameterization in indexing and retrieval. This approach exhibits
three main characteristics: -Discrete entities (words and
documents) are mapped onto a continuous vector space; -This mapping
is determined by global correlation patterns; and -Dimensionality
reduction is an integral part of the process. Such fairly generic
properties are advantageous in a variety of different contexts,
which motivates a broader interpretation of the underlying
paradigm. The outcome (LSM) is a data-driven framework for modeling
meaningful global relationships implicit in large volumes of (not
necessarily textual) data. This monograph gives a general overview
of the framework, and underscores the multifaceted benefits it can
bring to a number of problems in natural language understanding and
spoken language processing. It concludes with a discussion of the
inherent tradeoffs associated with the approach, and some
perspectives on its general applicability to data-driven
information extraction. Contents: I. Principles / Introduction /
Latent Semantic Mapping / LSM Feature Space / Computational Effort
/ Probabilistic Extensions / II. Applications / Junk E-mail
Filtering / Semantic Classification / Language Modeling /
Pronunciation Modeling / Speaker Verification / TTS Unit Selection
/ III. Perspectives / Discussion / Conclusion / Bibliography
|
|