![]() |
![]() |
Your cart is empty |
||
Books > Language & Literature > Language & linguistics > Computational linguistics
This is the first volume of a unique collection that brings together the best English-language problems created for students competing in the Computational Linguistics Olympiad. These problems are representative of the diverse areas presented in the competition and designed with three principles in mind: * To challenge the student analytically, without requiring any explicit knowledge or experience in linguistics or computer science; * To expose the student to the different kinds of reasoning required when encountering a new phenomenon in a language, both as a theoretical topic and as an applied problem; * To foster the natural curiosity students have about the workings of their own language, as well as to introduce them to the beauty and structure of other languages; * To learn about the models and techniques used by computers to understand human language. Aside from being a fun intellectual challenge, the Olympiad mimics the skills used by researchers and scholars in the field of computational linguistics. In an increasingly global economy where businesses operate across borders and languages, having a strong pool of computational linguists is a competitive advantage, and an important component to both security and growth in the 21st century. This collection of problems is a wonderful general introduction to the field of linguistics through the analytic problem solving technique. "A fantastic collection of problems for anyone who is curious about how human language works! These books take serious scientific questions and present them in a fun, accessible way. Readers exercise their logical thinking capabilities while learning about a wide range of human languages, linguistic phenomena, and computational models. " - Kevin Knight, USC Information Sciences Institute
This book explores the various categories of speech variation and works to draw a line between linguistic and paralinguistic phenomenon of speech. Paralinguistic contrast is crucial to human speech but has proven to be one of the most difficult tasks in speech systems. In the quest for solutions to speech technology and sciences, this book narrows down the gap between speech technologists and phoneticians and emphasizes the imperative efforts required to accomplish the goal of paralinguistic control in speech technology applications and the acute need for a multidisciplinary categorization system. This interdisciplinary work on paralanguage will not only serve as a source of information but also a theoretical model for linguists, sociologists, psychologists, phoneticians and speech researchers.
This collection of papers takes linguists to the leading edge of techniques in generative lexicon theory, the linguistic composition methodology that arose from the imperative to provide a compositional semantics for the contextual modifications in meaning that emerge in real linguistic usage. Today's growing shift towards distributed compositional analyses evinces the applicability of GL theory, and the contributions to this volume, presented at three international workshops (GL-2003, GL-2005 and GL-2007) address the relationship between compositionality in language and the mechanisms of selection in grammar that are necessary to maintain this property. The core unresolved issues in compositionality, relating to the interpretation of context and the mechanisms of selection, are treated from varying perspectives within GL theory, including its basic theoretical mechanisms and its analytical viewpoint on linguistic phenomena.
"Mobile Speech and Advanced Natural Language Solutions" presents the discussion of the most recent advances in intelligent human-computer interaction, including fascinating new study findings on talk-in-interaction, which is the province of conversation analysis, a subfield in sociology/sociolinguistics, a new and emerging area in natural language understanding. Editors Amy Neustein and Judith A. Markowitz have recruited a talented group of contributors to introduce the next generation natural language technologies for practical speech processing applications that serve the consumer's need for well-functioning natural language-driven personal assistants and other mobile devices, while also addressing business' need for better functioning IVR-driven call centers that yield a more satisfying experience for the caller. This anthology is aimed at two distinct audiences: one consisting of speech engineers and system developers; the other comprised of linguists and cognitive scientists. The text builds on the experience and knowledge of each of these audiences by exposing them to the work of the other.
Karen Sparck Jones is one of the major figures of 20th century and early 21st Century computing and information processing. Her ideas have had an important influence on the development of Internet Search Engines. Her contribution has been recognized by awards from the natural language processing, information retrieval and artificial intelligence communities, including being asked to present the prestigious Grace Hopper lecture. She continues to be an active and influential researcher. Her contribution to the scientific evaluation of the effectiveness of such computer systems has been quite outstanding. This book celebrates the life and work of Karen Sparck Jones in her seventieth year. It consists of fifteen new and original chapters written by leading international authorities reviewing the state of the art and her influence in the areas in which Karen Sparck Jones has been active. Although she has a publication record which goes back over forty years, it is clear even the very early work reviewed in the book can be read with profit by those working on recent developments in information processing like bioinformatics and the semantic web.
Automatic speech recognition suffers from a lack of robustness with respect to noise, reverberation and interfering speech. The growing field of speech recognition in the presence of missing or uncertain input data seeks to ameliorate those problems by using not only a preprocessed speech signal but also an estimate of its reliability to selectively focus on those segments and features that are most reliable for recognition. This book presents the state of the art in recognition in the presence of uncertainty, offering examples that utilize uncertainty information for noise robustness, reverberation robustness, simultaneous recognition of multiple speech signals, and audiovisual speech recognition. The book is appropriate for scientists and researchers in the field of speech recognition who will find an overview of the state of the art in robust speech recognition, professionals working in speech recognition who will find strategies for improving recognition results in various conditions of mismatch, and lecturers of advanced courses on speech processing or speech recognition who will find a reference and a comprehensive introduction to the field. The book assumes an understanding of the fundamentals of speech recognition using Hidden Markov Models.
Data driven methods have long been used in Automatic Speech Recognition (ASR) and Text-To-Speech (TTS) synthesis and have more recently been introduced for dialogue management, spoken language understanding, and Natural Language Generation. Machine learning is now present "end-to-end" in Spoken Dialogue Systems (SDS). However, these techniques require data collection and annotation campaigns, which can be time-consuming and expensive, as well as dataset expansion by simulation. In this book, we provide an overview of the current state of the field and of recent advances, with a specific focus on adaptivity.
Semantic fields are lexically coherent - the words they contain co-occur in texts. In this book the authors introduce and define semantic domains, a computational model for lexical semantics inspired by the theory of semantic fields. Semantic domains allow us to exploit domain features for texts, terms and concepts, and they can significantly boost the performance of natural-language processing systems. Semantic domains can be derived from existing lexical resources or can be acquired from corpora in an unsupervised manner. They also have the property of interlinguality, and they can be used to relate terms in different languages in multilingual application scenarios. The authors give a comprehensive explanation of the computational model, with detailed chapters on semantic domains, domain models, and applications of the technique in text categorization, word sense disambiguation, and cross-language text categorization. This book is suitable for researchers and graduate students in computational linguistics.
Complex systems in nature and society make use of information for the development of their internal organization and the control of their functional mechanisms. Alongside technical aspects of storing, transmitting and processing information, the various semantic aspects of information, such as meaning, sense, reference and function, play a decisive part in the analysis of such systems. With the aim of fostering a better understanding of semantic systems from an evolutionary and multidisciplinary perspective, this volume collects contributions by philosophers and natural scientists, linguists, information and computer scientists. They do not follow a single research paradigm; rather they shed, in a complementary way, new light upon some of the most important aspects of the evolution of semantic systems. Evolution of Semantic Systems is intended for researchers in philosophy, computer science, and the natural sciences who work on the analysis or development of semantic systems, ontologies, or similar complex information structures. In the eleven chapters, they will find a broad discussion of topics ranging from underlying universal principles to representation and processing aspects to paradigmatic examples.
This volume brings together a number of corpus-based studies dealing with language varieties. These contributions focus on contemporary lines of research interests, and include language teaching and learning, translation, domain-specific grammatical and textual phenomena, linguistic variation and gender, among others. Corpora used in these studies range from highly specialized texts, including earlier scientific texts, to regional varieties. Under the umbrella of corpus linguistics, scholars also apply other distinct methodological approaches to their data in order to offer new insights into old and new topics in linguistics and applied linguistics. Another important contribution of this book lies in the obvious didactic implications of the results obtained in the individual chapters for domain-based language teaching.
This book constitutes the refereed proceedings of the 5th International Conference of the CLEF Initiative, CLEF 2014, held in Sheffield, UK, in September 2014. The 11 full papers and 5 short papers presented were carefully reviewed and selected from 30 submissions. They cover a broad range of issues in the fields of multilingual and multimodal information access evaluation, also included are a set of labs and workshops designed to test different aspects of mono and cross-language information retrieval systems
A Journey Through Cultures addresses one of the hottest topics in contemporary HCI: cultural diversity amongst users. For a number of years the HCI community has been investigating alternatives to enhance the design of cross-cultural systems. Most contributions to date have followed either a 'design for each' or a 'design for all' strategy. A Journey Through Cultures takes a very different approach. Proponents of CVM - the Cultural Viewpoint Metaphors perspective - the authors invite HCI practitioners to think of how to expose and communicate the idea of cultural diversity. A detailed case study is included which assesses the metaphors' potential in cross-cultural design and evaluation. The results show that cultural viewpoint metaphors have strong epistemic power, leveraged by a combination of theoretic foundations coming from Anthropology, Semiotics and the authors' own work in HCI and Semiotic Engineering. Luciana Salgado, Carla Leitao and Clarisse de Souza are members of SERG, the Semiotic Engineering Research Group at the Departamento de Informatica of Rio de Janeiro's Pontifical Catholic University (PUC-Rio).
Information extraction (IE) and text summarization (TS) are powerful technologies for finding relevant pieces of information in text and presenting them to the user in condensed form. The ongoing information explosion makes IE and TS critical for successful functioning within the information society. These technologies face particular challenges due to the inherent multi-source nature of the information explosion. The technologies must now handle not isolated texts or individual narratives, but rather large-scale repositories and streams---in general, in multiple languages---containing a multiplicity of perspectives, opinions, or commentaries on particular topics, entities or events. There is thus a need to adapt existing techniques and develop new ones to deal with these challenges. This volume contains a selection of papers that present a variety of methodologies for content identification and extraction, as well as for content fusion and regeneration. The chapters cover various aspects of the challenges, depending on the nature of the information sought---names vs. events,--- and the nature of the sources---news streams vs. image captions vs. scientific research papers, etc. This volume aims to offer a broad and representative sample of studies from this very active research field.
The practical task of building a talking robot requires a theory of how natural language communication works. Conversely, the best way to computationally verify a theory of natural language communication is to demonstrate its functioning concretely in the form of a talking robot, the epitome of human-machine communication. To build an actual robot requires hardware that provides appropriate recognition and action interfaces, and because such hardware is hard to develop the approach in this book is theoretical: the author presents an artificial cognitive agent with language as a software system called database semantics (DBS). Because a theoretical approach does not have to deal with the technical difficulties of hardware engineering there is no reason to simplify the system - instead the software components of DBS aim at completeness of function and of data coverage in word form recognition, syntactic-semantic interpretation and inferencing, leaving the procedural implementation of elementary concepts for later. In this book the author first examines the universals of natural language and explains the Database Semantics approach. Then in Part I he examines the following natural language communication issues: using external surfaces; the cycle of natural language communication; memory structure; autonomous control; and learning. In Part II he analyzes the coding of content according to the aspects: semantic relations of structure; simultaneous amalgamation of content; graph-theoretical considerations; computing perspective in dialogue; and computing perspective in text. The book ends with a concluding chapter, a bibliography and an index. The book will be of value to researchers, graduate students and engineers in the areas of artificial intelligence and robotics, in particular those who deal with natural language processing.
The book focuses on the part of the audio conversation not related to language such as speaking rate (in terms of number of syllables per unit time) and emotion centric features. This text examines using non-linguistics features to infer information from phone calls to call centers. The author analyzes "how" the conversation happens and not "what" the conversation is about by audio signal processing and analysis.
This book is written for both linguists and computer scientists working in the field of artificial intelligence as well as to anyone interested in intelligent text processing. Lexical function is a concept that formalizes semantic and syntactic relations between lexical units. Collocational relation is a type of institutionalized lexical relations which holds between the base and its partner in a collocation. Knowledge of collocation is important for natural language processing because collocation comprises the restrictions on how words can be used together. The book shows how collocations can be annotated with lexical functions in a computer readable dictionary - allowing their precise semantic analysis in texts and their effective use in natural language applications including parsers, high quality machine translation, periphrasis system and computer-aided learning of lexica. The books shows how to extract collocations from corpora and annotate them with lexical functions automatically. To train algorithms, the authors created a dictionary of lexical functions containing more than 900 Spanish disambiguated and annotated examples which is a part of this book. The obtained results show that machine learning is feasible to achieve the task of automatic detection of lexical functions.
This book presents state of art research in speech emotion recognition. Readers are first presented with basic research and applications - gradually more advance information is provided, giving readers comprehensive guidance for classify emotions through speech. Simulated databases are used and results extensively compared, with the features and the algorithms implemented using MATLAB. Various emotion recognition models like Linear Discriminant Analysis (LDA), Regularized Discriminant Analysis (RDA), Support Vector Machines (SVM) and K-Nearest neighbor (KNN) and are explored in detail using prosody and spectral features, and feature fusion techniques.
The book provides an overview of more than a decade of joint R&D efforts in the Low Countries on HLT for Dutch. It not only presents the state of the art of HLT for Dutch in the areas covered, but, even more importantly, a description of the resources (data and tools) for Dutch that have been created are now available for both academia and industry worldwide. The contributions cover many areas of human language technology (for Dutch): corpus collection (including IPR issues) and building (in particular one corpus aiming at a collection of 500M word tokens), lexicology, anaphora resolution, a semantic network, parsing technology, speech recognition, machine translation, text (summaries) generation, web mining, information extraction, and text to speech to name the most important ones. The book also shows how a medium-sized language community (spanning two territories) can create a digital language infrastructure (resources, tools, etc.) as a basis for subsequent R&D. At the same time, it bundles contributions of almost all the HLT research groups in Flanders and the Netherlands, hence offers a view of their recent research activities. Targeted readers are mainly researchers in human language technology, in particular those focusing on Dutch. It concerns researchers active in larger networks such as the CLARIN, META-NET, FLaReNet and participating in conferences such as ACL, EACL, NAACL, COLING, RANLP, CICling, LREC, CLIN and DIR ( both in the Low Countries), InterSpeech, ASRU, ICASSP, ISCA, EUSIPCO, CLEF, TREC, etc. In addition, some chapters are interesting for human language technology policy makers and even for science policy makers in general.
- Donation refusal is high in all the regions of Argentina. - The deficient operative structure is a negative reality that allows inadequate donor maintenance and organ procurement. - In more developed regions, there are a high number of organs which are not utilized. This is true for heart, liver and lungs. Small waiting lists for these organs probably reflect an inadequate economic coverage for these organ transplant activities. - There is a long waiting list for cadaveric kidney transplants, which reflect poor procurement and transplant activity. - Lack of awareness by many physicians leads to the denouncing of brain deaths. In spite of these factors, we can say that there has been a significant growth in organ procuration and transplantation in 1993, after the regionalization of the INCUCAI. Conclusions Is there a shortage of organs in Argentina? There may be. But the situation in Argentina differs from that in Europe, as we have a pool of organs which are not utilized (donation refusal, operational deficits, lack of denouncing of brain deaths). Perhaps, in the future, when we are able to make good use of all the organs submitted for transplantation, we will be able to say objectively whether the number of organs is sufficient or not. Acknowledgements I would like to thank the University of Lyon and the Merieux Foundation, especially Professors Traeger, Touraine and Dr. Dupuy for the honour of being invited to talk about the issue of organ procurement.
It is becoming crucial to accurately estimate and monitor speech quality in various ambient environments to guarantee high quality speech communication. This practical hands-on book shows speech intelligibility measurement methods so that the readers can start measuring or estimating speech intelligibility of their own system. The book also introduces subjective and objective speech quality measures, and describes in detail speech intelligibility measurement methods. It introduces a diagnostic rhyme test which uses rhyming word-pairs, and includes: An investigation into the effect of word familiarity on speech intelligibility. Speech intelligibility measurement of localized speech in virtual 3-D acoustic space using the rhyme test. Estimation of speech intelligibility using objective measures, including the ITU standard PESQ measures, and automatic speech recognizers.
Novel Techniques for Dialectal Arabic Speech describes approaches to improve automatic speech recognition for dialectal Arabic. Since speech resources for dialectal Arabic speech recognition are very sparse, the authors describe how existing Modern Standard Arabic (MSA) speech data can be applied to dialectal Arabic speech recognition, while assuming that MSA is always a second language for all Arabic speakers. In this book, Egyptian Colloquial Arabic (ECA) has been chosen as a typical Arabic dialect. ECA is the first ranked Arabic dialect in terms of number of speakers, and a high quality ECA speech corpus with accurate phonetic transcription has been collected. MSA acoustic models were trained using news broadcast speech. In order to cross-lingually use MSA in dialectal Arabic speech recognition, the authors have normalized the phoneme sets for MSA and ECA. After this normalization, they have applied state-of-the-art acoustic model adaptation techniques like Maximum Likelihood Linear Regression (MLLR) and Maximum A-Posteriori (MAP) to adapt existing phonemic MSA acoustic models with a small amount of dialectal ECA speech data. Speech recognition results indicate a significant increase in recognition accuracy compared to a baseline model trained with only ECA data.
This two-volume set, consisting of LNCS 8403 and LNCS 8404, constitutes the thoroughly refereed proceedings of the 14th International Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2014, held in Kathmandu, Nepal, in April 2014. The 85 revised papers presented together with 4 invited papers were carefully reviewed and selected from 300 submissions. The papers are organized in the following topical sections: lexical resources; document representation; morphology, POS-tagging, and named entity recognition; syntax and parsing; anaphora resolution; recognizing textual entailment; semantics and discourse; natural language generation; sentiment analysis and emotion recognition; opinion mining and social networks; machine translation and multilingualism; information retrieval; text classification and clustering; text summarization; plagiarism detection; style and spelling checking; speech processing; and applications.
For humans, understanding a natural language sentence or discourse is so effortless that we hardly ever think about it. For machines, however, the task of interpreting natural language, especially grasping meaning beyond the literal content, has proven extremely difficult and requires a large amount of background knowledge. This book focuses on the interpretation of natural language with respect to specific domain knowledge captured in ontologies. The main contribution is an approach that puts ontologies at the center of the interpretation process. This means that ontologies not only provide a formalization of domain knowledge necessary for interpretation but also support and guide the construction of meaning representations. We start with an introduction to ontologies and demonstrate how linguistic information can be attached to them by means of the ontology lexicon model lemon. These lexica then serve as basis for the automatic generation of grammars, which we use to compositionally construct meaning representations that conform with the vocabulary of an underlying ontology. As a result, the level of representational granularity is not driven by language but by the semantic distinctions made in the underlying ontology and thus by distinctions that are relevant in the context of a particular domain. We highlight some of the challenges involved in the construction of ontology-based meaning representations, and show how ontologies can be exploited for ambiguity resolution and the interpretation of temporal expressions. Finally, we present a question answering system that combines all tools and techniques introduced throughout the book in a real-world application, and sketch how the presented approach can scale to larger, multi-domain scenarios in the context of the Semantic Web. Table of Contents: List of Figures / Preface / Acknowledgments / Introduction / Ontologies / Linguistic Formalisms / Ontology Lexica / Grammar Generation / Putting Everything Together / Ontological Reasoning for Ambiguity Resolution / Temporal Interpretation / Ontology-Based Interpretation for Question Answering / Conclusion / Bibliography / Authors' Biographies
It has been estimated that over a billion people are using or learning English as a second or foreign language, and the numbers are growing not only for English but for other languages as well. These language learners provide a burgeoning market for tools that help identify and correct learners' writing errors. Unfortunately, the errors targeted by typical commercial proofreading tools do not include those aspects of a second language that are hardest to learn. This volume describes the types of constructions English language learners find most difficult: constructions containing prepositions, articles, and collocations. It provides an overview of the automated approaches that have been developed to identify and correct these and other classes of learner errors in a number of languages. Error annotation and system evaluation are particularly important topics in grammatical error detection because there are no commonly accepted standards. Chapters in the book describe the options available to researchers, recommend best practices for reporting results, and present annotation and evaluation schemes. The final chapters explore recent innovative work that opens new directions for research. It is the authors' hope that this volume will continue to contribute to the growing interest in grammatical error detection by encouraging researchers to take a closer look at the field and its many challenging problems.
The explosion of information technology has led to substantial growth of web-accessible linguistic data in terms of quantity, diversity and complexity. These resources become even more useful when interlinked with each other to generate network effects. The general trend of providing data online is thus accompanied by newly developing methodologies to interconnect linguistic data and metadata. This includes linguistic data collections, general-purpose knowledge bases (e.g., the DBpedia, a machine-readable edition of the Wikipedia), and repositories with specific information about languages, linguistic categories and phenomena. The Linked Data paradigm provides a framework for interoperability and access management, and thereby allows to integrate information from such a diverse set of resources. The contributions assembled in this volume illustrate the band-width of applications of the Linked Data paradigm for representative types of language resources. They cover lexical-semantic resources, annotated corpora, typological databases as well as terminology and metadata repositories. The book includes representative applications from diverse fields, ranging from academic linguistics (e.g., typology and corpus linguistics) over applied linguistics (e.g., lexicography and translation studies) to technical applications (in computational linguistics, Natural Language Processing and information technology). This volume accompanies the Workshop on Linked Data in Linguistics 2012 (LDL-2012) in Frankfurt/M., Germany, organized by the Open Linguistics Working Group (OWLG) of the Open Knowledge Foundation (OKFN). It assembles contributions of the workshop participants and, beyond this, it summarizes initial steps in the formation of a Linked Open Data cloud of linguistic resources, the Linguistic Linked Open Data cloud (LLOD). |
![]() ![]() You may like...
Environmental Software Systems - IFIP…
Ralf Denzer, David A. Swayne, …
Hardcover
R6,093
Discovery Miles 60 930
Writing Successful Undergraduate…
Thomas Hainey, Gavin Baxter
Hardcover
R4,119
Discovery Miles 41 190
Technology and Anti-Money Laundering - A…
Dionysios S Demetis
Hardcover
R3,078
Discovery Miles 30 780
Pearson REVISE Edexcel GCSE Computer…
Ann Weidmann, Cynthia Selby
Paperback
R285
Discovery Miles 2 850
Applied System Simulation…
Mohammad S. Obaidat, Georgios I. Papadimitriou
Hardcover
Pearson REVISE BTEC Tech Award Digital…
Alan Jarvis
Digital product license key
R283
Discovery Miles 2 830
|