![]() |
![]() |
Your cart is empty |
||
Books > Language & Literature > Language & linguistics > Computational linguistics
This book provides an overview of various techniques for the alignment of bitexts. It describes general concepts and strategies that can be applied to map corresponding parts in parallel documents on various levels of granularity. Bitexts are valuable linguistic resources for many different research fields and practical applications. The most predominant application is machine translation, in particular, statistical machine translation. However, there are various other threads that can be followed which may be supported by the rich linguistic knowledge implicitly stored in parallel resources. Bitexts have been explored in lexicography, word sense disambiguation, terminology extraction, computer-aided language learning and translation studies to name just a few. The book covers the essential tasks that have to be carried out when building parallel corpora starting from the collection of translated documents up to sub-sentential alignments. In particular, it describes various approaches to document alignment, sentence alignment, word alignment and tree structure alignment. It also includes a list of resources and a comprehensive review of the literature on alignment techniques. Table of Contents: Introduction / Basic Concepts and Terminology / Building Parallel Corpora / Sentence Alignment / Word Alignment / Phrase and Tree Alignment / Concluding Remarks
A major part of natural language processing now depends on the use of text data to build linguistic analyzers. We consider statistical, computational approaches to modeling linguistic structure. We seek to unify across many approaches and many kinds of linguistic structures. Assuming a basic understanding of natural language processing and/or machine learning, we seek to bridge the gap between the two fields. Approaches to decoding (i.e., carrying out linguistic structure prediction) and supervised and unsupervised learning of models that predict discrete structures as outputs are the focus. We also survey natural language processing problems to which these methods are being applied, and we address related topics in probabilistic inference, optimization, and experimental methodology. Table of Contents: Representations and Linguistic Data / Decoding: Making Predictions / Learning Structure from Annotated Data / Learning Structure from Incomplete Data / Beyond Decoding: Inference
Routledge Introductions to Applied Linguistics consists of introductory level textbooks covering the core topics in Applied Linguistics, designed for those entering postgraduate studies and language professionals returning to academic study. The books take an innovative "practice to theory" approach, with a back to front structure which takes the reader from real life problems and issues in the field, then enters into a discussion of intervention and how to engage with these concerns. The final section concludes by tying the practical issues to theoretical foundations. Additional features include tasks with commentaries, a glossary of key terms, and an annotated further reading section. Corpus linguistics is a key area of applied linguistics and one of the most rapidly developing. Winnie Cheng s practical approach guides readers in acquiring the relevant knowledge and theories to enable the analysis, explanation and interpretation of language using corpus methods. Throughout the book practical classroom examples, concordance based analyses and tasks such as designing and conducting mini-projects are used to connect and explain the conceptual and practical aspects of corpus linguistics. Exploring Corpus Linguistics is an essential textbook for post-graduate/graduate students new to the field and for advanced undergraduates studying English Language and Applied Linguistics.
Human language acquisition has been studied for centuries, but using computational modeling for such studies is a relatively recent trend. However, computational approaches to language learning have become increasingly popular, mainly due to advances in developing machine learning techniques, and the availability of vast collections of experimental data on child language learning and child-adult interaction. Many of the existing computational models attempt to study the complex task of learning a language under cognitive plausibility criteria (such as memory and processing limitations that humans face), and to explain the developmental stages observed in children. By simulating the process of child language learning, computational models can show us which linguistic representations are learnable from the input that children have access to, and which mechanisms yield the same patterns of behaviour that children exhibit during this process. In doing so, computational modeling provides insight into the plausible mechanisms involved in human language acquisition, and inspires the development of better language models and techniques. This book provides an overview of the main research questions in the field of human language acquisition. It reviews the most commonly used computational frameworks, methodologies and resources for modeling child language learning, and the evaluation techniques used for assessing these computational models. The book is aimed at cognitive scientists who want to become familiar with the available computational methods for investigating problems related to human language acquisition, as well as computational linguists who are interested in applying their skills to the study of child language acquisition. Different aspects of language learning are discussed in separate chapters, including the acquisition of the individual words, the general regularities which govern word and sentence form, and the associations between form and meaning. For each of these aspects, the challenges of the task are discussed and the relevant empirical findings on children are summarized. Furthermore, the existing computational models that attempt to simulate the task under study are reviewed, and a number of case studies are presented. Table of Contents: Overview / Computational Models of Language Learning / Learning Words / Putting Words Together / Form--Meaning Associations / Final Thoughts
This book presents a critical overview of current work on
linguistic features and establishes new bases for their use in the
study and understanding of language.
This book provides system developers and researchers in natural language processing and computational linguistics with the necessary background information for working with the Arabic language. The goal is to introduce Arabic linguistic phenomena and review the state-of-the-art in Arabic processing. The book discusses Arabic script, phonology, orthography, morphology, syntax and semantics, with a final chapter on machine translation issues. The chapter sizes correspond more or less to what is linguistically distinctive about Arabic, with morphology getting the lion's share, followed by Arabic script. No previous knowledge of Arabic is needed. This book is designed for computer scientists and linguists alike. The focus of the book is on Modern Standard Arabic; however, notes on practical issues related to Arabic dialects and languages written in the Arabic script are presented in different chapters. Table of Contents: What is "Arabic"? / Arabic Script / Arabic Phonology and Orthography / Arabic Morphology / Computational Morphology Tasks / Arabic Syntax / A Note on Arabic Semantics / A Note on Arabic and Machine Translation
Originally published in 1997, this book is concerned with human language technology. This technology provides computers with the capability to handle spoken and written language. One major goal is to improve communication between humans and machines. If people can use their own language to access information, working with software applications and controlling machinery, the greatest obstacle for the acceptance of new information technology is overcome. Another important goal is to facilitate communication among people. Machines can help to translate texts or spoken input from one human language to the other. Programs that assist people in writing by checking orthography, grammar and style are constantly improving. This book was sponsored by the Directorate General XIII of the European Union and the Information Science and Engineering Directorate of the National Science Foundation, USA.
Search for information is no longer exclusively limited within the native language of the user, but is more and more extended to other languages. This gives rise to the problem of cross-language information retrieval (CLIR), whose goal is to find relevant information written in a different language to a query. In addition to the problems of monolingual information retrieval (IR), translation is the key problem in CLIR: one should translate either the query or the documents from a language to another. However, this translation problem is not identical to full-text machine translation (MT): the goal is not to produce a human-readable translation, but a translation suitable for finding relevant documents. Specific translation methods are thus required. The goal of this book is to provide a comprehensive description of the specific problems arising in CLIR, the solutions proposed in this area, as well as the remaining problems. The book starts with a general description of the monolingual IR and CLIR problems. Different classes of approaches to translation are then presented: approaches using an MT system, dictionary-based translation and approaches based on parallel and comparable corpora. In addition, the typical retrieval effectiveness using different approaches is compared. It will be shown that translation approaches specifically designed for CLIR can rival and outperform high-quality MT systems. Finally, the book offers a look into the future that draws a strong parallel between query expansion in monolingual IR and query translation in CLIR, suggesting that many approaches developed in monolingual IR can be adapted to CLIR. The book can be used as an introduction to CLIR. Advanced readers can also find more technical details and discussions about the remaining research challenges in the future. It is suitable to new researchers who intend to carry out research on CLIR. Table of Contents: Preface / Introduction / Using Manually Constructed Translation Systems and Resources for CLIR / Translation Based on Parallel and Comparable Corpora / Other Methods to Improve CLIR / A Look into the Future: Toward a Unified View of Monolingual IR and CLIR? / References / Author Biography
Our world is being revolutionized by data-driven methods: access to large amounts of data has generated new insights and opened exciting new opportunities in commerce, science, and computing applications. Processing the enormous quantities of data necessary for these advances requires large clusters, making distributed computing paradigms more crucial than ever. MapReduce is a programming model for expressing distributed computations on massive datasets and an execution framework for large-scale data processing on clusters of commodity servers. The programming model provides an easy-to-understand abstraction for designing scalable algorithms, while the execution framework transparently handles many system-level details, ranging from scheduling to synchronization to fault tolerance. This book focuses on MapReduce algorithm design, with an emphasis on text processing algorithms common in natural language processing, information retrieval, and machine learning. We introduce the notion of MapReduce design patterns, which represent general reusable solutions to commonly occurring problems across a variety of problem domains. This book not only intends to help the reader "think in MapReduce", but also discusses limitations of the programming model as well. Table of Contents: Introduction / MapReduce Basics / MapReduce Algorithm Design / Inverted Indexing for Text Retrieval / Graph Algorithms / EM Algorithms for Text Processing / Closing Remarks
This book introduces the most important problems of reference and
considers the solutions that have been proposed to explain them.
Reference is at the centre of debate among linguists and
philosophers and, as Barbara Abbott shows, this has been the case
for centuries. She begins by examining the basic issue of how far
reference is a two place (words-world) or a three place
(speakers-words-world) relation. She then discusses the main
aspects of the field and the issues associated with them, including
those concerning proper names; direct reference and individual
concepts; the difference between referential and quantificational
descriptions; pronouns and indexicality; concepts like definiteness
and strength; and noun phrases in discourse.
This book contains selected papers from the Colloquium in Honor of Alain Lecomte, held in Pauillac, France, in November 2007. The event was part of the ANR project "Prelude" (Towards Theoretical Pragmatics Based on Ludics and Continuation Theory), the proceedings of which were published in another FoLLI-LNAI volume (LNAI 6505) edited by Alain Lecomte and Samuel Troncon. The selected papers of this Festschrift volume focus on the scientific areas in which Alain Lecomte has worked and to which he has contributed: formal linguistics, computational linguistics, logic, and cognition.
This book is aimed at providing an overview of several aspects of semantic role labeling. Chapter 1 begins with linguistic background on the definition of semantic roles and the controversies surrounding them. Chapter 2 describes how the theories have led to structured lexicons such as FrameNet, VerbNet and the PropBank Frame Files that in turn provide the basis for large scale semantic annotation of corpora. This data has facilitated the development of automatic semantic role labeling systems based on supervised machine learning techniques. Chapter 3 presents the general principles of applying both supervised and unsupervised machine learning to this task, with a description of the standard stages and feature choices, as well as giving details of several specific systems. Recent advances include the use of joint inference to take advantage of context sensitivities, and attempts to improve performance by closer integration of the syntactic parsing task with semantic role labeling. Chapter 3 also discusses the impact the granularity of the semantic roles has on system performance. Having outlined the basic approach with respect to English, Chapter 4 goes on to discuss applying the same techniques to other languages, using Chinese as the primary example. Although substantial training data is available for Chinese, this is not the case for many other languages, and techniques for projecting English role labels onto parallel corpora are also presented. Table of Contents: Preface / Semantic Roles / Available Lexical Resources / Machine Learning for Semantic Role Labeling / A Cross-Lingual Perspective / Summary
Considerable progress has been made in recent years in the development of dialogue systems that support robust and efficient human-machine interaction using spoken language. Spoken dialogue technology allows various interactive applications to be built and used for practical purposes, and research focuses on issues that aim to increase the system's communicative competence by including aspects of error correction, cooperation, multimodality, and adaptation in context. This book gives a comprehensive view of state-of-the-art techniques that are used to build spoken dialogue systems. It provides an overview of the basic issues such as system architectures, various dialogue management methods, system evaluation, and also surveys advanced topics concerning extensions of the basic model to more conversational setups. The goal of the book is to provide an introduction to the methods, problems, and solutions that are used in dialogue system development and evaluation. It presents dialogue modelling and system development issues relevant in both academic and industrial environments and also discusses requirements and challenges for advanced interaction management and future research. Table of Contents: Preface / Introduction to Spoken Dialogue Systems / Dialogue Management / Error Handling / Case Studies: Advanced Approaches to Dialogue Management / Advanced Issues / Methodologies and Practices of Evaluation / Future Directions / References / Author Biographies
This book introduces Chinese language-processing issues and techniques to readers who already have a basic background in natural language processing (NLP). Since the major difference between Chinese and Western languages is at the word level, the book primarily focuses on Chinese morphological analysis and introduces the concept, structure, and interword semantics of Chinese words. The following topics are covered: a general introduction to Chinese NLP; Chinese characters, morphemes, and words and the characteristics of Chinese words that have to be considered in NLP applications; Chinese word segmentation; unknown word detection; word meaning and Chinese linguistic resources; interword semantics based on word collocation and NLP techniques for collocation extraction. Table of Contents: Introduction / Words in Chinese / Challenges in Chinese Morphological Processing / Chinese Word Segmentation / Unknown Word Identification / Word Meaning / Chinese Collocations / Automatic Chinese Collocation Extraction / Appendix / References / Author Biographies
Cross-Word Modeling for Arabic Speech Recognition utilizes phonological rules in order to model the cross-word problem, a merging of adjacent words in speech caused by continuous speech, to enhance the performance of continuous speech recognition systems. The author aims to provide an understanding of the cross-word problem and how it can be avoided, specifically focusing on Arabic phonology using an HHM-based classifier.
Linguistic annotation and text analytics are active areas of research and development, with academic conferences and industry events such as the Linguistic Annotation Workshops and the annual Text Analytics Summits. This book provides a basic introduction to both fields, and aims to show that good linguistic annotations are the essential foundation for good text analytics. After briefly reviewing the basics of XML, with practical exercises illustrating in-line and stand-off annotations, a chapter is devoted to explaining the different levels of linguistic annotations. The reader is encouraged to create example annotations using the WordFreak linguistic annotation tool. The next chapter shows how annotations can be created automatically using statistical NLP tools, and compares two sets of tools, the OpenNLP and Stanford NLP tools. The second half of the book describes different annotation formats and gives practical examples of how to interchange annotations between different formats using XSLT transformations. The two main text analytics architectures, GATE and UIMA, are then described and compared, with practical exercises showing how to configure and customize them. The final chapter is an introduction to text analytics, describing the main applications and functions including named entity recognition, coreference resolution and information extraction, with practical examples using both open source and commercial tools. Copies of the example files, scripts, and stylesheets used in the book are available from the companion website, located at the book website. Table of Contents: Working with XML / Linguistic Annotation / Using Statistical NLP Tools / Annotation Interchange / Annotation Architectures / Text Analytics
This book provides a precise and thorough description of the meaning and use of spatial expressions, using both a linguistics and an artificial intelligence perspective, and also an enlightening discussion of computer models of comprehension and production in the spatial domain. The author proposes a theoretical framework that explains many previously overlooked or misunderstood irregularities. The use of prepositions reveals underlying schematisations and idealisations of the spatial world, which, for the most part, echo representational structures necessary for human action (movement and manipulation). Because spatial cognition seems to provide a key to understanding much of the cognitive system, including language, the book addresses one of the most basic questions confronting cognitive science and artificial intelligence, and brings fresh and original insights to it.
In this book, Peter Culicover introduces the analysis of natural
language within the broader question of how language works - of how
people use languages to configure words and morphemes in order to
express meanings. He focuses both on the syntactic and
morphosyntactic devices that languages use, and on the conceptual
structures that correspond to particular aspects of linguistic
form. He seeks to explain linguistic forms and in the process to
show how these correspond with meanings.
This book presents computational mechanisms for solving common language interpretation problems including many cases of reference resolution, word sense disambiguation, and the interpretation of relationships implicit in modifiers. The proposed memory and context mechanisms provide the means for representing and applying information about the semantic relationships between entities imposed by the cultural context. The effects of different 'context factors', derived from multiple sources, are combined for disambiguation and for limiting memory search; the factors having been created and manipulated gradually during discourse processing.
Dependency-based methods for syntactic parsing have become increasingly popular in natural language processing in recent years. This book gives a thorough introduction to the methods that are most widely used today. After an introduction to dependency grammar and dependency parsing, followed by a formal characterization of the dependency parsing problem, the book surveys the three major classes of parsing models that are in current use: transition-based, graph-based, and grammar-based models. It continues with a chapter on evaluation and one on the comparison of different methods, and it closes with a few words on current trends and future prospects of dependency parsing. The book presupposes a knowledge of basic concepts in linguistics and computer science, as well as some knowledge of parsing methods for constituency-based representations. Table of Contents: Introduction / Dependency Parsing / Transition-Based Parsing / Graph-Based Parsing / Grammar-Based Parsing / Evaluation / Comparison / Final Thoughts
This is a book about semantic theories of modality. Its main goal
is to explain and evaluate important contemporary theories within
linguistics and to discuss a wide range of linguistic phenomena
from the perspective of these theories. The introduction describes
the variety of grammatical phenomena associated with modality,
explaining why modal verbs, adjectives, and adverbs represent the
core phenomena. Chapters are then devoted to the possible worlds
semantics for modality developed in modal logic; current theories
of modal semantics within linguistics; and the most important
empirical areas of research. The author concludes by discussing the
relation between modality and other topics, especially tense,
aspect, mood, and discourse meaning.
As online information grows dramatically, search engines such as Google are playing a more and more important role in our lives. Critical to all search engines is the problem of designing an effective retrieval model that can rank documents accurately for a given query. This has been a central research problem in information retrieval for several decades. In the past ten years, a new generation of retrieval models, often referred to as statistical language models, has been successfully applied to solve many different information retrieval problems. Compared with the traditional models such as the vector space model, these new models have a more sound statistical foundation and can leverage statistical estimation to optimize retrieval parameters. They can also be more easily adapted to model non-traditional and complex retrieval problems. Empirically, they tend to achieve comparable or better performance than a traditional model with less effort on parameter tuning. This book systematically reviews the large body of literature on applying statistical language models to information retrieval with an emphasis on the underlying principles, empirically effective language models, and language models developed for non-traditional retrieval tasks. All the relevant literature has been synthesized to make it easy for a reader to digest the research progress achieved so far and see the frontier of research in this area. The book also offers practitioners an informative introduction to a set of practically useful language models that can effectively solve a variety of retrieval problems. No prior knowledge about information retrieval is required, but some basic knowledge about probability and statistics would be useful for fully digesting all the details. Table of Contents: Introduction / Overview of Information Retrieval Models / Simple Query Likelihood Retrieval Model / Complex Query Likelihood Model / Probabilistic Distance Retrieval Model / Language Models for Special Retrieval Tasks / Language Models for Latent Topic Analysis / Conclusions
This volume is a collection of original contributions from outstanding scholars in linguistics, philosophy and computational linguistics exploring the relation between word meaning and human linguistic creativity. The papers present different aspects surrounding the question of what is word meaning, a problem that has been the centre of heated debate in all those disciplines that directly or indirectly are concerned with the study of language and of human cognition. The discussions are centred around a view of the mental lexicon, as outlined in the Generative Lexicon theory (Pustejovsky, 1995), which proposes a unified model for defining word meaning. The individual contributors present their evidence for a generative approach as well as critical perspectives, which provides for a volume where word meaning is not viewed only from a particular angle or from a particular concern, but from a wide variety of topics, each introduced and explained by the editors.
Placing contemporary spoken English at the centre of phonological research, this book tackles the issue of language variation and change through a range of methodological and theoretical approaches. In doing so the book bridges traditionally separate fields such as experimental phonetics, theoretical phonology, language acquisition and sociolinguistics. Made up of 12 chapters, it explores a substantial range of linguistic phenomena. It covers auditory, acoustic and articulatory phonetics, second language pronunciation and perception, sociophonetics, cross-linguistic comparison of vowel reduction and methodological issues in the construction of phonological corpora. The book presents new data and analyses which demonstrate what phonologists, phoneticians and sociolinguists do with their corpora and show how various theoretical and experimental questions can be explored in light of authentic spoken data.
This book provides a comprehensive introduction to Conversational AI. While the idea of interacting with a computer using voice or text goes back a long way, it is only in recent years that this idea has become a reality with the emergence of digital personal assistants, smart speakers, and chatbots. Advances in AI, particularly in deep learning, along with the availability of massive computing power and vast amounts of data, have led to a new generation of dialogue systems and conversational interfaces. Current research in Conversational AI focuses mainly on the application of machine learning and statistical data-driven approaches to the development of dialogue systems. However, it is important to be aware of previous achievements in dialogue technology and to consider to what extent they might be relevant to current research and development. Three main approaches to the development of dialogue systems are reviewed: rule-based systems that are handcrafted using best practice guidelines; statistical data-driven systems based on machine learning; and neural dialogue systems based on end-to-end learning. Evaluating the performance and usability of dialogue systems has become an important topic in its own right, and a variety of evaluation metrics and frameworks are described. Finally, a number of challenges for future research are considered, including: multimodality in dialogue systems, visual dialogue; data efficient dialogue model learning; using knowledge graphs; discourse and dialogue phenomena; hybrid approaches to dialogue systems development; dialogue with social robots and in the Internet of Things; and social and ethical issues. |
![]() ![]() You may like...
Health Care Systems Engineering - HCSE…
Paola Cappanera, Jingshan Li, …
Hardcover
R4,412
Discovery Miles 44 120
Population Mobility and Infectious…
Yorghos Apostolopoulos, Sevil Sonmez
Hardcover
R1,719
Discovery Miles 17 190
Silent Victories - The History and…
John W. Ward, Christian Warren
Hardcover
R2,192
Discovery Miles 21 920
Targeting Chronic Inflammatory Lung…
Kamal Dua, Philip M. Hansbro, …
Paperback
R4,285
Discovery Miles 42 850
Statistical Human Genetics - Methods and…
Robert C. Elston, Jaya M. Satagopan, …
Hardcover
R4,508
Discovery Miles 45 080
|