![]() |
![]() |
Your cart is empty |
||
Books > Language & Literature > Language & linguistics > Computational linguistics
This book provides an in-depth view of the current issues, problems and approaches in the computation of meaning as expressed in language. Aimed at linguists, computer scientists, and logicians with an interest in the computation of meaning, this book focuses on two main topics in recent research in computational semantics. The first topic is the definition and use of underspecified semantic representations, i.e. formal structures that represent part of the meaning of a linguistic object while leaving other parts unspecified. The second topic discussed is semantic annotation. Annotated corpora have become an indispensable resource both for linguists and for developers of language and speech technology, especially when used in combination with machine learning methods. The annotation in corpora has only marginally addressed semantic information, however, since semantic annotation methodologies are still in their infancy. This book discusses the development and application of such methodologies.
Natural Language is not only the most important means of communication between human beings, it is also used over historical periods for the pres- vation of cultural achievements and their transmission from one generation to the other. During the last few decades, the ?ood of digitalized information has been growing tremendously. This tendency will continue with the globali- tion of information societies and with the growing importance of national and international computer networks. This is one reason why the theoretical und- standing and the automated treatment of communication processes based on natural language have such a decisive social and economic impact. In this c- text, the semantic representation of knowledge originally formulated in natural language plays a central part, because it connects all components of natural language processing systems, be they the automatic understanding of natural language (analysis), the rational reasoning over knowledge bases, or the g- eration of natural language expressions from formal representations. This book presents a method for the semantic representation of natural l- guage expressions (texts, sentences, phrases, etc. ) which can be used as a u- versal knowledge representation paradigm in the human sciences, like lingu- tics, cognitive psychology, or philosophy of language, as well as in com- tational linguistics and in arti?cial intelligence. It is also an attempt to close the gap between these disciplines, which to a large extent are still working separately.
Research writing and teaching is a great challenge for novice scholars, especially L2 writers. This book presents a compelling and much-needed automated writing evaluation (AWE) reinforcement to L2 research writing pedagogy.
Vagueness is central to the flexibility and robustness of natural language descriptions. Vague concepts are robust to the imprecision of our perceptions, while still allowing us to convey useful, and sometimes vital, information. The study of vagueness in Artificial Intelligence (AI) is therefore motivated by the desire to incorporate this robustness and flexibility into intelligent computer systems. Such a goal, however, requires a formal model of vague concepts that will allow us to quantify and manipulate the uncertainty resulting from their use as a means of passing information between autonomous agents. This volume outlines a formal representation framework for modelling and reasoning with vague concepts in Artificial Intelligence. The new calculus has many applications, especially in automated reasoning, learning, data analysis and information fusion. This book gives a rigorous introduction to label semantics theory, illustrated with many examples, and suggests clear operational interpretations of the proposed measures. It also provides a detailed description of how the theory can be applied in data analysis and information fusion based on a range of benchmark problems.
The second volume of the two-volume set The Fruits of Empirical Linguistics focuses on the linguistic outcomes of empirical linguistics. The contributions present some of the insights that linguists can gain by applying the new methods: progress within language study is accelerated by the new evidence since language systems are more precisely captured. Readers will enjoy the fresh perspective on linguistic questions made possible by the evidence-based approach.
'Natural Language Processing in the Real World' is a practical guide for applying data science and machine learning to build Natural Language Processing (NLP) solutions. Where traditional, academic-taught NLP is often accompanied by a data source or dataset to aid solution building, this book is situated in the real-world where there may not be an existing rich dataset. This book covers the basic concepts behind NLP and text processing and discusses the applications across 15 industry verticals. From data sources and extraction to transformation and modelling, and classic Machine Learning to Deep Learning and Transformers, several popular applications of NLP are discussed and implemented. This book provides a hands-on and holistic guide for anyone looking to build NLP solutions, from students of Computer Science to those involved in large-scale industrial projects. .
To apply the same approaches to analysing spoken and written formulaic language is problematic; to do so masks the fact that the contextual meaning of spoken formulaic language is encoded, to a large extent, in its prosody. In The Prosody of Formulaic Sequences, Phoebe Lin offers a new perspective on formulaic language, arguing that while past research often treats formulaic language as a lexical phenomenon, the phonological aspect of it is a more fundamental facet. This book draws its conclusions from three original, empirical studies of spoken formulaic language, assessing intonation unit boundaries as well as features such as tempo and stress placement. Across all studies, Lin considers questions of methodology and conceptual framework. The corpus-based descriptions of prosody outlined in this book not only deepen our understanding of the nature of formulaic language but have important implications for English Language Teaching and automatic speech synthesis.
This book proposes the use of multimodal corpora in order to examine spoken discourse more effectively and with greater accuracy. Current corpora are invaluable resources for generating accurate and objective analyses of patterns of language use. However, spoken corpora are effectively mono-modal, presenting data in the same physical medium - text. The reality of a discourse situation is lost in its representation as text. Using multimodal data sets when conducting corpus-based pragmatic analyses is one solution. This book looks at multimodal corpora in some depth, using backchanneling as the conversational feature to be analyzed. It provides a bottom-up study of multimodal corpora; their physical construction and a methodology for the analysis of specific linguistic phenomena across their multiple streams of data. Dawn Knight also looks at possible directions in the construction and use of multimodal corpus linguistics. Furthermore, the collaborative and cooperative nature of backchannels is highlighted and the book presents an adapted pragmatic-functional linguistic coding matrix for the characterization of backchanneling phenomena. Corpus linguistics provides the methodology to extract meaning from discourse. Taking as its starting point the fact that language is not a mirror of reality but lets us share what we know, believe and think about reality, it focuses on language as a social phenomenon, and makes visible the attitudes and beliefs expressed by the members of a discourse community. Consisting of both spoken and written language, discourse always has historical, social, functional, and regional dimensions. Discourse can be monolingual or multilingual, interconnected by translations. Discourse is where language and social studies meet. "The Corpus and Discourse" series consists of two strands. The first, Research in Corpus and Discourse, features innovative contributions to various aspects of corpus linguistics and a wide range of applications, from language technology via the teaching of a second language to a history of mentalities. The second strand, Studies in Corpus and Discourse, is comprised of key texts bridging the gap between social studies and linguistics. Although equally academically rigorous, this strand will be aimed at a wider audience of academics and postgraduate students working in both disciplines.
Automatic Text Categorization and Clustering are becoming more and more important as the amount of text in electronic format grows and the access to it becomes more necessary and widespread. Well known applications are spam filtering and web search, but a large number of everyday uses exist (intelligent web search, data mining, law enforcement, etc.) Currently, researchers are employing many intelligent techniques for text categorization and clustering, ranging from support vector machines and neural networks to Bayesian inference and algebraic methods, such as Latent Semantic Indexing. This volume offers a wide spectrum of research work developed for intelligent text categorization and clustering. In the following, we give a brief introduction of the chapters that are included in this book.
Editors Amy Neustein and Judith A. Markowitz have recruited a talented group of contributors to introduce the next generation of natural language technologies to resolve some of the most vexing natural-language problems that compromise the performance of speech systems today. This fourteen-chapter anthology consists of contributions from industry scientists and from academicians working at major universities in North America and Europe. They include researchers who have played a central role in DARPA-funded programs and developers who craft real-world solutions for corporations. This anthology is aimed at speech engineers, system developers, computer scientists, AI researchers, and others interested in utilizing natural-language technology in both spoken and text-based applications.
This is the first comprehensive overview of computational approaches to Arabic morphology. The subtitle aims to reflect that widely different computational approaches to the Arabic morphological system have been proposed. The book provides a showcase of the most advanced language technologies applied to one of the most vexing problems in linguistics. It covers knowledge-based and empirical-based approaches.
The content of this textbook is organized as a theory of language for the construction of talking robots. The main topic is the mechanism of natural language communication in both the speaker and the hearer. In the third edition the author has modernized the text, leaving the overview of traditional, theoretical, and computational linguistics, analytic philosophy of language, and mathematical complexity theory with their historical backgrounds intact. The format of the empirical analyses of English and German syntax and semantics has been adapted to current practice; and Chaps. 22-24 have been rewritten to focus more sharply on the construction of a talking robot.
This book provides a clear and comprehensive description of the Ocotepec/Tapalapa variant of Chiapas Zoque. Zoque is one of the two major branches of the Mixe-Zoquean language family, spoken in the southern part of Mexico. Until the Spanish conquest in the sixteenth century the Mixe-Zoquean languages covered a large area from Veracruz on the Gulf coast to the border of Guatemala and the Pacific coast. Inscriptions in Zoque from the first half of the first millennium AD are the oldest known linguistic documents in Mesoamerica.The Zoquean area once included the entire heartland of the Olmecs, who almost certainly spoke a proto-Zoquean or proto-Mixe-Zoquean language. The Zoques are thus the most likely direct descendents of the oldest known civilization of Mexico. As a result of a long history of close contact, Zoque and Mayan share areal features, and there are lexical borrowings in both directions, but genetically and typologically they are clearly distinct. The Zoque-speaking area has shrunk considerably since pre-colonial times. In 1982 an eruption from the volcano Chichonal destroyed a central part of the Zoque core area and caused a mass migration of Zoque speakers to parts of Mexico where Spanish is the dominant language. This record of an unusual and critically endangered language will be a vital resource for linguists of all theoretical persuasions.
The contributions to The Fruits of Empirical Linguistics. Volume 1: Process reveal why the data-driven approach makes for a research environment which is fast-moving and democratic: technological change has made the sources of linguistic data readily accessible. These contributions show the methods both professional and student linguists are using to gather more evidence more easily than before.
The book specifies a corpus architecture, including annotation and querying techniques, and its implementation. The corpus architecture is developed for empirical studies of translations, and beyond those for the study of texts which are inter-lingually comparable, particularly texts of similar registers. The compiled corpus, CroCo, is a resource for research and is, with some copyright restrictions, accessible to other research projects. Most of the research was undertaken as part of a DFG-Project into linguistic properties of translations. Fundamentally, this research project was a corpus-based investigation into the language pair English-German. The long-term goal is a contribution to the study of translation as a contact variety, and beyond this to language comparison and language contact more generally with the language pair English - German as our object languages. This goal implies a thorough interest in possible specific properties of translations, and beyond this in an empirical translation theory. The methodology developed is not restricted to the traditional exclusively system-based comparison of earlier days, where real-text excerpts or constructed examples are used as mere illustrations of assumptions and claims, but instead implements an empirical research strategy involving structured data (the sub-corpora and their relationships to each other, annotated and aligned on various theoretically motivated levels of representation), the formation of hypotheses and their operationalizations, statistics on the data, critical examinations of their significance, and interpretation against the background of system-based comparisons and other independent sources of explanation for the phenomena observed. Further applications of the resource developed in computational linguistics are outlined and evaluated.
The 1990s saw a paradigm change in the use of corpus-driven methods in NLP. In the field of multilingual NLP (such as machine translation and terminology mining) this implied the use of parallel corpora. However, parallel resources are relatively scarce: many more texts are produced daily by native speakers of any given language than translated. This situation resulted in a natural drive towards the use of comparable corpora, i.e. non-parallel texts in the same domain or genre. Nevertheless, this research direction has not produced a single authoritative source suitable for researchers and students coming to the field. The proposed volume provides a reference source, identifying the state of the art in the field as well as future trends. The book is intended for specialists and students in natural language processing, machine translation and computer-assisted translation.
The practical task of building a talking robot requires a theory of how natural language communication works. Conversely, the best way to computationally verify a theory of natural language communication is to demonstrate its functioning concretely in the form of a talking robot, the epitome of human-machine communication. To build an actual robot requires hardware that provides appropriate recognition and action interfaces, and because such hardware is hard to develop the approach in this book is theoretical: the author presents an artificial cognitive agent with language as a software system called database semantics (DBS). Because a theoretical approach does not have to deal with the technical difficulties of hardware engineering there is no reason to simplify the system - instead the software components of DBS aim at completeness of function and of data coverage in word form recognition, syntactic-semantic interpretation and inferencing, leaving the procedural implementation of elementary concepts for later. In this book the author first examines the universals of natural language and explains the Database Semantics approach. Then in Part I he examines the following natural language communication issues: using external surfaces; the cycle of natural language communication; memory structure; autonomous control; and learning. In Part II he analyzes the coding of content according to the aspects: semantic relations of structure; simultaneous amalgamation of content; graph-theoretical considerations; computing perspective in dialogue; and computing perspective in text. The book ends with a concluding chapter, a bibliography and an index. The book will be of value to researchers, graduate students and engineers in the areas of artificial intelligence and robotics, in particular those who deal with natural language processing.
This book shows ways of augmenting the capabilities of Natural Language Processing (NLP) systems by means of cognitive-mode language processing. The authors employ eye-tracking technology to record and analyze shallow cognitive information in the form of gaze patterns of readers/annotators who perform language processing tasks. The insights gained from such measures are subsequently translated into systems that help us (1) assess the actual cognitive load in text annotation, with resulting increase in human text-annotation efficiency, and (2) extract cognitive features that, when added to traditional features, can improve the accuracy of text classifiers. In sum, the authors' work successfully demonstrates that cognitive information gleaned from human eye-movement data can benefit modern NLP. Currently available Natural Language Processing (NLP) systems are weak AI systems: they seek to capture the functionality of human language processing, without worrying about how this processing is realized in human beings' hardware. In other words, these systems are oblivious to the actual cognitive processes involved in human language processing. This ignorance, however, is NOT bliss! The accuracy figures of all non-toy NLP systems saturate beyond a certain point, making it abundantly clear that "something different should be done."
Text classification is becoming a crucial task to analysts in different areas. In the last few decades, the production of textual documents in digital form has increased exponentially. Their applications range from web pages to scientific documents, including emails, news and books. Despite the widespread use of digital texts, handling them is inherently difficult - the large amount of data necessary to represent them and the subjectivity of classification complicate matters. This book gives a concise view on how to use kernel approaches for inductive inference in large scale text classification; it presents a series of new techniques to enhance, scale and distribute text classification tasks. It is not intended to be a comprehensive survey of the state-of-the-art of the whole field of text classification. Its purpose is less ambitious and more practical: to explain and illustrate some of the important methods used in this field, in particular kernel approaches and techniques.
This book presents a theoretical study on aspect in Chinese, including both situation and viewpoint aspects. Unlike previous studies, which have largely classified linguistic units into different situation types, this study defines a set of ontological event types that are conceptually universal and on the basis of which different languages employ various linguistic devices to describe such events. To do so, it focuses on a particular component of events, namely the viewpoint aspect. It includes and discusses a wealth of examples to show how such ontological events are realized in Chinese. In addition, the study discusses how Chinese modal verbs and adverbs affect the distribution of viewpoint aspects associated with certain situation types. In turn, the book demonstrates how the proposed linguistic theory can be used in a computational context. Simply identifying events in terms of the verbs and their arguments is insufficient for real situations such as understanding the factivity and the logical/temporal relations between events. The proposed framework offers the possibility of analyzing events in Chinese text, yielding deep semantic information.
This book constitutes the thoroughly refereed proceedings of the Eleventh International Symposium on Natural Language Processing (SNLP-2016), held in Phranakhon Si Ayutthaya, Thailand on February 10-12, 2016. The SNLP promotes research in natural language processing and related fields, and provides a unique opportunity for researchers, professionals and practitioners to discuss various current and advanced issues of interest in NLP. The 2016 symposium was expanded to include the First Workshop in Intelligent Informatics and Smart Technology. Of the 66 high-quality papers accepted, this book presents twelve from the Symposium on Natural Language Processing track and ten from the Workshop in Intelligent Informatics and Smart Technology track (SSAI: Special Session on Artificial Intelligence).
This book offers a timely report on key theories and applications of soft-computing. Written in honour of Professor Gaspar Mayor on his 70th birthday, it primarily focuses on areas related to his research, including fuzzy binary operators, aggregation functions, multi-distances, and fuzzy consensus/decision models. It also discusses a number of interesting applications such as the implementation of fuzzy mathematical morphology based on Mayor-Torrens t-norms. Importantly, the different chapters, authored by leading experts, present novel results and offer new perspectives on different aspects of Mayor's research. The book also includes an overview of evolutionary fuzzy systems, a topic that is not one of Mayor's main areas of interest, and a final chapter written by the Spanish pioneer in fuzzy logic, Professor E. Trillas. Computer and decision scientists, knowledge engineers and mathematicians alike will find here an authoritative overview of key soft-computing concepts and techniques.
The general markup language XML has played an outstanding role in the mul- ple ways of processing electronic documents, XML being used either in the design of interface structures or as a formal framework for the representation of structure or content-related properties of documents. This book in its 13 chapters discusses aspects of XML-based linguistic information modeling combining: methodological issues, especially with respect to text-related information modeling, applicati- oriented research and issues of formal foundations. The contributions in this book are based on current research in Text Technology, Computational Linguistics and in the international domain of evolving standards for language resources. Rec- rent themes in this book are markup languages, explored from different points of view, and topics of text-related information modeling. These topics have been core areas of the research Unit "Text-technological Information Modeling" (www. te- technology. de) funded from 2002 to 2009 by the German Research Foundation (DFG). Positions developed in this book could also bene t from the presentations and discussion at the conference "Modelling Linguistic Information Resources" at the Center for Interdisciplinary Research (Zentrum fur .. interdisziplinare .. Forschung, ZiF) at Bielefeld, a center for advanced studies known for its international and interdisciplinary meetings and research. The editors would like to thank the DFG and ZiF for their nancial support, the publisher, the series editors, the reviewers and those people that helped to prepare the manuscript, especially Carolin Kram, Nils Diewald, Jens Stegmann and Peter M. Fischer and last but not least, all of the authors. |
![]() ![]() You may like...
Pearson REVISE Edexcel GCSE Computer…
Ann Weidmann, Cynthia Selby
Paperback
R285
Discovery Miles 2 850
Practical Core Software Security - A…
James F Ransome, Anmol Misra, …
Paperback
R2,494
Discovery Miles 24 940
Game Theoretic Analysis of Congestion…
Kjell Hausken, Jun Zhuang
Hardcover
Digital Production, Design and…
Sonia Stuart, Maureen Everett
Paperback
R1,309
Discovery Miles 13 090
Writing Successful Undergraduate…
Thomas Hainey, Gavin Baxter
Hardcover
R4,119
Discovery Miles 41 190
Online Optimization of Large Scale…
Martin Groetschel, Sven O. Krumke, …
Hardcover
R3,460
Discovery Miles 34 600
|