![]() |
Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
||
|
Books > Language & Literature > Language & linguistics > Computational linguistics
The two-volume proceedings, LNCS 13249 and 13250, constitutes the thoroughly refereed post-workshop proceedings of the 22nd Chinese Lexical Semantics Workshop, CLSW 2021, held in Nanjing, China in May 2021. The 68 full papers and 4 short papers were carefully reviewed and selected from 261 submissions. They are organized in the following topical sections: Lexical Semantics and General Linguistics; Natural Language Processing and Language Computing; Cognitive Science and Experimental Studies; Lexical Resources and Corpus Linguistics.
Recent decades have seen a fundamental change and transformation in the commercialisation and popularisation of sports and sporting events. Corpus Approaches to the Language of Sports uses corpus resources to offer new perspectives on the language and discourse of this increasingly popular and culturally significant area of research. Bringing together a range of empirical studies from leading scholars, this book bridges the gap between quantitative corpus approaches and more qualitative, multimodal discourse methods. Covering a wide range of sports, including football, cycling and basketball, the linguistic aspects of sports language are analysed across different genres and contexts. Highlighting the importance of studying the language of sports alongside its accompanying audio-visual modes of communication, chapters draw on new digitised collections of language to fully describe and understand the complexities of communication through various channels. In doing so, Corpus Approaches to the Language of Sports not only offers exciting new insights into the language of sports but also extends the scope of corpus linguistics beyond traditional monomodal approaches to put multimodality firmly on the agenda.
This book presents a theoretical study on aspect in Chinese, including both situation and viewpoint aspects. Unlike previous studies, which have largely classified linguistic units into different situation types, this study defines a set of ontological event types that are conceptually universal and on the basis of which different languages employ various linguistic devices to describe such events. To do so, it focuses on a particular component of events, namely the viewpoint aspect. It includes and discusses a wealth of examples to show how such ontological events are realized in Chinese. In addition, the study discusses how Chinese modal verbs and adverbs affect the distribution of viewpoint aspects associated with certain situation types. In turn, the book demonstrates how the proposed linguistic theory can be used in a computational context. Simply identifying events in terms of the verbs and their arguments is insufficient for real situations such as understanding the factivity and the logical/temporal relations between events. The proposed framework offers the possibility of analyzing events in Chinese text, yielding deep semantic information.
Automating Linguistics offers an in-depth study of the history of the mathematisation and automation of the sciences of language. In the wake of the first mathematisation of the 1930s, two waves followed: machine translation in the 1950s and the development of computational linguistics and natural language processing in the 1960s. These waves were pivotal given the work of large computerised corpora in the 1990s and the unprecedented technological development of computers and software.Early machine translation was devised as a war technology originating in the sciences of war, amidst the amalgamate of mathematics, physics, logics, neurosciences, acoustics, and emerging sciences such as cybernetics and information theory. Machine translation was intended to provide mass translations for strategic purposes during the Cold War. Linguistics, in turn, did not belong to the sciences of war, and played a minor role in the pioneering projects of machine translation.Comparing the two trends, the present book reveals how the sciences of language gradually integrated the technologies of computing and software, resulting in the second-wave mathematisation of the study of language, which may be called mathematisation-automation. The integration took on various shapes contingent upon cultural and linguistic traditions (USA, ex-USSR, Great Britain and France). By contrast, working with large corpora in the 1990s, though enabled by unprecedented development of computing and software, was primarily a continuation of traditional approaches in the sciences of language sciences, such as the study of spoken and written texts, lexicography, and statistical studies of vocabulary.
This book covers theoretical work, applications, approaches, and techniques for computational models of information and its presentation by language (artificial, human, or natural in other ways). Computational and technological developments that incorporate natural language are proliferating. Adequate coverage encounters difficult problems related to ambiguities and dependency on context and agents (humans or computational systems). The goal is to promote computational systems of intelligent natural language processing and related models of computation, language, thought, mental states, reasoning, and other cognitive processes.
This SpringerBrief presents the data- information-and-time (DIT) model that precisely clarifies the semantics behind the terms data, information and their relations to the passage of real time. According to the DIT model a data item is a symbol that appears as a pattern (e.g., visual, sound, gesture, or any bit pattern) in physical space. It is generated by a human or a machine in the current contextual situation and is linked to a concept in the human mind or a set of operations of a machine. An information item delivers the sense or the idea that a human mind extracts out of a given natural language proposition that contains meaningful data items. Since the given tangible, intangible and temporal context are part of the explanation of a data item, a change of context can have an effect on the meaning of data and the sense of a proposition. The DIT model provides a framework to show how the flow of time can change the truth-value of a proposition. This book compares our notions of data, information, and time in differing contexts: in human communication, in the operation of a computer system and in a biological system. In the final Section a few simple examples demonstrate how the lessons learned from the DIT-model can help to improve the design of a computer system.
This handbook is a comprehensive practical resource on corpus linguistics. It features a range of basic and advanced approaches, methods and techniques in corpus linguistics, from corpus compilation principles to quantitative data analyses. The Handbook is organized in six Parts. Parts I to III feature chapters that discuss key issues and the know-how related to various topics around corpus design, methods and corpus types. Parts IV-V aim to offer a user-friendly introduction to the quantitative analysis of corpus data: for each statistical technique discussed, chapters provide a practical guide with R and come with supplementary online material. Part VI focuses on how to write a corpus linguistic paper and how to meta-analyze corpus linguistic research. The volume can serve as a course book as well as for individual study. It will be an essential reading for students of corpus linguistics as well as experienced researchers who want to expand their knowledge of the field.
Memory-based language processing - a machine learning and problem solving method for language technology - is based on the idea that the direct reuse of examples using analogical reasoning is more suited for solving language processing problems than the application of rules extracted from those examples. This book discusses the theory and practice of memory-based language processing, showing its comparative strengths over alternative methods of language modelling. Language is complex, with few generalizations, many sub-regularities and exceptions, and the advantage of memory-based language processing is that it does not abstract away from this valuable low-frequency information. By applying the model to a range of benchmark problems, the authors show that for linguistic areas ranging from phonology to semantics, it produces excellent results. They also describe TiMBL, a software package for memory-based language processing. The first comprehensive overview of the approach, this book will be invaluable for computational linguists, psycholinguists and language engineers.
Labelling data is one of the most fundamental activities in science, and has underpinned practice, particularly in medicine, for decades, as well as research in corpus linguistics since at least the development of the Brown corpus. With the shift towards Machine Learning in Artificial Intelligence (AI), the creation of datasets to be used for training and evaluating AI systems, also known in AI as corpora, has become a central activity in the field as well. Early AI datasets were created on an ad-hoc basis to tackle specific problems. As larger and more reusable datasets were created, requiring greater investment, the need for a more systematic approach to dataset creation arose to ensure increased quality. A range of statistical methods were adopted, often but not exclusively from the medical sciences, to ensure that the labels used were not subjective, or to choose among different labels provided by the coders. A wide variety of such methods is now in regular use. This book is meant to provide a survey of the most widely used among these statistical methods supporting annotation practice. As far as the authors know, this is the first book attempting to cover the two families of methods in wider use. The first family of methods is concerned with the development of labelling schemes and, in particular, ensuring that such schemes are such that sufficient agreement can be observed among the coders. The second family includes methods developed to analyze the output of coders once the scheme has been agreed upon, particularly although not exclusively to identify the most likely label for an item among those provided by the coders. The focus of this book is primarily on Natural Language Processing, the area of AI devoted to the development of models of language interpretation and production, but many if not most of the methods discussed here are also applicable to other areas of AI, or indeed, to other areas of Data Science.
This book sheds new light on corpus-assisted translation pedagogy, an intersection of three distinct but cognate disciplines: corpus linguistics, translation and pedagogy. By taking an innovative and empirical approach to translation teaching, the study utilizes mixed methods, including translation experiments, surveys and in-depth focus groups. The results demonstrated the unique advantages and at the same time called attention to possible pitfalls of using corpora for translation teaching purposes. This book enriches our understanding of corpus application in the setting of translation between Chinese and English, two languages which are each distinctly different from one another. Readers will also discover new horizons in this burgeoning and interdisciplinary field of research. This book appeals to a broad readership, from scholars and researchers who are interested in translation technology to widen the scope of translation studies, translation trainers in search of effective teaching approaches to a growing number of cross-disciplinary postgraduate students longing to improve their translation skills and competence.
This book discusses the state of the art of automated essay scoring, its challenges and its potential. One of the earliest applications of artificial intelligence to language data (along with machine translation and speech recognition), automated essay scoring has evolved to become both a revenue-generating industry and a vast field of research, with many subfields and connections to other NLP tasks. In this book, we review the developments in this field against the backdrop of Elias Page's seminal 1966 paper titled "The Imminence of Grading Essays by Computer." Part 1 establishes what automated essay scoring is about, why it exists, where the technology stands, and what are some of the main issues. In Part 2, the book presents guided exercises to illustrate how one would go about building and evaluating a simple automated scoring system, while Part 3 offers readers a survey of the literature on different types of scoring models, the aspects of essay quality studied in prior research, and the implementation and evaluation of a scoring engine. Part 4 offers a broader view of the field inclusive of some neighboring areas, and Part \ref{part5} closes with summary and discussion. This book grew out of a week-long course on automated evaluation of language production at the North American Summer School for Logic, Language, and Information (NASSLLI), attended by advanced undergraduates and early-stage graduate students from a variety of disciplines. Teachers of natural language processing, in particular, will find that the book offers a useful foundation for a supplemental module on automated scoring. Professionals and students in linguistics, applied linguistics, educational technology, and other related disciplines will also find the material here useful.
The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in response to a query. Although the most common formulation of text ranking is search, instances of the task can also be found in many natural language processing (NLP) applications.This book provides an overview of text ranking with neural network architectures known as transformers, of which BERT (Bidirectional Encoder Representations from Transformers) is the best-known example. The combination of transformers and self-supervised pretraining has been responsible for a paradigm shift in NLP, information retrieval (IR), and beyond. This book provides a synthesis of existing work as a single point of entry for practitioners who wish to gain a better understanding of how to apply transformers to text ranking problems and researchers who wish to pursue work in this area. It covers a wide range of modern techniques, grouped into two high-level categories: transformer models that perform reranking in multi-stage architectures and dense retrieval techniques that perform ranking directly. Two themes pervade the book: techniques for handling long documents, beyond typical sentence-by-sentence processing in NLP, and techniques for addressing the tradeoff between effectiveness (i.e., result quality) and efficiency (e.g., query latency, model and index size). Although transformer architectures and pretraining techniques are recent innovations, many aspects of how they are applied to text ranking are relatively well understood and represent mature techniques. However, there remain many open research questions, and thus in addition to laying out the foundations of pretrained transformers for text ranking, this book also attempts to prognosticate where the field is heading.
This book explores some of the ethical, legal, and social implications of chatbots, or conversational artificial agents. It reviews the possibility of establishing meaningful social relationships with chatbots and investigates the consequences of those relationships for contemporary debates in the philosophy of Artificial Intelligence. The author introduces current technological challenges of AI and discusses how technological progress and social change influence our understanding of social relationships. He then argues that chatbots introduce epistemic uncertainty into human social discourse, but that this can be ameliorated by introducing a new ontological classification or 'status' for chatbots. This step forward would allow humans to reap the benefits of this technological development, without the attendant losses. Finally, the author considers the consequences of chatbots on human-human relationships, providing analysis on robot rights, human-centered design, and the social tension between robophobes and robophiles.
This book describes effective methods for automatically analyzing a sentence, based on the syntactic and semantic characteristics of the elements that form it. To tackle ambiguities, the authors use selectional preferences (SP), which measure how well two words fit together semantically in a sentence. Today, many disciplines require automatic text analysis based on the syntactic and semantic characteristics of language and as such several techniques for parsing sentences have been proposed. Which is better? In this book the authors begin with simple heuristics before moving on to more complex methods that identify nouns and verbs and then aggregate modifiers, and lastly discuss methods that can handle complex subordinate and relative clauses. During this process, several ambiguities arise. SP are commonly determined on the basis of the association between a pair of words. However, in many cases, SP depend on more words. For example, something (such as grass) may be edible, depending on who is eating it (a cow?). Moreover, things such as popcorn are usually eaten at the movies, and not in a restaurant. The authors deal with these phenomena from different points of view.
This book presents a taxonomy framework and survey of methods relevant to explaining the decisions and analyzing the inner workings of Natural Language Processing (NLP) models. The book is intended to provide a snapshot of Explainable NLP, though the field continues to rapidly grow. The book is intended to be both readable by first-year M.Sc. students and interesting to an expert audience. The book opens by motivating a focus on providing a consistent taxonomy, pointing out inconsistencies and redundancies in previous taxonomies. It goes on to present (i) a taxonomy or framework for thinking about how approaches to explainable NLP relate to one another; (ii) brief surveys of each of the classes in the taxonomy, with a focus on methods that are relevant for NLP; and (iii) a discussion of the inherent limitations of some classes of methods, as well as how to best evaluate them. Finally, the book closes by providing a list of resources for further research on explainability.
The book presents current research and developments in multilingual speech recognition. The author presents a Multilingual Phone Recognition System (Multi-PRS), developed using a common multilingual phone-set derived from the International Phonetic Alphabets (IPA) based transcription of six Indian languages - Kannada, Telugu, Bengali, Odia, Urdu, and Assamese. The author shows how the performance of Multi-PRS can be improved using tandem features. The book compares Monolingual Phone Recognition Systems (Mono-PRS) versus Multi-PRS and baseline versus tandem system. Methods are proposed to predict Articulatory Features (AFs) from spectral features using Deep Neural Networks (DNN). Multitask learning is explored to improve the prediction accuracy of AFs. Then, the AFs are explored to improve the performance of Multi-PRS using lattice rescoring method of combination and tandem method of combination. The author goes on to develop and evaluate the Language Identification followed by Monolingual phone recognition (LID-Mono) and common multilingual phone-set based multilingual phone recognition systems.
This book constitutes the refereed proceedings of the 14th
International Conference on Formal Grammar 2009, held in Bordeaux,
France, in July 2009.
This book deals with two fundamental issues in the semiotics of the image. The first is the relationship between image and observer: how does one look at an image? To answer this question, this book sets out to transpose the theory of enunciation formulated in linguistics over to the visual field. It also aims to clarify the gains made in contemporary visual semiotics relative to the semiology of Roland Barthes and Emile Benveniste. The second issue addressed is the relation between the forces, forms and materiality of the images. How do different physical mediums (pictorial, photographic and digital) influence visual forms? How does materiality affect the generativity of forms? On the forces within the images, the book addresses the philosophical thought of Gilles Deleuze and Rene Thom as well as the experiment of Aby Warburg's Atlas Mnemosyne. The theories discussed in the book are tested on a variety of corpora for analysis, including both paintings and photographs, taken from traditional as well as contemporary sources in a variety of social sectors (arts and sciences). Finally, semiotic methodology is contrasted with the computational analysis of large collections of images (Big Data), such as the "Media Visualization" analyses proposed by Lev Manovich and Cultural Analytics in the field of Computer Science to evaluate the impact of automatic analysis of visual forms on Digital Art History and more generally on the image sciences.
This book presents the concept of the double hierarchy linguistic term set and its extensions, which can deal with dynamic and complex decision-making problems. With the rapid development of science and technology and the acceleration of information updating, the complexity of decision-making problems has become increasingly obvious. This book provides a comprehensive and systematic introduction to the latest research in the field, including measurement methods, consistency methods, group consensus and large-scale group consensus decision-making methods, as well as their practical applications. Intended for engineers, technicians, and researchers in the fields of computer linguistics, operations research, information science, management science and engineering, it also serves as a textbook for postgraduate and senior undergraduate university students.
This book explores novel aspects of social robotics, spoken dialogue systems, human-robot interaction, spoken language understanding, multimodal communication, and system evaluation. It offers a variety of perspectives on and solutions to the most important questions about advanced techniques for social robots and chat systems. Chapters by leading researchers address key research and development topics in the field of spoken dialogue systems, focusing in particular on three special themes: dialogue state tracking, evaluation of human-robot dialogue in social robotics, and socio-cognitive language processing. The book offers a valuable resource for researchers and practitioners in both academia and industry whose work involves advanced interaction technology and who are seeking an up-to-date overview of the key topics. It also provides supplementary educational material for courses on state-of-the-art dialogue system technologies, social robotics, and related research fields.
This book explains speech enhancement in the Fractional Fourier Transform (FRFT) domain and investigates the use of different FRFT algorithms in both single channel and multi-channel enhancement systems, which has proven to be an ideal time frequency analysis tool in many speech signal processing applications. The authors discuss the complexities involved in the highly non- stationary signal processing and the concepts of FRFT for speech enhancement applications. The book explains the fundamentals of FRFT as well as its implementation in speech enhancement. Theories of different FRFT methods are also discussed. The book lets readers understand the new fractional domains to prepare them to develop new algorithms. A comprehensive literature survey regarding the topic is also made available to the reader.
This book constitutes the proceedings of the 14th International Conference on Computational Processing of the Portuguese Language, PROPOR 2020, held in Evora, Portugal, in March 2020. The 36 full papers presented together with 5 short papers were carefully reviewed and selected from 70 submissions. They are grouped in topical sections on speech processing; resources and evaluation; natural language processing applications; semantics; natural language processing tasks; and multilinguality.
Weighted finite-state transducers (WFSTs) are commonly used by engineers and computational linguists for processing and generating speech and text. This book first provides a detailed introduction to this formalism. It then introduces Pynini, a Python library for compiling finite-state grammars and for combining, optimizing, applying, and searching finite-state transducers. This book illustrates this library's conventions and use with a series of case studies. These include the compilation and application of context-dependent rewrite rules, the construction of morphological analyzers and generators, and text generation and processing applications.
Opportunity and Curiosity find similar rocks on Mars. One can generally understand this statement if one knows that Opportunity and Curiosity are instances of the class of Mars rovers, and recognizes that, as signalled by the word on, rocks are located on Mars. Two mental operations contribute to understanding: recognize how entities/concepts mentioned in a text interact and recall already known facts (which often themselves consist of relations between entities/concepts). Concept interactions one identifies in the text can be added to the repository of known facts, and aid the processing of future texts. The amassed knowledge can assist many advanced language-processing tasks, including summarization, question answering and machine translation. Semantic relations are the connections we perceive between things which interact. The book explores two, now intertwined, threads in semantic relations: how they are expressed in texts and what role they play in knowledge repositories. A historical perspective takes us back more than 2000 years to their beginnings, and then to developments much closer to our time: various attempts at producing lists of semantic relations, necessary and sufficient to express the interaction between entities/concepts. A look at relations outside context, then in general texts, and then in texts in specialized domains, has gradually brought new insights, and led to essential adjustments in how the relations are seen. At the same time, datasets which encompass these phenomena have become available. They started small, then grew somewhat, then became truly large. The large resources are inevitably noisy because they are constructed automatically. The available corpora-to be analyzed, or used to gather relational evidence-have also grown, and some systems now operate at the Web scale. The learning of semantic relations has proceeded in parallel, in adherence to supervised, unsupervised or distantly supervised paradigms. Detailed analyses of annotated datasets in supervised learning have granted insights useful in developing unsupervised and distantly supervised methods. These in turn have contributed to the understanding of what relations are and how to find them, and that has led to methods scalable to Web-sized textual data. The size and redundancy of information in very large corpora, which at first seemed problematic, have been harnessed to improve the process of relation extraction/learning. The newest technology, deep learning, supplies innovative and surprising solutions to a variety of problems in relation learning. This book aims to paint a big picture and to offer interesting details.
What is the lexicon, what does it contain, and how is it structured? What principles determine the functioning of the lexicon as a component of natural language grammar? What role does lexical information play in linguistic theory? This accessible introduction aims to answer these questions, and explores the relation of the lexicon to grammar as a whole. It includes a critical overview of major theoretical frameworks, and puts forward a unified treatment of lexical structure and design. The text can be used for introductory and advanced courses, and for courses that touch upon different aspects of the lexicon, such as lexical semantics, lexicography, syntax, general linguistics, computational lexicology and ontology design. The book provides students with a set of tools which will enable them to work with lexical data for all kinds of purposes, including an abundance of exercises and in-class activities designed to ensure that students are actively engaged with the content and effectively acquire the necessary knowledge and skills they need. |
You may like...
Bioinformatics: Principles and Analysis
Gretchen Kenney
Hardcover
Cybercrime and Jurisdiction - A global…
Bert-Jaap Koops, Susan W. Brenner
Hardcover
R1,558
Discovery Miles 15 580
Introduction To Legal Pluralism In South…
C. Rautenbach
Paperback
(1)
|