![]() |
![]() |
Your cart is empty |
||
Books > Language & Literature > Language & linguistics > Computational linguistics
The development of smaller and more powerful computers and the introduction of new communication channels by the interlinking of computers, by the Internet and the World Wide Web, have caused great changes for linguistics. They affect the methods in the various disciplines of pure linguistics as well as the tools and ways of applied linguistics such as translation and interpretation, language teaching, learning, and testing. This volume presents general reflections and overview articles on these new developments by noted experts followed by reports on the concrete uses of information technologies for linguistic purposes in different European countries and at the European Parliament. A discussion of another important linguistic issue is added: the various uses of the highly symbolic term national language.
This volume explores multiple dimensions of openness in ICT-enhanced education. The chapters, contributed by researchers and academic teachers, present a number of exemplary solutions in the area. They involve the use of open source software, innovative technologies, teaching/learning methods and techniques, as well as examine potential benefits for both teachers' and students' cognitive, behavioural and metacognitive development.
Users of natural languages have many word orders with which to encode the same truth-conditional meaning. They choose contextually appropriate strings from these many ways with little conscious effort and with effective communicative results. Previous computational models of when English speakers produce non-canonical word orders, like topicalization, left-dislocation, and clefts, fail-either by overgenerating these statistically rare forms or by undergenerating. The primary goal of this book is to present a better model of when speakers choose to produce certain non-canonical word orders by incorporating the effects of discourse context and speaker goals on syntactic choice. The theoretical model is then used as a basis for building a probabilistic classifier that can select the most human-like word order based on the surrounding discourse context. The model of discourse context used is a methodological advance both from a theoretical and an engineering perspective. It is built up from individual linguistic features, ones more easily and reliably annotated than the direct annotation of a discourse or rhetorical structure for a text. This book makes extensive use of previously unexamined naturally occurring corpus data of non-canonical word order in English, both to illustrate the points of the theoretical model and to train the statistical model.
This book is an introduction to the rudiments of Perl programming. It provides the general reader with an interest in language with the most usable and relevant aspects of Perl for writing programs that deal with language.Through a series of simple examples and exercises, the reader is gradually introduced to the essentials of good programming. The examples are carefully constructed to make the introduction of new concepts as simple as possible, while at the same time using sample programs that make sense to someone who works with language as data. Many of these programs can be used immediately with minimal or no modification. The text is accompanied by exercises at the end of each chapter and all the code is available from the companion website: http: //www .u.arizona.edu/~hammond.
This comprehensive reference work provides an overview of the concepts, methodologies, and applications in computational linguistics and natural language processing (NLP). * Features contributions by the top researchers in the field, reflecting the work that is driving the discipline forward * Includes an introduction to the major theoretical issues in these fields, as well as the central engineering applications that the work has produced * Presents the major developments in an accessible way, explaining the close connection between scientific understanding of the computational properties of natural language and the creation of effective language technologies * Serves as an invaluable state-of-the-art reference source for computational linguists and software engineers developing NLP applications in industrial research and development labs of software companies
Computers offer new perspectives in the study of language, allowing us to see phenomena that previously remained obscure because of the limitations of our vantage points. It is not uncommon for computers to be likened to the telescope, or microscope, in this respect. In this pioneering computer-assisted study of translation, Dorothy Kenny suggests another image, that of the kaleidoscope: playful changes of perspective using corpus-processing software allow textual patterns to come into focus and then recede again as others take their place. And against the background of repeated patterns in a corpus, creative uses of language gain a particular prominence. In Lexis and Creativity in Translation, Kenny monitors the translation of creative source-text word forms and collocations uncovered in a specially constructed German-English parallel corpus of literary texts. Using an abundance of examples, she reveals evidence of both normalization and ingenious creativity in translation. Her discussion of lexical creativity draws on insights from traditional morphology, structural semantics and, most notably, neo-Firthian corpus linguistics, suggesting that rumours of the demise of linguistics in translation studies are greatly exaggerated. Lexis and Creativity in Translation is essential reading for anyone interested in corpus linguistics and its impact so far on translation studies. The book also offers theoretical and practical guidance for researchers who wish to conduct their own corpus-based investigations of translation. No previous knowledge of German, corpus linguistics or computing is assumed.
Contemporary corpus linguists use a wide variety of methods to study discourse patterns. This volume provides a systematic comparison of various methodological approaches in corpus linguistics through a series of parallel empirical studies that use a single corpus dataset to answer the same overarching research question. Ten contributing experts each use a different method to address the same broadly framed research question: In what ways does language use in online Q+A forum responses differ across four world English varieties (India, Philippines, United Kingdom, and United States)? Contributions will be based on analysis of the same 400,000 word corpus from online Q+A forums, and contributors employ methodologies including corpus-based discourse analysis, audience perceptions, Multi-Dimensional analysis, pragmatic analysis, and keyword analysis. In their introductory and concluding chapters, the volume editors compare and contrast the findings from each method and assess the degree to which 'triangulating' multiple approaches may provide a more nuanced understanding of a research question, with the aim of identifying a set of complementary approaches which could arguably take into account analytical blind spots. Baker and Egbert also consider the importance of issues such as researcher subjectivity, type of annotation, the limitations and affordances of different corpus tools, the relative strengths of qualitative and quantitative approaches, and the value of considering data or information beyond the corpus. Rather than attempting to find the 'best' approach, the focus of the volume is on how different corpus linguistic methodologies may complement one another, and raises suggestions for further methodological studies which use triangulation to enrich corpus-related research.
Polysemy is a term used in semantic and lexical analysis to describe a word with multiple meanings. The problem is to establish whether its the same word with related meanings or different words that happen to look or sound the same. In 'Plainly planes plane plains plainly' how many distinct lexical items are there? Such words present few difficulties in everyday language, but pose near-intractable problems for linguists and lexicographers. The contributors, including Anna Wierzbicka, Charles Fillmore, and James Pustejovsky, consider the implications of these problems for grammatical theory and how they may be addressed in computational linguistics.
This collection of papers and abstracts stems from the third meeting in the series of Sperlonga workshops on Cognitive Models of Speech Processing. It presents current research on the structure and organization of the mental lexicon, and on the processes that access that lexicon. The volume starts with discussion of issues in acquisition and consideration of questions such as, 'What is the relationship between vocabulary growth and the acquisition of syntax?', and, 'How does prosodic information, concerning the melodies and rhythms of the language, influence the processes of lexical and syntactic acquisition?'. From acquisition, the papers move on to consider the manner in which contemporary models of spoken word recognition and production can map onto neural models of the recognition and production processes. The issue of exactly what is recognised, and when, is dealt with next - the empirical findings suggest that the function of something to which a word refers is accessed with a different time-course to the form of that something. This has considerable implications for the nature, and content, of lexical representations. Equally important are the findings from the studies of disordered lexical processing, and two papers in this volume address the implications of these disorders for models of lexical representation and process (borrowing from both empirical data and computational modelling). The final paper explores whether neural networks can successfully model certain lexical phenomena that have elsewhere been assumed to require rule-based processes.
The techniques of natural language processing (NLP) have been
widely applied in machine translation and automated message
understanding, but have only recently been utilized in second
language teaching. This book offers both an argument for and a
critical examination of this new application, with an examination
of how systems may be designed to exploit the power of NLP,
accomodate its limitations, and minimize its risks. This volume
marks the first collection of work in the U.S. and Canada that
incorporates advanced human language technologies into language
tutoring systems, covering languages as diverse as Arabic, Spanish,
Japanese, and English.
The techniques of natural language processing (NLP) have been
widely applied in machine translation and automated message
understanding, but have only recently been utilized in second
language teaching. This book offers both an argument for and a
critical examination of this new application, with an examination
of how systems may be designed to exploit the power of NLP,
accomodate its limitations, and minimize its risks. This volume
marks the first collection of work in the U.S. and Canada that
incorporates advanced human language technologies into language
tutoring systems, covering languages as diverse as Arabic, Spanish,
Japanese, and English.
Multi-Dimensional Analysis: Research Methods and Current Issues provides a comprehensive guide both to the statistical methods in Multi-Dimensional Analysis (MDA) and its key elements, such as corpus building, tagging, and tools. The major goal is to explain the steps involved in the method so that readers may better understand this complex research framework and conduct MD research on their own. Multi-Dimensional Analysis is a method that allows the researcher to describe different registers (textual varieties defined by their social use) such as academic settings, regional discourse, social media, movies, and pop songs. Through multivariate statistical techniques, MDA identifies complementary correlation groupings of dozens of variables, including variables which belong both to the grammatical and semantic domains. Such groupings are then associated with situational variables of texts like information density, orality, and narrativity to determine linguistic constructs known as dimensions of variation, which provide a scale for the comparison of a large number of texts and registers. This book is a comprehensive research guide to MDA.
The Language of ICT: * explores the nature of the electronic word and presents the new types of text in which it is found * examines the impact of the rapid technological change we are living through * analyses different texts, including email and answerphone messages, webpages, faxes, computer games and articles about IT * provides detailed guidance on downloading material from the web, gives URLs to visit, and includes a dedicated webpage * includes a comprehensive glossary of terms.
The book provides an overview of more than a decade of joint R&D efforts in the Low Countries on HLT for Dutch. It not only presents the state of the art of HLT for Dutch in the areas covered, but, even more importantly, a description of the resources (data and tools) for Dutch that have been created are now available for both academia and industry worldwide. The contributions cover many areas of human language technology (for Dutch): corpus collection (including IPR issues) and building (in particular one corpus aiming at a collection of 500M word tokens), lexicology, anaphora resolution, a semantic network, parsing technology, speech recognition, machine translation, text (summaries) generation, web mining, information extraction, and text to speech to name the most important ones. The book also shows how a medium-sized language community (spanning two territories) can create a digital language infrastructure (resources, tools, etc.) as a basis for subsequent R&D. At the same time, it bundles contributions of almost all the HLT research groups in Flanders and the Netherlands, hence offers a view of their recent research activities. Targeted readers are mainly researchers in human language technology, in particular those focusing on Dutch. It concerns researchers active in larger networks such as the CLARIN, META-NET, FLaReNet and participating in conferences such as ACL, EACL, NAACL, COLING, RANLP, CICling, LREC, CLIN and DIR ( both in the Low Countries), InterSpeech, ASRU, ICASSP, ISCA, EUSIPCO, CLEF, TREC, etc. In addition, some chapters are interesting for human language technology policy makers and even for science policy makers in general. "
The research described in this book shows that conversation analysis can effectively model dialogue. Specifically, this work shows that the multidisciplinary field of communicative ICALL may greatly benefit from including Conversation Analysis. As a consequence, this research makes several contributions to the related research disciplines, such as conversation analysis, second-language acquisition, computer-mediated communication, artificial intelligence, and dialogue systems. The book will be of value for researchers and engineers in the areas of computational linguistics, intelligent assistants, and conversational interfaces.
Semantic fields are lexically coherent - the words they contain co-occur in texts. In this book the authors introduce and define semantic domains, a computational model for lexical semantics inspired by the theory of semantic fields. Semantic domains allow us to exploit domain features for texts, terms and concepts, and they can significantly boost the performance of natural-language processing systems. Semantic domains can be derived from existing lexical resources or can be acquired from corpora in an unsupervised manner. They also have the property of interlinguality, and they can be used to relate terms in different languages in multilingual application scenarios. The authors give a comprehensive explanation of the computational model, with detailed chapters on semantic domains, domain models, and applications of the technique in text categorization, word sense disambiguation, and cross-language text categorization. This book is suitable for researchers and graduate students in computational linguistics.
Solving linguistic problems not infrequently reduces to carrying out tasks that are computationally complex and therefore requires automation. In such situations, the difference between having and not having computational tools to handle the tasks is not a matter of economy of time and effort, but may amount to the difference between finding and not finding a solution at all. The book is an introduction to machine-aided linguistic discovery, a novel research area, arguing for the fruitfulness of the computational approach by presenting a basic conceptual apparatus and several intelligent discovery programmes. One of the systems models the fundamental Saussurian notion of system, and thus, for the first time, after almost a century after the introduction of this concept and structuralism in general, linguists are capable to handle adequately this recurring computationally complex task. Another system models the problem of searching for Greenbergian language universals and is capable of stating its discoveries in an intelligible form, viz. a comprehensive English language text, thus constituting the first computer program to generate a whole scientific article. Yet another system detects potential inconsistencies in genetic language classifications. The programmes are applied with noteworthy results to substantial problems from diverse linguistic disciplines such as structural semantics, phonology, typology and historical linguistics.
Understanding any communication depends on the listener or reader recognizing that some words refer to what has already been said or written (his, its, he, there, etc.). This mode of reference, anaphora, involves complicated cognitive and syntactic processes, which people usually perform unerringly, but which present formidable problems for the linguist and cognitive scientist trying to explain precisely how comprehension is achieved. Anaphora is thus a central research focus in syntactic and semantic theory, while understanding and modelling its operation in discourse are important targets in computational linguistics and cognitive science. Yan Huang provides an extensive and accessible overview of the major contemporary issues surrounding anaphora and gives a critical survey of the many and diverse contemporary approaches to it. He provides by far the fullest cross-linguistic account yet published: Dr Huang's survey and analysis are based on a rich collection of data drawn from around 450 of the world's languages.
An increasing number of contributions have appeared in recent years on the subject of Audiovisual Translation (AVT), particularly in relation to dubbing and subtitling. The broad scope of this branch of Translation Studies is challenging because it brings together diverse disciplines, including film studies, translatology, semiotics, linguistics, applied linguistics, cognitive psychology, technology and ICT. This volume addresses issues relating to AVT research and didactics. The first section is dedicated to theoretical aspects in order to stimulate further debate and encourage progress in research-informed teaching. The second section focuses on a less developed area of research in the field of AVT: its potential use in foreign language pedagogy. This collection of articles is intended to create a discourse on new directions in AVT and foreign language learning. The book begins with reflections on wider methodological issues, advances to a proposed model of analysis for colloquial speech, touches on more 'niche' aspects of AVT (e.g. surtitling), progresses to didactic applications in foreign language pedagogy and learning at both linguistic and cultural levels, and concludes with a practical proposal for the use of AVT in foreign language classes. An interview with a professional subtitler draws the volume to a close.
This volume, composed mainly of papers given at the 1999 conferences of the Forum for German Language Studies (FGLS) at Kent and the Conference of University Teachers of German (CUTG) at Keele, is devoted to differential yet synergetic treatments of the German language. It includes corpus-lexicographical, computational, rigorously phonological, historical/dialectal, comparative, semiotic, acquisitional and pedagogical contributions. In all, a variety of approaches from the rigorously 'pure' and formal to the applied, often feeding off each other to focus on various aspects of the German language.
This book presents a theoretical study on aspect in Chinese, including both situation and viewpoint aspects. Unlike previous studies, which have largely classified linguistic units into different situation types, this study defines a set of ontological event types that are conceptually universal and on the basis of which different languages employ various linguistic devices to describe such events. To do so, it focuses on a particular component of events, namely the viewpoint aspect. It includes and discusses a wealth of examples to show how such ontological events are realized in Chinese. In addition, the study discusses how Chinese modal verbs and adverbs affect the distribution of viewpoint aspects associated with certain situation types. In turn, the book demonstrates how the proposed linguistic theory can be used in a computational context. Simply identifying events in terms of the verbs and their arguments is insufficient for real situations such as understanding the factivity and the logical/temporal relations between events. The proposed framework offers the possibility of analyzing events in Chinese text, yielding deep semantic information.
This book addresses the research, analysis, and description of the methods and processes that are used in the annotation and processing of language corpora in advanced, semi-advanced, and non-advanced languages. It provides the background information and empirical data needed to understand the nature and depth of problems related to corpus annotation and text processing and shows readers how the linguistic elements found in texts are analyzed and applied to develop language technology systems and devices. As such, it offers valuable insights for researchers, educators, and students of linguistics and language technology.
There is hardly any aspect of verbal communication that has not been investigated using the analytical tools developed by corpus linguists. This is especially true in the case of English, which commands a vast international research community, and corpora are becoming increasingly specialised, as they account for areas of language use shaped by specific sociolectal (register, genre, variety) and speaker (gender, profession, status) variables. Corpus analysis is driven by a common interest in 'linguistic evidence', viewed as a source of insights into language phenomena or of lexical, semantic and contrastive data for subsequent applications. Among the latter, pedagogical settings are highly prominent, as corpora can be used to monitor classroom output, raise learner awareness and inform teaching materials. The eighteen chapters in this volume focus on contexts where English is employed by specialists in the professions or academia and debate some of the challenges arising from the complex relationship between linguistic theory, data-mining tools and statistical methods.
This book introduces formal semantics techniques for a natural language processing audience. Methods discussed involve: (i) the denotational techniques used in model-theoretic semantics, which make it possible to determine whether a linguistic expression is true or false with respect to some model of the way things happen to be; and (ii) stages of interpretation, i.e., ways to arrive at meanings by evaluating and converting source linguistic expressions, possibly with respect to contexts, into output (logical) forms that could be used with (i). The book demonstrates that the methods allow wide coverage without compromising the quality of semantic analysis. Access to unrestricted, robust and accurate semantic analysis is widely regarded as an essential component for improving natural language processing tasks, such as: recognizing textual entailment, information extraction, summarization, automatic reply, and machine translation.
Current language technology is dominated by approaches that either enumerate a large set of rules, or are focused on a large amount of manually labelled data. The creation of both is time-consuming and expensive, which is commonly thought to be the reason why automated natural language understanding has still not made its way into "real-life" applications yet. This book sets an ambitious goal: to shift the development of
language processing systems to a much more automated setting than
previous works. A new approach is defined: what if computers
analysed large samples of language data on their own, identifying
structural regularities that perform the necessary abstractions and
generalisations in order to better understand language in the
process? The target audience are academics on all levels (undergraduate and graduate students, lecturers and professors) working in the fields of natural language processing and computational linguistics, as well as natural language engineers who are seeking to improve their systems. |
![]() ![]() You may like...
Online Optimization of Large Scale…
Martin Groetschel, Sven O. Krumke, …
Hardcover
R3,460
Discovery Miles 34 600
Revise BTEC National Information…
Daniel Richardson, Alan Jarvis
Paperback
R534
Discovery Miles 5 340
Internal and External Stabilization of…
Ali Saberi, Anton A. Stoorvogel, …
Hardcover
R4,744
Discovery Miles 47 440
Handbook of Reinforcement Learning and…
Kyriakos G. Vamvoudakis, Yan Wan, …
Hardcover
R6,849
Discovery Miles 68 490
Statistical Physics of Non-Thermal Phase…
Sergey G Abaimov
Hardcover
|