![]() |
![]() |
Your cart is empty |
||
Books > Computing & IT > Applications of computing > Artificial intelligence > Natural language & machine translation
This book constitutes the proceedings of the 21st EPIA Conference on Artificial Intelligence, EPIA 2022, which took place in Lisbon, Portugal, in August/September 2022. The 64 papers presented in this volume were carefully reviewed and selected from 85 submissions. They were organized in topical sections as follows: AI4IS - Artificial Intelligence for Industry and Societies; AIL - Artificial Intelligence and Law; AIM - Artificial Intelligence in Medicine; AIPES - Artificial Intelligence in Power and Energy Systems; AITS - Artificial Intelligence in Transportation Systems; AmIA - Ambient Intelligence and Affective Environments; GAI - General AI; IROBOT - Intelligent Robotics; KDBI - Knowledge Discovery and Business Intelligence; KRR - Knowledge Representation and Reasoning; MASTA - Multi-Agent Systems: Theory and Applications; TeMA - Text Mining and Applications.
This book constitutes the proceedings of the 26th International Conference on Theory and Practice of Digital Libraries, TPDL 2022, which took place in Padua, Italy, in September 2022. The 18 full papers, 27 short papers and 15 accelerating innovation papers included in these proceedings were carefully reviewed and selected from 107 submissions. They focus on digital libraries and associated technical, practical, and social issues.
The two-volume proceedings, LNCS 13249 and 13250, constitutes the thoroughly refereed post-workshop proceedings of the 22nd Chinese Lexical Semantics Workshop, CLSW 2021, held in Nanjing, China in May 2021. The 68 full papers and 4 short papers were carefully reviewed and selected from 261 submissions. They are organized in the following topical sections: Lexical Semantics and General Linguistics; Natural Language Processing and Language Computing; Cognitive Science and Experimental Studies; Lexical Resources and Corpus Linguistics.
This book provides a new multi-method, process-oriented approach towards speech quality assessment, which allows readers to examine the influence of speech transmission quality on a variety of perceptual and cognitive processes in human listeners. Fundamental concepts and methodologies surrounding the topic of process-oriented quality assessment are introduced and discussed. The book further describes a functional process model of human quality perception, which theoretically integrates results obtained in three experimental studies. This book's conceptual ideas, empirical findings, and theoretical interpretations should be of particular interest to researchers working in the fields of Quality and Usability Engineering, Audio Engineering, Psychoacoustics, Audiology, and Psychophysiology.
This book constitutes the proceedings of the 26th International Conference on Implementation and Application of Automata, CIAA 2022, held in Rouen, France in June/ July 2022. The 16 regular papers presented together with 3 invited lectures in this book were carefully reviewed and selected from 26 submissions. The topics of the papers covering various fields in the application, implementation, and theory of automata and related structures.
This book constitutes the refereed proceedings of the 20th International Conference on Formal Modeling and Analysis of Timed Systems, FORMATS 2022, held in Warsaw, Poland, in September 2022. The 12 full papers together with 2 short papers that were carefully reviewed and selected from 30 submissions are presented in this volume with 3 full-length papers associated with invited/anniversary talks. The papers focus on topics such as modelling, design and analysis of timed computational systems. The conference aims in real-time issues in hardware design, performance analysis, real-time software, scheduling, semantics and verification of real-timed, hybrid and probabilistic systems.
This work presents a discourse-aware Text Simplification approach that splits and rephrases complex English sentences within the semantic context in which they occur. Based on a linguistically grounded transformation stage, complex sentences are transformed into shorter utterances with a simple canonical structure that can be easily analyzed by downstream applications. To avoid breaking down the input into a disjointed sequence of statements that is difficult to interpret, the author incorporates the semantic context between the split propositions in the form of hierarchical structures and semantic relationships, thus generating a novel representation of complex assertions that puts a semantic layer on top of the simplified sentences. In a second step, she leverages the semantic hierarchy of minimal propositions to improve the performance of Open IE frameworks. She shows that such systems benefit in two dimensions. First, the canonical structure of the simplified sentences facilitates the extraction of relational tuples, leading to an improved precision and recall of the extracted relations. Second, the semantic hierarchy can be leveraged to enrich the output of existing Open IE approaches with additional meta-information, resulting in a novel lightweight semantic representation for complex text data in the form of normalized and context-preserving relational tuples.
This book constitutes the proceedings of the 26th International Conference on Developments in Language Theory, DLT 2022, which was held in Tampa, FL, USA, during May, 2022. The conference took place in an hybrid format with both in-person and online participation. The 21 full papers included in these proceedings were carefully reviewed and selected from 32 submissions. The DLT conference series provides a forum for presenting current developments in formal languages and automata.
This book gathers high-quality papers presented at Academia-Industry Consortium for Data Science (AICDS 2020), held in Wenzhou, China during 19 - 20 December 2020. The book presents views of academicians and also how companies are approaching these challenges organizationally. The topics covered in the book are data science and analytics, natural language processing, predictive analytics, artificial intelligence, machine learning, deep learning, big data computing, cognitive computing, data visualization, image processing, and optimization techniques.
This book constitutes the proceedings of the 5th International Workshop on Chatbot Research and Design, CONVERSATIONS 2021, which was held during November 2021.Due to COVID-19 pandemic the conference was held online.The 12 papers included in this volume were carefully reviewed and selected from a total of 25 submissions. The papers in the proceedings are structured in four topical groups: Chatbot User Insight, Chatbots Supporting Collaboration and Social Interaction, and Chatbot UX and Design.
Labelling data is one of the most fundamental activities in science, and has underpinned practice, particularly in medicine, for decades, as well as research in corpus linguistics since at least the development of the Brown corpus. With the shift towards Machine Learning in Artificial Intelligence (AI), the creation of datasets to be used for training and evaluating AI systems, also known in AI as corpora, has become a central activity in the field as well. Early AI datasets were created on an ad-hoc basis to tackle specific problems. As larger and more reusable datasets were created, requiring greater investment, the need for a more systematic approach to dataset creation arose to ensure increased quality. A range of statistical methods were adopted, often but not exclusively from the medical sciences, to ensure that the labels used were not subjective, or to choose among different labels provided by the coders. A wide variety of such methods is now in regular use. This book is meant to provide a survey of the most widely used among these statistical methods supporting annotation practice. As far as the authors know, this is the first book attempting to cover the two families of methods in wider use. The first family of methods is concerned with the development of labelling schemes and, in particular, ensuring that such schemes are such that sufficient agreement can be observed among the coders. The second family includes methods developed to analyze the output of coders once the scheme has been agreed upon, particularly although not exclusively to identify the most likely label for an item among those provided by the coders. The focus of this book is primarily on Natural Language Processing, the area of AI devoted to the development of models of language interpretation and production, but many if not most of the methods discussed here are also applicable to other areas of AI, or indeed, to other areas of Data Science.
This book constitutes selected revised papers of the 15th International Conference, NooJ 2021, held in Besancon, France, in June 2021. Due to the COVID-19 pandemic the conference was held online. NooJ is a linguistic development environment that allows linguists to formalize several levels of linguistic phenomena. NooJ provides linguists with tools to develop dictionaries, regular grammars, context-free grammars, context-sensitive grammars and unrestricted grammars as well as their graphical equivalent to formalize each linguistic phenomenon. The 20 full papers presented were carefully reviewed and selected from 62 submissions. The papers are organized in the following topics: linguistic formalization and analysis, digital humanities and teaching, natural language processing applications.
This book constitutes the refereed proceedings of the 10th International Conference on Computational Data and Social Networks, CSoNet 2021, which was held online during November 15-17, 2021. The conference was initially planned to take place in Montreal, Quebec, Canada, but changed to an online event due to the COVID-19 pandemic. The 24 full and 8 short papers included in this book were carefully reviewed and selected from 57 submissions. They were organized in topical sections as follows: Combinatorial optimization and learning; deep learning and applications to complex and social systems; measurements of insight from data; complex networks analytics; special track on fact-checking, fake news and malware detection in online social networks; and special track on information spread in social and data networks.
This book constitutes the refereed proceedings of the 17th China Conference on Machine Translation, CCMT 2020, held in Xining, China, in October 2021. The 10 papers presented in this volume were carefully reviewed and selected from 25 submissions and focus on all aspects of machine translation, including preprocessing, neural machine translation models, hybrid model, evaluation method, and post-editing.
Empirical methods are means to answering methodological questions of empirical sciences by statistical techniques. The methodological questions addressed in this book include the problems of validity, reliability, and significance. In the case of machine learning, these correspond to the questions of whether a model predicts what it purports to predict, whether a model's performance is consistent across replications, and whether a performance difference between two models is due to chance, respectively. The goal of this book is to answer these questions by concrete statistical tests that can be applied to assess validity, reliability, and significance of data annotation and machine learning prediction in the fields of NLP and data science. Our focus is on model-based empirical methods where data annotations and model predictions are treated as training data for interpretable probabilistic models from the well-understood families of generalized additive models (GAMs) and linear mixed effects models (LMEMs). Based on the interpretable parameters of the trained GAMs or LMEMs, the book presents model-based statistical tests such as a validity test that allows detecting circular features that circumvent learning. Furthermore, the book discusses a reliability coefficient using variance decomposition based on random effect parameters of LMEMs. Last, a significance test based on the likelihood ratio of nested LMEMs trained on the performance scores of two machine learning models is shown to naturally allow the inclusion of variations in meta-parameter settings into hypothesis testing, and further facilitates a refined system comparison conditional on properties of input data. This book can be used as an introduction to empirical methods for machine learning in general, with a special focus on applications in NLP and data science. The book is self-contained, with an appendix on the mathematical background on GAMs and LMEMs, and with an accompanying webpage including R code to replicate experiments presented in the book.
Leverage machine learning and deep learning techniques to build fully-fledged natural language processing (NLP) projects. Projects throughout this book grow in complexity and showcase methodologies, optimizing tips, and tricks to solve various business problems. You will use modern Python libraries and algorithms to build end-to-end NLP projects. The book starts with an overview of natural language processing (NLP) and artificial intelligence to provide a quick refresher on algorithms. Next, it covers end-to-end NLP projects beginning with traditional algorithms and projects such as customer review sentiment and emotion detection, topic modeling, and document clustering. From there, it delves into e-commerce related projects such as product categorization using the description of the product, a search engine to retrieve the relevant content, and a content-based recommendation system to enhance user experience. Moving forward, it explains how to build systems to find similar sentences using contextual embedding, summarizing huge documents using recurrent neural networks (RNN), automatic word suggestion using long short-term memory networks (LSTM), and how to build a chatbot using transfer learning. It concludes with an exploration of next-generation AI and algorithms in the research space. By the end of this book, you will have the knowledge needed to solve various business problems using NLP techniques. What You Will Learn Implement full-fledged intelligent NLP applications with Python Translate real-world business problem on text data with NLP techniques Leverage machine learning and deep learning techniques to perform smart language processing Gain hands-on experience implementing end-to-end search engine information retrieval, text summarization, chatbots, text generation, document clustering and product classification, and more Who This Book Is For Data scientists, machine learning engineers, and deep learning professionals looking to build natural language applications using Python
This two-volume set of LNAI 13028 and LNAI 13029 constitutes the refereed proceedings of the 10th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2021, held in Qingdao, China, in October 2021.The 66 full papers, 23 poster papers, and 27 workshop papers presented were carefully reviewed and selected from 446 submissions. They are organized in the following areas: Fundamentals of NLP; Machine Translation and Multilinguality; Machine Learning for NLP; Information Extraction and Knowledge Graph; Summarization and Generation; Question Answering; Dialogue Systems; Social Media and Sentiment Analysis; NLP Applications and Text Mining; and Multimodality and Explainability.
The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in response to a query. Although the most common formulation of text ranking is search, instances of the task can also be found in many natural language processing (NLP) applications.This book provides an overview of text ranking with neural network architectures known as transformers, of which BERT (Bidirectional Encoder Representations from Transformers) is the best-known example. The combination of transformers and self-supervised pretraining has been responsible for a paradigm shift in NLP, information retrieval (IR), and beyond. This book provides a synthesis of existing work as a single point of entry for practitioners who wish to gain a better understanding of how to apply transformers to text ranking problems and researchers who wish to pursue work in this area. It covers a wide range of modern techniques, grouped into two high-level categories: transformer models that perform reranking in multi-stage architectures and dense retrieval techniques that perform ranking directly. Two themes pervade the book: techniques for handling long documents, beyond typical sentence-by-sentence processing in NLP, and techniques for addressing the tradeoff between effectiveness (i.e., result quality) and efficiency (e.g., query latency, model and index size). Although transformer architectures and pretraining techniques are recent innovations, many aspects of how they are applied to text ranking are relatively well understood and represent mature techniques. However, there remain many open research questions, and thus in addition to laying out the foundations of pretrained transformers for text ranking, this book also attempts to prognosticate where the field is heading.
Automating Linguistics offers an in-depth study of the history of the mathematisation and automation of the sciences of language. In the wake of the first mathematisation of the 1930s, two waves followed: machine translation in the 1950s and the development of computational linguistics and natural language processing in the 1960s. These waves were pivotal given the work of large computerised corpora in the 1990s and the unprecedented technological development of computers and software.Early machine translation was devised as a war technology originating in the sciences of war, amidst the amalgamate of mathematics, physics, logics, neurosciences, acoustics, and emerging sciences such as cybernetics and information theory. Machine translation was intended to provide mass translations for strategic purposes during the Cold War. Linguistics, in turn, did not belong to the sciences of war, and played a minor role in the pioneering projects of machine translation.Comparing the two trends, the present book reveals how the sciences of language gradually integrated the technologies of computing and software, resulting in the second-wave mathematisation of the study of language, which may be called mathematisation-automation. The integration took on various shapes contingent upon cultural and linguistic traditions (USA, ex-USSR, Great Britain and France). By contrast, working with large corpora in the 1990s, though enabled by unprecedented development of computing and software, was primarily a continuation of traditional approaches in the sciences of language sciences, such as the study of spoken and written texts, lexicography, and statistical studies of vocabulary.
This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports. The papers are organized into the following topical sections: historical document analysis, document analysis systems, handwriting recognition, scene text detection and recognition, document image processing, natural language processing (NLP) for document understanding, and graphics, diagram and math recognition.
This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports. The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding.
This book constitutes the proceedings of the international workshops co-located with the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland, in September 2021.The total of 59 full and 12 short papers presented in this book were carefully selected from 96 submissions and divided into two volumes. Part II contains 30 full and 8 short papers that stem from the following meetings: Workshop on Machine Learning (WML); Workshop on Open Services and Tools for Document Analysis (OST); Workshop on Industrial Applications of Document Analysis and Recognition (WIADAR); Workshop on Computational Paleography (IWCP); Workshop on Document Images and Language (DIL); Workshop on Graph Representation Learning for Scanned Document Analysis (GLESDO).
This book constitutes the refereed proceedings of the 12th International Conference of the CLEF Association, CLEF 2021, held virtually in September 2021.The conference has a clear focus on experimental information retrieval with special attention to the challenges of multimodality, multilinguality, and interactive search ranging from unstructured to semi structures and structured data. The 11 full papers presented in this volume were carefully reviewed and selected from 21 submissions. This year, the contributions addressed the following challenges: application of neural methods for entity recognition as well as misinformation detection in the health area, skills extraction in job-match databases, stock market prediction using financial news, and extraction of audio features for podcast retrieval. In addition to this, the volume presents 5 "best of the labs" papers which were reviewed as full paper submissions with the same review criteria. 12 lab overview papers were accepted and represent scientific challenges based on new data sets and real world problems in multimodal and multilingual information access.
The book presents current research and developments in multilingual speech recognition. The author presents a Multilingual Phone Recognition System (Multi-PRS), developed using a common multilingual phone-set derived from the International Phonetic Alphabets (IPA) based transcription of six Indian languages - Kannada, Telugu, Bengali, Odia, Urdu, and Assamese. The author shows how the performance of Multi-PRS can be improved using tandem features. The book compares Monolingual Phone Recognition Systems (Mono-PRS) versus Multi-PRS and baseline versus tandem system. Methods are proposed to predict Articulatory Features (AFs) from spectral features using Deep Neural Networks (DNN). Multitask learning is explored to improve the prediction accuracy of AFs. Then, the AFs are explored to improve the performance of Multi-PRS using lattice rescoring method of combination and tandem method of combination. The author goes on to develop and evaluate the Language Identification followed by Monolingual phone recognition (LID-Mono) and common multilingual phone-set based multilingual phone recognition systems.
This two-volume set of LNAI 13028 and LNAI 13029 constitutes the refereed proceedings of the 10th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2021, held in Qingdao, China, in October 2021.The 66 full papers, 23 poster papers, and 27 workshop papers presented were carefully reviewed and selected from 446 submissions. They are organized in the following areas: Fundamentals of NLP; Machine Translation and Multilinguality; Machine Learning for NLP; Information Extraction and Knowledge Graph; Summarization and Generation; Question Answering; Dialogue Systems; Social Media and Sentiment Analysis; NLP Applications and Text Mining; and Multimodality and Explainability. |
![]() ![]() You may like...
Python Programming for Computations…
Computer Language
Hardcover
Annotation, Exploitation and Evaluation…
Silvia Hansen-Schirra, Sambor Grucza
Hardcover
R995
Discovery Miles 9 950
Handbook of Research on Recent…
Siddhartha Bhattacharyya, Nibaran Das, …
Hardcover
R9,890
Discovery Miles 98 900
Eyetracking and Applied Linguistics
Silvia Hansen-Schirra, Sambor Grucza
Hardcover
R902
Discovery Miles 9 020
Natural Language Processing for Global…
Fatih Pinarbasi, M. Nurdan Taskiran
Hardcover
R6,892
Discovery Miles 68 920
|