|
Showing 1 - 3 of
3 matches in All Departments
This book introduces basic supervised learning algorithms
applicable to natural language processing (NLP) and shows how the
performance of these algorithms can often be improved by exploiting
the marginal distribution of large amounts of unlabeled data. One
reason for that is data sparsity, i.e., the limited amounts of data
we have available in NLP. However, in most real-world NLP
applications our labeled data is also heavily biased. This book
introduces extensions of supervised learning algorithms to cope
with data sparsity and different kinds of sampling bias. This book
is intended to be both readable by first-year students and
interesting to the expert audience. My intention was to introduce
what is necessary to appreciate the major challenges we face in
contemporary NLP related to data sparsity and sampling bias,
without wasting too much time on details about supervised learning
algorithms or particular NLP applications. I use text
classification, part-of-speech tagging, and dependency parsing as
running examples, and limit myself to a small set of cardinal
learning algorithms. I have worried less about theoretical
guarantees ("this algorithm never does too badly") than about
useful rules of thumb ("in this case this algorithm may perform
really well"). In NLP, data is so noisy, biased, and non-stationary
that few theoretical guarantees can be established and we are
typically left with our gut feelings and a catalogue of crazy
ideas. I hope this book will provide its readers with both.
Throughout the book we include snippets of Python code and
empirical evaluations, when relevant.
This book presents a taxonomy framework and survey of methods
relevant to explaining the decisions and analyzing the inner
workings of Natural Language Processing (NLP) models. The book is
intended to provide a snapshot of Explainable NLP, though the field
continues to rapidly grow. The book is intended to be both readable
by first-year M.Sc. students and interesting to an expert audience.
The book opens by motivating a focus on providing a consistent
taxonomy, pointing out inconsistencies and redundancies in previous
taxonomies. It goes on to present (i) a taxonomy or framework for
thinking about how approaches to explainable NLP relate to one
another; (ii) brief surveys of each of the classes in the taxonomy,
with a focus on methods that are relevant for NLP; and (iii) a
discussion of the inherent limitations of some classes of methods,
as well as how to best evaluate them. Finally, the book closes by
providing a list of resources for further research on
explainability.
The majority of natural language processing (NLP) is English
language processing, and while there is good language technology
support for (standard varieties of) English, support for Albanian,
Burmese, or Cebuano--and most other languages--remains limited.
Being able to bridge this digital divide is important for
scientific and democratic reasons but also represents an enormous
growth potential. A key challenge for this to happen is learning to
align basic meaning-bearing units of different languages. In this
book, the authors survey and discuss recent and historical work on
supervised and unsupervised learning of such alignments.
Specifically, the book focuses on so-called cross-lingual word
embeddings. The survey is intended to be systematic, using
consistent notation and putting the available methods on comparable
form, making it easy to compare wildly different approaches. In so
doing, the authors establish previously unreported relations
between these methods and are able to present a fast-growing
literature in a very compact way. Furthermore, the authors discuss
how best to evaluate cross-lingual word embedding methods and
survey the resources available for students and researchers
interested in this topic.
|
You may like...
Ab Wheel
R209
R149
Discovery Miles 1 490
|