|
|
Showing 1 - 1 of
1 matches in All Departments
Labelling data is one of the most fundamental activities in
science, and has underpinned practice, particularly in medicine,
for decades, as well as research in corpus linguistics since at
least the development of the Brown corpus. With the shift towards
Machine Learning in Artificial Intelligence (AI), the creation of
datasets to be used for training and evaluating AI systems, also
known in AI as corpora, has become a central activity in the field
as well. Early AI datasets were created on an ad-hoc basis to
tackle specific problems. As larger and more reusable datasets were
created, requiring greater investment, the need for a more
systematic approach to dataset creation arose to ensure increased
quality. A range of statistical methods were adopted, often but not
exclusively from the medical sciences, to ensure that the labels
used were not subjective, or to choose among different labels
provided by the coders. A wide variety of such methods is now in
regular use. This book is meant to provide a survey of the most
widely used among these statistical methods supporting annotation
practice. As far as the authors know, this is the first book
attempting to cover the two families of methods in wider use. The
first family of methods is concerned with the development of
labelling schemes and, in particular, ensuring that such schemes
are such that sufficient agreement can be observed among the
coders. The second family includes methods developed to analyze the
output of coders once the scheme has been agreed upon, particularly
although not exclusively to identify the most likely label for an
item among those provided by the coders. The focus of this book is
primarily on Natural Language Processing, the area of AI devoted to
the development of models of language interpretation and
production, but many if not most of the methods discussed here are
also applicable to other areas of AI, or indeed, to other areas of
Data Science.
|
You may like...
Operation Joktan
Amir Tsarfati, Steve Yohn
Paperback
(1)
R250
R230
Discovery Miles 2 300
Loot
Nadine Gordimer
Paperback
(2)
R367
R340
Discovery Miles 3 400
|
Email address subscribed successfully.
A activation email has been sent to you.
Please click the link in that email to activate your subscription.