|
Showing 1 - 1 of
1 matches in All Departments
Entity Resolution (ER) lies at the core of data integration and
cleaning and, thus, a bulk of the research examines ways for
improving its effectiveness and time efficiency. The initial ER
methods primarily target Veracity in the context of structured
(relational) data that are described by a schema of well-known
quality and meaning. To achieve high effectiveness, they leverage
schema, expert, and/or external knowledge. Part of these methods
are extended to address Volume, processing large datasets through
multi-core or massive parallelization approaches, such as the
MapReduce paradigm. However, these early schema-based approaches
are inapplicable to Web Data, which abound in voluminous, noisy,
semi-structured, and highly heterogeneous information. To address
the additional challenge of Variety, recent works on ER adopt a
novel, loosely schema-aware functionality that emphasizes
scalability and robustness to noise. Another line of present
research focuses on the additional challenge of Velocity, aiming to
process data collections of a continuously increasing volume. The
latest works, though, take advantage of the significant
breakthroughs in Deep Learning and Crowdsourcing, incorporating
external knowledge to enhance the existing words to a significant
extent. This synthesis lecture organizes ER methods into four
generations based on the challenges posed by these four Vs. For
each generation, we outline the corresponding ER workflow, discuss
the state-of-the-art methods per workflow step, and present current
research directions. The discussion of these methods takes into
account a historical perspective, explaining the evolution of the
methods over time along with their similarities and differences.
The lecture also discusses the available ER tools and benchmark
datasets that allow expert as well as novice users to make use of
the available solutions.
|
You may like...
Wonka
Timothee Chalamet
Blu-ray disc
R250
R190
Discovery Miles 1 900
Holy Fvck
Demi Lovato
CD
R435
Discovery Miles 4 350
Loot
Nadine Gordimer
Paperback
(2)
R205
R168
Discovery Miles 1 680
Loot
Nadine Gordimer
Paperback
(2)
R205
R168
Discovery Miles 1 680
|