Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
|||
Showing 1 - 2 of 2 matches in All Departments
Recent years have seen a resurgence of interest in Datalog from both the industry and research community. Datalog is a declarative query language that extends relational algebra with recursion. It is used to express a wide spectrum of modern data management tasks such as data integration, declarative networking, graph analysis, business analytics, and program analysis. The result of this long line of research is a plethora of Datalog engines that support different variants of Datalog, and have different technical specifications and capabilities. In this monograph, the authors provide an overview of the architecture and technical characteristics of the various Datalog engines. They identify common architectural decisions and evaluation methods as well as data structures and layouts used to speed up the query execution. They also discuss the ways in which Datalog engines differ when they specialize to workloads with different characteristics. A particular focus of this monograph is how modern Datalog engines scale to massively parallel environments, which is necessary to support the processing of very large datasets. The authors conclude with opportunities for future research directions and new possible applications for Datalog engines.
The last decade has seen a huge and growing interest in processing large data sets on large distributed clusters. This trend began with the MapReduce framework, and has been widely adopted by several other systems, including PigLatin, Hive, Scope, Dremmel, Spark and Myria to name a few. While the applications of such systems are diverse (for example, machine learning, data analytics), most involve relatively standard data processing tasks like identifying relevant data, cleaning, filtering, joining, grouping, transforming, extracting features, and evaluating results. This has generated great interest in the study of algorithms for data processing on large distributed clusters. Algorithmic Aspects of Parallel Data Processing discusses recent algorithmic developments for distributed data processing. It uses a theoretical model of parallel processing called the Massively Parallel Computation (MPC) model, which is a simplification of the BSP model where the only cost is given by the amount of communication and the number of communication rounds. The survey studies several algorithms for multi-join queries, sorting, and matrix multiplication. It discusses their relationships and common techniques applied across the different data processing tasks.
|
You may like...
|