Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
|||
Showing 1 - 8 of 8 matches in All Departments
The Web is causing a revolution in how we represent, retrieve,
and process information Its growth has given us a universally
accessible database but in the form of a largely unorganized
collection of documents. This is changing, thanks to the
simultaneous emergence of new ways of representing data: from
within the Web community, XML; and from within the database
community, semistructured data. The convergence of these two
approaches has rendered them nearly identical. Now, there is a
concerted effort to develop effective techniques for retrieving and
processing both kinds of data. Data on the Web" is the only comprehensive, up-to-date
examination of these rapidly evolving retrieval and processing
strategies, which are of critical importance for almost all Web-
and data-intensive enterprises. This book offers detailed solutions
to a wide range of practical problems while equipping you with a
keen understanding of the fundamental issues including data models,
query languages, and schemas involved in their design,
implementation, and optimization. You'll find it to be compelling
reading, whether your interest is that of a practitioner involved
in a database-driven Web enterprise or a researcher in computer
science or related field.
Probabilistic databases are databases where the value of some attributes or the presence of some records are uncertain and known only with some probability. Applications in many areas such as information extraction, RFID and scientific data management, data cleaning, data integration, and financial risk assessment produce large volumes of uncertain data, which are best modeled and processed by a probabilistic database. This book presents the state of the art in representation formalisms and query processing techniques for probabilistic data. It starts by discussing the basic principles for representing large probabilistic databases, by decomposing them into tuple-independent tables, block-independent-disjoint tables, or U-databases. Then it discusses two classes of techniques for query evaluation on probabilistic databases. In extensional query evaluation, the entire probabilistic inference can be pushed into the database engine and, therefore, processed as effectively as the evaluation of standard SQL queries. The relational queries that can be evaluated this way are called safe queries. In intensional query evaluation, the probabilistic inference is performed over a propositional formula called lineage expression: every relational query can be evaluated this way, but the data complexity dramatically depends on the query being evaluated, and can be #P-hard. The book also discusses some advanced topics in probabilistic data management such as top-k query processing, sequential probabilistic databases, indexing and materialized views, and Monte Carlo databases. Table of Contents: Overview / Data and Query Model / The Query Evaluation Problem / Extensional Query Evaluation / Intensional Query Evaluation / Advanced Techniques
This book constitutes the refereed proceedings of the 11th International Conference on Database Theory, ICDT 2007, held in Spain in January 2007. The papers are organized in topical sections on information integration and peer to peer, axiomatizations for XML, expressive power of query languages, incompleteness, inconsistency, and uncertainty, XML schemas and typechecking, stream processing and sequential query processing, ranking, XML update and query, as well as query containment.
Modern database systems enhance the capabilities of traditional database systems by their ability to handle any kind of data, including text, image, audio, and video. Today, databasesystemsareparticularlyrelevanttotheWeb, astheycanprovideinputtocontent generators for Web pages, and can handle queries issued over the Internet. The eXtensible Markup Language (XML) is used in applications running the gamut from content management through publishing to Web services and e-commerce. It is used as the universal communication language for exchanging music and graphics as well as purchase orders and technical documentation. As database systems increasingly talk to each other over the Web, there is a fa- growingdesiretouseXMLasthestandardexchangeformat.Asaresult, manyrelational database systems can export data as XML documents and import data from XML d- uments and provide query and update capabilities for XML data. In addition, so called native XML database and integration systems are appearing on the database market, whose claim is to be especially tailored to storing, maintaining, and easily accessing XML documents. After the huge success of the ?rst XML Database Symposium (XSym 2003) last year in Berlin (already then in conjunction withVLDB) it was decided to establish this symposiumasanannualeventthatissupposedtotakeplaceasanintegralpartofVLDB. Thegoalofthissymposiumistoprovideahigh-qualityplatformforthepresentationand discussion of new research results and system developments. It is targeted at scientists, practitioners, vendors and users of XML and database technologie
The papers in this volume represent the technical program of the 9th Biennial WorkshoponDataBasesandProgrammingLanguages(DBPL2003), whichwas held on September 6 8, 2003, in Potsdam, Germany. The workshop meets every two years, and is a well-established forum for ideas that lie at the intersection of database and programming language research. DBPL 2003 continued the t- dition of excellence initiated by its predecessors in Rosco?, Finistre (1987), S- ishan, Oregon (1989), Nafplion, Argolida (1991), Manhattan, New York (1993), Gubbio, Umbria (1995), Estes Park, Colorado (1997), Kinloch Rannoch, Sc- land (1999), and Frascati, Rome (2001). Theprogramcommitteeselected14papersoutof22submissions, andinvited twocontributions.The16talkswerepresentedoverthreedays, insevensessions. In theinvitedtalk Jennifer Widom presented the paper CQL: a Language forContinuousQueriesoverStreamsandRelations, coauthoredbyArvindArasu andShivnathBabu.Whilealotofresearchhasbeendonerecentlyonqueryp- cessingoverdatastreams, CQLisvirtuallythe?rstproposalofaquerylanguage on streams that is a strict extension of SQL. The language is structured around a simple yet powerful idea: it has two distinct data types, relations and streams, with well-de?ned operators for mapping between them. Window speci?cation expressions, such as sliding windows, map streams to relations, while operators such as insert stream, delete stream, and relation stream map relations to streams by returning, at each moment in time, the newly inserted tuples, the deleted tuples, or a snapshot of the entire relation. The numerous examples in this paper make a convincing case for the power and usefulness of CQL."
With the development of the World-Wide Web, data management problems have branched out from the traditional framework in which tabular data is processed under the strict control of an application, and address today the rich variety of information that is found on the Web, considering a variety of ?exible envir- ments under which such data can be searched, classi ed , and processed. Da- base systems are coming forward today in a new role as the primary backend for the information provided on the Web. Most of today's Web accesses trigger some form of content generation from a database, while electronic commerce often triggers intensive DBMS-based applications. The research community has begun to revise data models, query languages, data integration techniques, - dexes, query processing algorithms, and transaction concepts in order to cope with the characteristics and scale of the data on the Web. New problems have been identi ed , among them goal-oriented information gathering, management of semi-structured data, or database-style query languages for Web data, to name just a few. The International Workshop on the Web and Databases (WebDB) is a series of workshops intended to bring together researchers interested in the interaction between databases and the Web. This year's WebDB 2000 was the third in the series, and was held in Dallas, Texas, in conjunction with the ACM SIGMOD International Conference on Management of Data.
The last decade has seen a huge and growing interest in processing large data sets on large distributed clusters. This trend began with the MapReduce framework, and has been widely adopted by several other systems, including PigLatin, Hive, Scope, Dremmel, Spark and Myria to name a few. While the applications of such systems are diverse (for example, machine learning, data analytics), most involve relatively standard data processing tasks like identifying relevant data, cleaning, filtering, joining, grouping, transforming, extracting features, and evaluating results. This has generated great interest in the study of algorithms for data processing on large distributed clusters. Algorithmic Aspects of Parallel Data Processing discusses recent algorithmic developments for distributed data processing. It uses a theoretical model of parallel processing called the Massively Parallel Computation (MPC) model, which is a simplification of the BSP model where the only cost is given by the amount of communication and the number of communication rounds. The survey studies several algorithms for multi-join queries, sorting, and matrix multiplication. It discusses their relationships and common techniques applied across the different data processing tasks.
Probabilistic data is motivated by the need to model uncertainty in large databases. Over the last twenty years or so, both the Database community and the Al community have studied various aspects of probabilistic relational data. Query Processing on Probabilistic Data: A Survey presents the main approaches developed in the literature, reconciling concepts developed in parallel by the two research communities. It starts with an extensive discussion of the main probabilistic data models and their relationships, followed by a brief overview of model counting and its relationship to probabilistic data. The monograph proceeds to discuss lifted probabilistic inference, a suite of techniques developed in parallel by the Database and Al communities for probabilistic query evaluation. It then provides a summary of query compilation, presenting some theoretical results highlighting limitations of various query evaluation techniques on probabilistic data. It ends with a brief discussion of some popular probabilistic data sets, systems, and applications that build on this technology.
|
You may like...
|