![]() |
![]() |
Your cart is empty |
||
Books > Computing & IT > Applications of computing > Databases > Data mining
Soft computing, as an engineering science, and statistics, as a
classical branch of mathematics, emphasize different aspects of
data analysis.
Recent achievements in hardware and software development, such as multi-core CPUs and DRAM capacities of multiple terabytes per server, enabled the introduction of a revolutionary technology: in-memory data management. This technology supports the flexible and extremely fast analysis of massive amounts of enterprise data. Professor Hasso Plattner and his research group at the Hasso Plattner Institute in Potsdam, Germany, have been investigating and teaching the corresponding concepts and their adoption in the software industry for years. This book is based on an online course that was first launched in autumn 2012 with more than 13,000 enrolled students and marked the successful starting point of the openHPI e-learning platform. The course is mainly designed for students of computer science, software engineering, and IT related subjects, but addresses business experts, software developers, technology experts, and IT analysts alike. Plattner and his group focus on exploring the inner mechanics of a column-oriented dictionary-encoded in-memory database. Covered topics include - amongst others - physical data storage and access, basic database operators, compression mechanisms, and parallel join algorithms. Beyond that, implications for future enterprise applications and their development are discussed. Step by step, readers will understand the radical differences and advantages of the new technology over traditional row-oriented, disk-based databases. In this completely revised 2nd edition, we incorporate the feedback of thousands of course participants on openHPI and take into account latest advancements in hard- and software. Improved figures, explanations, and examples further ease the understanding of the concepts presented. We introduce advanced data management techniques such as transparent aggregate caches and provide new showcases that demonstrate the potential of in-memory databases for two diverse industries: retail and life sciences.
Statistical Decision Problems presents a quick and concise introduction into the theory of risk, deviation and error measures that play a key role in statistical decision problems. It introduces state-of-the-art practical decision making through twenty-one case studies from real-life applications. The case studies cover a broad area of topics and the authors include links with source code and data, a very helpful tool for the reader. In its core, the text demonstrates how to use different factors to formulate statistical decision problems arising in various risk management applications, such as optimal hedging, portfolio optimization, cash flow matching, classification, and more. The presentation is organized into three parts: selected concepts of statistical decision theory, statistical decision problems, and case studies with portfolio safeguard. The text is primarily aimed at practitioners in the areas of risk management, decision making, and statistics. However, the inclusion of a fair bit of mathematical rigor renders this monograph an excellent introduction to the theory of general error, deviation, and risk measures for graduate students. It can be used as supplementary reading for graduate courses including statistical analysis, data mining, stochastic programming, financial engineering, to name a few. The high level of detail may serve useful to applied mathematicians, engineers, and statisticians interested in modeling and managing risk in various applications.
The papers in this volume comprise the refereed proceedings of the conference Arti- cial Intelligence in Theory and Practice (IFIP AI 2010), which formed part of the 21st World Computer Congress of IFIP, the International Federation for Information Pr- essing (WCC-2010), in Brisbane, Australia in September 2010. The conference was organized by the IFIP Technical Committee on Artificial Int- ligence (Technical Committee 12) and its Working Group 12.5 (Artificial Intelligence Applications). All papers were reviewed by at least two members of our Program Committee. - nal decisions were made by the Executive Program Committee, which comprised John Debenham (University of Technology, Sydney, Australia), Ilias Maglogiannis (University of Central Greece, Lamia, Greece), Eunika Mercier-Laurent (KIM, France) and myself. The best papers were selected for the conference, either as long papers (maximum 10 pages) or as short papers (maximum 5 pages) and are included in this volume. The international nature of IFIP is amply reflected in the large number of countries represented here. I should like to thank the Conference Chair, Tharam Dillon, for all his efforts and the members of our Program Committee for reviewing papers under a very tight de- line.
This book describes the application of modern information technology to reservoir modeling and well management in shale. While covering Shale Analytics, it focuses on reservoir modeling and production management of shale plays, since conventional reservoir and production modeling techniques do not perform well in this environment. Topics covered include tools for analysis, predictive modeling and optimization of production from shale in the presence of massive multi-cluster, multi-stage hydraulic fractures. Given the fact that the physics of storage and fluid flow in shale are not well-understood and well-defined, Shale Analytics avoids making simplifying assumptions and concentrates on facts (Hard Data - Field Measurements) to reach conclusions. Also discussed are important insights into understanding completion practices and re-frac candidate selection and design. The flexibility and power of the technique is demonstrated in numerous real-world situations.
This book brings all of the elements of data mining together in a
single volume, saving the reader the time and expense of making
multiple purchases. It consolidates both introductory and advanced
topics, thereby covering the gamut of data mining and machine
learning tactics ? from data integration and pre-processing, to
fundamental algorithms, to optimization techniques and web mining
methodology.
This book features both cutting-edge contributions on managing knowledge in transformational contexts and a selection of real-world case studies. It analyzes how the disruptive power of digitization is becoming a major challenge for knowledge-based value creation worldwide, and subsequently examines the changes in how we manage information and knowledge, communicate, collaborate, learn and decide within and across organizations. The book highlights the opportunities provided by disruptive renewal, while also stressing the need for knowledge workers and organizations to transform governance, leadership and work organization. Emerging new business models and digitally enabled co-creation are presented as drivers that can help establish new ways of managing knowledge. In turn, a number of carefully selected and interpreted case studies provide a link to practice in organizations.
This book addresses the topic of exploiting enterprise-linked data with a particular focus on knowledge construction and accessibility within enterprises. It identifies the gaps between the requirements of enterprise knowledge consumption and "standard" data consuming technologies by analysing real-world use cases, and proposes the enterprise knowledge graph to fill such gaps. It provides concrete guidelines for effectively deploying linked-data graphs within and across business organizations. It is divided into three parts, focusing on the key technologies for constructing, understanding and employing knowledge graphs. Part 1 introduces basic background information and technologies, and presents a simple architecture to elucidate the main phases and tasks required during the lifecycle of knowledge graphs. Part 2 focuses on technical aspects; it starts with state-of-the art knowledge-graph construction approaches, and then discusses exploration and exploitation techniques as well as advanced question-answering topics concerning knowledge graphs. Lastly, Part 3 demonstrates examples of successful knowledge graph applications in the media industry, healthcare and cultural heritage, and offers conclusions and future visions.
This book describes analytical techniques for optimizing knowledge acquisition, processing, and propagation, especially in the contexts of cyber-infrastructure and big data. Further, it presents easy-to-use analytical models of knowledge-related processes and their applications. The need for such methods stems from the fact that, when we have to decide where to place sensors, or which algorithm to use for processing the data-we mostly rely on experts' opinions. As a result, the selected knowledge-related methods are often far from ideal. To make better selections, it is necessary to first create easy-to-use models of knowledge-related processes. This is especially important for big data, where traditional numerical methods are unsuitable. The book offers a valuable guide for everyone interested in big data applications: students looking for an overview of related analytical techniques, practitioners interested in applying optimization techniques, and researchers seeking to improve and expand on these techniques.
This book provides two general granular computing approaches to mining relational data, the first of which uses abstract descriptions of relational objects to build their granular representation, while the second extends existing granular data mining solutions to a relational case. Both approaches make it possible to perform and improve popular data mining tasks such as classification, clustering, and association discovery. How can different relational data mining tasks best be unified? How can the construction process of relational patterns be simplified? How can richer knowledge from relational data be discovered? All these questions can be answered in the same way: by mining relational data in the paradigm of granular computing! This book will allow readers with previous experience in the field of relational data mining to discover the many benefits of its granular perspective. In turn, those readers familiar with the paradigm of granular computing will find valuable insights on its application to mining relational data. Lastly, the book offers all readers interested in computational intelligence in the broader sense the opportunity to deepen their understanding of the newly emerging field granular-relational data mining.
Mohamed Medhat Gaber "It is not my aim to surprise or shock you - but the simplest way I can summarise is to say that there are now in the world machines that think, that learn and that create. Moreover, their ability to do these things is going to increase rapidly until - in a visible future - the range of problems they can handle will be coextensive with the range to which the human mind has been applied" by Herbert A. Simon (1916-2001) 1Overview This book suits both graduate students and researchers with a focus on discovering knowledge from scienti c data. The use of computational power for data analysis and knowledge discovery in scienti c disciplines has found its roots with the re- lution of high-performance computing systems. Computational science in physics, chemistry, and biology represents the rst step towards automation of data analysis tasks. The rational behind the developmentof computationalscience in different - eas was automating mathematical operations performed in those areas. There was no attention paid to the scienti c discovery process. Automated Scienti c Disc- ery (ASD) [1-3] represents the second natural step. ASD attempted to automate the process of theory discovery supported by studies in philosophy of science and cognitive sciences. Although early research articles have shown great successes, the area has not evolved due to many reasons. The most important reason was the lack of interaction between scientists and the automating systems.
Hyperspectral Image Fusion is the first text dedicated to the fusion techniques for such a huge volume of data consisting of a very large number of images. This monograph brings out recent advances in the research in the area of visualization of hyperspectral data. It provides a set of pixel-based fusion techniques, each of which is based on a different framework and has its own advantages and disadvantages. The techniques are presented with complete details so that practitioners can easily implement them. It is also demonstrated how one can select only a few specific bands to speed up the process of fusion by exploiting spatial correlation within successive bands of the hyperspectral data. While the techniques for fusion of hyperspectral images are being developed, it is also important to establish a framework for objective assessment of such techniques. This monograph has a dedicated chapter describing various fusion performance measures that are applicable to hyperspectral image fusion. This monograph also presents a notion of consistency of a fusion technique which can be used to verify the suitability and applicability of a technique for fusion of a very large number of images. This book will be a highly useful resource to the students, researchers, academicians and practitioners in the specific area of hyperspectral image fusion, as well as generic image fusion.
This book aims to identify promising future developmental opportunities and applications for Tech Mining. Specifically, the enclosed contributions will pursue three converging themes: The increasing availability of electronic text data resources relating to Science, Technology and Innovation (ST&I). The multiple methods that are able to treat this data effectively and incorporate means to tap into human expertise and interests. Translating those analyses to provide useful intelligence on likely future developments of particular emerging S&T targets. Tech Mining can be defined as text analyses of ST&I information resources to generate Competitive Technical Intelligence (CTI). It combines bibliometrics and advanced text analytic, drawing on specialized knowledge pertaining to ST&I. Tech Mining may also be viewed as a special form of "Big Data" analytics because it searches on a target emerging technology (or key organization) of interest in global databases. One then downloads, typically, thousands of field-structured text records (usually abstracts), and analyses those for useful CTI. Forecasting Innovation Pathways (FIP) is a methodology drawing on Tech Mining plus additional steps to elicit stakeholder and expert knowledge to link recent ST&I activity to likely future development. A decade ago, we demeaned Management of Technology (MOT) as somewhat self-satisfied and ignorant. Most technology managers relied overwhelmingly on casual human judgment, largely oblivious of the potential of empirical analyses to inform R&D management and science policy. CTI, Tech Mining, and FIP are changing that. The accumulation of Tech Mining research over the past decade offers a rich resource of means to get at emerging technology developments and organizational networks to date. Efforts to bridge from those recent histories of development to project likely FIP, however, prove considerably harder. One focus of this volume is to extend the repertoire of information resources; that will enrich FIP. Featuring cases of novel approaches and applications of Tech Mining and FIP, this volume will present frontier advances in ST&I text analytics that will be of interest to students, researchers, practitioners, scholars and policy makers in the fields of R&D planning, technology management, science policy and innovation strategy.
This book presents statistical processes for health care delivery and covers new ideas, methods and technologies used to improve health care organizations. It gathers the proceedings of the Third International Conference on Health Care Systems Engineering (HCSE 2017), which took place in Florence, Italy from May 29 to 31, 2017. The Conference provided a timely opportunity to address operations research and operations management issues in health care delivery systems. Scientists and practitioners discussed new ideas, methods and technologies for improving the operations of health care systems, developed in close collaborations with clinicians. The topics cover a broad spectrum of concrete problems that pose challenges for researchers and practitioners alike: hospital drug logistics, operating theatre management, home care services, modeling, simulation, process mining and data mining in patient care and health care organizations.
This book offers a coherent and comprehensive approach to feature subset selection in the scope of classification problems, explaining the foundations, real application problems and the challenges of feature selection for high-dimensional data. The authors first focus on the analysis and synthesis of feature selection algorithms, presenting a comprehensive review of basic concepts and experimental results of the most well-known algorithms. They then address different real scenarios with high-dimensional data, showing the use of feature selection algorithms in different contexts with different requirements and information: microarray data, intrusion detection, tear film lipid layer classification and cost-based features. The book then delves into the scenario of big dimension, paying attention to important problems under high-dimensional spaces, such as scalability, distributed processing and real-time processing, scenarios that open up new and interesting challenges for researchers. The book is useful for practitioners, researchers and graduate students in the areas of machine learning and data mining.
The rate at which geospatial data is being generated exceeds our computational capabilities to extract patterns for the understanding of a dynamically changing world. Geoinformatics and data mining focuses on the development and implementation of computational algorithms to solve these problems. This unique volume contains a collection of chapters on state-of-the-art data mining techniques applied to geoinformatic problems of high complexity and important societal value. Data Mining for Geoinformatics addresses current concerns and developments relating to spatio-temporal data mining issues in remotely-sensed data, problems in meteorological data such as tornado formation, estimation of radiation from the Fukushima nuclear power plant, simulations of traffic data using OpenStreetMap, real time traffic applications of data stream mining, visual analytics of traffic and weather data and the exploratory visualization of collective, mobile objects such as the flocking behavior of wild chickens. This book is designed for researchers and advanced-level students focused on computer science, earth science and geography as a reference or secondary text book. Practitioners working in the areas of data mining and geoscience will also find this book to be a valuable reference.
Advances in hardware technology have increased the capability to store and record personal data about consumers and individuals. This has caused concerns that personal data may be used for a variety of intrusive or malicious purposes. Privacy Preserving Data Mining: Models and Algorithms proposes a number of techniques to perform the data mining tasks in a privacy-preserving way. These techniques generally fall into the following categories: data modification techniques, cryptographic methods and protocols for data sharing, statistical techniques for disclosure and inference control, query auditing methods, randomization and perturbation-based techniques. This edited volume contains surveys by distinguished researchers in the privacy field. Each survey includes the key research content as well as future research directions of a particular topic in privacy. Privacy Preserving Data Mining: Models and Algorithms is designed for researchers, professors, and advanced-level students in computer science. This book is also suitable for practitioners in industry.
This book presents the Recommender System for Improving Customer Loyalty. New and innovative products have begun appearing from a wide variety of countries, which has increased the need to improve the customer experience. When a customer spends hundreds of thousands of dollars on a piece of equipment, keeping it running efficiently is critical to achieving the desired return on investment. Moreover, managers have discovered that delivering a better customer experience pays off in a number of ways. A study of publicly traded companies conducted by Watermark Consulting found that from 2007 to 2013, companies with a better customer service generated a total return to shareholders that was 26 points higher than the S&P 500. This is only one of many studies that illustrate the measurable value of providing a better service experience. The Recommender System presented here addresses several important issues. (1) It provides a decision framework to help managers determine which actions are likely to have the greatest impact on the Net Promoter Score. (2) The results are based on multiple clients. The data mining techniques employed in the Recommender System allow users to "learn" from the experiences of others, without sharing proprietary information. This dramatically enhances the power of the system. (3) It supplements traditional text mining options. Text mining can be used to identify the frequency with which topics are mentioned, and the sentiment associated with a given topic. The Recommender System allows users to view specific, anonymous comments associated with actual customers. Studying these comments can provide highly accurate insights into the steps that can be taken to improve the customer experience. (4) Lastly, the system provides a sensitivity analysis feature. In some cases, certain actions can be more easily implemented than others. The Recommender System allows managers to "weigh" these actions and determine which ones would have a greater impact.
This book addresses the challenges of social network and social media analysis in terms of prediction and inference. The chapters collected here tackle these issues by proposing new analysis methods and by examining mining methods for the vast amount of social content produced. Social Networks (SNs) have become an integral part of our lives; they are used for leisure, business, government, medical, educational purposes and have attracted billions of users. The challenges that stem from this wide adoption of SNs are vast. These include generating realistic social network topologies, awareness of user activities, topic and trend generation, estimation of user attributes from their social content, and behavior detection. This text has applications to widely used platforms such as Twitter and Facebook and appeals to students, researchers, and professionals in the field.
Provides readers with the methods, algorithms, and means to perform text mining tasks This book is devoted to the fundamentals of text mining using Perl, an open-source programming tool that is freely available via the Internet (www.perl.org). It covers mining ideas from several perspectives--statistics, data mining, linguistics, and information retrieval--and provides readers with the means to successfully complete text mining tasks on their own. The book begins with an introduction to regular expressions, a text pattern methodology, and quantitative text summaries, all of which are fundamental tools of analyzing text. Then, it builds upon this foundation to explore: Probability and texts, including the bag-of-words model Information retrieval techniques such as the TF-IDF similarity measure Concordance lines and corpus linguistics Multivariate techniques such as correlation, principal components analysis, and clustering Perl modules, German, and permutation tests Each chapter is devoted to a single key topic, and the author carefully and thoughtfully introduces mathematical concepts as they arise, allowing readers to learn as they go without having to refer to additional books. The inclusion of numerous exercises and worked-out examples further complements the book's student-friendly format. Practical Text Mining with Perl is ideal as a textbook for undergraduate and graduate courses in text mining and as a reference for a variety of professionals who are interested in extracting information from text documents.
This book presents the proceedings of Workshops and Posters at the 13th International Conference on Spatial Information Theory (COSIT 2017), which is concerned with all aspects of space and spatial environments as experienced, represented and elaborated by humans, other animals and artificial agents. Complementing the main conference proceedings, workshop papers and posters investigate specialized research questions or challenges in spatial information theory and closely related topics, including advances in the conceptualization of specific spatio-temporal domains and diverse applications of spatial and temporal information.
This book provides a general and comprehensible overview of supervised descriptive pattern mining, considering classic algorithms and those based on heuristics. It provides some formal definitions and a general idea about patterns, pattern mining, the usefulness of patterns in the knowledge discovery process, as well as a brief summary on the tasks related to supervised descriptive pattern mining. It also includes a detailed description on the tasks usually grouped under the term supervised descriptive pattern mining: subgroups discovery, contrast sets and emerging patterns. Additionally, this book includes two tasks, class association rules and exceptional models, that are also considered within this field. A major feature of this book is that it provides a general overview (formal definitions and algorithms) of all the tasks included under the term supervised descriptive pattern mining. It considers the analysis of different algorithms either based on heuristics or based on exhaustive search methodologies for any of these tasks. This book also illustrates how important these techniques are in different fields, a set of real-world applications are described. Last but not least, some related tasks are also considered and analyzed. The final aim of this book is to provide a general review of the supervised descriptive pattern mining field, describing its tasks, its algorithms, its applications, and related tasks (those that share some common features). This book targets developers, engineers and computer scientists aiming to apply classic and heuristic-based algorithms to solve different kinds of pattern mining problems and apply them to real issues. Students and researchers working in this field, can use this comprehensive book (which includes its methods and tools) as a secondary textbook.
The IEEE ICDM 2004 workshop on the Foundation of Data Mining and the IEEE ICDM 2005 workshop on the Foundation of Semantic Oriented Data and Web Mining focused on topics ranging from the foundations of data mining to new data mining paradigms. The workshops brought together both data mining researchers and practitioners to discuss these two topics while seeking solutions to long standing data mining problems and stimul- ing new data mining research directions. We feel that the papers presented at these workshops may encourage the study of data mining as a scienti?c ?eld and spark new communications and collaborations between researchers and practitioners. Toexpressthevisionsforgedintheworkshopstoawiderangeofdatam- ing researchers and practitioners and foster active participation in the study of foundations of data mining, we edited this volume by involving extended and updated versions of selected papers presented at those workshops as well as some other relevant contributions. The content of this book includes st- ies of foundations of data mining from theoretical, practical, algorithmical, and managerial perspectives. The following is a brief summary of the papers contained in this book.
Recently, there has been a rapid increase in interest regarding social network analysis in the data mining community. Cognitive radios are expected to play a major role in meeting this exploding traffic demand on social networks due to their ability to sense the environment, analyze outdoor parameters, and then make decisions for dynamic time, frequency, space, resource allocation, and management to improve the utilization of mining the social data. Cognitive Social Mining Applications in Data Analytics and Forensics is an essential reference source that reviews cognitive radio concepts and examines their applications to social mining using a machine learning approach so that an adaptive and intelligent mining is achieved. Featuring research on topics such as data mining, real-time ubiquitous social mining services, and cognitive computing, this book is ideally designed for social network analysts, researchers, academicians, and industry professionals. |
![]() ![]() You may like...
Computational and Methodological…
Andriette Bekker, (Din) Ding-Geng Chen, …
Hardcover
R4,276
Discovery Miles 42 760
Intelligent Analysis of Multimedia…
Siddhartha Bhattacharyya, Hrishikesh Bhaumik, …
Hardcover
R6,091
Discovery Miles 60 910
Opinion Mining and Text Analytics on…
Pantea Keikhosrokiani, Moussa Pourya Asl
Hardcover
R10,065
Discovery Miles 100 650
New Opportunities for Sentiment Analysis…
Aakanksha Sharaff, G. R. Sinha, …
Hardcover
R7,211
Discovery Miles 72 110
|