![]() |
Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
||
|
Books > Computing & IT > Applications of computing > Databases > Data mining
There is often a large number of association rules discovered in data mining practice, making it difficult for users to identify those that are of particular interest to them. Therefore, it is important to remove insignificant rules and prune redundancy as well as summarize, visualize, and post-mine the discovered rules. Post-Mining of Association Rules: Techniques for Effective Knowledge Extraction provides a systematic collection on post-mining, summarization and presentation of association rules, and new forms of association rules. This book presents researchers, practitioners, and academicians with tools to extract useful and actionable knowledge after discovering a large number of association rules.
This book addresses the current status, challenges and future directions of data-driven materials discovery and design. It presents the analysis and learning from data as a key theme in many science and cyber related applications. The challenging open questions as well as future directions in the application of data science to materials problems are sketched. Computational and experimental facilities today generate vast amounts of data at an unprecedented rate. The book gives guidance to discover new knowledge that enables materials innovation to address grand challenges in energy, environment and security, the clearer link needed between the data from these facilities and the theory and underlying science. The role of inference and optimization methods in distilling the data and constraining predictions using insights and results from theory is key to achieving the desired goals of real time analysis and feedback. Thus, the importance of this book lies in emphasizing that the full value of knowledge driven discovery using data can only be realized by integrating statistical and information sciences with materials science, which is increasingly dependent on high throughput and large scale computational and experimental data gathering efforts. This is especially the case as we enter a new era of big data in materials science with the planning of future experimental facilities such as the Linac Coherent Light Source at Stanford (LCLS-II), the European X-ray Free Electron Laser (EXFEL) and MaRIE (Matter Radiation in Extremes), the signature concept facility from Los Alamos National Laboratory. These facilities are expected to generate hundreds of terabytes to several petabytes of in situ spatially and temporally resolved data per sample. The questions that then arise include how we can learn from the data to accelerate the processing and analysis of reconstructed microstructure, rapidly map spatially resolved properties from high throughput data, devise diagnostics for pattern detection, and guide experiments towards desired targeted properties. The authors are an interdisciplinary group of leading experts who bring the excitement of the nascent and rapidly emerging field of materials informatics to the reader.
This book addresses the problems that are encountered, and solutions that have been proposed, when we aim to identify people and to reconstruct populations under conditions where information is scarce, ambiguous, fuzzy and sometimes erroneous. The process from handwritten registers to a reconstructed digitized population consists of three major phases, reflected in the three main sections of this book. The first phase involves transcribing and digitizing the data while structuring the information in a meaningful and efficient way. In the second phase, records that refer to the same person or group of persons are identified by a process of linkage. In the third and final phase, the information on an individual is combined into a reconstruction of their life course. The studies and examples in this book originate from a range of countries, each with its own cultural and administrative characteristics, and from medieval charters through historical censuses and vital registration, to the modern issue of privacy preservation. Despite the diverse places and times addressed, they all share the study of fundamental issues when it comes to model reasoning for population reconstruction and the possibilities and limitations of information technology to support this process. It is thus not a single discipline that is involved in such an endeavor. Historians, social scientists, and linguists represent the humanities through their knowledge of the complexity of the past, the limitations of sources, and the possible interpretations of information. The availability of big data from digitized archives and the need for complex analyses to identify individuals calls for the involvement of computer scientists. With contributions from all these fields, often in direct cooperation, this book is at the heart of the digital humanities, and will hopefully offer a source of inspiration for future investigations.
This book demonstrates how quantitative methods for text analysis can successfully combine with qualitative methods in the study of different disciplines of the Humanities and Social Sciences (HSS). The book focuses on learning about the evolution of ideas of HSS disciplines through a distant reading of the contents conveyed by scientific literature, in order to retrieve the most relevant topics being debated over time. Quantitative methods, statistical techniques and software packages are used to identify and study the main subject matters of a discipline from raw textual data, both in the past and today. The book also deals with the concept of quality of life of words and aims to foster a discussion about the life cycle of scientific ideas. Textual data retrieved from large corpora pose interesting challenges for any data analysis method and today represent a growing area of research in many fields. New problems emerge from the growing availability of large databases and new methods are needed to retrieve significant information from those large information sources. This book can be used to explain how quantitative methods can be part of the research instrumentation and the "toolbox" of scholars of Humanities and Social Sciences. The book contains numerous examples and a description of the main methods in use, with references to literature and available software. Most of the chapters of the book have been written in a non-technical language for HSS researchers without mathematical, computer or statistical backgrounds.
Pattern Recognition on Oriented Matroids covers a range of innovative problems in combinatorics, poset and graph theories, optimization, and number theory that constitute a far-reaching extension of the arsenal of committee methods in pattern recognition. The groundwork for the modern committee theory was laid in the mid-1960s, when it was shown that the familiar notion of solution to a feasible system of linear inequalities has ingenious analogues which can serve as collective solutions to infeasible systems. A hierarchy of dialects in the language of mathematics, for instance, open cones in the context of linear inequality systems, regions of hyperplane arrangements, and maximal covectors (or topes) of oriented matroids, provides an excellent opportunity to take a fresh look at the infeasible system of homogeneous strict linear inequalities - the standard working model for the contradictory two-class pattern recognition problem in its geometric setting. The universal language of oriented matroid theory considerably simplifies a structural and enumerative analysis of applied aspects of the infeasibility phenomenon. The present book is devoted to several selected topics in the emerging theory of pattern recognition on oriented matroids: the questions of existence and applicability of matroidal generalizations of committee decision rules and related graph-theoretic constructions to oriented matroids with very weak restrictions on their structural properties; a study (in which, in particular, interesting subsequences of the Farey sequence appear naturally) of the hierarchy of the corresponding tope committees; a description of the three-tope committees that are the most attractive approximation to the notion of solution to an infeasible system of linear constraints; an application of convexity in oriented matroids as well as blocker constructions in combinatorial optimization and in poset theory to enumerative problems on tope committees; an attempt to clarify how elementary changes (one-element reorientations) in an oriented matroid affect the family of its tope committees; a discrete Fourier analysis of the important family of critical tope committees through rank and distance relations in the tope poset and the tope graph; the characterization of a key combinatorial role played by the symmetric cycles in hypercube graphs. Contents Oriented Matroids, the Pattern Recognition Problem, and Tope Committees Boolean Intervals Dehn-Sommerville Type Relations Farey Subsequences Blocking Sets of Set Families, and Absolute Blocking Constructions in Posets Committees of Set Families, and Relative Blocking Constructions in Posets Layers of Tope Committees Three-Tope Committees Halfspaces, Convex Sets, and Tope Committees Tope Committees and Reorientations of Oriented Matroids Topes and Critical Committees Critical Committees and Distance Signals Symmetric Cycles in the Hypercube Graphs
This textbook presents the main principles of visual analytics and describes techniques and approaches that have proven their utility and can be readily reproduced. Special emphasis is placed on various instructive examples of analyses, in which the need for and the use of visualisations are explained in detail. The book begins by introducing the main ideas and concepts of visual analytics and explaining why it should be considered an essential part of data science methodology and practices. It then describes the general principles underlying the visual analytics approaches, including those on appropriate visual representation, the use of interactive techniques, and classes of computational methods. It continues with discussing how to use visualisations for getting aware of data properties that need to be taken into account and for detecting possible data quality issues that may impair the analysis. The second part of the book describes visual analytics methods and workflows, organised by various data types including multidimensional data, data with spatial and temporal components, data describing binary relationships, texts, images and video. For each data type, the specific properties and issues are explained, the relevant analysis tasks are discussed, and appropriate methods and procedures are introduced. The focus here is not on the micro-level details of how the methods work, but on how the methods can be used and how they can be applied to data. The limitations of the methods are also discussed and possible pitfalls are identified. The textbook is intended for students in data science and, more generally, anyone doing or planning to do practical data analysis. It includes numerous examples demonstrating how visual analytics techniques are used and how they can help analysts to understand the properties of data, gain insights into the subject reflected in the data, and build good models that can be trusted. Based on several years of teaching related courses at the City, University of London, the University of Bonn and TU Munich, as well as industry training at the Fraunhofer Institute IAIS and numerous summer schools, the main content is complemented by sample datasets and detailed, illustrated descriptions of exercises to practice applying visual analytics methods and workflows.
This edited volume addresses the vast challenges of adapting Online Social Media (OSM) to developing research methods and applications. The topics cover generating realistic social network topologies, awareness of user activities, topic and trend generation, estimation of user attributes from their social content, behavior detection, mining social content for common trends, identifying and ranking social content sources, building friend-comprehension tools, and many others. Each of the ten chapters tackle one or more of these issues by proposing new analysis methods or new visualization techniques, or both, for famous OSM applications such as Twitter and Facebook. This collection of contributed chapters address these challenges. Online Social Media has become part of the daily lives of hundreds of millions of users generating an immense amount of 'social content'. Addressing the challenges that stem from this wide adaptation of OSM is what makes this book a valuable contribution to the field of social networks.
This book reviews the latest developments in nature-inspired computation, with a focus on the cross-disciplinary applications in data mining and machine learning. Data mining, machine learning and nature-inspired computation are current hot research topics due to their importance in both theory and practical applications. Adopting an application-focused approach, each chapter introduces a specific topic, with detailed descriptions of relevant algorithms, extensive literature reviews and implementation details. Covering topics such as nature-inspired algorithms, swarm intelligence, classification, clustering, feature selection, cybersecurity, learning algorithms over cloud, extreme learning machines, object categorization, particle swarm optimization, flower pollination and firefly algorithms, and neural networks, it also presents case studies and applications, including classifications of crisis-related tweets, extraction of named entities in the Tamil language, performance-based prediction of diseases, and healthcare services. This book is both a valuable a reference resource and a practical guide for students, researchers and professionals in computer science, data and management sciences, artificial intelligence and machine learning.
"Reliable Knowledge Discovery" focuses on theory, methods, and techniques for RKDD, a new sub-field of KDD. It studies the theory and methods to assure the reliability and trustworthiness of discovered knowledge and to maintain the stability and consistency of knowledge discovery processes. RKDD has a broad spectrum of applications, especially in critical domains like medicine, finance, and military. "Reliable Knowledge Discovery" also presents methods and techniques for designing robust knowledge-discovery processes. Approaches to assessing the reliability of the discovered knowledge are introduced. Particular attention is paid to methods for reliable feature selection, reliable graph discovery, reliable classification, and stream mining. Estimating the data trustworthiness is covered in this volume as well. Case studies are provided in many chapters. "Reliable Knowledge Discovery" is designed for researchers and advanced-level students focused on computer science and electrical engineering as a secondary text or reference. Professionals working in this related field and KDD application developers will also find this book useful.
Sustainable development is based on the idea that societies should advance without compromising their future development requirements. This book explores how the application of data analytics and digital technologies can ensure that development changes are executed on the basis of factual data and information. It addresses how innovations that rely on digital technologies can support sustainable development across all sectors and all social, economic, and environmental aspects and help us achieve the Sustainable Development Goals (SDGs). The book also highlights techniques, processes, models, tools, and practices used to achieve sustainable development through data analysis. The various topics covered in this book are critically evaluated, not only theoretically, but also from an application perspective. It will be of interest to researchers and students, especially those in the fields of applied data analytics, business intelligence and knowledge management.
This book approaches big data, artificial intelligence, machine learning, and business intelligence through the lens of Data Science. We have grown accustomed to seeing these terms mentioned time and time again in the mainstream media. However, our understanding of what they actually mean often remains limited. This book provides a general overview of the terms and approaches used broadly in data science, and provides detailed information on the underlying theories, models, and application scenarios. Divided into three main parts, it addresses what data science is; how and where it is used; and how it can be implemented using modern open source software. The book offers an essential guide to modern data science for all students, practitioners, developers and managers seeking a deeper understanding of how various aspects of data science work, and of how they can be employed to gain a competitive advantage.
Within this context, big data analytics (BDA) can be an important tool given that many analytic techniques within the big data world have been created specifically to deal with complexity and rapidly changing conditions. The important task for public sector organizations is to liberate analytics from narrow scientific silos and expand it across internally to reap maximum benefit across their portfolios of programs. This book highlights contextual factors important to better situating the use of BDA within government organizations and demonstrates the wide range of applications of different BDA techniques. It emphasizes the importance of leadership and organizational practices that can improve performance. It explains that BDA initiatives should not be bolted on but should be integrated into the organization's performance management processes. Equally important, the book includes chapters that demonstrate the diversity of factors that need to be managed to launch and sustain BDA initiatives in public sector organizations.
This book provides information on data-driven infrastructure design, analytical approaches, and technological solutions with case studies for smart cities. This book aims to attract works on multidisciplinary research spanning across the computer science and engineering, environmental studies, services, urban planning and development, social sciences and industrial engineering on technologies, case studies, novel approaches, and visionary ideas related to data-driven innovative solutions and big data-powered applications to cope with the real world challenges for building smart cities.
This book presents the combined proceedings of the 7th International Conference on Computer Science and its Applications (CSA-15) and the International Conference on Ubiquitous Information Technologies and Applications (CUTE 2015), both held in Cebu, Philippines, December 15 - 17, 2015. The aim of these two meetings was to promote discussion and interaction among academics, researchers and professionals in the field of computer science covering topics including mobile computing, security and trust management, multimedia systems and devices, networks and communications, databases and data mining, and ubiquitous computing technologies such as ubiquitous communication and networking, ubiquitous software technology, ubiquitous systems and applications, security and privacy. These proceedings reflect the state-of-the-art in the development of computational methods, numerical simulations, error and uncertainty analysis and novel applications of new processing techniques in engineering, science, and other disciplines related to computer science.
This text integrates different mobility data handling processes, from database management to multi-dimensional analysis and mining, into a unified presentation driven by the spectrum of requirements raised by real-world applications. It presents a step-by-step methodology to understand and exploit mobility data: collecting and cleansing data, storage in Moving Object Database (MOD) engines, indexing, processing, analyzing and mining mobility data. Emerging issues, such as semantic and privacy-aware querying and mining as well as distributed data processing, are also covered. Theoretical presentation is smoothly interchanged with hands-on exercises and case studies involving an actual MOD engine. The authors are established experts who address both theoretical and practical dimensions of the field but also present valuable prototype software. The background context, clear explanations and sample exercises make this an ideal textbook for graduate students studying database management, data mining and geographic information systems.
This book contains the proceedings as well as invited papers for the first annual conference of the UNESCO Unitwin Complex System Digital Campus (CSDC), which is an international initiative gathering 120 Universities on four continents, and structured in ten E-Departments. First Complex Systems Digital Campus World E-Conference 2015 features chapters from the latest research results on theoretical questions of complex systems and their experimental domains. The content contained bridges the gap between the individual and the collective within complex systems science and new integrative sciences on topics such as: genes to organisms to ecosystems, atoms to materials to products, and digital media to the Internet. The conference breaks new ground through a dedicated video-conferencing system - a concept at the heart of the international UNESCO UniTwin, embracing scientists from low-income and distant countries. This book promotes an integrated system of research, education, and training. It also aims at contributing to global development by taking into account its social, economic, and cultural dimensions. First Complex Systems Digital Campus World E-Conference 2015 will appeal to students and researchers working in the fields of complex systems, statistical physics, computational intelligence, and biological physics.
This fully updated book collects numerous data mining techniques, reflecting the acceleration and diversity of the development of data-driven approaches to the life sciences. The first half of the volume examines genomics, particularly metagenomics and epigenomics, which promise to deepen our knowledge of genes and genomes, while the second half of the book emphasizes metabolism and the metabolome as well as relevant medicine-oriented subjects. Written for the highly successful Methods in Molecular Biology series, chapters include the kind of detail and expert implementation advice that is useful for getting optimal results. Authoritative and practical, Data Mining for Systems Biology: Methods and Protocols, Second Edition serves as an ideal resource for researchers of biology and relevant fields, such as medical, pharmaceutical, and agricultural sciences, as well as for the scientists and engineers who are working on developing data-driven techniques, such as databases, data sciences, data mining, visualization systems, and machine learning or artificial intelligence that now are central to the paradigm-altering discoveries being made with a higher frequency.
This book presents a comprehensive report on the evolution of Fuzzy Logic since its formulation in Lotfi Zadeh's seminal paper on "fuzzy sets," published in 1965. In addition, it features a stimulating sampling from the broad field of research and development inspired by Zadeh's paper. The chapters, written by pioneers and prominent scholars in the field, show how fuzzy sets have been successfully applied to artificial intelligence, control theory, inference, and reasoning. The book also reports on theoretical issues; features recent applications of Fuzzy Logic in the fields of neural networks, clustering, data mining and software testing; and highlights an important paradigm shift caused by Fuzzy Logic in the area of uncertainty management. Conceived by the editors as an academic celebration of the fifty years' anniversary of the 1965 paper, this work is a must-have for students and researchers willing to get an inspiring picture of the potentialities, limitations, achievements and accomplishments of Fuzzy Logic-based systems.
This book is devoted to the modeling and understanding of complex urban systems. This second volume of Understanding Complex Urban Systems focuses on the challenges of the modeling tools, concerning, e.g., the quality and quantity of data and the selection of an appropriate modeling approach. It is meant to support urban decision-makers-including municipal politicians, spatial planners, and citizen groups-in choosing an appropriate modeling approach for their particular modeling requirements. The contributors to this volume are from different disciplines, but all share the same goal: optimizing the representation of complex urban systems. They present and discuss a variety of approaches for dealing with data-availability problems and finding appropriate modeling approaches-and not only in terms of computer modeling. The selection of articles featured in this volume reflect a broad variety of new and established modeling approaches such as: - An argument for using Big Data methods in conjunction with Agent-based Modeling; - The introduction of a participatory approach involving citizens, in order to utilize an Agent-based Modeling approach to simulate urban-growth scenarios; - A presentation of semantic modeling to enable a flexible application of modeling methods and a flexible exchange of data; - An article about a nested-systems approach to analyzing a city's interdependent subsystems (according to these subsystems' different velocities of change); - An article about methods that use Luhmann's system theory to characterize cities as systems that are composed of flows; - An article that demonstrates how the Sen-Nussbaum Capabilities Approach can be used in urban systems to measure household well-being shifts that occur in response to the resettlement of urban households; - A final article that illustrates how Adaptive Cycles of Complex Adaptive Systems, as well as innovation, can be applied to gain a better understanding of cities and to promote more resilient and more sustainable urban futures.
The book presents some of the most efficient statistical and deterministic methods for information processing and applications in order to extract targeted information and find hidden patterns. The techniques presented range from Bayesian approaches and their variations such as sequential Monte Carlo methods, Markov Chain Monte Carlo filters, Rao Blackwellization, to the biologically inspired paradigm of Neural Networks and decomposition techniques such as Empirical Mode Decomposition, Independent Component Analysis and Singular Spectrum Analysis. The book is directed to the research students, professors, researchers and practitioners interested in exploring the advanced techniques in intelligent signal processing and data mining paradigms.
Perspectives on Data Science for Software Engineering presents the best practices of seasoned data miners in software engineering. The idea for this book was created during the 2014 conference at Dagstuhl, an invitation-only gathering of leading computer scientists who meet to identify and discuss cutting-edge informatics topics. At the 2014 conference, the concept of how to transfer the knowledge of experts from seasoned software engineers and data scientists to newcomers in the field highlighted many discussions. While there are many books covering data mining and software engineering basics, they present only the fundamentals and lack the perspective that comes from real-world experience. This book offers unique insights into the wisdom of the community's leaders gathered to share hard-won lessons from the trenches. Ideas are presented in digestible chapters designed to be applicable across many domains. Topics included cover data collection, data sharing, data mining, and how to utilize these techniques in successful software projects. Newcomers to software engineering data science will learn the tips and tricks of the trade, while more experienced data scientists will benefit from war stories that show what traps to avoid.
This book describes in detail sampling techniques that can be used for unsupervised and supervised cases, with a focus on sampling techniques for machine learning algorithms. It covers theory and models of sampling methods for managing scalability and the "curse of dimensionality", their implementations, evaluations, and applications. A large part of the book is dedicated to database comprising standard feature vectors, and a special section is reserved to the handling of more complex objects and dynamic scenarios. The book is ideal for anyone teaching or learning pattern recognition and interesting teaching or learning pattern recognition and is interested in the big data challenge. It provides an accessible introduction to the field and discusses the state of the art concerning sampling techniques for supervised and unsupervised task. Provides a comprehensive description of sampling techniques for unsupervised and supervised tasks; Describe implementation and evaluation of algorithms that simultaneously manage scalable problems and curse of dimensionality; Addresses the role of sampling in dynamic scenarios, sampling when dealing with complex objects, and new challenges arising from big data. "This book represents a timely collection of state-of-the art research of sampling techniques, suitable for anyone who wants to become more familiar with these helpful techniques for tackling the big data challenge." M. Emre Celebi, Ph.D., Professor and Chair, Department of Computer Science, University of Central Arkansas "In science the difficulty is not to have ideas, but it is to make them work" From Carlo Rovelli
This book provides a review of advanced topics relating to the theory, research, analysis and implementation in the context of big data platforms and their applications, with a focus on methods, techniques, and performance evaluation. The explosive growth in the volume, speed, and variety of data being produced every day requires a continuous increase in the processing speeds of servers and of entire network infrastructures, as well as new resource management models. This poses significant challenges (and provides striking development opportunities) for data intensive and high-performance computing, i.e., how to efficiently turn extremely large datasets into valuable information and meaningful knowledge. The task of context data management is further complicated by the variety of sources such data derives from, resulting in different data formats, with varying storage, transformation, delivery, and archiving requirements. At the same time rapid responses are needed for real-time applications. With the emergence of cloud infrastructures, achieving highly scalable data management in such contexts is a critical problem, as the overall application performance is highly dependent on the properties of the data management service.
This book highlights an innovative approach for extracting terminological cores from subject domain-bounded collections of professional texts. The approach is based on exploiting the phenomenon of terminological saturation. The book presents the formal framework for the method of detecting and measuring terminological saturation as a successive approximation process. It further offers the suite of the algorithms that implement the method in the software and comprehensively evaluates all the aspects of the method and possible input configurations in the experiments on synthetic and real collections of texts in several subject domains. The book demonstrates the use of the developed method and software pipeline in industrial and academic use cases. It also outlines the potential benefits of the method for the adoption in industry.
In the fields of data mining and control, the huge amount of unstructured data and the presence of uncertainty in system descriptions have always been critical issues. The book Randomized Algorithms in Automatic Control and Data Mining introduces the readers to the fundamentals of randomized algorithm applications in data mining (especially clustering) and in automatic control synthesis. The methods proposed in this book guarantee that the computational complexity of classical algorithms and the conservativeness of standard robust control techniques will be reduced. It is shown that when a problem requires "brute force" in selecting among options, algorithms based on random selection of alternatives offer good results with certain probability for a restricted time and significantly reduce the volume of operations. |
You may like...
Fundamentals and Applications of…
Joceli Mayer, Paulo V.K. Borges, …
Hardcover
Re-imagining Diffusion and Adoption of…
Sujeet K. Sharma, Yogesh K. Dwivedi, …
Hardcover
R4,157
Discovery Miles 41 570
Privacy and Identity Management for Life…
Michele Bezzi, Penny Duquenoy, …
Hardcover
R1,441
Discovery Miles 14 410
Cyberspace Security and Defense…
Janusz S. Kowalik, Janusz G orski, …
Hardcover
R2,862
Discovery Miles 28 620
Green Computing in Network Security…
Deepak Kumar Sharma, Koyel Datta Gupta, …
Hardcover
R3,576
Discovery Miles 35 760
Recent Advances in Multimedia Signal…
Mislav Grgic, Kresimir Delac, …
Hardcover
R4,131
Discovery Miles 41 310
Human Aspects of Information Security…
Nathan Clarke, Steven Furnell
Hardcover
R1,459
Discovery Miles 14 590
Privacy and Identity Management for Life…
Simone Fischer-Hubner, Penny Duquenoy, …
Hardcover
R1,455
Discovery Miles 14 550
|