Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
|||
Books > Computing & IT > Applications of computing > Databases > Data mining
This book offers an original and broad exploration of the fundamental methods in Clustering and Combinatorial Data Analysis, presenting new formulations and ideas within this very active field. With extensive introductions, formal and mathematical developments and real case studies, this book provides readers with a deeper understanding of the mutual relationships between these methods, which are clearly expressed with respect to three facets: logical, combinatorial and statistical. Using relational mathematical representation, all types of data structures can be handled in precise and unified ways which the author highlights in three stages: Clustering a set of descriptive attributes Clustering a set of objects or a set of object categories Establishing correspondence between these two dual clusterings Tools for interpreting the reasons of a given cluster or clustering are also included. Foundations and Methods in Combinatorial and Statistical Data Analysis and Clustering will be a valuable resource for students and researchers who are interested in the areas of Data Analysis, Clustering, Data Mining and Knowledge Discovery.
This book highlights recent research advances in unsupervised learning using natural computing techniques such as artificial neural networks, evolutionary algorithms, swarm intelligence, artificial immune systems, artificial life, quantum computing, DNA computing, and others. The book also includes information on the use of natural computing techniques for unsupervised learning tasks. It features several trending topics, such as big data scalability, wireless network analysis, engineering optimization, social media, and complex network analytics. It shows how these applications have triggered a number of new natural computing techniques to improve the performance of unsupervised learning methods. With this book, the readers can easily capture new advances in this area with systematic understanding of the scope in depth. Readers can rapidly explore new methods and new applications at the junction between natural computing and unsupervised learning. Includes advances on unsupervised learning using natural computing techniques Reports on topics in emerging areas such as evolutionary multi-objective unsupervised learning Features natural computing techniques such as evolutionary multi-objective algorithms and many-objective swarm intelligence algorithms
Sequential data from Web server logs, online transaction logs, and performance measurements is collected each day. This sequential data is a valuable source of information, as it allows individuals to search for a particular value or event and also facilitates analysis of the frequency of certain events or sets of related events. Finding patterns in sequences is of utmost importance in many areas of science, engineering, and business scenarios. Pattern Discovery Using Sequence Data Mining: Applications and Studies provides a comprehensive view of sequence mining techniques and presents current research and case studies in pattern discovery in sequential data by researchers and practitioners. This research identifies industry applications introduced by various sequence mining approaches.
The ideas introduced in this book explore the relationships among rule based systems, machine learning and big data. Rule based systems are seen as a special type of expert systems, which can be built by using expert knowledge or learning from real data. The book focuses on the development and evaluation of rule based systems in terms of accuracy, efficiency and interpretability. In particular, a unified framework for building rule based systems, which consists of the operations of rule generation, rule simplification and rule representation, is presented. Each of these operations is detailed using specific methods or techniques. In addition, this book also presents some ensemble learning frameworks for building ensemble rule based systems.
This edited volume focuses on big data implications for computational social science and humanities from management to usage. The first part of the book covers geographic data, text corpus data, and social media data, and exemplifies their concrete applications in a wide range of fields including anthropology, economics, finance, geography, history, linguistics, political science, psychology, public health, and mass communications. The second part of the book provides a panoramic view of the development of big data in the fields of computational social sciences and humanities. The following questions are addressed: why is there a need for novel data governance for this new type of data?, why is big data important for social scientists?, and how will it revolutionize the way social scientists conduct research? With the advent of the information age and technologies such as Web 2.0, ubiquitous computing, wearable devices, and the Internet of Things, digital society has fundamentally changed what we now know as "data", the very use of this data, and what we now call "knowledge". Big data has become the standard in social sciences, and has made these sciences more computational. Big Data in Computational Social Science and Humanities will appeal to graduate students and researchers working in the many subfields of the social sciences and humanities.
This book presents a unique approach to stream data mining. Unlike the vast majority of previous approaches, which are largely based on heuristics, it highlights methods and algorithms that are mathematically justified. First, it describes how to adapt static decision trees to accommodate data streams; in this regard, new splitting criteria are developed to guarantee that they are asymptotically equivalent to the classical batch tree. Moreover, new decision trees are designed, leading to the original concept of hybrid trees. In turn, nonparametric techniques based on Parzen kernels and orthogonal series are employed to address concept drift in the problem of non-stationary regressions and classification in a time-varying environment. Lastly, an extremely challenging problem that involves designing ensembles and automatically choosing their sizes is described and solved. Given its scope, the book is intended for a professional audience of researchers and practitioners who deal with stream data, e.g. in telecommunication, banking, and sensor networks.
This book introduces Meaningful Purposive Interaction Analysis (MPIA) theory, which combines social network analysis (SNA) with latent semantic analysis (LSA) to help create and analyse a meaningful learning landscape from the digital traces left by a learning community in the co-construction of knowledge. The hybrid algorithm is implemented in the statistical programming language and environment R, introducing packages which capture - through matrix algebra - elements of learners' work with more knowledgeable others and resourceful content artefacts. The book provides comprehensive package-by-package application examples, and code samples that guide the reader through the MPIA model to show how the MPIA landscape can be constructed and the learner's journey mapped and analysed. This building block application will allow the reader to progress to using and building analytics to guide students and support decision-making in learning.
This book provides information on data-driven infrastructure design, analytical approaches, and technological solutions with case studies for smart cities. This book aims to attract works on multidisciplinary research spanning across the computer science and engineering, environmental studies, services, urban planning and development, social sciences and industrial engineering on technologies, case studies, novel approaches, and visionary ideas related to data-driven innovative solutions and big data-powered applications to cope with the real world challenges for building smart cities.
The work presented in this book is a combination of theoretical advancements of big data analysis, cloud computing, and their potential applications in scientific computing. The theoretical advancements are supported with illustrative examples and its applications in handling real life problems. The applications are mostly undertaken from real life situations. The book discusses major issues pertaining to big data analysis using computational intelligence techniques and some issues of cloud computing. An elaborate bibliography is provided at the end of each chapter. The material in this book includes concepts, figures, graphs, and tables to guide researchers in the area of big data analysis and cloud computing.
This textbook presents the main principles of visual analytics and describes techniques and approaches that have proven their utility and can be readily reproduced. Special emphasis is placed on various instructive examples of analyses, in which the need for and the use of visualisations are explained in detail. The book begins by introducing the main ideas and concepts of visual analytics and explaining why it should be considered an essential part of data science methodology and practices. It then describes the general principles underlying the visual analytics approaches, including those on appropriate visual representation, the use of interactive techniques, and classes of computational methods. It continues with discussing how to use visualisations for getting aware of data properties that need to be taken into account and for detecting possible data quality issues that may impair the analysis. The second part of the book describes visual analytics methods and workflows, organised by various data types including multidimensional data, data with spatial and temporal components, data describing binary relationships, texts, images and video. For each data type, the specific properties and issues are explained, the relevant analysis tasks are discussed, and appropriate methods and procedures are introduced. The focus here is not on the micro-level details of how the methods work, but on how the methods can be used and how they can be applied to data. The limitations of the methods are also discussed and possible pitfalls are identified. The textbook is intended for students in data science and, more generally, anyone doing or planning to do practical data analysis. It includes numerous examples demonstrating how visual analytics techniques are used and how they can help analysts to understand the properties of data, gain insights into the subject reflected in the data, and build good models that can be trusted. Based on several years of teaching related courses at the City, University of London, the University of Bonn and TU Munich, as well as industry training at the Fraunhofer Institute IAIS and numerous summer schools, the main content is complemented by sample datasets and detailed, illustrated descriptions of exercises to practice applying visual analytics methods and workflows.
This book provides fresh insights into the cutting edge of multimedia data mining, reflecting how the research focus has shifted towards networked social communities, mobile devices and sensors. The work describes how the history of multimedia data processing can be viewed as a sequence of disruptive innovations. Across the chapters, the discussion covers the practical frameworks, libraries, and open source software that enable the development of ground-breaking research into practical applications. Features: reviews how innovations in mobile, social, cognitive, cloud and organic based computing impacts upon the development of multimedia data mining; provides practical details on implementing the technology for solving real-world problems; includes chapters devoted to privacy issues in multimedia social environments and large-scale biometric data processing; covers content and concept based multimedia search and advanced algorithms for multimedia data representation, processing and visualization.
This book addresses the current status, challenges and future directions of data-driven materials discovery and design. It presents the analysis and learning from data as a key theme in many science and cyber related applications. The challenging open questions as well as future directions in the application of data science to materials problems are sketched. Computational and experimental facilities today generate vast amounts of data at an unprecedented rate. The book gives guidance to discover new knowledge that enables materials innovation to address grand challenges in energy, environment and security, the clearer link needed between the data from these facilities and the theory and underlying science. The role of inference and optimization methods in distilling the data and constraining predictions using insights and results from theory is key to achieving the desired goals of real time analysis and feedback. Thus, the importance of this book lies in emphasizing that the full value of knowledge driven discovery using data can only be realized by integrating statistical and information sciences with materials science, which is increasingly dependent on high throughput and large scale computational and experimental data gathering efforts. This is especially the case as we enter a new era of big data in materials science with the planning of future experimental facilities such as the Linac Coherent Light Source at Stanford (LCLS-II), the European X-ray Free Electron Laser (EXFEL) and MaRIE (Matter Radiation in Extremes), the signature concept facility from Los Alamos National Laboratory. These facilities are expected to generate hundreds of terabytes to several petabytes of in situ spatially and temporally resolved data per sample. The questions that then arise include how we can learn from the data to accelerate the processing and analysis of reconstructed microstructure, rapidly map spatially resolved properties from high throughput data, devise diagnostics for pattern detection, and guide experiments towards desired targeted properties. The authors are an interdisciplinary group of leading experts who bring the excitement of the nascent and rapidly emerging field of materials informatics to the reader.
In this work we plan to revise the main techniques for enumeration algorithms and to show four examples of enumeration algorithms that can be applied to efficiently deal with some biological problems modelled by using biological networks: enumerating central and peripheral nodes of a network, enumerating stories, enumerating paths or cycles, and enumerating bubbles. Notice that the corresponding computational problems we define are of more general interest and our results hold in the case of arbitrary graphs. Enumerating all the most and less central vertices in a network according to their eccentricity is an example of an enumeration problem whose solutions are polynomial and can be listed in polynomial time, very often in linear or almost linear time in practice. Enumerating stories, i.e. all maximal directed acyclic subgraphs of a graph G whose sources and targets belong to a predefined subset of the vertices, is on the other hand an example of an enumeration problem with an exponential number of solutions, that can be solved by using a non trivial brute-force approach. Given a metabolic network, each individual story should explain how some interesting metabolites are derived from some others through a chain of reactions, by keeping all alternative pathways between sources and targets. Enumerating cycles or paths in an undirected graph, such as a protein-protein interaction undirected network, is an example of an enumeration problem in which all the solutions can be listed through an optimal algorithm, i.e. the time required to list all the solutions is dominated by the time to read the graph plus the time required to print all of them. By extending this result to directed graphs, it would be possible to deal more efficiently with feedback loops and signed paths analysis in signed or interaction directed graphs, such as gene regulatory networks. Finally, enumerating mouths or bubbles with a source s in a directed graph, that is enumerating all the two vertex-disjoint directed paths between the source s and all the possible targets, is an example of an enumeration problem in which all the solutions can be listed through a linear delay algorithm, meaning that the delay between any two consecutive solutions is linear, by turning the problem into a constrained cycle enumeration problem. Such patterns, in a de Bruijn graph representation of the reads obtained by sequencing, are related to polymorphisms in DNA- or RNA-seq data.
This book highlights the state of the art and recent advances in Big Data clustering methods and their innovative applications in contemporary AI-driven systems. The book chapters discuss Deep Learning for Clustering, Blockchain data clustering, Cybersecurity applications such as insider threat detection, scalable distributed clustering methods for massive volumes of data; clustering Big Data Streams such as streams generated by the confluence of Internet of Things, digital and mobile health, human-robot interaction, and social networks; Spark-based Big Data clustering using Particle Swarm Optimization; and Tensor-based clustering for Web graphs, sensor streams, and social networks. The chapters in the book include a balanced coverage of big data clustering theory, methods, tools, frameworks, applications, representation, visualization, and clustering validation.
This book demonstrates how quantitative methods for text analysis can successfully combine with qualitative methods in the study of different disciplines of the Humanities and Social Sciences (HSS). The book focuses on learning about the evolution of ideas of HSS disciplines through a distant reading of the contents conveyed by scientific literature, in order to retrieve the most relevant topics being debated over time. Quantitative methods, statistical techniques and software packages are used to identify and study the main subject matters of a discipline from raw textual data, both in the past and today. The book also deals with the concept of quality of life of words and aims to foster a discussion about the life cycle of scientific ideas. Textual data retrieved from large corpora pose interesting challenges for any data analysis method and today represent a growing area of research in many fields. New problems emerge from the growing availability of large databases and new methods are needed to retrieve significant information from those large information sources. This book can be used to explain how quantitative methods can be part of the research instrumentation and the "toolbox" of scholars of Humanities and Social Sciences. The book contains numerous examples and a description of the main methods in use, with references to literature and available software. Most of the chapters of the book have been written in a non-technical language for HSS researchers without mathematical, computer or statistical backgrounds.
The purpose of this book is to review the recent advances in E-health technologies and applications. In particular, the book investigates the recent advancements in physical design of medical devices, signal processing and emergent wireless technologies for E-health. In a second part, novel security and privacy solutions for IoT-based E-health applications are presented. The last part of the book is focused on applications, data mining and data analytics for E-health using artificial intelligence and cloud infrastructure. E-health has been an evolving concept since its inception, due to the numerous technologies that can be adapted to offer new innovative and efficient E-health applications. Recently, with the tremendous advancement of wireless technologies, sensors and wearable devices and software technologies, new opportunities have arisen and transformed the E-health field. Moreover, with the expansion of the Internet of Things, and the huge amount of data that connected E-health devices and applications are generating, it is also mandatory to address new challenges related to the data management, applications management and their security. Through this book, readers will be introduced to all these concepts. This book is intended for all practitioners (industrial and academic) interested in widening their knowledge in wireless communications and embedded technologies applied to E-health, cloud computing, artificial intelligence and big data for E-health applications and security issues in E-health.
This thesis focuses on the problem of optimizing the quality of network multimedia services. This problem spans multiple domains, from subjective perception of multimedia quality to computer networks management. The work done in this thesis approaches the problem at different levels, developing methods for modeling the subjective perception of quality based on objectively measurable parameters of the multimedia coding process as well as the transport over computer networks. The modeling of subjective perception is motivated by work done in psychophysics, while using Machine Learning techniques to map network conditions to the human perception of video services. Furthermore, the work develops models for efficient control of multimedia systems operating in dynamic networked environments with the goal of delivering optimized Quality of Experience. Overall this thesis delivers a set of methods for monitoring and optimizing the quality of multimedia services that adapt to the dynamic environment of computer networks in which they operate.
This thesis covers a diverse set of topics related to space-based gravitational wave detectors such as the Laser Interferometer Space Antenna (LISA). The core of the thesis is devoted to the preprocessing of the interferometric link data for a LISA constellation, specifically developing optimal Kalman filters to reduce arm length noise due to clock noise. The approach is to apply Kalman filters of increasing complexity to make optimal estimates of relevant quantities such as constellation arm length, relative clock drift, and Doppler frequencies based on the available measurement data. Depending on the complexity of the filter and the simulated data, these Kalman filter estimates can provide up to a few orders of magnitude improvement over simpler estimators. While the basic concept of the LISA measurement (Time Delay Interferometry) was worked out some time ago, this work brings a level of rigor to the processing of the constellation-level data products. The thesis concludes with some topics related to the eLISA such as a new class of phenomenological waveforms for extreme mass-ratio inspiral sources (EMRIs, one of the main source for eLISA), an octahedral space-based GW detector that does not require drag-free test masses, and some efficient template-search algorithms for the case of relatively high SNR signals.
This book will provide a comprehensive overview of business analytics, for those who have either a technical background (quantitative methods) or a practitioner business background. Business analytics, in the context of the 4th Industrial Revolution, is the "new normal" for businesses that operate in this digital age. This book provides a comprehensive primer and overview of the field (and related fields such as Business Intelligence and Data Science). It will discuss the field as it applies to financial institutions, with some minor departures to other industries. Readers will gain understanding and insight into the field of data science, including traditional as well as emerging techniques. Further, many chapters are dedicated to the establishment of a data-driven team - from executive buy-in and corporate governance to managing and quantifying the return of data-driven projects.
This book examines the principles of and advances in personalized task recommendation in crowdsourcing systems, with the aim of improving their overall efficiency. It discusses the challenges faced by personalized task recommendation when crowdsourcing systems channel human workforces, knowledge, skills and perspectives beyond traditional organizational boundaries. The solutions presented help interested individuals find tasks that closely match their personal interests and capabilities in a context of ever-increasing opportunities of participating in crowdsourcing activities. In order to explore the design of mechanisms that generate task recommendations based on individual preferences, the book first lays out a conceptual framework that guides the analysis and design of crowdsourcing systems. Based on a comprehensive review of existing research, it then develops and evaluates a new kind of task recommendation service that integrates with existing systems. The resulting prototype provides a platform for both the field study and the practical implementation of task recommendation in productive environments.
This book reviews the latest developments in nature-inspired computation, with a focus on the cross-disciplinary applications in data mining and machine learning. Data mining, machine learning and nature-inspired computation are current hot research topics due to their importance in both theory and practical applications. Adopting an application-focused approach, each chapter introduces a specific topic, with detailed descriptions of relevant algorithms, extensive literature reviews and implementation details. Covering topics such as nature-inspired algorithms, swarm intelligence, classification, clustering, feature selection, cybersecurity, learning algorithms over cloud, extreme learning machines, object categorization, particle swarm optimization, flower pollination and firefly algorithms, and neural networks, it also presents case studies and applications, including classifications of crisis-related tweets, extraction of named entities in the Tamil language, performance-based prediction of diseases, and healthcare services. This book is both a valuable a reference resource and a practical guide for students, researchers and professionals in computer science, data and management sciences, artificial intelligence and machine learning.
As the applications of data mining, the non-trivial extraction of implicit information in a data set, have expanded in recent years, so has the need for techniques that are tolerable to imprecision, uncertainty, and approximation. Intelligent Soft Computation and Evolving Data Mining: Integrating Advanced Technologies is a compendium that addresses this need. It integrates contrasting techniques of conventional hard computing and soft computing to exploit the tolerance for imprecision, uncertainty, partial truth, and approximation to achieve tractability, robustness and low-cost solution. This book provides a reference to researchers, practitioners, and students in both soft computing and data mining communities, forming a foundation for the development of the field.
Pattern Recognition on Oriented Matroids covers a range of innovative problems in combinatorics, poset and graph theories, optimization, and number theory that constitute a far-reaching extension of the arsenal of committee methods in pattern recognition. The groundwork for the modern committee theory was laid in the mid-1960s, when it was shown that the familiar notion of solution to a feasible system of linear inequalities has ingenious analogues which can serve as collective solutions to infeasible systems. A hierarchy of dialects in the language of mathematics, for instance, open cones in the context of linear inequality systems, regions of hyperplane arrangements, and maximal covectors (or topes) of oriented matroids, provides an excellent opportunity to take a fresh look at the infeasible system of homogeneous strict linear inequalities - the standard working model for the contradictory two-class pattern recognition problem in its geometric setting. The universal language of oriented matroid theory considerably simplifies a structural and enumerative analysis of applied aspects of the infeasibility phenomenon. The present book is devoted to several selected topics in the emerging theory of pattern recognition on oriented matroids: the questions of existence and applicability of matroidal generalizations of committee decision rules and related graph-theoretic constructions to oriented matroids with very weak restrictions on their structural properties; a study (in which, in particular, interesting subsequences of the Farey sequence appear naturally) of the hierarchy of the corresponding tope committees; a description of the three-tope committees that are the most attractive approximation to the notion of solution to an infeasible system of linear constraints; an application of convexity in oriented matroids as well as blocker constructions in combinatorial optimization and in poset theory to enumerative problems on tope committees; an attempt to clarify how elementary changes (one-element reorientations) in an oriented matroid affect the family of its tope committees; a discrete Fourier analysis of the important family of critical tope committees through rank and distance relations in the tope poset and the tope graph; the characterization of a key combinatorial role played by the symmetric cycles in hypercube graphs. Contents Oriented Matroids, the Pattern Recognition Problem, and Tope Committees Boolean Intervals Dehn-Sommerville Type Relations Farey Subsequences Blocking Sets of Set Families, and Absolute Blocking Constructions in Posets Committees of Set Families, and Relative Blocking Constructions in Posets Layers of Tope Committees Three-Tope Committees Halfspaces, Convex Sets, and Tope Committees Tope Committees and Reorientations of Oriented Matroids Topes and Critical Committees Critical Committees and Distance Signals Symmetric Cycles in the Hypercube Graphs
'Data Mining Patterns' gives an overall view of the recent solutions for mining and covers mining new kinds of patterns, mining patterns under constraints, new kinds of complex data and real-world applications of these concepts. |
You may like...
Opinion Mining and Text Analytics on…
Pantea Keikhosrokiani, Moussa Pourya Asl
Hardcover
R9,808
Discovery Miles 98 080
Intelligent Analysis of Multimedia…
Siddhartha Bhattacharyya, Hrishikesh Bhaumik, …
Hardcover
R5,812
Discovery Miles 58 120
Contemporary Perspectives in Data Mining
Kenneth D. Lawrence, Ronald K. Klimberg
Hardcover
R2,748
Discovery Miles 27 480
Handbook of Research on Automated…
Mrutyunjaya Panda, Harekrishna Misra
Hardcover
R8,195
Discovery Miles 81 950
|