![]() |
![]() |
Your cart is empty |
||
Books > Computing & IT > Applications of computing > Databases > Data mining
This book provides an overview of predictive methods demonstrated by open source software modeling with Rattle (R') and WEKA. Knowledge management involves application of human knowledge (epistemology) with the technological advances of our current society (computer systems) and big data, both in terms of collecting data and in analyzing it. We see three types of analytic tools. Descriptive analytics focus on reports of what has happened. Predictive analytics extend statistical and/or artificial intelligence to provide forecasting capability. It also includes classification modeling. Prescriptive analytics applies quantitative models to optimize systems, or at least to identify improved systems. Data mining includes descriptive and predictive modeling. Operations research includes all three. This book focuses on prescriptive analytics. The book seeks to provide simple explanations and demonstration of some descriptive tools. This second edition provides more examples of big data impact, updates the content on visualization, clarifies some points, and expands coverage of association rules and cluster analysis. Chapter 1 gives an overview in the context of knowledge management. Chapter 2 discusses some basic data types. Chapter 3 covers fundamentals time series modeling tools, and Chapter 4 provides demonstration of multiple regression modeling. Chapter 5 demonstrates regression tree modeling. Chapter 6 presents autoregressive/integrated/moving average models, as well as GARCH models. Chapter 7 covers the set of data mining tools used in classification, to include special variants support vector machines, random forests, and boosting. Models are demonstrated using business related data. The style of the book is intended to be descriptive, seeking to explain how methods work, with some citations, but without deep scholarly reference. The data sets and software are all selected for widespread availability and access by any reader with computer links.
This book describes efforts to improve subject-independent automated classification techniques using a better feature extraction method and a more efficient model of classification. It evaluates three popular saliency criteria for feature selection, showing that they share common limitations, including time-consuming and subjective manual de-facto standard practice, and that existing automated efforts have been predominantly used for subject dependent setting. It then proposes a novel approach for anomaly detection, demonstrating its effectiveness and accuracy for automated classification of biomedical data, and arguing its applicability to a wider range of unsupervised machine learning applications in subject-independent settings.
This book introduces research presented at the "International Conference on Artificial Intelligence: Advances and Applications-2019 (ICAIAA 2019)," a two-day conference and workshop bringing together leading academicians, researchers as well as students to share their experiences and findings on all aspects of engineering applications of artificial intelligence. The book covers research in the areas of artificial intelligence, machine learning, and deep learning applications in health care, agriculture, business and security. It also includes research in core concepts of computer networks, intelligent system design and deployment, real-time systems, WSN, sensors and sensor nodes, SDN and NFV. As such it is a valuable resource for students, academics and practitioners in industry working on AI applications.
The use of game theoretic techniques is playing an increasingly important role in the network design domain. Understanding the background, concepts, and principles in using game theory approaches is necessary for engineers in network design. Game Theory Applications in Network Design provides the basic idea of game theory and the fundamental understanding of game theoretic interactions among network entities. The material in this book also covers recent advances and open issues, offering game theoretic solutions for specific network design issues. This publication will benefit students, educators, research strategists, scientists, researchers, and engineers in the field of network design.
This PALO volume constitutes the Proceedings of the 19th Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES 2015), held in Bangkok, Thailand, November 22-25, 2015. The IES series of conference is an annual event that was initiated back in 1997 in Canberra, Australia. IES aims to bring together researchers from countries of the Asian Pacific Rim, in the fields of intelligent systems and evolutionary computation, to exchange ideas, present recent results and discuss possible collaborations. Researchers beyond Asian Pacific Rim countries are also welcome and encouraged to participate. The theme for IES 2015 is "Transforming Big Data into Knowledge and Technological Breakthroughs". The host organization for IES 2015 is the School of Information Technology (SIT), King Mongkut's University of Technology Thonburi (KMUTT), and it is technically sponsored by the International Neural Network Society (INNS). IES 2015 is collocated with three other conferences; namely, The 6th International Conference on Computational Systems-Biology and Bioinformatics 2015 (CSBio 2015), The 7th International Conference on Advances in Information Technology 2015 (IAIT 2015) and The 10th International Conference on e-Business 2015 (iNCEB 2015), as a major part of series of events to celebrate the SIT 20th anniversary and the KMUTT 55th anniversary.
This book comprises the best deliberations with the theme "Machine Learning Technologies and Applications" in the "International Conference on Advances in Computer Engineering and Communication Systems (ICACECS 2020)," organized by the Department of Computer Science and Engineering, VNR Vignana Jyothi Institute of Engineering and Technology. The book provides insights into the recent trends and developments in the field of computer science with a special focus on the machine learning and big data. The book focuses on advanced topics in artificial intelligence, machine learning, data mining and big data computing, cloud computing, Internet of things, distributed computing and smart systems.
This book contains some selected papers from the International Conference on Extreme Learning Machine 2015, which was held in Hangzhou, China, December 15-17, 2015. This conference brought together researchers and engineers to share and exchange R&D experience on both theoretical studies and practical applications of the Extreme Learning Machine (ELM) technique and brain learning. This book covers theories, algorithms ad applications of ELM. It gives readers a glance of the most recent advances of ELM.
The widespread use of XML in business and scientific databases has prompted the development of methodologies, techniques, and systems for effectively managing and analyzing XML data. This has increasingly attracted the attention of different research communities, including database, information retrieval, pattern recognition, and machine learning, from which several proposals have been offered to address problems in XML data management and knowledge discovery. XML Data Mining: Models, Methods, and Applications aims to collect knowledge from experts of database, information retrieval, machine learning, and knowledge management communities in developing models, methods, and systems for XML data mining. This book addresses key issues and challenges in XML data mining, offering insights into the various existing solutions and best practices for modeling, processing, analyzing XML data, and for evaluating performance of XML data mining algorithms and systems.
This book, written by an international team of prominent authors, gathers the latest developments in mobile technologies for the acquisition, management, analysis and sharing of Volunteered Geographic Information (VGI) in the context of Earth observation. It is divided into three parts, the first of which presents case studies on the implementation of VGI for Earth observation, discusses the characteristics of volunteers' engagement in relation with their expertise and motivation, analyzes the tasks they are called upon to perform, and examines the available tools for developing VGI. In turn, the second part introduces readers to essential methods, techniques and algorithms used to develop mobile information systems based on VGI for distinct Earth observation tasks, while the last part focuses on the drawbacks and limitations of VGI with regard to the above-mentioned tasks and proposes innovative methods and techniques to help overcome them. Given its breadth of coverage, the book offers a comprehensive, practice-oriented reference guide for researchers and practitioners in the field of geo-information management.
Sequential data from Web server logs, online transaction logs, and performance measurements is collected each day. This sequential data is a valuable source of information, as it allows individuals to search for a particular value or event and also facilitates analysis of the frequency of certain events or sets of related events. Finding patterns in sequences is of utmost importance in many areas of science, engineering, and business scenarios. Pattern Discovery Using Sequence Data Mining: Applications and Studies provides a comprehensive view of sequence mining techniques and presents current research and case studies in pattern discovery in sequential data by researchers and practitioners. This research identifies industry applications introduced by various sequence mining approaches.
This book provides a comprehensive overview of different biomedical data types, including both clinical and genomic data. Thorough explanations enable readers to explore key topics ranging from electrocardiograms to Big Data health mining and EEG analysis techniques. Each chapter offers a summary of the field and a sample analysis. Also covered are telehealth infrastructure, healthcare information association rules, methods for mass spectrometry imaging, environmental biodiversity, and the global nonlinear fitness function for protein structures. Diseases are addressed in chapters on functional annotation of lncRNAs in human disease, metabolomics characterization of human diseases, disease risk factors using SNP data and Bayesian methods, and imaging informatics for diagnostic imaging marker selection. With the exploding accumulation of Electronic Health Records (EHRs), there is an urgent need for computer-aided analysis of heterogeneous biomedical datasets. Biomedical data is notorious for its diversified scales, dimensions, and volumes, and requires interdisciplinary technologies for visual illustration and digital characterization. Various computer programs and servers have been developed for these purposes by both theoreticians and engineers. This book is an essential reference for investigating the tools available for analyzing heterogeneous biomedical data. It is designed for professionals, researchers, and practitioners in biomedical engineering, diagnostics, medical electronics, and related industries.
This book provides a systematic review of many advanced techniques to support the analysis of large collections of documents, ranging from the elementary to the profound, covering all the aspects of the visualization of text documents. Particularly, we start by introducing the fundamental concept of information visualization and visual analysis, followed by a brief survey of the field of text visualization and commonly used data models for converting document into a structured form for visualization. Then we introduce the key visualization techniques including visualizing document similarity, content, sentiments, as well as text corpus exploration system in details with concrete examples in the rest of the book.
The ideas introduced in this book explore the relationships among rule based systems, machine learning and big data. Rule based systems are seen as a special type of expert systems, which can be built by using expert knowledge or learning from real data. The book focuses on the development and evaluation of rule based systems in terms of accuracy, efficiency and interpretability. In particular, a unified framework for building rule based systems, which consists of the operations of rule generation, rule simplification and rule representation, is presented. Each of these operations is detailed using specific methods or techniques. In addition, this book also presents some ensemble learning frameworks for building ensemble rule based systems.
This book offers an original and broad exploration of the fundamental methods in Clustering and Combinatorial Data Analysis, presenting new formulations and ideas within this very active field. With extensive introductions, formal and mathematical developments and real case studies, this book provides readers with a deeper understanding of the mutual relationships between these methods, which are clearly expressed with respect to three facets: logical, combinatorial and statistical. Using relational mathematical representation, all types of data structures can be handled in precise and unified ways which the author highlights in three stages: Clustering a set of descriptive attributes Clustering a set of objects or a set of object categories Establishing correspondence between these two dual clusterings Tools for interpreting the reasons of a given cluster or clustering are also included. Foundations and Methods in Combinatorial and Statistical Data Analysis and Clustering will be a valuable resource for students and researchers who are interested in the areas of Data Analysis, Clustering, Data Mining and Knowledge Discovery.
This book discusses the fusion of mobile and WiFi network data with semantic technologies and diverse context sources for offering semantically enriched context-aware services in the telecommunications domain. It presents the OpenMobileNetwork as a platform for providing estimated and semantically enriched mobile and WiFi network topology data using the principles of Linked Data. This platform is based on the OpenMobileNetwork Ontology consisting of a set of network context ontology facets that describe mobile network cells as well as WiFi access points from a topological perspective and geographically relate their coverage areas to other context sources. The book also introduces Linked Crowdsourced Data and its corresponding Context Data Cloud Ontology, which is a crowdsourced dataset combining static location data with dynamic context information. Linked Crowdsourced Data supports the OpenMobileNetwork by providing the necessary context data richness for more sophisticated semantically enriched context-aware services. Various application scenarios and proof of concept services as well as two separate evaluations are part of the book. As the usability of the provided services closely depends on the quality of the approximated network topologies, it compares the estimated positions for mobile network cells within the OpenMobileNetwork to a small set of real-world cell positions. The results prove that context-aware services based on the OpenMobileNetwork rely on a solid and accurate network topology dataset. The book also evaluates the performance of the exemplary Semantic Tracking as well as Semantic Geocoding services, verifying the applicability and added value of semantically enriched mobile and WiFi network data.
The work presented in this book is a combination of theoretical advancements of big data analysis, cloud computing, and their potential applications in scientific computing. The theoretical advancements are supported with illustrative examples and its applications in handling real life problems. The applications are mostly undertaken from real life situations. The book discusses major issues pertaining to big data analysis using computational intelligence techniques and some issues of cloud computing. An elaborate bibliography is provided at the end of each chapter. The material in this book includes concepts, figures, graphs, and tables to guide researchers in the area of big data analysis and cloud computing.
This book introduces Meaningful Purposive Interaction Analysis (MPIA) theory, which combines social network analysis (SNA) with latent semantic analysis (LSA) to help create and analyse a meaningful learning landscape from the digital traces left by a learning community in the co-construction of knowledge. The hybrid algorithm is implemented in the statistical programming language and environment R, introducing packages which capture - through matrix algebra - elements of learners' work with more knowledgeable others and resourceful content artefacts. The book provides comprehensive package-by-package application examples, and code samples that guide the reader through the MPIA model to show how the MPIA landscape can be constructed and the learner's journey mapped and analysed. This building block application will allow the reader to progress to using and building analytics to guide students and support decision-making in learning.
This book highlights recent research advances in unsupervised learning using natural computing techniques such as artificial neural networks, evolutionary algorithms, swarm intelligence, artificial immune systems, artificial life, quantum computing, DNA computing, and others. The book also includes information on the use of natural computing techniques for unsupervised learning tasks. It features several trending topics, such as big data scalability, wireless network analysis, engineering optimization, social media, and complex network analytics. It shows how these applications have triggered a number of new natural computing techniques to improve the performance of unsupervised learning methods. With this book, the readers can easily capture new advances in this area with systematic understanding of the scope in depth. Readers can rapidly explore new methods and new applications at the junction between natural computing and unsupervised learning. Includes advances on unsupervised learning using natural computing techniques Reports on topics in emerging areas such as evolutionary multi-objective unsupervised learning Features natural computing techniques such as evolutionary multi-objective algorithms and many-objective swarm intelligence algorithms
In this work we plan to revise the main techniques for enumeration algorithms and to show four examples of enumeration algorithms that can be applied to efficiently deal with some biological problems modelled by using biological networks: enumerating central and peripheral nodes of a network, enumerating stories, enumerating paths or cycles, and enumerating bubbles. Notice that the corresponding computational problems we define are of more general interest and our results hold in the case of arbitrary graphs. Enumerating all the most and less central vertices in a network according to their eccentricity is an example of an enumeration problem whose solutions are polynomial and can be listed in polynomial time, very often in linear or almost linear time in practice. Enumerating stories, i.e. all maximal directed acyclic subgraphs of a graph G whose sources and targets belong to a predefined subset of the vertices, is on the other hand an example of an enumeration problem with an exponential number of solutions, that can be solved by using a non trivial brute-force approach. Given a metabolic network, each individual story should explain how some interesting metabolites are derived from some others through a chain of reactions, by keeping all alternative pathways between sources and targets. Enumerating cycles or paths in an undirected graph, such as a protein-protein interaction undirected network, is an example of an enumeration problem in which all the solutions can be listed through an optimal algorithm, i.e. the time required to list all the solutions is dominated by the time to read the graph plus the time required to print all of them. By extending this result to directed graphs, it would be possible to deal more efficiently with feedback loops and signed paths analysis in signed or interaction directed graphs, such as gene regulatory networks. Finally, enumerating mouths or bubbles with a source s in a directed graph, that is enumerating all the two vertex-disjoint directed paths between the source s and all the possible targets, is an example of an enumeration problem in which all the solutions can be listed through a linear delay algorithm, meaning that the delay between any two consecutive solutions is linear, by turning the problem into a constrained cycle enumeration problem. Such patterns, in a de Bruijn graph representation of the reads obtained by sequencing, are related to polymorphisms in DNA- or RNA-seq data.
This edited volume focuses on big data implications for computational social science and humanities from management to usage. The first part of the book covers geographic data, text corpus data, and social media data, and exemplifies their concrete applications in a wide range of fields including anthropology, economics, finance, geography, history, linguistics, political science, psychology, public health, and mass communications. The second part of the book provides a panoramic view of the development of big data in the fields of computational social sciences and humanities. The following questions are addressed: why is there a need for novel data governance for this new type of data?, why is big data important for social scientists?, and how will it revolutionize the way social scientists conduct research? With the advent of the information age and technologies such as Web 2.0, ubiquitous computing, wearable devices, and the Internet of Things, digital society has fundamentally changed what we now know as "data", the very use of this data, and what we now call "knowledge". Big data has become the standard in social sciences, and has made these sciences more computational. Big Data in Computational Social Science and Humanities will appeal to graduate students and researchers working in the many subfields of the social sciences and humanities.
This book provides fresh insights into the cutting edge of multimedia data mining, reflecting how the research focus has shifted towards networked social communities, mobile devices and sensors. The work describes how the history of multimedia data processing can be viewed as a sequence of disruptive innovations. Across the chapters, the discussion covers the practical frameworks, libraries, and open source software that enable the development of ground-breaking research into practical applications. Features: reviews how innovations in mobile, social, cognitive, cloud and organic based computing impacts upon the development of multimedia data mining; provides practical details on implementing the technology for solving real-world problems; includes chapters devoted to privacy issues in multimedia social environments and large-scale biometric data processing; covers content and concept based multimedia search and advanced algorithms for multimedia data representation, processing and visualization.
This book presents a unique approach to stream data mining. Unlike the vast majority of previous approaches, which are largely based on heuristics, it highlights methods and algorithms that are mathematically justified. First, it describes how to adapt static decision trees to accommodate data streams; in this regard, new splitting criteria are developed to guarantee that they are asymptotically equivalent to the classical batch tree. Moreover, new decision trees are designed, leading to the original concept of hybrid trees. In turn, nonparametric techniques based on Parzen kernels and orthogonal series are employed to address concept drift in the problem of non-stationary regressions and classification in a time-varying environment. Lastly, an extremely challenging problem that involves designing ensembles and automatically choosing their sizes is described and solved. Given its scope, the book is intended for a professional audience of researchers and practitioners who deal with stream data, e.g. in telecommunication, banking, and sensor networks.
This thesis focuses on the problem of optimizing the quality of network multimedia services. This problem spans multiple domains, from subjective perception of multimedia quality to computer networks management. The work done in this thesis approaches the problem at different levels, developing methods for modeling the subjective perception of quality based on objectively measurable parameters of the multimedia coding process as well as the transport over computer networks. The modeling of subjective perception is motivated by work done in psychophysics, while using Machine Learning techniques to map network conditions to the human perception of video services. Furthermore, the work develops models for efficient control of multimedia systems operating in dynamic networked environments with the goal of delivering optimized Quality of Experience. Overall this thesis delivers a set of methods for monitoring and optimizing the quality of multimedia services that adapt to the dynamic environment of computer networks in which they operate.
This thesis covers a diverse set of topics related to space-based gravitational wave detectors such as the Laser Interferometer Space Antenna (LISA). The core of the thesis is devoted to the preprocessing of the interferometric link data for a LISA constellation, specifically developing optimal Kalman filters to reduce arm length noise due to clock noise. The approach is to apply Kalman filters of increasing complexity to make optimal estimates of relevant quantities such as constellation arm length, relative clock drift, and Doppler frequencies based on the available measurement data. Depending on the complexity of the filter and the simulated data, these Kalman filter estimates can provide up to a few orders of magnitude improvement over simpler estimators. While the basic concept of the LISA measurement (Time Delay Interferometry) was worked out some time ago, this work brings a level of rigor to the processing of the constellation-level data products. The thesis concludes with some topics related to the eLISA such as a new class of phenomenological waveforms for extreme mass-ratio inspiral sources (EMRIs, one of the main source for eLISA), an octahedral space-based GW detector that does not require drag-free test masses, and some efficient template-search algorithms for the case of relatively high SNR signals.
'Data Mining Patterns' gives an overall view of the recent solutions for mining and covers mining new kinds of patterns, mining patterns under constraints, new kinds of complex data and real-world applications of these concepts. |
![]() ![]() You may like...
Problem Solving Cases In Microsoft…
Joseph Brady, Ellen Monk, …
Paperback
Intelligent Computing Paradigm: Recent…
J K Mandal, Devadutta Sinha
Hardcover
R2,873
Discovery Miles 28 730
Computational Intelligence in Data…
Aravindan Chandrabose, Ulrich Furbach, …
Hardcover
R2,918
Discovery Miles 29 180
Theoretical and Computational Physics of…
Sergey T Surzhikov
Hardcover
R5,806
Discovery Miles 58 060
Advances in Engineering Design and…
Chenfeng Li, U. Chandrasekhar, …
Hardcover
R4,398
Discovery Miles 43 980
Shelly Cashman Series (R) Microsoft (R…
Mary Last, Philip Pratt
Paperback
|