![]() |
![]() |
Your cart is empty |
||
Books > Computing & IT > Applications of computing > Databases > Data mining
This book discusses the development of a theory of info-statics as a sub-theory of the general theory of information. It describes the factors required to establish a definition of the concept of information that fixes the applicable boundaries of the phenomenon of information, its linguistic structure and scientific applications. The book establishes the definitional foundations of information and how the concepts of uncertainty, data, fact, evidence and evidential things are sequential derivatives of information as the primary category, which is a property of matter and energy. The sub-definitions are extended to include the concepts of possibility, probability, expectation, anticipation, surprise, discounting, forecasting, prediction and the nature of past-present-future information structures. It shows that the factors required to define the concept of information are those that allow differences and similarities to be established among universal objects over the ontological and epistemological spaces in terms of varieties and identities. These factors are characteristic and signal dispositions on the basis of which general definitional foundations are developed to construct the general information definition (GID). The book then demonstrates that this definition is applicable to all types of information over the ontological and epistemological spaces. It also defines the concepts of uncertainty, data, fact, evidence and knowledge based on the GID. Lastly, it uses set-theoretic analytics to enhance the definitional foundations, and shows the value of the theory of info-statics to establish varieties and categorial varieties at every point of time and thus initializes the construct of the theory of info-dynamics.
Virtually all nontrivial and modern service related problems and systems involve data volumes and types that clearly fall into what is presently meant as "big data", that is, are huge, heterogeneous, complex, distributed, etc. Data mining is a series of processes which include collecting and accumulating data, modeling phenomena, and discovering new information, and it is one of the most important steps to scientific analysis of the processes of services. Data mining application in services requires a thorough understanding of the characteristics of each service and knowledge of the compatibility of data mining technology within each particular service, rather than knowledge only in calculation speed and prediction accuracy. Varied examples of services provided in this book will help readers understand the relation between services and data mining technology. This book is intended to stimulate interest among researchers and practitioners in the relation between data mining technology and its application to other fields.
Social network analysis applications have experienced tremendous advances within the last few years due in part to increasing trends towards users interacting with each other on the internet. Social networks are organized as graphs, and the data on social networks takes on the form of massive streams, which are mined for a variety of purposes. Social Network Data Analytics covers an important niche in the social network analytics field. This edited volume, contributed by prominent researchers in this field, presents a wide selection of topics on social network data mining such as Structural Properties of Social Networks, Algorithms for Structural Discovery of Social Networks and Content Analysis in Social Networks. This book is also unique in focussing on the data analytical aspects of social networks in the internet scenario, rather than the traditional sociology-driven emphasis prevalent in the existing books, which do not focus on the unique data-intensive characteristics of online social networks. Emphasis is placed on simplifying the content so that students and practitioners benefit from this book. This book targets advanced level students and researchers concentrating on computer science as a secondary text or reference book. Data mining, database, information security, electronic commerce and machine learning professionals will find this book a valuable asset, as well as primary associations such as ACM, IEEE and Management Science.
This book gathers authoritative contributions in the field of Soft Computing. Based on selected papers presented at the 7th World Conference on Soft Computing, which was held on May 29-31, 2018, in Baku, Azerbaijan, it describes new theoretical advances, as well as cutting-edge methods and applications. New theories and algorithms in fuzzy logic, cognitive modeling, graph theory and metaheuristics are discussed, and applications in data mining, social networks, control and robotics, geoscience, biomedicine and industrial management are described. This book offers a timely, broad snapshot of recent developments, including thought-provoking trends and challenges that are yielding new research directions in the diverse areas of Soft Computing.
This book gathers high-quality papers presented at the International Conference on Smart Trends for Information Technology and Computer Communications (SmartCom 2020), organized by the Global Knowledge Research Foundation (GR Foundation) from 23 to 24 January 2020. It covers the state-of-the-art and emerging topics in information, computer communications, and effective strategies for their use in engineering and managerial applications. It also explores and discusses the latest technological advances in, and future directions for, information and knowledge computing and its applications.
This book includes the original, peer-reviewed research from the 2nd International Conference on Emerging Trends in Electrical, Communication and Information Technologies (ICECIT 2015), held in December, 2015 at Srinivasa Ramanujan Institute of Technology, Ananthapuramu, Andhra Pradesh, India. It covers the latest research trends or developments in areas of Electrical Engineering, Electronic and Communication Engineering, and Computer Science and Information.
This book brings together two major trends: data science and blockchains. It is one of the first books to systematically cover the analytics aspects of blockchains, with the goal of linking traditional data mining research communities with novel data sources. Data science and big data technologies can be considered cornerstones of the data-driven digital transformation of organizations and society. The concept of blockchain is predicted to enable and spark transformation on par with that associated with the invention of the Internet. Cryptocurrencies are the first successful use case of highly distributed blockchains, like the world wide web was to the Internet. The book takes the reader through basic data exploration topics, proceeding systematically, method by method, through supervised and unsupervised learning approaches and information visualization techniques, all the way to understanding the blockchain data from the network science perspective. Chapters introduce the cryptocurrency blockchain data model and methods to explore it using structured query language, association rules, clustering, classification, visualization, and network science. Each chapter introduces basic concepts, presents examples with real cryptocurrency blockchain data and offers exercises and questions for further discussion. Such an approach intends to serve as a good starting point for undergraduate and graduate students to learn data science topics using cryptocurrency blockchain examples. It is also aimed at researchers and analysts who already possess good analytical and data skills, but who do not yet have the specific knowledge to tackle analytic questions about blockchain transactions. The readers improve their knowledge about the essential data science techniques in order to turn mere transactional information into social, economic, and business insights.
The objective of this monograph is to improve the performance of the sentiment analysis model by incorporating the semantic, syntactic and common-sense knowledge. This book proposes a novel semantic concept extraction approach that uses dependency relations between words to extract the features from the text. Proposed approach combines the semantic and common-sense knowledge for the better understanding of the text. In addition, the book aims to extract prominent features from the unstructured text by eliminating the noisy, irrelevant and redundant features. Readers will also discover a proposed method for efficient dimensionality reduction to alleviate the data sparseness problem being faced by machine learning model. Authors pay attention to the four main findings of the book : -Performance of the sentiment analysis can be improved by reducing the redundancy among the features. Experimental results show that minimum Redundancy Maximum Relevance (mRMR) feature selection technique improves the performance of the sentiment analysis by eliminating the redundant features. - Boolean Multinomial Naive Bayes (BMNB) machine learning algorithm with mRMR feature selection technique performs better than Support Vector Machine (SVM) classifier for sentiment analysis. - The problem of data sparseness is alleviated by semantic clustering of features, which in turn improves the performance of the sentiment analysis. - Semantic relations among the words in the text have useful cues for sentiment analysis. Common-sense knowledge in form of ConceptNet ontology acquires knowledge, which provides a better understanding of the text that improves the performance of the sentiment analysis.
This text is about spreading of information and influence in complex networks. Although previously considered similar and modeled in parallel approaches, there is now experimental evidence that epidemic and social spreading work in subtly different ways. While previously explored through modeling, there is currently an explosion of work on revealing the mechanisms underlying complex contagion based on big data and data-driven approaches. This volume consists of four parts. Part 1 is an Introduction, providing an accessible summary of the state of the art. Part 2 provides an overview of the central theoretical developments in the field. Part 3 describes the empirical work on observing spreading processes in real-world networks. Finally, Part 4 goes into detail with recent and exciting new developments: dedicated studies designed to measure specific aspects of the spreading processes, often using randomized control trials to isolate the network effect from confounders, such as homophily. Each contribution is authored by leading experts in the field. This volume, though based on technical selections of the most important results on complex spreading, remains quite accessible to the newly interested. The main benefit to the reader is that the topics are carefully structured to take the novice to the level of expert on the topic of social spreading processes. This book will be of great importance to a wide field: from researchers in physics, computer science, and sociology to professionals in public policy and public health.
Abstraction is a fundamental mechanism underlying both human and artificial perception, representation of knowledge, reasoning and learning. This mechanism plays a crucial role in many disciplines, notably Computer Programming, Natural and Artificial Vision, Complex Systems, Artificial Intelligence and Machine Learning, Art, and Cognitive Sciences. This book first provides the reader with an overview of the notions of abstraction proposed in various disciplines by comparing both commonalities and differences. After discussing the characterizing properties of abstraction, a formal model, the KRA model, is presented to capture them. This model makes the notion of abstraction easily applicable by means of the introduction of a set of abstraction operators and abstraction patterns, reusable across different domains and applications. It is the impact of abstraction in Artificial Intelligence, Complex Systems and Machine Learning which creates the core of the book. A general framework, based on the KRA model, is presented, and its pragmatic power is illustrated with three case studies: Model-based diagnosis, Cartographic Generalization, and learning Hierarchical Hidden Markov Models.
The book covers tools in the study of online social networks such as machine learning techniques, clustering, and deep learning. A variety of theoretical aspects, application domains, and case studies for analyzing social network data are covered. The aim is to provide new perspectives on utilizing machine learning and related scientific methods and techniques for social network analysis. Machine Learning Techniques for Online Social Networks will appeal to researchers and students in these fields.
This book reports on the development and validation of a generic defeasible logic programming framework for carrying out argumentative reasoning in Semantic Web applications (GF@SWA). The proposed methodology is unique in providing a solution for representing incomplete and/or contradictory information coming from different sources, and reasoning with it. GF@SWA is able to represent this type of information, perform argumentation-driven hybrid reasoning to resolve conflicts, and generate graphical representations of the integrated information, thus assisting decision makers in decision making processes. GF@SWA represents the first argumentative reasoning engine for carrying out automated reasoning in the Semantic Web context and is expected to have a significant impact on future business applications. The book provides the readers with a detailed and clear exposition of different argumentation-based reasoning techniques, and of their importance and use in Semantic Web applications. It addresses both academics and professionals, and will be of primary interest to researchers, students and practitioners in the area of Web-based intelligent decision support systems and their application in various domains.
Data Preprocessing for Data Mining addresses one of the most important issues within the well-known Knowledge Discovery from Data process. Data directly taken from the source will likely have inconsistencies, errors or most importantly, it is not ready to be considered for a data mining process. Furthermore, the increasing amount of data in recent science, industry and business applications, calls to the requirement of more complex tools to analyze it. Thanks to data preprocessing, it is possible to convert the impossible into possible, adapting the data to fulfill the input demands of each data mining algorithm. Data preprocessing includes the data reduction techniques, which aim at reducing the complexity of the data, detecting or removing irrelevant and noisy elements from the data. This book is intended to review the tasks that fill the gap between the data acquisition from the source and the data mining process. A comprehensive look from a practical point of view, including basic concepts and surveying the techniques proposed in the specialized literature, is given.Each chapter is a stand-alone guide to a particular data preprocessing topic, from basic concepts and detailed descriptions of classical algorithms, to an incursion of an exhaustive catalog of recent developments. The in-depth technical descriptions make this book suitable for technical professionals, researchers, senior undergraduate and graduate students in data science, computer science and engineering.
This book contributes to an improved understanding of knowledge-intensive business services and knowledge management issues. It offers a complex overview of literature devoted to these topics and introduces the concept of 'knowledge flows', which constitutes a missing link in the previous knowledge management theories. The book provides a detailed analysis of knowledge flows, with their types, relations and factors influencing them. It offers a novel approach to understand the aspects of knowledge and its management not only inside the organization, but also outside, in its environment.
This book presents new approaches that advance research in all aspects of agent-based models, technologies, simulations and implementations for data intensive applications. The nine chapters contain a review of recent cross-disciplinary approaches in cloud environments and multi-agent systems, and important formulations of data intensive problems in distributed computational environments together with the presentation of new agent-based tools to handle those problems and Big Data in general. This volume can serve as a reference for students, researchers and industry practitioners working in or interested in joining interdisciplinary work in the areas of data intensive computing and Big Data systems using emergent large-scale distributed computing paradigms. It will also allow newcomers to grasp key concepts and potential solutions on advanced topics of theory, models, technologies, system architectures and implementation of applications in Multi-Agent systems and data intensive computing.
Learn how to apply the principles of machine learning to time series modeling with this indispensable resource Machine Learning for Time Series Forecasting with Python is an incisive and straightforward examination of one of the most crucial elements of decision-making in finance, marketing, education, and healthcare: time series modeling. Despite the centrality of time series forecasting, few business analysts are familiar with the power or utility of applying machine learning to time series modeling. Author Francesca Lazzeri, a distinguished machine learning scientist and economist, corrects that deficiency by providing readers with comprehensive and approachable explanation and treatment of the application of machine learning to time series forecasting. Written for readers who have little to no experience in time series forecasting or machine learning, the book comprehensively covers all the topics necessary to: Understand time series forecasting concepts, such as stationarity, horizon, trend, and seasonality Prepare time series data for modeling Evaluate time series forecasting models' performance and accuracy Understand when to use neural networks instead of traditional time series models in time series forecasting Machine Learning for Time Series Forecasting with Python is full real-world examples, resources and concrete strategies to help readers explore and transform data and develop usable, practical time series forecasts. Perfect for entry-level data scientists, business analysts, developers, and researchers, this book is an invaluable and indispensable guide to the fundamental and advanced concepts of machine learning applied to time series modeling.
This book explores all relevant aspects of net scoring, also known as uplift modeling: a data mining approach used to analyze and predict the effects of a given treatment on a desired target variable for an individual observation. After discussing modern net score modeling methods, data preparation, and the assessment of uplift models, the book investigates software implementations and real-world scenarios. Focusing on the application of theoretical results and on practical issues of uplift modeling, it also includes a dedicated chapter on software solutions in SAS, R, Spectrum Miner, and KNIME, which compares the respective tools. This book also presents the applications of net scoring in various contexts, e.g. medical treatment, with a special emphasis on direct marketing and corresponding business cases. The target audience primarily includes data scientists, especially researchers and practitioners in predictive modeling and scoring, mainly, but not exclusively, in the marketing context.
This work takes a critical look at the current concept of isotopic landscapes ("isoscapes") in bioarchaeology and its application in future research. It specifically addresses the research potential of cremated finds, a somewhat neglected bioarchaeological substrate, resulting primarily from the inherent osteological challenges and complex mineralogy associated with it. In addition, for the first time data mining methods are applied. The chapters are the outcome of an international workshop sponsored by the German Science Foundation and the Centre of Advanced Studies at the Ludwig-Maximilian-University in Munich. Isotopic landscapes are indispensable tracers for the monitoring of the flow of matter through geo/ecological systems since they comprise existing temporally and spatially defined stable isotopic patterns found in geological and ecological samples. Analyses of stable isotopes of the elements nitrogen, carbon, oxygen, strontium, and lead are routinely utilized in bioarchaeology to reconstruct biodiversity, palaeodiet, palaeoecology, palaeoclimate, migration and trade. The interpretive power of stable isotopic ratios depends not only on firm, testable hypotheses, but most importantly on the cooperative networking of scientists from both natural and social sciences. Application of multi-isotopic tracers generates isotopic patterns with multiple dimensions, which accurately characterize a find, but can only be interpreted by use of modern data mining methods.
Mining of Data with Complex Structures: - Clarifies the type and nature of data with complex structure including sequences, trees and graphs - Provides a detailed background of the state-of-the-art of sequence mining, tree mining and graph mining. -Defines the essential aspects of the tree mining problem: subtree types, support definitions, constraints. - Outlines the implementation issues one needs to consider when developing tree mining algorithms (enumeration strategies, data structures, etc.) - Details the Tree Model Guided (TMG) approach for tree mining and provides the mathematical model for the worst case estimate of complexity of mining ordered induced and embedded subtrees. - Explains the mechanism of the TMG framework for mining ordered/unordered induced/embedded and distance-constrained embedded subtrees. - Provides a detailed comparison of the different tree mining approaches highlighting the characteristics and benefits of each approach. - Overviews the implications and potential applications of tree mining in general knowledge management related tasks, and uses Web, health and bioinformatics related applications as case studies. - Details the extension of the TMG framework for sequence mining - Provides an overview of the future research direction with respect to technical extensions and application areas The primary audience is 3rd year, 4th year undergraduate students, Masters and PhD students and academics. The book can be used for both teaching and research. The secondary audiences are practitioners in industry, business, commerce, government and consortiums, alliances and partnerships to learn how to introduce and efficiently make use of the techniques for mining of data with complex structures into their applications. The scope of the book is both theoretical and practical and as such it will reach a broad market both within academia and industry. In addition, its subject matter is a rapidly emerging field that is critical for efficient analysis of knowledge stored in various domains."
A hands on guide to web scraping and text mining for both beginners and experienced users of R * Introduces fundamental concepts of the main architecture of the web and databases and covers HTTP, HTML, XML, JSON, SQL. * Provides basic techniques to query web documents and data sets (XPath and regular expressions). * An extensive set of exercises are presented to guide the reader through each technique. * Explores both supervised and unsupervised techniques as well as advanced techniques such as data scraping and text management. * Case studies are featured throughout along with examples for each technique presented. * R code and solutions to exercises featured in the book are provided on a supporting website.
This volume gathers selected peer-reviewed papers presented at the XXVI International Joint Conference on Industrial Engineering and Operations Management (IJCIEOM), held on July 8-11, 2020 in Rio de Janeiro, Brazil. The respective chapters address a range of timely topics in industrial engineering, including operations and process management, global operations, managerial economics, data science and stochastic optimization, logistics and supply chain management, quality management, product development, strategy and organizational engineering, knowledge and information management, work and human factors, sustainability, production engineering education, healthcare operations management, disaster management, and more. These topics broadly involve fields like operations, manufacturing, industrial and production engineering, and management. Given its scope, the book offers a valuable resource for those engaged in optimization research, operations research, and practitioners alike.
This monograph addresses advances in representation learning, a cutting-edge research area of machine learning. Representation learning refers to modern data transformation techniques that convert data of different modalities and complexity, including texts, graphs, and relations, into compact tabular representations, which effectively capture their semantic properties and relations. The monograph focuses on (i) propositionalization approaches, established in relational learning and inductive logic programming, and (ii) embedding approaches, which have gained popularity with recent advances in deep learning. The authors establish a unifying perspective on representation learning techniques developed in these various areas of modern data science, enabling the reader to understand the common underlying principles and to gain insight using selected examples and sample Python code. The monograph should be of interest to a wide audience, ranging from data scientists, machine learning researchers and students to developers, software engineers and industrial researchers interested in hands-on AI solutions.
The book proposes new technologies and discusses future solutions for design infrastructure for ICT. The book contains high quality submissions presented at Second International Conference on Information and Communication Technology for Sustainable Development (ICT4SD - 2016) held at Goa, India during 1 - 2 July, 2016. The conference stimulates the cutting-edge research discussions among many academic pioneering researchers, scientists, industrial engineers, and students from all around the world. The topics covered in this book also focus on innovative issues at international level by bringing together the experts from different countries.
This book enriches unsupervised outlier detection research by proposing several new distance-based and density-based outlier scores in a k-nearest neighbors' setting. The respective chapters highlight the latest developments in k-nearest neighbor-based outlier detection research and cover such topics as our present understanding of unsupervised outlier detection in general; distance-based and density-based outlier detection in particular; and the applications of the latest findings to boundary point detection and novel object detection. The book also offers a new perspective on bridging the gap between k-nearest neighbor-based outlier detection and clustering-based outlier detection, laying the groundwork for future advances in unsupervised outlier detection research. The authors hope the algorithms and applications proposed here will serve as valuable resources for outlier detection researchers for years to come. |
![]() ![]() You may like...
Electroshock and Minors - A Fifty-Year…
Steven Baldwin, Melissa Oxlad
Hardcover
R1,369
Discovery Miles 13 690
Research Handbook on Contract Design
Marcelo Corrales Compagnucci, Helena Haapio, …
Hardcover
R6,581
Discovery Miles 65 810
|