![]() |
![]() |
Your cart is empty |
||
Books > Computing & IT > Applications of computing > Databases > Data mining
Data has increased due to the growing use of web applications and communication devices. It is necessary to develop new techniques of managing data in order to ensure adequate usage. Modern Technologies for Big Data Classification and Clustering is an essential reference source for the latest scholarly research on handling large data sets with conventional data mining and provide information about the new technologies developed for the management of large data. Featuring coverage on a broad range of topics such as text and web data analytics, risk analysis, and opinion mining, this publication is ideally designed for professionals, researchers, and students seeking current research on various concepts of big data analytics. Topics Covered: The many academic areas covered in this publication include, but are not limited to: Data visualization Distributed Computing Systems Opinion Mining Privacy and security Risk analysis Social Network Analysis Text Data Analytics Web Data Analytics
Representation learning in heterogeneous graphs (HG) is intended to provide a meaningful vector representation for each node so as to facilitate downstream applications such as link prediction, personalized recommendation, node classification, etc. This task, however, is challenging not only because of the need to incorporate heterogeneous structural (graph) information consisting of multiple types of node and edge, but also the need to consider heterogeneous attributes or types of content (e.g. text or image) associated with each node. Although considerable advances have been made in homogeneous (and heterogeneous) graph embedding, attributed graph embedding and graph neural networks, few are capable of simultaneously and effectively taking into account heterogeneous structural (graph) information as well as the heterogeneous content information of each node. In this book, we provide a comprehensive survey of current developments in HG representation learning. More importantly, we present the state-of-the-art in this field, including theoretical models and real applications that have been showcased at the top conferences and journals, such as TKDE, KDD, WWW, IJCAI and AAAI. The book has two major objectives: (1) to provide researchers with an understanding of the fundamental issues and a good point of departure for working in this rapidly expanding field, and (2) to present the latest research on applying heterogeneous graphs to model real systems and learning structural features of interaction systems. To the best of our knowledge, it is the first book to summarize the latest developments and present cutting-edge research on heterogeneous graph representation learning. To gain the most from it, readers should have a basic grasp of computer science, data mining and machine learning.
This book discusses the application of data systems and data-driven infrastructure in existing industrial systems in order to optimize workflow, utilize hidden potential, and make existing systems free from vulnerabilities. The book discusses application of data in the health sector, public transportation, the financial institutions, and in battling natural disasters, among others. Topics include real-time applications in the current big data perspective; improving security in IoT devices; data backup techniques for systems; artificial intelligence-based outlier prediction; machine learning in OpenFlow Network; and application of deep learning in blockchain enabled applications. This book is intended for a variety of readers from professional industries, organizations, and students.
Advances in hardware technology have lead to an ability to collect data with the use of a variety of sensor technologies. In particular sensor notes have become cheaper and more efficient, and have even been integrated into day-to-day devices of use, such as mobile phones. This has lead to a much larger scale of applicability and mining of sensor data sets. The human-centric aspect of sensor data has created tremendous opportunities in integrating social aspects of sensor data collection into the mining process. Managing and Mining Sensor Data is a contributed volume by prominent leaders in this field, targeting advanced-level students in computer science as a secondary text book or reference. Practitioners and researchers working in this field will also find this book useful.
This book presents established and state-of-the-art methods in Language Technology (including text mining, corpus linguistics, computational linguistics, and natural language processing), and demonstrates how they can be applied by humanities scholars working with textual data. The landscape of humanities research has recently changed thanks to the proliferation of big data and large textual collections such as Google Books, Early English Books Online, and Project Gutenberg. These resources have yet to be fully explored by new generations of scholars, and the authors argue that Language Technology has a key role to play in the exploration of large-scale textual data. The authors use a series of illustrative examples from various humanistic disciplines (mainly but not exclusively from History, Classics, and Literary Studies) to demonstrate basic and more complex use-case scenarios. This book will be useful to graduate students and researchers in humanistic disciplines working with textual data, including History, Modern Languages, Literary studies, Classics, and Linguistics. This is also a very useful book for anyone teaching or learning Digital Humanities and interested in the basic concepts from computational linguistics, corpus linguistics, and natural language processing.
This is the second edition of the comprehensive treatment of statistical inference using permutation techniques. It makes available to practitioners a variety of useful and powerful data analytic tools that rely on very few distributional assumptions. Although many of these procedures have appeared in journal articles, they are not readily available to practitioners. This new and updated edition places increased emphasis on the use of alternative permutation statistical tests based on metric Euclidean distance functions that have excellent robustness characteristics. These alternative permutation techniques provide many powerful multivariate tests including multivariate multiple regression analyses.
Data mining essentially relies on several mathematical disciplines, many of which are presented in this second edition of this book. Topics include partially ordered sets, combinatorics, general topology, metric spaces, linear spaces, graph theory. To motivate the reader a significant number of applications of these mathematical tools are included ranging from association rules, clustering algorithms, classification, data constraints, logical data analysis, etc. The book is intended as a reference for researchers and graduate students. The current edition is a significant expansion of the first edition. We strived to make the book self-contained and only a general knowledge of mathematics is required. More than 700 exercises are included and they form an integral part of the material. Many exercises are in reality supplemental material and their solutions are included.
These proceedings gather outstanding research papers presented at the Second International Conference on Data Engineering 2015 (DaEng-2015) and offer a consolidated overview of the latest developments in databases, information retrieval, data mining and knowledge management. The conference brought together researchers and practitioners from academia and industry to address key challenges in these fields, discuss advanced data engineering concepts and form new collaborations. The topics covered include but are not limited to: * Data engineering * Big data * Data and knowledge visualization * Data management * Data mining and warehousing * Data privacy & security * Database theory * Heterogeneous databases * Knowledge discovery in databases * Mobile, grid and cloud computing * Knowledge management * Parallel and distributed data * Temporal data * Web data, services and information engineering * Decision support systems * E-Business engineering and management * E-commerce and e-learning * Geographical information systems * Information management * Information quality and strategy * Information retrieval, integration and visualization * Information security * Information systems and technologies
This book organizes key concepts, theories, standards, methodologies, trends, challenges and applications of data mining and knowledge discovery in databases. It first surveys, then provides comprehensive yet concise algorithmic descriptions of methods, including classic methods plus the extensions and novel methods developed recently. It also gives in-depth descriptions of data mining applications in various interdisciplinary industries.
Overcoming many challenges, data mining has already established discipline capability in many domains. ""Dynamic and Advanced Data Mining for Progressing Technological Development: Innovations and Systemic Approaches"" discusses advances in modern data mining research in today's rapidly growing global and technological environment. A critical mass of the most sought after knowledge, this publication serves as an important reference tool to leading research within information search and retrieval techniques.
As data mining algorithms are typically applied to sizable volumes of high-dimensional data, these can result in large storage requirements and inefficient computation times. This unique text/reference addresses the challenges of data abstraction generation using a least number of database scans, compressing data through novel lossy and non-lossy schemes, and carrying out clustering and classification directly in the compressed domain. Schemes are presented which are shown to be efficient both in terms of space and time, while simultaneously providing the same or better classification accuracy, as illustrated using high-dimensional handwritten digit data and a large intrusion detection dataset. Topics and features: presents a concise introduction to data mining paradigms, data compression, and mining compressed data; describes a non-lossy compression scheme based on run-length encoding of patterns with binary valued features; proposes a lossy compression scheme that recognizes a pattern as a sequence of features and identifying subsequences; examines whether the identification of prototypes and features can be achieved simultaneously through lossy compression and efficient clustering; discusses ways to make use of domain knowledge in generating abstraction; reviews optimal prototype selection using genetic algorithms; suggests possible ways of dealing with big data problems using multiagent systems. A must-read for all researchers involved in data mining and big data, the book proposes each algorithm within a discussion of the wider context, implementation details and experimental results. These are further supported by bibliographic notes and a glossary."""
This book introduces readers to advanced data science techniques for signal mining in connection with agriculture. It shows how to apply heuristic modeling to improve farm-level efficiency, and how to use sensors and data intelligence to provide closed-loop feedback, while also providing recommendation techniques that yield actionable insights. The book also proposes certain macroeconomic pricing models, which data-mine macroeconomic signals and the influence of global economic trends on small-farm sustainability to provide actionable insights to farmers, helping them avoid financial disasters due to recurrent economic crises. The book is intended to equip current and future software engineering teams and operations research experts with the skills and tools they need in order to fully utilize advanced data science, artificial intelligence, heuristics, and economic models to develop software capabilities that help to achieve sustained food security for future generations.
This volume presents techniques and theories drawn from mathematics, statistics, computer science, and information science to analyze problems in business, economics, finance, insurance, and related fields. The authors present proposals for solutions to common problems in related fields. To this end, they are showing the use of mathematical, statistical, and actuarial modeling, and concepts from data science to construct and apply appropriate models with real-life data, and employ the design and implementation of computer algorithms to evaluate decision-making processes. This book is unique as it associates data science - data-scientists coming from different backgrounds - with some basic and advanced concepts and tools used in econometrics, operational research, and actuarial sciences. It, therefore, is a must-read for scholars, students, and practitioners interested in a better understanding of the techniques and theories of these fields.
After a short description of the key concepts of big data the book explores on the secrecy and security threats posed especially by cloud based data storage. It delivers conceptual frameworks and models along with case studies of recent technology.
Enterprise Architecture, Integration, and Interoperability and the Networked enterprise have become the theme of many conferences in the past few years. These conferences were organised by IFIP TC5 with the support of its two working groups: WG 5. 12 (Architectures for Enterprise Integration) and WG 5. 8 (Enterprise Interoperability), both concerned with aspects of the topic: how is it possible to architect and implement businesses that are flexible and able to change, to interact, and use one another's s- vices in a dynamic manner for the purpose of (joint) value creation. The original qu- tion of enterprise integration in the 1980s was: how can we achieve and integrate - formation and material flow in the enterprise? Various methods and reference models were developed or proposed - ranging from tightly integrated monolithic system - chitectures, through cell-based manufacturing to on-demand interconnection of bu- nesses to form virtual enterprises in response to market opportunities. Two camps have emerged in the endeavour to achieve the same goal, namely, to achieve interoperability between businesses (whereupon interoperability is the ability to exchange information in order to use one another's services or to jointly implement a service). One school of researchers addresses the technical aspects of creating dynamic (and static) interconnections between disparate businesses (or parts thereof).
Deep Learning models are at the core of artificial intelligence research today. It is well known that deep learning techniques are disruptive for Euclidean data, such as images or sequence data, and not immediately applicable to graph-structured data such as text. This gap has driven a wave of research for deep learning on graphs, including graph representation learning, graph generation, and graph classification. The new neural network architectures on graph-structured data (graph neural networks, GNNs in short) have performed remarkably on these tasks, demonstrated by applications in social networks, bioinformatics, and medical informatics. Despite these successes, GNNs still face many challenges ranging from the foundational methodologies to the theoretical understandings of the power of the graph representation learning. This book provides a comprehensive introduction of GNNs. It first discusses the goals of graph representation learning and then reviews the history, current developments, and future directions of GNNs. The second part presents and reviews fundamental methods and theories concerning GNNs while the third part describes various frontiers that are built on the GNNs. The book concludes with an overview of recent developments in a number of applications using GNNs. This book is suitable for a wide audience including undergraduate and graduate students, postdoctoral researchers, professors and lecturers, as well as industrial and government practitioners who are new to this area or who already have some basic background but want to learn more about advanced and promising techniques and applications.
Traditional methods for handling spatial data are encumbered by the assumption of separate origins for horizontal and vertical measurements, but modern measurement systems operate in a 3-D spatial environment. The 3-D Global Spatial Data Model: Principles and Applications, Second Edition maintains a new model for handling digital spatial data, the global spatial data model or GSDM. The GSDM preserves the integrity of three-dimensional spatial data while also providing additional benefits such as simpler equations, worldwide standardization, and the ability to track spatial data accuracy with greater specificity and convenience. This second edition expands to new topics that satisfy a growing need in the GIS, professional surveyor, machine control, and Big Data communities while continuing to embrace the earth center fixed coordinate system as the fundamental point of origin of one, two, and three-dimensional data sets. Ideal for both beginner and advanced levels, this book also provides guidance and insight on how to link to the data collected and stored in legacy systems.
This edited book covers recent advances of techniques, methods and tools treating the problem of learning from data streams generated by evolving non-stationary processes. The goal is to discuss and overview the advanced techniques, methods and tools that are dedicated to manage, exploit and interpret data streams in non-stationary environments. The book includes the required notions, definitions, and background to understand the problem of learning from data streams in non-stationary environments and synthesizes the state-of-the-art in the domain, discussing advanced aspects and concepts and presenting open problems and future challenges in this field. Provides multiple examples to facilitate the understanding data streams in non-stationary environments; Presents several application cases to show how the methods solve different real world problems; Discusses the links between methods to help stimulate new research and application directions.
This open access book describes the results of natural language processing and machine learning methods applied to clinical text from electronic patient records. It is divided into twelve chapters. Chapters 1-4 discuss the history and background of the original paper-based patient records, their purpose, and how they are written and structured. These initial chapters do not require any technical or medical background knowledge. The remaining eight chapters are more technical in nature and describe various medical classifications and terminologies such as ICD diagnosis codes, SNOMED CT, MeSH, UMLS, and ATC. Chapters 5-10 cover basic tools for natural language processing and information retrieval, and how to apply them to clinical text. The difference between rule-based and machine learning-based methods, as well as between supervised and unsupervised machine learning methods, are also explained. Next, ethical concerns regarding the use of sensitive patient records for research purposes are discussed, including methods for de-identifying electronic patient records and safely storing patient records. The book's closing chapters present a number of applications in clinical text mining and summarise the lessons learned from the previous chapters. The book provides a comprehensive overview of technical issues arising in clinical text mining, and offers a valuable guide for advanced students in health informatics, computational linguistics, and information retrieval, and for researchers entering these fields.
The post-genomic revolution is witnessing the generation of petabytes of data annually, with deep implications ranging across evolutionary theory, developmental biology, agriculture, and disease processes. "Data Mining for Systems Biology: Methods and Protocols," surveys and demonstrates the science and technology of converting an unprecedented data deluge to new knowledge and biological insight. The volume is organized around two overlapping themes, network inference and functional inference. Written in the highly successful "Methods in Molecular Biology " series format, chapters include introductions to their respective topics, lists of the necessary materials and reagents, step-by-step, readily reproducible protocols, and key tips on troubleshooting and avoiding known pitfalls. Authoritative and practical, "Data Mining for Systems Biology: Methods and Protocols" also seeks to aid researchers in the further development of databases, mining and visualization systems that are central to the paradigm altering discoveries being made with increasing frequency."
This book addresses different methods and techniques of integration for enhancing the overall goal of data mining. The book is a collection of high-quality peer-reviewed research papers presented in the Sixth International Conference on Computational Intelligence in Data Mining (ICCIDM 2021) held at Aditya Institute of Technology and Management, Tekkali, Andhra Pradesh, India, during December 11-12, 2021. The book addresses the difficulties and challenges for the seamless integration of two core disciplines of computer science, i.e., computational intelligence and data mining. The book helps to disseminate the knowledge about some innovative, active research directions in the field of data mining, machine and computational intelligence, along with some current issues and applications of related topics.
Making use of data is not anymore a niche project but central to almost every project. With access to massive compute resources and vast amounts of data, it seems at least in principle possible to solve any problem. However, successful data science projects result from the intelligent application of: human intuition in combination with computational power; sound background knowledge with computer-aided modelling; and critical reflection of the obtained insights and results. Substantially updating the previous edition, then entitled Guide to Intelligent Data Analysis, this core textbook continues to provide a hands-on instructional approach to many data science techniques, and explains how these are used to solve real world problems. The work balances the practical aspects of applying and using data science techniques with the theoretical and algorithmic underpinnings from mathematics and statistics. Major updates on techniques and subject coverage (including deep learning) are included. Topics and features: guides the reader through the process of data science, following the interdependent steps of project understanding, data understanding, data blending and transformation, modeling, as well as deployment and monitoring; includes numerous examples using the open source KNIME Analytics Platform, together with an introductory appendix; provides a review of the basics of classical statistics that support and justify many data analysis methods, and a glossary of statistical terms; integrates illustrations and case-study-style examples to support pedagogical exposition; supplies further tools and information at an associated website. This practical and systematic textbook/reference is a "need-to-have" tool for graduate and advanced undergraduate students and essential reading for all professionals who face data science problems. Moreover, it is a "need to use, need to keep" resource following one's exploration of the subject.
The latest inventions in internet technology influence most of business and daily activities. Internet security, internet data management, web search, data grids, cloud computing, and web-based applications play vital roles, especially in business and industry, as more transactions go online and mobile. Issues related to ubiquitous computing are becoming critical. Internet technology and data engineering should reinforce efficiency and effectiveness of business processes. These technologies should help people make better and more accurate decisions by presenting necessary information and possible consequences for the decisions. Intelligent information systems should help us better understand and manage information with ubiquitous data repository and cloud computing. This book is a compilation of some recent research findings in Internet Technology and Data Engineering. This book provides state-of-the-art accounts in computational algorithms/tools, database management and database technologies, intelligent information systems, data engineering applications, internet security, internet data management, web search, data grids, cloud computing, web-based application, and other related topics. |
![]() ![]() You may like...
Rough Computing - Theories, Technologies…
Aboul Ella Hassanien, Zbigniew Suraj, …
Hardcover
R5,201
Discovery Miles 52 010
Definitive Guide to DAX, The - Business…
Alberto Ferrari, Marco Russo
Paperback
Intelligent Control - A Stochastic…
Kaushik Das Sharma, Amitava Chatterjee, …
Hardcover
R4,871
Discovery Miles 48 710
|