![]() |
![]() |
Your cart is empty |
||
Books > Computing & IT > Applications of computing > Databases > Data mining
This book is a comprehensive, hands-on guide to the basics of data mining and machine learning with a special emphasis on supervised and unsupervised learning methods. The book lays stress on the new ways of thinking needed to master in machine learning based on the Python, R, and Java programming platforms. This book first provides an understanding of data mining, machine learning and their applications, giving special attention to classification and clustering techniques. The authors offer a discussion on data mining and machine learning techniques with case studies and examples. The book also describes the hands-on coding examples of some well-known supervised and unsupervised learning techniques using three different and popular coding platforms: R, Python, and Java. This book explains some of the most popular classification techniques (K-NN, Naive Bayes, Decision tree, Random forest, Support vector machine etc,) along with the basic description of artificial neural network and deep neural network. The book is useful for professionals, students studying data mining and machine learning, and researchers in supervised and unsupervised learning techniques.
This monograph offers an original broad and very diverse exploration of the seriation domain in data analysis, together with building a specific relation to clustering.Relative to a data table crossing a set of objects and a set of descriptive attributes, the search for orders which correspond respectively to these two sets is formalized mathematically and statistically. State-of-the-art methods are created and compared with classical methods and a thorough understanding of the mutual relationships between these methods is clearly expressed. The authors distinguish two families of methods: Geometric representation methods Algorithmic and Combinatorial methods Original and accurate methods are provided in the framework for both families. Their basis and comparison is made on both theoretical and experimental levels. The experimental analysis is very varied and very comprehensive. Seriation in Combinatorial and Statistical Data Analysis has a unique character in the literature falling within the fields of Data Analysis, Data Mining and Knowledge Discovery. It will be a valuable resource for students and researchers in the latter fields.
This important text/reference presents a comprehensive review of techniques for taxonomy matching, discussing matching algorithms, analyzing matching systems, and comparing matching evaluation approaches. Different methods are investigated in accordance with the criteria of the Ontology Alignment Evaluation Initiative (OAEI). The text also highlights promising developments and innovative guidelines, to further motivate researchers and practitioners in the field. Topics and features: discusses the fundamentals and the latest developments in taxonomy matching, including the related fields of ontology matching and schema matching; reviews next-generation matching strategies, matching algorithms, matching systems, and OAEI campaigns, as well as alternative evaluations; examines how the latest techniques make use of different sources of background knowledge to enable precise matching between repositories; describes the theoretical background, state-of-the-art research, and practical real-world applications; covers the fields of dynamic taxonomies, personalized directories, catalog segmentation, and recommender systems. This stimulating book is an essential reference for practitioners engaged in data science and business intelligence, and for researchers specializing in taxonomy matching and semantic similarity assessment. The work is also suitable as a supplementary text for advanced undergraduate and postgraduate courses on information and metadata management.
This monograph discusses software reuse and how it can be applied at different stages of the software development process, on different types of data and at different levels of granularity. Several challenging hypotheses are analyzed and confronted using novel data-driven methodologies, in order to solve problems in requirements elicitation and specification extraction, software design and implementation, as well as software quality assurance. The book is accompanied by a number of tools, libraries and working prototypes in order to practically illustrate how the phases of the software engineering life cycle can benefit from unlocking the potential of data. Software engineering researchers, experts, and practitioners can benefit from the various methodologies presented and can better understand how knowledge extracted from software data residing in various repositories can be combined and used to enable effective decision making and save considerable time and effort through software reuse. Mining Software Engineering Data for Software Reuse can also prove handy for graduate-level students in software engineering.
Data Mining for Business Applications presents the state-of-the-art research and development outcomes on methodologies, techniques, approaches and successful applications in the area. The contributions mark a paradigm shift from data-centered pattern mining to domain driven actionable knowledge discovery for next-generation KDD research and applications. The contents identify how KDD techniques can better contribute to critical domain problems in theory and practice, and strengthen business intelligence in complex enterprise applications. The volume also explores challenges and directions for future research and development in the dialogue between academia and business."
This book offers a comprehensive review of multilabel techniques widely used to classify and label texts, pictures, videos and music in the Internet. A deep review of the specialized literature on the field includes the available software needed to work with this kind of data. It provides the user with the software tools needed to deal with multilabel data, as well as step by step instruction on how to use them. The main topics covered are: * The special characteristics of multi-labeled data and the metrics available to measure them.* The importance of taking advantage of label correlations to improve the results.* The different approaches followed to face multi-label classification.* The preprocessing techniques applicable to multi-label datasets.* The available software tools to work with multi-label data. This book is beneficial for professionals and researchers in a variety of fields because of the wide range of potential applications for multilabel classification. Besides its multiple applications to classify different types of online information, it is also useful in many other areas, such as genomics and biology. No previous knowledge about the subject is required. The book introduces all the needed concepts to understand multilabel data characterization, treatment and evaluation.
This book provides a comprehensive set of characterization, prediction, optimization, evaluation, and evolution techniques for a diagnosis system for fault isolation in large electronic systems. Readers with a background in electronics design or system engineering can use this book as a reference to derive insightful knowledge from data analysis and use this knowledge as guidance for designing reasoning-based diagnosis systems. Moreover, readers with a background in statistics or data analytics can use this book as a practical case study for adapting data mining and machine learning techniques to electronic system design and diagnosis. This book identifies the key challenges in reasoning-based, board-level diagnosis system design and presents the solutions and corresponding results that have emerged from leading-edge research in this domain. It covers topics ranging from highly accurate fault isolation, adaptive fault isolation, diagnosis-system robustness assessment, to system performance analysis and evaluation, knowledge discovery and knowledge transfer. With its emphasis on the above topics, the book provides an in-depth and broad view of reasoning-based fault diagnosis system design. * Explains and applies optimized techniques from the machine-learning domain to solve the fault diagnosis problem in the realm of electronic system design and manufacturing;* Demonstrates techniques based on industrial data and feedback from an actual manufacturing line;* Discusses practical problems, including diagnosis accuracy, diagnosis time cost, evaluation of diagnosis system, handling of missing syndromes in diagnosis, and need for fast diagnosis-system development.
In this book, Dr. Soofastaei and his colleagues reveal how all mining managers can effectively deploy advanced analytics in their day-to-day operations- one business decision at a time. Most mining companies have a massive amount of data at their disposal. However, they cannot use the stored data in any meaningful way. The powerful new business tool-advanced analytics enables many mining companies to aggressively leverage their data in key business decisions and processes with impressive results. From statistical analysis to machine learning and artificial intelligence, the authors show how many analytical tools can improve decisions about everything in the mine value chain, from exploration to marketing. Combining the science of advanced analytics with the mining industrial business solutions, introduce the "Advanced Analytics in Mining Engineering Book" as a practical road map and tools for unleashing the potential buried in your company's data. The book is aimed at providing mining executives, managers, and research and development teams with an understanding of the business value and applicability of different analytic approaches and helping data analytics leads by giving them a business framework in which to assess the value, cost, and risk of potential analytical solutions. In addition, the book will provide the next generation of miners - undergraduate and graduate IT and mining engineering students - with an understanding of data analytics applied to the mining industry. By providing a book with chapters structured in line with the mining value chain, we will provide a clear, enterprise-level view of where and how advanced data analytics can best be applied. This book highlights the potential to interconnect activities in the mining enterprise better. Furthermore, the book explores the opportunities for optimization and increased productivity offered by better interoperability along the mining value chain - in line with the emerging vision of creating a digital mine with much-enhanced capabilities for modeling, simulation, and the use of digital twins - in line with leading "digital" industries.
This book is a collection of high scientific novel contributions addressing several of these challenges. These articles are extended versions of a selection of the best papers that were initially presented at the French-speaking conferences EGC'2019held in Metz (France, January 21-25, 2019). These extended versions have been accepted after an additional peer-review process among papers already accepted in long format at the conference. Concerning the conference, the long and short papers selection were also the result of a double blind peer review process among the hundreds of papers initially submitted to each edition of the conference (acceptance rate for long papers is about 25%.
Covid-19 has hit the world unprepared, as the deadliest pandemic of the century. Governments and authorities, as leaders and decision makers fighting against the virus, enormously tap on the power of AI and its data analytics models for urgent decision supports at the greatest efforts, ever seen from human history. This book showcases a collection of important data analytics models that were used during the epidemic, and discusses and compares their efficacy and limitations. Readers who from both healthcare industries and academia can gain unique insights on how data analytics models were designed and applied on epidemic data. Taking Covid-19 as a case study, readers especially those who are working in similar fields, would be better prepared in case a new wave of virus epidemic may arise again in the near future.
Advances in Computational Algorithms and Data Analysis offers state of the art tremendous advances in computational algorithms and data analysis. The selected articles are representative in these subjects sitting on the top-end-high technologies. The volume serves as an excellent reference work for researchers and graduate students working on computational algorithms and data analysis.
This book provides a thorough summary of the means currently available to the investigators of Artificial Intelligence for making criminal behavior (both individual and collective) foreseeable, and for assisting their investigative capacities. The volume provides chapters on the introduction of artificial intelligence and machine learning suitable for an upper level undergraduate with exposure to mathematics and some programming skill or a graduate course. It also brings the latest research in Artificial Intelligence to life with its chapters on fascinating applications in the area of law enforcement, though much is also being accomplished in the fields of medicine and bioengineering. Individuals with a background in Artificial Intelligence will find the opening chapters to be an excellent refresher but the greatest excitement will likely be the law enforcement examples, for little has been done in that area. The editors have chosen to shine a bright light on law enforcement analytics utilizing artificial neural network technology to encourage other researchers to become involved in this very important and timely field of study.
The book features original papers from the 2nd International Conference on Smart IoT Systems: Innovations and Computing (SSIC 2019), presenting scientific work related to smart solution concepts. It discusses computational collective intelligence, which includes interactions between smart devices, smart environments and smart interactions, as well as information technology support for such areas. It also describes how to successfully approach various government organizations for funding for business and the humanitarian technology development projects. Thanks to the high-quality content and the broad range of the topics covered, the book appeals to researchers pursuing advanced studies.
This book presents recent advances in Knowledge discovery in databases (KDD) with a focus on the areas of market basket database, time-stamped databases and multiple related databases. Various interesting and intelligent algorithms are reported on data mining tasks. A large number of association measures are presented, which play significant roles in decision support applications. This book presents, discusses and contrasts new developments in mining time-stamped data, time-based data analyses, the identification of temporal patterns, the mining of multiple related databases, as well as local patterns analysis.
This book examines the field of parallel database management systems and illustrates the great variety of solutions based on a shared-storage or a shared-nothing architecture. Constantly dropping memory prices and the desire to operate with low-latency responses on large sets of data paved the way for main memory-based parallel database management systems. However, this area is currently dominated by the shared-nothing approach in order to preserve the in-memory performance advantage by processing data locally on each server. The main argument this book makes is that such an unilateral development will cease due to the combination of the following three trends: a) Today's network technology features remote direct memory access (RDMA) and narrows the performance gap between accessing main memory on a server and of a remote server to and even below a single order of magnitude. b) Modern storage systems scale gracefully, are elastic and provide high-availability. c) A modern storage system such as Stanford's RAM Cloud even keeps all data resident in the main memory. Exploiting these characteristics in the context of a main memory-based parallel database management system is desirable. The book demonstrates that the advent of RDMA-enabled network technology makes the creation of a parallel main memory DBMS based on a shared-storage approach feasible.
This book reports on cutting-edge research carried out within the context of the EU-funded Dicode project, which aims at facilitating and augmenting collaboration and decision making in data-intensive and cognitively complex settings. Whenever appropriate, Dicode builds on prominent high-performance computing paradigms and large data processing technologies to meaningfully search, analyze, and aggregate data from diverse, extremely large and rapidly evolving sources. The Dicode approach and services are fully explained and particular emphasis is placed on deepening insights regarding the exploitation of big data, as well as on collaboration and issues relating to sense-making support. Building on current advances, the solution developed in the Dicode project brings together the reasoning capabilities of both the machine and humans. It can be viewed as an innovative "workbench" incorporating and orchestrating a set of interoperable services that reduce the data intensiveness and complexity overload at critical decision points to a manageable level, thus permitting stakeholders to be more productive and effective in their work practices.
This book includes high-quality papers presented at the Second International Conference on Data Science and Management (ICDSM 2021), organized by the Gandhi Institute for Education and Technology, Bhubaneswar, from 19 to 20 February 2021. It features research in which data science is used to facilitate the decision-making process in various application areas, and also covers a wide range of learning methods and their applications in a number of learning problems. The empirical studies, theoretical analyses and comparisons to psychological phenomena described contribute to the development of products to meet market demands.
This book provides a practical and fairly comprehensive review of Data Science through the lens of dimensionality reduction, as well as hands-on techniques to tackle problems with data collected in the real world. State-of-the-art results and solutions from statistics, computer science and mathematics are explained from the point of view of a practitioner in any domain science, such as biology, cyber security, chemistry, sports science and many others. Quantitative and qualitative assessment methods are described to implement and validate the solutions back in the real world where the problems originated. The ability to generate, gather and store volumes of data in the order of tera- and exo bytes daily has far outpaced our ability to derive useful information with available computational resources for many domains. This book focuses on data science and problem definition, data cleansing, feature selection and extraction, statistical, geometric, information-theoretic, biomolecular and machine learning methods for dimensionality reduction of big datasets and problem solving, as well as a comparative assessment of solutions in a real-world setting. This book targets professionals working within related fields with an undergraduate degree in any science area, particularly quantitative. Readers should be able to follow examples in this book that introduce each method or technique. These motivating examples are followed by precise definitions of the technical concepts required and presentation of the results in general situations. These concepts require a degree of abstraction that can be followed by re-interpreting concepts like in the original example(s). Finally, each section closes with solutions to the original problem(s) afforded by these techniques, perhaps in various ways to compare and contrast dis/advantages to other solutions.
Locating empirical information on specific service industry characteristics is not an easy task, even for an individual familiar with various sources of data. This book is a quick source of information on service industry statistics across many nations of the world. The reader is introduced to finding key sources of data, building analytical ratios from diverse sources, and understanding the advantages and disadvantages of data selection methods in the service sector. The global nature of the data compiled in this book, especially an extensive coverage of the United States, makes it an invaluable resource to active researchers and stakeholders in the service industry as well as those who seek to enter it.
Social network analysis increasingly bridges the discovery of patterns in diverse areas of study as more data becomes available and complex. Yet the construction of huge networks from large data often requires entirely different approaches for analysis including; graph theory, statistics, machine learning and data mining. This work covers frontier studies on social network analysis and mining from different perspectives such as social network sites, financial data, e-mails, forums, academic research funds, XML technology, blog content, community detection and clique finding, prediction of user's- behavior, privacy in social network analysis, mobility from spatio-temporal point of view, agent technology and political parties in parliament. These topics will be of interest to researchers and practitioners from different disciplines including, but not limited to, social sciences and engineering.
The textbook at hand aims to provide an introduction to the use of automated methods for gathering strategic competitiveintelligence. Hereby, the text does not describe a singleton research discipline in its own right, such as machine learning or Web mining. It rather contemplates an "application scenario," namely the gathering of knowledge that appears of paramount importance to organizations, e.g., companies and corporations. To this end, the book first summarizes the range of research disciplines that contribute to addressing the issue, extracting from each those grains that are of utmost relevance to the depicted application scope. Moreover, the book presents systems that put these techniques to practical use (e.g., reputation monitoring platforms) and takes an inductive approach to define the "gestalt" of mining for competitive strategic intelligence by selecting major use cases that are laid out and explained in detail. These pieces form the first part of the book. Each of those use cases is backed by a number of research papers, some of which are contained in its largely original version in the second part of the monograph. "
This book constitutes the refereed post-conference proceedings of the Fifth IFIP TC 12 International Conference on Computational Intelligence in Data Science, ICCIDS 2022, held virtually, in March 2022. The 28 revised full papers presented were carefully reviewed and selected from 96 submissions. The papers cover topics such as computational intelligence for text analysis; computational intelligence for image and video analysis; blockchain and data science.
This book includes the proceedings of the second workshop on recommender systems in fashion and retail (2020), and it aims to present a state-of-the-art view of the advancements within the field of recommendation systems with focused application to e-commerce, retail, and fashion by presenting readers with chapters covering contributions from academic as well as industrial researchers active within this emerging new field. Recommender systems are often used to solve different complex problems in this scenario, such as product recommendations, or size and fit recommendations, and social media-influenced recommendations (outfits worn by influencers).
In many real-world problems, rare categories (minority classes) play essential roles despite their extreme scarcity. The discovery, characterization and prediction of rare categories of rare examples may protect us from fraudulent or malicious behavior, aid scientific discovery, and even save lives. This book focuses on rare category analysis, where the majority classes have smooth distributions, and the minority classes exhibit the compactness property. Furthermore, it focuses on the challenging cases where the support regions of the majority and minority classes overlap. The author has developed effective algorithms with theoretical guarantees and good empirical results for the related techniques, and these are explained in detail. The book is suitable for researchers in the area of artificial intelligence, in particular machine learning and data mining.
Analyzing Social Media Networks with NodeXL: Insights from a Connected World, Second Edition, provides readers with a thorough, practical and updated guide to NodeXL, the open-source social network analysis (SNA) plug-in for use with Excel. The book analyzes social media, provides a NodeXL tutorial, and presents network analysis case studies, all of which are revised to reflect the latest developments. Sections cover history and concepts, mapping and modeling, the detailed operation of NodeXL, and case studies, including e-mail, Twitter, Facebook, Flickr and YouTube. In addition, there are descriptions of each system and types of analysis for identifying people, documents, groups and events. This book is perfect for use as a course text in social network analysis or as a guide for practicing NodeXL users. |
![]() ![]() You may like...
Social Sensing - Building Reliable…
Dong Wang, Tarek Abdelzaher, …
Paperback
Implementation of Machine Learning…
Veljko Milutinovi, Nenad Mitic, …
Hardcover
R7,372
Discovery Miles 73 720
Opinion Mining and Text Analytics on…
Pantea Keikhosrokiani, Moussa Pourya Asl
Hardcover
R10,307
Discovery Miles 103 070
Big Data - Concepts, Methodologies…
Information Reso Management Association
Hardcover
R19,596
Discovery Miles 195 960
Clinical Decision Support and Beyond…
Robert Greenes, Guilherme Del Fiol
Paperback
|