![]() |
![]() |
Your cart is empty |
||
Books > Computing & IT > Applications of computing > Databases > Data mining
Decision trees have become one of the most powerful and popular approaches in knowledge discovery and data mining; it is the science of exploring large and complex bodies of data in order to discover useful patterns. Decision tree learning continues to evolve over time. Existing methods are constantly being improved and new methods introduced.This 2nd Edition is dedicated entirely to the field of decision trees in data mining; to cover all aspects of this important technique, as well as improved or new methods and techniques developed after the publication of our first edition. In this new edition, all chapters have been revised and new topics brought in. New topics include Cost-Sensitive Active Learning, Learning with Uncertain and Imbalanced Data, Using Decision Trees beyond Classification Tasks, Privacy Preserving Decision Tree Learning, Lessons Learned from Comparative Studies, and Learning Decision Trees for Big Data. A walk-through guide to existing open-source data mining software is also included in this edition.This book invites readers to explore the many benefits in data mining that decision trees offer:
This book provides an overview of crowdsourced data management. Covering all aspects including the workflow, algorithms and research potential, it particularly focuses on the latest techniques and recent advances. The authors identify three key aspects in determining the performance of crowdsourced data management: quality control, cost control and latency control. By surveying and synthesizing a wide spectrum of studies on crowdsourced data management, the book outlines important factors that need to be considered to improve crowdsourced data management. It also introduces a practical crowdsourced-database-system design and presents a number of crowdsourced operators. Self-contained and covering theory, algorithms, techniques and applications, it is a valuable reference resource for researchers and students new to crowdsourced data management with a basic knowledge of data structures and databases.
This book is a timely collection of chapters that present the state of the art within the analysis and application of big data. Working within the broader context of big data, this text focuses on the hot topics of social network modelling and analysis such as online dating recommendations, hiring practices, and subscription-type prediction in mobile phone services. Manuscripts are expanded versions of the best papers presented at the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM'2016), which was held in August 2016. The papers were among the best featured at the meeting and were then improved and extended substantially. Social Network Based Big Data Analysis and Applications will appeal to students and researchers in the field.
This book presents the proceedings of the Conference on Algorithms and Applications (ALAP 2018), which focuses on various areas of computing, like distributed systems and security, big data and analytics and very-large-scale integration (VLSI) design. The book provides solutions to a broad class of problems in diverse areas of algorithms in our daily lives in a world designed for, and increasingly controlled by algorithms. Written by eminent personalities from academia and industry, the papers included offer insights from a number of perspectives, providing an overview of the state of the art in the field. The book consists of invited talks by respected speakers, papers presented in technical sessions, and tutorials to offer ideas, results, work-in-progress and experiences of various algorithmic aspects of computational science and engineering.
This book features research papers presented at the International Conference on Emerging Technologies in Data Mining and Information Security (IEMIS 2018) held at the University of Engineering & Management, Kolkata, India, on February 23-25, 2018. It comprises high-quality research work by academicians and industrial experts in the field of computing and communication, including full-length papers, research-in-progress papers, and case studies related to all the areas of data mining, machine learning, Internet of Things (IoT) and information security.
This book constitutes the refereed proceedings of the 6th IFIP TC 5 International Conference on Computational Intelligence and Its Applications, CIIA 2018, held in Oran, Algeria, in May 2018. The 56 full papers presented were carefully reviewed and selected from 202 submissions. They are organized in the following topical sections: data mining and information retrieval; evolutionary computation; machine learning; optimization; planning and scheduling; wireless communication and mobile computing; Internet of Things (IoT) and decision support systems; pattern recognition and image processing; and semantic web services.
This book provides a comprehensive picture of mobile big data starting from data sources to mobile data driven applications. Mobile Big Data comprises two main components: an overview of mobile big data, and the case studies based on real-world data recently collected by one of the largest mobile network carriers in China. In the first component, four areas of mobile big data life cycle are surveyed: data source and collection, transmission, computing platform and applications. In the second component, two case studies are provided, based on the signaling data collected in the cellular core network in terms of subscriber privacy evaluation and demand forecasting for network management. These cases respectively give a vivid demonstration of what mobile big data looks like, and how it can be analyzed and mined to generate useful and meaningful information and knowledge. This book targets researchers, practitioners and professors relevant to this field. Advanced-level students studying computer science and electrical engineering will also be interested in this book as supplemental reading.
This richly illustrated book provides an easy-to-read introduction to the challenges of organizing and integrating modern data worlds, explaining the contribution of public statistics and the ISO standard SDMX (Statistical Data and Metadata Exchange). As such, it is a must for data experts as well those aspiring to become one. Today, exponentially growing data worlds are increasingly determining our professional and private lives. The rapid increase in the amount of globally available data, fueled by search engines and social networks but also by new technical possibilities such as Big Data, offers great opportunities. But whatever the undertaking - driving the block chain revolution or making smart phones even smarter - success will be determined by how well it is possible to integrate, i.e. to collect, link and evaluate, the required data. One crucial factor in this is the introduction of a cross-domain order system in combination with a standardization of the data structure. Using everyday examples, the authors show how the concepts of statistics provide the basis for the universal and standardized presentation of any kind of information. They also introduce the international statistics standard SDMX, describing the profound changes it has made possible and the related order system for the international statistics community.
This text is about spreading of information and influence in complex networks. Although previously considered similar and modeled in parallel approaches, there is now experimental evidence that epidemic and social spreading work in subtly different ways. While previously explored through modeling, there is currently an explosion of work on revealing the mechanisms underlying complex contagion based on big data and data-driven approaches. This volume consists of four parts. Part 1 is an Introduction, providing an accessible summary of the state of the art. Part 2 provides an overview of the central theoretical developments in the field. Part 3 describes the empirical work on observing spreading processes in real-world networks. Finally, Part 4 goes into detail with recent and exciting new developments: dedicated studies designed to measure specific aspects of the spreading processes, often using randomized control trials to isolate the network effect from confounders, such as homophily. Each contribution is authored by leading experts in the field. This volume, though based on technical selections of the most important results on complex spreading, remains quite accessible to the newly interested. The main benefit to the reader is that the topics are carefully structured to take the novice to the level of expert on the topic of social spreading processes. This book will be of great importance to a wide field: from researchers in physics, computer science, and sociology to professionals in public policy and public health.
This open access book describes the results of natural language processing and machine learning methods applied to clinical text from electronic patient records. It is divided into twelve chapters. Chapters 1-4 discuss the history and background of the original paper-based patient records, their purpose, and how they are written and structured. These initial chapters do not require any technical or medical background knowledge. The remaining eight chapters are more technical in nature and describe various medical classifications and terminologies such as ICD diagnosis codes, SNOMED CT, MeSH, UMLS, and ATC. Chapters 5-10 cover basic tools for natural language processing and information retrieval, and how to apply them to clinical text. The difference between rule-based and machine learning-based methods, as well as between supervised and unsupervised machine learning methods, are also explained. Next, ethical concerns regarding the use of sensitive patient records for research purposes are discussed, including methods for de-identifying electronic patient records and safely storing patient records. The book's closing chapters present a number of applications in clinical text mining and summarise the lessons learned from the previous chapters. The book provides a comprehensive overview of technical issues arising in clinical text mining, and offers a valuable guide for advanced students in health informatics, computational linguistics, and information retrieval, and for researchers entering these fields.
The book constitutes selected high quality papers presented in International Conference on Computing, Power and Communication Technologies 2018 (GUCON 2018) organised by Galgotias University, India, in September 2018. It discusses issues in electrical, computer and electronics engineering and technologies. The selected papers are organised into three sections - cloud computing and computer networks; data mining and big data analysis; and bioinformatics and machine learning. In-depth discussions on various issues under these topics provides an interesting compilation for researchers, engineers, and students.
The two-volume set LNAI 11288 and 11289 constitutes the proceedings of the 17th Mexican International Conference on Artificial Intelligence, MICAI 2018, held in Guadalajara, Mexico, in October 2018. The total of 62 papers presented in these two volumes was carefully reviewed and selected from 149 submissions. The contributions are organized in topical as follows: Part I: evolutionary and nature-inspired intelligence; machine learning; fuzzy logic and uncertainty management. Part II: knowledge representation, reasoning, and optimization; natural language processing; and robotics and computer vision.
This edited volume presents examples of social science research projects that employ new methods of quantitative analysis and mathematical modeling of social processes. This book presents the fascinating areas of empirical and theoretical investigations that use formal mathematics in a way that is accessible for individuals lacking extensive expertise but still desiring to expand their scope of research methodology and add to their data analysis toolbox. Mathematical Modeling of Social Relationships professes how mathematical modeling can help us understand the fundamental, compelling, and yet sometimes complicated concepts that arise in the social sciences. This volume will appeal to upper-level students and researchers in a broad area of fields within the social sciences, as well as the disciplines of social psychology, complex systems, and applied mathematics.
This book presents modeling methods and algorithms for data-driven prediction and forecasting of practical industrial process by employing machine learning and statistics methodologies. Related case studies, especially on energy systems in the steel industry are also addressed and analyzed. The case studies in this volume are entirely rooted in both classical data-driven prediction problems and industrial practice requirements. Detailed figures and tables demonstrate the effectiveness and generalization of the methods addressed, and the classifications of the addressed prediction problems come from practical industrial demands, rather than from academic categories. As such, readers will learn the corresponding approaches for resolving their industrial technical problems. Although the contents of this book and its case studies come from the steel industry, these techniques can be also used for other process industries. This book appeals to students, researchers, and professionals within the machine learning and data analysis and mining communities.
This book demonstrates how quantitative methods for text analysis can successfully combine with qualitative methods in the study of different disciplines of the Humanities and Social Sciences (HSS). The book focuses on learning about the evolution of ideas of HSS disciplines through a distant reading of the contents conveyed by scientific literature, in order to retrieve the most relevant topics being debated over time. Quantitative methods, statistical techniques and software packages are used to identify and study the main subject matters of a discipline from raw textual data, both in the past and today. The book also deals with the concept of quality of life of words and aims to foster a discussion about the life cycle of scientific ideas. Textual data retrieved from large corpora pose interesting challenges for any data analysis method and today represent a growing area of research in many fields. New problems emerge from the growing availability of large databases and new methods are needed to retrieve significant information from those large information sources. This book can be used to explain how quantitative methods can be part of the research instrumentation and the "toolbox" of scholars of Humanities and Social Sciences. The book contains numerous examples and a description of the main methods in use, with references to literature and available software. Most of the chapters of the book have been written in a non-technical language for HSS researchers without mathematical, computer or statistical backgrounds.
This book addresses and examines the impacts of applications and services for data management and analysis, such as infrastructure, platforms, software, and business processes, on both academia and industry. The chapters cover effective approaches in dealing with the inherent complexity and increasing demands of big data management from an applications perspective. Various case studies included have been reported by data analysis experts who work closely with their clients in such fields as education, banking, and telecommunications. Understanding how data management has been adapted to these applications will help students, instructors and professionals in the field. Application areas also include the fields of social network analysis, bioinformatics, and the oil and gas industries.
In this book, the authors first address the research issues by providing a motivating scenario, followed by the exploration of the principles and techniques of the challenging topics. Then they solve the raised research issues by developing a series of methodologies. More specifically, the authors study the query optimization and tackle the query performance prediction for knowledge retrieval. They also handle unstructured data processing, data clustering for knowledge extraction. To optimize the queries issued through interfaces against knowledge bases, the authors propose a cache-based optimization layer between consumers and the querying interface to facilitate the querying and solve the latency issue. The cache depends on a novel learning method that considers the querying patterns from individual's historical queries without having knowledge of the backing systems of the knowledge base. To predict the query performance for appropriate query scheduling, the authors examine the queries' structural and syntactical features and apply multiple widely adopted prediction models. Their feature modelling approach eschews the knowledge requirement on both the querying languages and system. To extract knowledge from unstructured Web sources, the authors examine two kinds of Web sources containing unstructured data: the source code from Web repositories and the posts in programming question-answering communities. They use natural language processing techniques to pre-process the source codes and obtain the natural language elements. Then they apply traditional knowledge extraction techniques to extract knowledge. For the data from programming question-answering communities, the authors make the attempt towards building programming knowledge base by starting with paraphrase identification problems and develop novel features to accurately identify duplicate posts. For domain specific knowledge extraction, the authors propose to use a clustering technique to separate knowledge into different groups. They focus on developing a new clustering algorithm that uses manifold constraints in the optimization task and achieves fast and accurate performance. For each model and approach presented in this dissertation, the authors have conducted extensive experiments to evaluate it using either public dataset or synthetic data they generated.
This book addresses the current status, challenges and future directions of data-driven materials discovery and design. It presents the analysis and learning from data as a key theme in many science and cyber related applications. The challenging open questions as well as future directions in the application of data science to materials problems are sketched. Computational and experimental facilities today generate vast amounts of data at an unprecedented rate. The book gives guidance to discover new knowledge that enables materials innovation to address grand challenges in energy, environment and security, the clearer link needed between the data from these facilities and the theory and underlying science. The role of inference and optimization methods in distilling the data and constraining predictions using insights and results from theory is key to achieving the desired goals of real time analysis and feedback. Thus, the importance of this book lies in emphasizing that the full value of knowledge driven discovery using data can only be realized by integrating statistical and information sciences with materials science, which is increasingly dependent on high throughput and large scale computational and experimental data gathering efforts. This is especially the case as we enter a new era of big data in materials science with the planning of future experimental facilities such as the Linac Coherent Light Source at Stanford (LCLS-II), the European X-ray Free Electron Laser (EXFEL) and MaRIE (Matter Radiation in Extremes), the signature concept facility from Los Alamos National Laboratory. These facilities are expected to generate hundreds of terabytes to several petabytes of in situ spatially and temporally resolved data per sample. The questions that then arise include how we can learn from the data to accelerate the processing and analysis of reconstructed microstructure, rapidly map spatially resolved properties from high throughput data, devise diagnostics for pattern detection, and guide experiments towards desired targeted properties. The authors are an interdisciplinary group of leading experts who bring the excitement of the nascent and rapidly emerging field of materials informatics to the reader.
This book introduces the concepts, applications and development of data science in the telecommunications industry by focusing on advanced machine learning and data mining methodologies in the wireless networks domain. Mining Over Air describes the problems and their solutions for wireless network performance and quality, device quality readiness and returns analytics, wireless resource usage profiling, network traffic anomaly detection, intelligence-based self-organizing networks, telecom marketing, social influence, and other important applications in the telecom industry. Written by authors who study big data analytics in wireless networks and telecommunication markets from both industrial and academic perspectives, the book targets the pain points in telecommunication networks and markets through big data. Designed for both practitioners and researchers, the book explores the intersection between the development of new engineering technology and uses data from the industry to understand consumer behavior. It combines engineering savvy with insights about human behavior. Engineers will understand how the data generated from the technology can be used to understand the consumer behavior and social scientists will get a better understanding of the data generation process.
This book reports on new theories and applications in the field of intelligent systems and computing. It covers computational and artificial intelligence methods, as well as advances in computer vision, current issues in big data and cloud computing, computation linguistics, and cyber-physical systems. It also reports on data mining and knowledge extraction technologies, as well as central issues in intelligent information management. Written by active researchers, the respective chapters are based on papers presented at the International Conference on Computer Science and Information Technologies (CSIT 2018), held on September 11-14, 2018, in Lviv, Ukraine, and jointly organized by the Lviv Polytechnic National University, Ukraine, the Kharkiv National University of Radio Electronics, Ukraine, and the Technical University of Lodz, Poland, under patronage of Ministry of Education and Science of Ukraine. Given its breadth of coverage, the book provides academics and professionals with extensive information and a timely snapshot of the field of intelligent systems, and is sure to foster new discussions and collaborations among different groups.
The seven-volume set of LNCS 11301-11307, constitutes the proceedings of the 25th International Conference on Neural Information Processing, ICONIP 2018, held in Siem Reap, Cambodia, in December 2018. The 401 full papers presented were carefully reviewed and selected from 575 submissions. The papers address the emerging topics of theoretical research, empirical studies, and applications of neural information processing techniques across different domains. The first volume, LNCS 11301, is organized in topical sections on deep neural networks, convolutional neural networks, recurrent neural networks, and spiking neural networks.
The book includes high-quality research papers presented at the International Conference on Innovative Computing and Communication (ICICC 2018), which was held at the Guru Nanak Institute of Management (GNIM), Delhi, India on 5-6 May 2018. Introducing the innovative works of scientists, professors, research scholars, students and industrial experts in the field of computing and communication, the book promotes the transformation of fundamental research into institutional and industrialized research and the conversion of applied exploration into real-time applications.
This book constitutes the refereed proceedings of the 6th International Conference on Big Data analytics, BDA 2018, held in Warangal, India, in December 2018. The 29 papers presented in this volume were carefully reviewed and selected from 93 submissions. The papers are organized in topical sections named: big data analytics: vision and perspectives; financial data analytics and data streams; web and social media data; big data systems and frameworks; predictive analytics in healthcare and agricultural domains; and machine learning and pattern mining.
This book constitutes the refereed post-conference proceedings of the 8th International Conference on Big Data Technologies and Applications, BDTA 2017, held in Gwangju, South Korea, in November 2017. The 15 revised full papers were carefully reviewed and selected from 25 submissions and handle theoretical foundations and practical applications which premise the new generation of data analytics and engineering. The contributions deal with following topics: privacy and security, image processing, context awareness, s/w engineering and e-commerce, social media and health care. |
![]() ![]() You may like...
Opinion Mining and Text Analytics on…
Pantea Keikhosrokiani, Moussa Pourya Asl
Hardcover
R10,065
Discovery Miles 100 650
Implementation of Machine Learning…
Veljko Milutinovi, Nenad Mitic, …
Hardcover
R7,211
Discovery Miles 72 110
Big Data and Smart Service Systems
Xiwei Liu, Rangachari Anand, …
Hardcover
Enhancing Academic Research With…
Dhananjay Subhashchandra Deshpande, Narayan Bhosale, …
Hardcover
R5,421
Discovery Miles 54 210
|