Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
|||
Books > Computing & IT > Applications of computing > Databases > Data mining
Data Mining: Concepts and Techniques, Fourth Edition introduces concepts, principles, and methods for mining patterns, knowledge, and models from various kinds of data for diverse applications. Specifically, it delves into the processes for uncovering patterns and knowledge from massive collections of data, known as knowledge discovery from data, or KDD. It focuses on the feasibility, usefulness, effectiveness, and scalability of data mining techniques for large data sets. After an introduction to the concept of data mining, the authors explain the methods for preprocessing, characterizing, and warehousing data. They then partition the data mining methods into several major tasks, introducing concepts and methods for mining frequent patterns, associations, and correlations for large data sets; data classificcation and model construction; cluster analysis; and outlier detection. Concepts and methods for deep learning are systematically introduced as one chapter. Finally, the book covers the trends, applications, and research frontiers in data mining.
The book features original papers from International Conference on Computational Methods and Data Engineering (ICCMDE 2021), organized by School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India, during November 25-26, 2021. The book covers innovative and cutting-edge work of researchers, developers, and practitioners from academia and industry working in the area of advanced computing.
This book presents innovative research works to demonstrate the potential and the advancements of computing approaches to utilize healthcare centric and medical datasets in solving complex healthcare problems. Computing technique is one of the key technologies that are being currently used to perform medical diagnostics in the healthcare domain, thanks to the abundance of medical data being generated and collected. Nowadays, medical data is available in many different forms like MRI images, CT scan images, EHR data, test reports, histopathological data and doctor patient conversation data. This opens up huge opportunities for the application of computing techniques, to derive data-driven models that can be of very high utility, in terms of providing effective treatment to patients. Moreover, machine learning algorithms can uncover hidden patterns and relationships present in medical datasets, which are too complex to uncover, if a data-driven approach is not taken. With the help of computing systems, today, it is possible for researchers to predict an accurate medical diagnosis for new patients, using models built from previous patient data. Apart from automatic diagnostic tasks, computing techniques have also been applied in the process of drug discovery, by which a lot of time and money can be saved. Utilization of genomic data using various computing techniques is another emerging area, which may in fact be the key to fulfilling the dream of personalized medications. Medical prognostics is another area in which machine learning has shown great promise recently, where automatic prognostic models are being built that can predict the progress of the disease, as well as can suggest the potential treatment paths to get ahead of the disease progression.
Delve into your data for the key to success Data mining is quickly becoming integral to creating value and business momentum. The ability to detect unseen patterns hidden in the numbers exhaustively generated by day-to-day operations allows savvy decision-makers to exploit every tool at their disposal in the pursuit of better business. By creating models and testing whether patterns hold up, it is possible to discover new intelligence that could change your business's entire paradigm for a more successful outcome. Data Mining for Dummies shows you why it doesn't take a data scientist to gain this advantage, and empowers average business people to start shaping a process relevant to their business's needs. In this book, you'll learn the hows and whys of mining to the depths of your data, and how to make the case for heavier investment into data mining capabilities. The book explains the details of the knowledge discovery process including: * Model creation, validity testing, and interpretation * Effective communication of findings * Available tools, both paid and open-source * Data selection, transformation, and evaluation Data Mining for Dummies takes you step-by-step through a real-world data-mining project using open-source tools that allow you to get immediate hands-on experience working with large amounts of data. You'll gain the confidence you need to start making data mining practices a routine part of your successful business. If you're serious about doing everything you can to push your company to the top, Data Mining for Dummies is your ticket to effective data mining.
Text Mining and Visualization: Case Studies Using Open-Source Tools provides an introduction to text mining using some of the most popular and powerful open-source tools: KNIME, RapidMiner, Weka, R, and Python. The contributors-all highly experienced with text mining and open-source software-explain how text data are gathered and processed from a wide variety of sources, including books, server access logs, websites, social media sites, and message boards. Each chapter presents a case study that you can follow as part of a step-by-step, reproducible example. You can also easily apply and extend the techniques to other problems. All the examples are available on a supplementary website. The book shows you how to exploit your text data, offering successful application examples and blueprints for you to tackle your text mining tasks and benefit from open and freely available tools. It gets you up to date on the latest and most powerful tools, the data mining process, and specific text mining activities.
Solutions Manual to accompany Statistical Data Analytics: Foundations for Data Mining, Informatics, and Knowledge Discovery A comprehensive introduction to statistical methods for data mining and knowledge discovery. Extensive solutions using actual data (with sample R programming code) are provided, illustrating diverse informatic sources in genomics, biomedicine, ecological remote sensing, astronomy, socioeconomics, marketing, advertising and finance, among many others.
This book brings all of the elements of data mining together in a
single volume, saving the reader the time and expense of making
multiple purchases. It consolidates both introductory and advanced
topics, thereby covering the gamut of data mining and machine
learning tactics ? from data integration and pre-processing, to
fundamental algorithms, to optimization techniques and web mining
methodology.
This book presents high-quality research papers presented at 2nd International Conference on Smart Data Intelligence (ICSMDI 2022) organized by Kongunadu College of Engineering and Technology at Trichy, Tamil Nadu, India, during April 2022. This book brings out the new advances and research results in the fields of algorithmic design, data analysis, and implementation on various real-time applications. It discusses many emerging related fields like big data, data science, artificial intelligence, machine learning, and deep learning which have deployed a paradigm shift in various data-driven approaches that tends to evolve new data-driven research opportunities in various influential domains like social networks, healthcare, information, and communication applications.
Data science teams looking to turn research into useful analytics applications require not only the right tools, but also the right approach if they're to succeed. With the revised second edition of this hands-on guide, up-and-coming data scientists will learn how to use the Agile Data Science development methodology to build data applications with Python, Apache Spark, Kafka, and other tools. Author Russell Jurney demonstrates how to compose a data platform for building, deploying, and refining analytics applications with Apache Kafka, MongoDB, ElasticSearch, d3.js, scikit-learn, and Apache Airflow. You'll learn an iterative approach that lets you quickly change the kind of analysis you're doing, depending on what the data is telling you. Publish data science work as a web application, and affect meaningful change in your organization. Build value from your data in a series of agile sprints, using the data-value pyramid Extract features for statistical models from a single dataset Visualize data with charts, and expose different aspects through interactive reports Use historical data to predict the future via classification and regression Translate predictions into actions Get feedback from users after each sprint to keep your project on track
After the start of the Syrian Civil War in 2011-12, increasing numbers of civilians sought refuge in neighboring countries. By May 2017, Turkey had received over 3 million refugees - the largest refugee population in the world. Some lived in government-run camps near the Syrian border, but many have moved to cities looking for work and better living conditions. They faced problems of integration, income, welfare, employment, health, education, language, social tension, and discrimination. In order to develop sound policies to solve these interlinked problems, a good understanding of refugee dynamics isnecessary. This book summarizes the most important findings of the Data for Refugees (D4R) Challenge, which was a non-profit project initiated to improve the conditions of the Syrian refugees in Turkey by providing a database for the scientific community to enable research on urgent problems concerning refugees. The database, based on anonymized mobile call detail records (CDRs) of phone calls and SMS messages of one million Turk Telekom customers, indicates the broad activity and mobility patterns of refugees and citizens in Turkey for the year 1 January to 31 December 2017. Over 100 teams from around the globe applied to take part in the challenge, and 61 teams were granted access to the data. This book describes the challenge, and presents selected and revised project reports on the five major themes: unemployment, health, education, social integration, and safety, respectively. These are complemented by additional invited chapters describing related projects from international governmental organizations, technological infrastructure, as well as ethical aspects. The last chapter includes policy recommendations, based on the lessons learned. The book will serve as a guideline for creating innovative data-centered collaborations between industry, academia, government, and non-profit humanitarian agencies to deal with complex problems in refugee scenarios. It illustrates the possibilities of big data analytics in coping with refugee crises and humanitarian responses, by showcasing innovative approaches drawing on multiple data sources, information visualization, pattern analysis, and statistical analysis.It will also provide researchers and students working with mobility data with an excellent coverage across data science, economics, sociology, urban computing, education, migration studies, and more.
This book describes data analytics and data mining in the commercial world and how similar techniques (learner analytics and educational data mining) are starting to be applied in education. The book examines the challenges being encountered and the potential of such efforts for improving student outcomes and the productivity of K12 education systems. The goal is to help education policymakers and administrators understand how data mining and analytics work and how they can be applied within online learning systems to support education-related decision making.
This book is the proceeding of the 1st International Conference on Distributed Sensing and Intelligent Systems (ICDSIS2020) which will be held in The National School of Applied Sciences of Agadir, Ibn Zohr University, Agadir, Morocco on February 01-03, 2020. ICDSIS2020 is co-organized by Computer Vision and Intelligent Systems Lab, University of North Texas, USA as a scientific collaboration event with The National School of Applied Sciences of Agadir, Ibn Zohr University. ICDSIS2020 aims to foster students, researchers, academicians and industry persons in the field of Computer and Information Science, Intelligent Systems, and Electronics and Communication Engineering in general. The volume collects contributions from leading experts around the globe with the latest insights on emerging topics, and includes reviews, surveys, and research chapters covering all aspects of distributed sensing and intelligent systems. The volume is divided into 5 key sections: Distributed Sensing Applications; Intelligent Systems; Advanced theories and algorithms in machine learning and data mining; Artificial intelligence and optimization, and application to Internet of Things (IoT); and Cybersecurity and Secure Distributed Systems. This conference proceeding is an academic book which can be read by students, analysts, policymakers, and regulators interested in Distributed Sensing, Smart Network approaches, Smart Cities, IoT Applications, and Intelligent Applications. It is written in plain and easy language, and describes new concepts when they appear first so that a reader without prior background of the field finds it readable. The book is primarily intended for research students in sensor networks and IoT applications (including intelligent information systems, and smart sensors applications), academics in higher education institutions including universities and vocational colleges, policy makers and legislators.
This book includes high-quality research papers presented at the First International Conference on Human-Centric Smart Computing (ICHCSC 2022), organized by the University of Engineering and Management, Jaipur, India, on 27-29 April 2022. The topics covered in the book are human-centric computing, hyper connectivity, and data science. The book presents innovative work by leading academics, researchers, and experts from industry.
A new approach to unsupervised learning Evolving technologies have brought about an explosion of information in recent years, but the question of how such information might be effectively harvested, archived, and analyzed remains a monumental challenge for the processing of such information is often fraught with the need for conceptual interpretation: a relatively simple task for humans, yet an arduous one for computers. Inspired by the relative success of existing popular research on self-organizing neural networks for data clustering and feature extraction, Unsupervised Learning: A Dynamic Approach presents information within the family of generative, self-organizing maps, such as the self-organizing tree map (SOTM) and the more advanced self-organizing hierarchical variance map (SOHVM). It covers a series of pertinent, real-world applications with regard to the processing of multimedia data from its role in generic image processing techniques, such as the automated modeling and removal of impulse noise in digital images, to problems in digital asset management and its various roles in feature extraction, visual enhancement, segmentation, and analysis of microbiological image data. Self-organization concepts and applications discussed include: * Distance metrics for unsupervised clustering * Synaptic self-amplification and competition * Image retrieval * Impulse noise removal * Microbiological image analysis Unsupervised Learning: A Dynamic Approach introduces a new family of unsupervised algorithms that have a basis in self-organization, making it an invaluable resource for researchers, engineers, and scientists who want to create systems that effectively model oppressive volumes of data with little or no user intervention.
The book presents selected research papers on current developments in the field of soft computing and signal processing from the International Conference on Soft Computing and Signal Processing (ICSCSP 2018). It includes papers on current topics such as soft sets, rough sets, fuzzy logic, neural networks, genetic algorithms and machine learning, discussing various aspects of these topics, like technological, product implementation, contemporary research as well as application issues.
Discover how graph databases can help you manage and query highly connected data. With this practical book, you'll learn how to design and implement a graph database that brings the power of graphs to bear on a broad range of problem domains. Whether you want to speed up your response to user queries or build a database that can adapt as your business evolves, this book shows you how to apply the schema-free graph model to real-world problems. This second edition includes new code samples and diagrams, using the latest Neo4j syntax, as well as information on new functionality. Learn how different organizations are using graph databases to outperform their competitors. With this book's data modeling, query, and code examples, you'll quickly be able to implement your own solution. Model data with the Cypher query language and property graph model Learn best practices and common pitfalls when modeling with graphs Plan and implement a graph database solution in test-driven fashion Explore real-world examples to learn how and why organizations use a graph database Understand common patterns and components of graph database architecture Use analytical techniques and algorithms to mine graph database information
From the Foreword: "While large-scale machine learning and data mining have greatly impacted a range of commercial applications, their use in the field of Earth sciences is still in the early stages. This book, edited by Ashok Srivastava, Ramakrishna Nemani, and Karsten Steinhaeuser, serves as an outstanding resource for anyone interested in the opportunities and challenges for the machine learning community in analyzing these data sets to answer questions of urgent societal interest...I hope that this book will inspire more computer scientists to focus on environmental applications, and Earth scientists to seek collaborations with researchers in machine learning and data mining to advance the frontiers in Earth sciences." --Vipin Kumar, University of Minnesota Large-Scale Machine Learning in the Earth Sciences provides researchers and practitioners with a broad overview of some of the key challenges in the intersection of Earth science, computer science, statistics, and related fields. It explores a wide range of topics and provides a compilation of recent research in the application of machine learning in the field of Earth Science. Making predictions based on observational data is a theme of the book, and the book includes chapters on the use of network science to understand and discover teleconnections in extreme climate and weather events, as well as using structured estimation in high dimensions. The use of ensemble machine learning models to combine predictions of global climate models using information from spatial and temporal patterns is also explored. The second part of the book features a discussion on statistical downscaling in climate with state-of-the-art scalable machine learning, as well as an overview of methods to understand and predict the proliferation of biological species due to changes in environmental conditions. The problem of using large-scale machine learning to study the formation of tornadoes is also explored in depth. The last part of the book covers the use of deep learning algorithms to classify images that have very high resolution, as well as the unmixing of spectral signals in remote sensing images of land cover. The authors also apply long-tail distributions to geoscience resources, in the final chapter of the book.
Quantile regression is an approach to data at a loss of homogeneity, for example (1) data with outliers, (2) skewed data like corona - deaths data, (3) data with inconstant variability, (4) big data. In clinical research many examples can be given like circadian phenomena, and diseases where spreading may be dependent on subsets with frailty, low weight, low hygiene, and many forms of lack of healthiness. Stratified analyses is the laborious and rather explorative way of analysis, but quantile analysis is a more fruitful, faster and completer alternative for the purpose. Considering all of this, we are on the verge of a revolution in data analysis. The current edition is the first textbook and tutorial of quantile regressions for medical and healthcare students as well as recollection/update bench, and help desk for professionals. Each chapter can be studied as a standalone and covers one of the many fields in the fast growing world of quantile regressions. Step by step analyses of over 20 data files stored at extras.springer.com are included for self-assessment. We should add that the authors are well qualified in their field. Professor Zwinderman is past-president of the International Society of Biostatistics (2012-2015) and Professor Cleophas is past-president of the American College of Angiology(2000-2002). From their expertise they should be able to make adequate selections of modern quantile regression methods for the benefit of physicians, students, and investigators.
From data collection to evaluation and visualization of prediction results, this book provides a comprehensive overview of the process of predicting demand for retailers. Each step is illustrated with the relevant code and implementation details to demystify how historical data can be leveraged to predict future demand. The tools and methods presented can be applied to most retail settings, both online and brick-and-mortar, such as fashion, electronics, groceries, and furniture. This book is intended to help students in business analytics and data scientists better master how to leverage data for predicting demand in retail applications. It can also be used as a guide for supply chain practitioners who are interested in predicting demand. It enables readers to understand how to leverage data to predict future demand, how to clean and pre-process the data to make it suitable for predictive analytics, what the common caveats are in terms of implementation and how to assess prediction accuracy.
This book presents novel work of academicians, researchers, industry professionals, practitioners, and budding engineers to disseminate the most recent innovations, trends, and concerns along with the present-day challenges and the solving approaches for implementation in the domains of data science, intelligent computing, and computer networks and security. It is a collection of selected high-quality research papers from the International Conference on Data Science, Intelligent Computing and Cyber Security (ICDIC 2020) organized by Sree Vidyanikethan Engineering College, Tirupati, India, during 27-29 February 2020. It discusses the latest challenges and solutions in the field of data innovation, data management, data analysis, data security, and intelligent methods and applications.
This book introduces readers to ecological informatics as an emerging discipline that takes into account the data-intensive nature of ecology, the valuable information to be found in ecological data, and the need to communicate results and inform decisions, including those related to research, conservation and resource management. At its core, ecological informatics combines developments in information technology and ecological theory with applications that facilitate ecological research and the dissemination of results to scientists and the public. Its conceptual framework links ecological entities (genomes, organisms, populations, communities, ecosystems, landscapes) with data management, analysis and synthesis, and communicates new findings to inform decisions by following the course of a loop. In comparison to the 2nd edition published in 2006, the 3rd edition of Ecological Informatics has been completely restructured on the basis of the generic conceptual f ramework provided in Figure 1. It reflects the significant advances in data management, analysis and synthesis that have been made over the past 10 years, including new remote and in situ sensing techniques, the emergence of ecological and environmental observatories, novel evolutionary computations for knowledge discovery and forecasting, and new approaches to communicating results and informing decisions.
The increasing availability of data in our current, information overloaded society has led to the need for valid tools for its modelling and analysis. Data mining and applied statistical methods are the appropriate tools to extract knowledge from such data. This book provides an accessible introduction to data mining methods in a consistent and application oriented statistical framework, using case studies drawn from real industry projects and highlighting the use of data mining methods in a variety of business applications. Introduces data mining methods and applications.Covers classical and Bayesian multivariate statistical methodology as well as machine learning and computational data mining methods.Includes many recent developments such as association and sequence rules, graphical Markov models, lifetime value modelling, credit risk, operational risk and web mining.Features detailed case studies based on applied projects within industry.Incorporates discussion of data mining software, with case studies analysed using R.Is accessible to anyone with a basic knowledge of statistics or data analysis.Includes an extensive bibliography and pointers to further reading within the text. "Applied Data Mining for Business and Industry, 2nd edition" is aimed at advanced undergraduate and graduate students of data mining, applied statistics, database management, computer science and economics. The case studies will provide guidance to professionals working in industry on projects involving large volumes of data, such as customer relationship management, web design, risk management, marketing, economics and finance. |
You may like...
The Elements of Statistical Learning…
Trevor Hastie, Robert Tibshirani, …
Hardcover
R1,939
Discovery Miles 19 390
Intuition, Trust, and Analytics
Jay Liebowitz, Joanna Paliszkiewicz, …
Paperback
R1,363
Discovery Miles 13 630
Data Visualization - Exploring and…
Michael Fry, Jeffrey Ohlmann, …
Paperback
Handbook of Educational Data Mining
Cristobal Romero, Sebastian Ventura, …
Hardcover
R4,548
Discovery Miles 45 480
Text Analytics - An Introduction to the…
John Atkinson-Abutridy
Hardcover
R3,473
Discovery Miles 34 730
The Top Ten Algorithms in Data Mining
Xindong Wu, Vipin Kumar
Hardcover
R2,913
Discovery Miles 29 130
|