![]() |
Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
||
|
Books > Computing & IT > Applications of computing > Databases > Data mining
This book describes in detail sampling techniques that can be used for unsupervised and supervised cases, with a focus on sampling techniques for machine learning algorithms. It covers theory and models of sampling methods for managing scalability and the "curse of dimensionality", their implementations, evaluations, and applications. A large part of the book is dedicated to database comprising standard feature vectors, and a special section is reserved to the handling of more complex objects and dynamic scenarios. The book is ideal for anyone teaching or learning pattern recognition and interesting teaching or learning pattern recognition and is interested in the big data challenge. It provides an accessible introduction to the field and discusses the state of the art concerning sampling techniques for supervised and unsupervised task. Provides a comprehensive description of sampling techniques for unsupervised and supervised tasks; Describe implementation and evaluation of algorithms that simultaneously manage scalable problems and curse of dimensionality; Addresses the role of sampling in dynamic scenarios, sampling when dealing with complex objects, and new challenges arising from big data. "This book represents a timely collection of state-of-the art research of sampling techniques, suitable for anyone who wants to become more familiar with these helpful techniques for tackling the big data challenge." M. Emre Celebi, Ph.D., Professor and Chair, Department of Computer Science, University of Central Arkansas "In science the difficulty is not to have ideas, but it is to make them work" From Carlo Rovelli
This book provides a review of advanced topics relating to the theory, research, analysis and implementation in the context of big data platforms and their applications, with a focus on methods, techniques, and performance evaluation. The explosive growth in the volume, speed, and variety of data being produced every day requires a continuous increase in the processing speeds of servers and of entire network infrastructures, as well as new resource management models. This poses significant challenges (and provides striking development opportunities) for data intensive and high-performance computing, i.e., how to efficiently turn extremely large datasets into valuable information and meaningful knowledge. The task of context data management is further complicated by the variety of sources such data derives from, resulting in different data formats, with varying storage, transformation, delivery, and archiving requirements. At the same time rapid responses are needed for real-time applications. With the emergence of cloud infrastructures, achieving highly scalable data management in such contexts is a critical problem, as the overall application performance is highly dependent on the properties of the data management service.
This book highlights an innovative approach for extracting terminological cores from subject domain-bounded collections of professional texts. The approach is based on exploiting the phenomenon of terminological saturation. The book presents the formal framework for the method of detecting and measuring terminological saturation as a successive approximation process. It further offers the suite of the algorithms that implement the method in the software and comprehensively evaluates all the aspects of the method and possible input configurations in the experiments on synthetic and real collections of texts in several subject domains. The book demonstrates the use of the developed method and software pipeline in industrial and academic use cases. It also outlines the potential benefits of the method for the adoption in industry.
In the fields of data mining and control, the huge amount of unstructured data and the presence of uncertainty in system descriptions have always been critical issues. The book Randomized Algorithms in Automatic Control and Data Mining introduces the readers to the fundamentals of randomized algorithm applications in data mining (especially clustering) and in automatic control synthesis. The methods proposed in this book guarantee that the computational complexity of classical algorithms and the conservativeness of standard robust control techniques will be reduced. It is shown that when a problem requires "brute force" in selecting among options, algorithms based on random selection of alternatives offer good results with certain probability for a restricted time and significantly reduce the volume of operations.
Introduction to the Theories and Varieties of Modern Crime in Financial Markets explores statistical methods and data mining techniques that, if used correctly, can help with crime detection and prevention. The three sections of the book present the methods, techniques, and approaches for recognizing, analyzing, and ultimately detecting and preventing financial frauds, especially complex and sophisticated crimes that characterize modern financial markets. The first two sections appeal to readers with technical backgrounds, describing data analysis and ways to manipulate markets and commit crimes. The third section gives life to the information through a series of interviews with bankers, regulators, lawyers, investigators, rogue traders, and others. The book is sharply focused on analyzing the origin of a crime from an economic perspective, showing Big Data in action, noting both the pros and cons of this approach.
This book explains the Linked Data domain by adopting a bottom-up approach: it introduces the fundamental Semantic Web technologies and building blocks, which are then combined into methodologies and end-to-end examples for publishing datasets as Linked Data, and use cases that harness scholarly information and sensor data. It presents how Linked Data is used for web-scale data integration, information management and search. Special emphasis is given to the publication of Linked Data from relational databases as well as from real-time sensor data streams. The authors also trace the transformation from the document-based World Wide Web into a Web of Data. Materializing the Web of Linked Data is addressed to researchers and professionals studying software technologies, tools and approaches that drive the Linked Data ecosystem, and the Web in general.
This text presents an overview of smart information systems for both the private and public sector, highlighting the research questions that can be studied by applying computational intelligence. The book demonstrates how to transform raw data into effective smart information services, covering the challenges and potential of this approach. Each chapter describes the algorithms, tools, measures and evaluations used to answer important questions. This is then further illustrated by a diverse selection of case studies reflecting genuine problems faced by SMEs, multinational manufacturers, service companies, and the public sector. Features: provides a state-of-the-art introduction to the field, integrating contributions from both academia and industry; reviews novel information aggregation services; discusses personalization and recommendation systems; examines sensor-based knowledge acquisition services, describing how the analysis of sensor data can be used to provide a clear picture of our world.
This book provides a general introduction to the most important geophysical exploration methods and their application to forensic sciences. It describes physical principles, campaign procedures and processing, as well as interpretation techniques, while also highlighting new acquisition and data analysis procedures. A large section of the book is devoted to applications, from measurements to the interpretation of data. Further, the book shows how to design and perform a forensic survey, and offers guidance on selecting the best method for the problem at hand, and on selecting the best type of data acquisition and processing. Written in straightforward language and chiefly intended as an introductory text for students in several scientific fields, the book also offers a useful guide for specialists who want to expand their expertise in this fascinating discipline.
Data Analysis in the Cloud introduces and discusses models, methods, techniques, and systems to analyze the large number of digital data sources available on the Internet using the computing and storage facilities of the cloud. Coverage includes scalable data mining and knowledge discovery techniques together with cloud computing concepts, models, and systems. Specific sections focus on map-reduce and NoSQL models. The book also includes techniques for conducting high-performance distributed analysis of large data on clouds. Finally, the book examines research trends such as Big Data pervasive computing, data-intensive exascale computing, and massive social network analysis.
Data Mining and Predictive Analysis: Intelligence Gathering and Crime Analysis, 2nd Edition, describes clearly and simply how crime clusters and other intelligence can be used to deploy security resources most effectively. Rather than being reactive, security agencies can anticipate and prevent crime through the appropriate application of data mining and the use of standard computer programs. Data Mining and Predictive Analysis offers a clear, practical starting point for professionals who need to use data mining in homeland security, security analysis, and operational law enforcement settings. This revised text highlights new and emerging technology, discusses the importance of analytic context for ensuring successful implementation of advanced analytics in the operational setting, and covers new analytic service delivery models that increase ease of use and access to high-end technology and analytic capabilities. The use of predictive analytics in intelligence and security analysis enables the development of meaningful, information based tactics, strategy, and policy decisions in the operational public safety and security environment.
"Turn yourself into a Data Head. You'll become a more valuable employee and make your organization more successful." Thomas H. Davenport, Research Fellow, Author of Competing on Analytics, Big Data @ Work, and The AI Advantage You've heard the hype around data--now get the facts. In Becoming a Data Head: How to Think, Speak, and Understand Data Science, Statistics, and Machine Learning, award-winning data scientists Alex Gutman and Jordan Goldmeier pull back the curtain on data science and give you the language and tools necessary to talk and think critically about it. You'll learn how to: Think statistically and understand the role variation plays in your life and decision making Speak intelligently and ask the right questions about the statistics and results you encounter in the workplace Understand what's really going on with machine learning, text analytics, deep learning, and artificial intelligence Avoid common pitfalls when working with and interpreting data Becoming a Data Head is a complete guide for data science in the workplace: covering everything from the personalities you'll work with to the math behind the algorithms. The authors have spent years in data trenches and sought to create a fun, approachable, and eminently readable book. Anyone can become a Data Head--an active participant in data science, statistics, and machine learning. Whether you're a business professional, engineer, executive, or aspiring data scientist, this book is for you.
This book covers key issues related to Geospatial Semantic Web, including geospatial web services for spatial data interoperability; geospatial ontology for semantic interoperability; ontology creation, sharing, and integration; querying knowledge and information from heterogeneous data source; interfaces for Geospatial Semantic Web, VGI (Volunteered Geographic Information) and Geospatial Semantic Web; challenges of Geospatial Semantic Web; and development of Geospatial Semantic Web applications. This book also describes state-of-the-art technologies that attempt to solve these problems such as WFS, WMS, RDF, OWL and GeoSPARQL and demonstrates how to use the Geospatial Semantic Web technologies to solve practical real-world problems such as spatial data interoperability.
This monograph will provide an in-depth mathematical treatment of modern multiple test procedures controlling the false discovery rate (FDR) and related error measures, particularly addressing applications to fields such as genetics, proteomics, neuroscience and general biology. The book will also include a detailed description how to implement these methods in practice. Moreover new developments focusing on non-standard assumptions are also included, especially multiple tests for discrete data. The book primarily addresses researchers and practitioners but will also be beneficial for graduate students.
This book offers a clear understanding of the concept of context-aware machine learning including an automated rule-based framework within the broad area of data science and analytics, particularly, with the aim of data-driven intelligent decision making. Thus, we have bestowed a comprehensive study on this topic that explores multi-dimensional contexts in machine learning modeling, context discretization with time-series modeling, contextual rule discovery and predictive analytics, recent-pattern or rule-based behavior modeling, and their usefulness in various context-aware intelligent applications and services. The presented machine learning-based techniques can be employed in a wide range of real-world application areas ranging from personalized mobile services to security intelligence, highlighted in the book. As the interpretability of a rule-based system is high, the automation in discovering rules from contextual raw data can make this book more impactful for the application developers as well as researchers. Overall, this book provides a good reference for both academia and industry people in the broad area of data science, machine learning, AI-Driven computing, human-centered computing and personalization, behavioral analytics, IoT and mobile applications, and cybersecurity intelligence.
Many important planning decisions in society and business depend on proper knowledge and a correct understanding of movement, be it in transportation, logistics, biology, or the life sciences. Today the widespread use of mobile phones and technologies like GPS and RFID provides an immense amount of data on location and movement. What is needed are new methods of visualization and algorithmic data analysis that are tightly integrated and complement each other to allow end-users and analysts to extract useful knowledge from these extremely large data volumes. This is exactly the topic of this book. As the authors show, modern visual analytics techniques are ready to tackle the enormous challenges brought about by movement data, and the technology and software needed to exploit them are available today. The authors start by illustrating the different kinds of data available to describe movement, from individual trajectories of single objects to multiple trajectories of many objects, and then proceed to detail a conceptual framework, which provides the basis for a fundamental understanding of movement data. With this basis, they move on to more practical and technical aspects, focusing on how to transform movement data to make it more useful, and on the infrastructure necessary for performing visual analytics in practice. In so doing they demonstrate that visual analytics of movement data can yield exciting insights into the behavior of moving persons and objects, but can also lead to an understanding of the events that transpire when things move. Throughout the book, they use sample applications from various domains and illustrate the examples with graphical depictions of both the interactive displays and the analysis results. In summary, readers will benefit from this detailed description of the state of the art in visual analytics in various ways. Researchers will appreciate the scientific precision involved, software technologists will find essential information on algorithms and systems, and practitioners will profit from readily accessible examples with detailed illustrations for practical purposes.
Data Mining for Bioinformatics Applications provides valuable information on the data mining methods have been widely used for solving real bioinformatics problems, including problem definition, data collection, data preprocessing, modeling, and validation. The text uses an example-based method to illustrate how to apply data mining techniques to solve real bioinformatics problems, containing 45 bioinformatics problems that have been investigated in recent research. For each example, the entire data mining process is described, ranging from data preprocessing to modeling and result validation.
This textbook brings together both new and traditional research methods in Human Computer Interaction (HCI). Research methods include interviews and observations, ethnography, grounded theory and analysis of digital traces of behavior. Readers will gain an understanding of the type of knowledge each method provides, its disciplinary roots and how each contributes to understanding users, user behavior and the context of use. The background context, clear explanations and sample exercises make this an ideal textbook for graduate students, as well as a valuable reference for researchers and practitioners. 'It is an impressive collection in terms of the level of detail and variety.' (M. Sasikumar, ACM Computing Reviews #CR144066)
This book provides comprehensive coverage of neural networks, their evolution, their structure, the problems they can solve, and their applications. The first half of the book looks at theoretical investigations on artificial neural networks and addresses the key architectures that are capable of implementation in various application scenarios. The second half is designed specifically for the production of solutions using artificial neural networks to solve practical problems arising from different areas of knowledge. It also describes the various implementation details that were taken into account to achieve the reported results. These aspects contribute to the maturation and improvement of experimental techniques to specify the neural network architecture that is most appropriate for a particular application scope. The book is appropriate for students in graduate and upper undergraduate courses in addition to researchers and professionals.
This book presents innovative research works to demonstrate the potential and the advancements of computing approaches to utilize healthcare centric and medical datasets in solving complex healthcare problems. Computing technique is one of the key technologies that are being currently used to perform medical diagnostics in the healthcare domain, thanks to the abundance of medical data being generated and collected. Nowadays, medical data is available in many different forms like MRI images, CT scan images, EHR data, test reports, histopathological data and doctor patient conversation data. This opens up huge opportunities for the application of computing techniques, to derive data-driven models that can be of very high utility, in terms of providing effective treatment to patients. Moreover, machine learning algorithms can uncover hidden patterns and relationships present in medical datasets, which are too complex to uncover, if a data-driven approach is not taken. With the help of computing systems, today, it is possible for researchers to predict an accurate medical diagnosis for new patients, using models built from previous patient data. Apart from automatic diagnostic tasks, computing techniques have also been applied in the process of drug discovery, by which a lot of time and money can be saved. Utilization of genomic data using various computing techniques is another emerging area, which may in fact be the key to fulfilling the dream of personalized medications. Medical prognostics is another area in which machine learning has shown great promise recently, where automatic prognostic models are being built that can predict the progress of the disease, as well as can suggest the potential treatment paths to get ahead of the disease progression.
This book introduces basic computing skills designed for industry professionals without a strong computer science background. Written in an easily accessible manner, and accompanied by a user-friendly website, it serves as a self-study guide to survey data science and data engineering for those who aspire to start a computing career, or expand on their current roles, in areas such as applied statistics, big data, machine learning, data mining, and informatics. The authors draw from their combined experience working at software and social network companies, on big data products at several major online retailers, as well as their experience building big data systems for an AI startup. Spanning from the basic inner workings of a computer to advanced data manipulation techniques, this book opens doors for readers to quickly explore and enhance their computing knowledge. Computing with Data comprises a wide range of computational topics essential for data scientists, analysts, and engineers, providing them with the necessary tools to be successful in any role that involves computing with data. The introduction is self-contained, and chapters progress from basic hardware concepts to operating systems, programming languages, graphing and processing data, testing and programming tools, big data frameworks, and cloud computing. The book is fashioned with several audiences in mind. Readers without a strong educational background in CS--or those who need a refresher--will find the chapters on hardware, operating systems, and programming languages particularly useful. Readers with a strong educational background in CS, but without significant industry background, will find the following chapters especially beneficial: learning R, testing, programming, visualizing and processing data in Python and R, system design for big data, data stores, and software craftsmanship.
Easy-to-follow step-by-step concepts and methods. Every chapter is introduced in a very gentle and intuitive way so students can understand the WHYs, WHAT-IFs, WHAT-IS-THIS-FORs, HOWs, etc by themselves. Practical programming exercises in Python for each chapter. Includes theory and practice for every chapter, summaries, practical coding exercises for target problems, QA, and a companion website with the sample code and data.
Easy-to-follow step-by-step concepts and methods. Every chapter is introduced in a very gentle and intuitive way so students can understand the WHYs, WHAT-IFs, WHAT-IS-THIS-FORs, HOWs, etc by themselves. Practical programming exercises in Python for each chapter. Includes theory and practice for every chapter, summaries, practical coding exercises for target problems, QA, and a companion website with the sample code and data.
This book addresses the usefulness of knowledge discovery through data mining. With this aim, contributors from different fields propose concrete problems and applications showing how data mining and discovering embedded knowledge from raw data can be beneficial to social organizations, domestic spheres, and ICT markets. Data mining or knowledge discovery in databases (KDD) has received increasing interest due to its focus on transforming large amounts of data into novel, valid, useful, and structured knowledge by detecting concealed patterns and relationships. The concept of knowledge is broad and speculative and has promoted epistemological debates in western philosophies. The intensified interest in knowledge management and data mining stems from the difficulty in identifying computational models able to approximate human behaviors and abilities in resolving organizational, social, and physical problems. Current ICT interfaces are not yet adequately advanced to support and simulate the abilities of physicians, teachers, assistants or housekeepers in domestic spheres. And unlike in industrial contexts where abilities are routinely applied, the domestic world is continuously changing and unpredictable. There are challenging questions in this field: Can knowledge locked in conventions, rules of conduct, common sense, ethics, emotions, laws, cultures, and experiences be mined from data? Is it acceptable for automatic systems displaying emotional behaviors to govern complex interactions based solely on the mining of large volumes of data? Discussing multidisciplinary themes, the book proposes computational models able to approximate, to a certain degree, human behaviors and abilities in resolving organizational, social, and physical problems. The innovations presented are of primary importance for: a. The academic research community b. The ICT market c. Ph.D. students and early stage researchers d. Schools, hospitals, rehabilitation and assisted-living centers e. Representatives from multimedia industries and standardization bodies
R is a powerful and free software system for data analysis and graphics, with over 5,000 add-on packages available. This book introduces R using SAS and SPSS terms with which you are already familiar. It demonstrates which of the add-on packages are most like SAS and SPSS and compares them to R's built-in functions. It steps through over 30 programs written in all three packages, comparing and contrasting the packages' differing approaches. The programs and practice datasets are available for download. The glossary defines over 50 R terms using SAS/SPSS jargon and again using R jargon. The table of contents and the index allow you to find equivalent R functions by looking up both SAS statements and SPSS commands. When finished, you will be able to import data, manage and transform it, create publication quality graphics, and perform basic statistical analyses. This new edition has updated programming, an expanded index, and even more statistical methods covered in over 25 new sections. |
You may like...
Dynamic Web Application Development…
David Parsons, Simon Stobart
Paperback
Innovations in Molecular Mechanisms and…
Jeanne Wilson-Rawls, Kenro Kusumi
Hardcover
R4,576
Discovery Miles 45 760
|