![]() |
Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
||
|
Books > Computing & IT > Applications of computing > Databases > Data mining
This volume features selected, refereed papers on various aspects of statistics, matrix theory and its applications to statistics, as well as related numerical linear algebra topics and numerical solution methods, which are relevant for problems arising in statistics and in big data. The contributions were originally presented at the 25th International Workshop on Matrices and Statistics (IWMS 2016), held in Funchal (Madeira), Portugal on June 6-9, 2016. The IWMS workshop series brings together statisticians, computer scientists, data scientists and mathematicians, helping them better understand each other's tools, and fostering new collaborations at the interface of matrix theory and statistics.
This book has a collection of articles written by Big Data experts to describe some of the cutting-edge methods and applications from their respective areas of interest, and provides the reader with a detailed overview of the field of Big Data Analytics as it is practiced today. The chapters cover technical aspects of key areas that generate and use Big Data such as management and finance; medicine and healthcare; genome, cytome and microbiome; graphs and networks; Internet of Things; Big Data standards; bench-marking of systems; and others. In addition to different applications, key algorithmic approaches such as graph partitioning, clustering and finite mixture modelling of high-dimensional data are also covered. The varied collection of themes in this volume introduces the reader to the richness of the emerging field of Big Data Analytics.
In discrete choice models the relationships between the independent variables and the choice probabilities are nonlinear, depending on both the value of the particular independent variable being interpreted and the values of the other independent variables. Thus, interpreting the magnitude of the effects (the "substantive effects") of the independent variables on choice behavior requires the use of additional interpretative techniques. Three common techniques for interpretation are described here: first differences, marginal effects and elasticities, and odds ratios. Concepts related to these techniques are also discussed, as well as methods to account for estimation uncertainty. Interpretation of binary logits, ordered logits, multinomial and conditional logits, and mixed discrete choice models such as mixed multinomial logits and random effects logits for panel data are covered in detail. The techniques discussed here are general, and can be applied to other models with discrete dependent variables which are not specifically described here.
This book constitutes selected, revised and extended papers from the 13th International Conference on Computer Supported Education, CSEDU 2021, held as a virtual event in April 2021. The 27 revised full papers were carefully reviewed and selected from 143 submissions. They were organized in topical sections as follows: artificial intelligence in education; information technologies supporting learning; learning/teaching methodologies and assessment; social context and learning environments; ubiquitous learning; current topics.
This is the first study of Boko Haram that brings advanced data-driven, machine learning models to both learn models capable of predicting a wide range of attacks carried out by Boko Haram, as well as develop data-driven policies to shape Boko Haram's behavior and reduce attacks by them. This book also identifies conditions that predict sexual violence, suicide bombings and attempted bombings, abduction, arson, looting, and targeting of government officials and security installations. After reducing Boko Haram's history to a spreadsheet containing monthly information about different types of attacks and different circumstances prevailing over a 9 year period, this book introduces Temporal Probabilistic (TP) rules that can be automatically learned from data and are easy to explain to policy makers and security experts. This book additionally reports on over 1 year of forecasts made using the model in order to validate predictive accuracy. It also introduces a policy computation method to rein in Boko Haram's attacks. Applied machine learning researchers, machine learning experts and predictive modeling experts agree that this book is a valuable learning asset. Counter-terrorism experts, national and international security experts, public policy experts and Africa experts will also agree this book is a valuable learning tool.
This book presents the recent achievements on the processing of representative user generated content (UGC) on E-commerce websites. This large size of UGC is valuable information for data mining to help customer/object profiling. It provides a comprehensive overview on the concept of customer credibility, object-oriented review summarization technology and content-based collaborative filtering algorithm. It covers a feedback mechanism which is designed to discover customer credibility, which is used to define the professional degree of review content; product-oriented review summarization for restaurants or trip arrangements, and introduced content-based collaborative filtering for product recommendation.
This book presents the recent achievements on the processing of representative user generated content (UGC) on E-commerce websites. This large size of UGC is valuable information for data mining to help customer/object profiling. It provides a comprehensive overview on the concept of customer credibility, object-oriented review summarization technology and content-based collaborative filtering algorithm. It covers a feedback mechanism which is designed to discover customer credibility, which is used to define the professional degree of review content; product-oriented review summarization for restaurants or trip arrangements, and introduced content-based collaborative filtering for product recommendation.
Group method of data handling (GMDH) is a typical inductive modeling method built on the principles of self-organization. Since its introduction, inductive modelling has been developed to support complex systems in prediction, clusterization, system identification, as well as data mining and knowledge extraction technologies in social science, science, engineering, and medicine.This is the first book to explore GMDH using MATLAB (matrix laboratory) language. Readers will learn how to implement GMDH in MATLAB as a method of dealing with big data analytics. Error-free source codes in MATLAB have been included in supplementary material (accessible online) to assist users in their understanding in GMDH and to make it easy for users to further develop variations of GMDH algorithms.
While laboratory research is the backbone of collecting experimental data in cognitive science, a rapidly increasing amount of research is now capitalizing on large-scale and real-world digital data. Each piece of data is a trace of human behavior and offers us a potential clue to understanding basic cognitive principles. However, we have to be able to put the pieces together in a reasonable way, which necessitates both advances in our theoretical models and development of new methodological techniques. The primary goal of this volume is to present cutting-edge examples of mining large-scale and naturalistic data to discover important principles of cognition and evaluate theories that would not be possible without such a scale. This book also has a mission to stimulate cognitive scientists to consider new ways to harness big data in order to enhance our understanding of fundamental cognitive processes. Finally, this book aims to warn of the potential pitfalls of using, or being over-reliant on, big data and to show how big data can work alongside traditional, rigorously gathered experimental data rather than simply supersede it. In sum, this groundbreaking volume presents cognitive scientists and those in related fields with an exciting, detailed, stimulating, and realistic introduction to big data - and to show how it may greatly advance our understanding of the principles of human memory, perception, categorization, decision-making, language, problem-solving, and representation.
Big Data in medical science - what exactly is that? What are the potentials for healthcare management? Where is Big Data at the moment? Which risk factors need to be kept in mind? What is hype and what is real potential? This book provides an impression of the new possibilities of networked data analysis and "Big Data" - for and within medical science and healthcare management. Big Data is about the collection, storage, search, distribution, statistical analysis and visualization of large amounts of data. This is especially relevant in healthcare management, as the amount of digital information is growing exponentially. An amount of data corresponding to 12 million novels emerges during the time of a single hospital stay. These are dimensions that cannot be dealt with without IT technologies. What can we do with the data that are available today? What will be possible in the next few years? Do we want everything that is possible? Who protects the data from wrong usage? More importantly, who protects the data from NOT being used? Big Data is the "resource of the 21st century" and might change the world of medical science more than we understand, realize and want at the moment. The core competence of Big Data will be the complete and correct collection, evaluation and interpretation of data. This also makes it possible to estimate the frame conditions and possibilities of the automation of daily (medical) routine. Can Big Data in medical science help to better understand fundamental problems of health and illness, and draw consequences accordingly? Big Data also means the overcoming of sector borders in healthcare management. The specialty of Big Data analysis will be the new quality of the outcomes of the combination of data that were not related before. That is why the editor of the book gives a voice to 30 experts, working in a variety of fields, such as in hospitals, in health insurance or as medical practitioners. The authors show potentials, risks, concrete practical examples, future scenarios, and come up with possible answers for the field of information technology and data privacy.
This edited collection discusses the emerging topics in statistical modeling for biomedical research. Leading experts in the frontiers of biostatistics and biomedical research discuss the statistical procedures, useful methods, and their novel applications in biostatistics research. Interdisciplinary in scope, the volume as a whole reflects the latest advances in statistical modeling in biomedical research, identifies impactful new directions, and seeks to drive the field forward. It also fosters the interaction of scholars in the arena, offering great opportunities to stimulate further collaborations. This book will appeal to industry data scientists and statisticians, researchers, and graduate students in biostatistics and biomedical science. It covers topics in: Next generation sequence data analysis Deep learning, precision medicine, and their applications Large scale data analysis and its applications Biomedical research and modeling Survival analysis with complex data structure and its applications.
This book discusses the challenges facing current research in knowledge discovery and data mining posed by the huge volumes of complex data now gathered in various real-world applications (e.g., business process monitoring, cybersecurity, medicine, language processing, and remote sensing). The book consists of 14 chapters covering the latest research by the authors and the research centers they represent. It illustrates techniques and algorithms that have recently been developed to preserve the richness of the data and allow us to efficiently and effectively identify the complex information it contains. Presenting the latest developments in complex pattern mining, this book is a valuable reference resource for data science researchers and professionals in academia and industry.
What is text mining, and how can it be used? What relevance do these methods have to everyday work in information science and the digital humanities? How does one develop competences in text mining? Working with Text provides a series of cross-disciplinary perspectives on text mining and its applications. As text mining raises legal and ethical issues, the legal background of text mining and the responsibilities of the engineer are discussed in this book. Chapters provide an introduction to the use of the popular GATE text mining package with data drawn from social media, the use of text mining to support semantic search, the development of an authority system to support content tagging, and recent techniques in automatic language evaluation. Focused studies describe text mining on historical texts, automated indexing using constrained vocabularies, and the use of natural language processing to explore the climate science literature. Interviews are included that offer a glimpse into the real-life experience of working within commercial and academic text mining.
Recent advances in high-throughput technologies have resulted in a deluge of biological information. Yet the storage, analysis, and interpretation of such multifaceted data require effective and efficient computational tools. This unique text/reference addresses the need for a unified framework describing how soft computing and machine learning techniques can be judiciously formulated and used in building efficient pattern recognition models. The book reviews both established and cutting-edge research, following a clear structure reflecting the major phases of a pattern recognition system: classification, feature selection, and clustering. The text provides a careful balance of theory, algorithms, and applications, with a particular emphasis given to applications in computational biology and bioinformatics. Topics and features: reviews the development of scalable pattern recognition algorithms for computational biology and bioinformatics; integrates different soft computing and machine learning methodologies with pattern recognition tasks; discusses in detail the integration of different techniques for handling uncertainties in decision-making and efficiently mining large biological datasets; presents a particular emphasis on real-life applications, such as microarray expression datasets and magnetic resonance images; includes numerous examples and experimental results to support the theoretical concepts described; concludes each chapter with directions for future research and a comprehensive bibliography. This important work will be of great use to graduate students and researchers in the fields of computer science, electrical and biomedical engineering. Researchers and practitioners involved in pattern recognition, machine learning, computational biology and bioinformatics, data mining, and soft computing will also find the book invaluable.
Focused on the mathematical foundations of social media analysis, Graph-Based Social Media Analysis provides a comprehensive introduction to the use of graph analysis in the study of social and digital media. It addresses an important scientific and technological challenge, namely the confluence of graph analysis and network theory with linear algebra, digital media, machine learning, big data analysis, and signal processing. Supplying an overview of graph-based social media analysis, the book provides readers with a clear understanding of social media structure. It uses graph theory, particularly the algebraic description and analysis of graphs, in social media studies. The book emphasizes the big data aspects of social and digital media. It presents various approaches to storing vast amounts of data online and retrieving that data in real-time. It demystifies complex social media phenomena, such as information diffusion, marketing and recommendation systems in social media, and evolving systems. It also covers emerging trends, such as big data analysis and social media evolution. Describing how to conduct proper analysis of the social and digital media markets, the book provides insights into processing, storing, and visualizing big social media data and social graphs. It includes coverage of graphs in social and digital media, graph and hyper-graph fundamentals, mathematical foundations coming from linear algebra, algebraic graph analysis, graph clustering, community detection, graph matching, web search based on ranking, label propagation and diffusion in social media, graph-based pattern recognition and machine learning, graph-based pattern classification and dimensionality reduction, and much more. This book is an ideal reference for scientists and engineers working in social media and digital media production and distribution. It is also suitable for use as a textbook in undergraduate or graduate courses on digital media, social media, or social networks.
The growth of machines and users of the Internet has led to the proliferation of all sorts of data concerning individuals, institutions, companies, governments, universities, and all kinds of known objects and events happening everywhere in daily life. Scientific knowledge is not an exception to the data boom. The phenomenon of data growth in science pushes forth as the number of scientific papers published doubles every 9-15 years, and the need for methods and tools to understand what is reported in scientific literature becomes evident. As the number of academicians and innovators swells, so do the number of publications of all types, yielding outlets of documents and depots of authors and institutions that need to be found in Bibliometric databases. These databases are dug into and treated to hand over metrics of research performance by means of Scientometrics that analyze the toil of individuals, institutions, journals, countries, and even regions of the world. The objective of this book is to assist students, professors, university managers, government, industry, and stakeholders in general, understand which are the main Bibliometric databases, what are the key research indicators, and who are the main players in university rankings and the methodologies and approaches that they employ in producing ranking tables. The book is divided into two sections. The first looks at Scientometric databases, including Scopus and Google Scholar as well as institutional repositories. The second section examines the application of Scientometrics to world-class universities and the role that Scientometrics can play in competition among them. It looks at university rankings and the methodologies used to create these rankings. Individual chapters examine specific rankings that include: QS World University Scimago Institutions Webometrics U-Multirank U.S. News & World Report The book concludes with a discussion of university performance in the age of research analytics.
In order to make informed decisions, there are three important elements: intuition, trust, and analytics. Intuition is based on experiential learning and recent research has shown that those who rely on their "gut feelings" may do better than those who don't. Analytics, however, are important in a data-driven environment to also inform decision making. The third element, trust, is critical for knowledge sharing to take place. These three elements-intuition, analytics, and trust-make a perfect combination for decision making. This book gathers leading researchers who explore the role of these three elements in the process of decision-making.
The Art and Science of Analyzing Software Data provides valuable information on analysis techniques often used to derive insight from software data. This book shares best practices in the field generated by leading data scientists, collected from their experience training software engineering students and practitioners to master data science. The book covers topics such as the analysis of security data, code reviews, app stores, log files, and user telemetry, among others. It covers a wide variety of techniques such as co-change analysis, text analysis, topic analysis, and concept analysis, as well as advanced topics such as release planning and generation of source code comments. It includes stories from the trenches from expert data scientists illustrating how to apply data analysis in industry and open source, present results to stakeholders, and drive decisions.
The book proposes new technologies and discusses future solutions for design infrastructure for ICT. The book contains high quality submissions presented at Second International Conference on Information and Communication Technology for Sustainable Development (ICT4SD - 2016) held at Goa, India during 1 - 2 July, 2016. The conference stimulates the cutting-edge research discussions among many academic pioneering researchers, scientists, industrial engineers, and students from all around the world. The topics covered in this book also focus on innovative issues at international level by bringing together the experts from different countries.
Decision trees have become one of the most powerful and popular approaches in knowledge discovery and data mining; it is the science of exploring large and complex bodies of data in order to discover useful patterns. Decision tree learning continues to evolve over time. Existing methods are constantly being improved and new methods introduced.This 2nd Edition is dedicated entirely to the field of decision trees in data mining; to cover all aspects of this important technique, as well as improved or new methods and techniques developed after the publication of our first edition. In this new edition, all chapters have been revised and new topics brought in. New topics include Cost-Sensitive Active Learning, Learning with Uncertain and Imbalanced Data, Using Decision Trees beyond Classification Tasks, Privacy Preserving Decision Tree Learning, Lessons Learned from Comparative Studies, and Learning Decision Trees for Big Data. A walk-through guide to existing open-source data mining software is also included in this edition.This book invites readers to explore the many benefits in data mining that decision trees offer:
The book presents the results of studies on selected problems (such as predictive model of transcription initiation and termination, protein recognition codes, protein structure prediction, feature selection for disease prediction, information retrieval from medical imaging) of Bioinformatics and Information Retrieval. Information Retrieval is one of the contemporary answers to new challenges in threat evaluation of composite systems. This book provides a practical course in computational data analysis suitable for students or researchers with no previous exposure to computer programming. It describes in detail the theoretical basis for statistical analysis techniques used throughout the textbook, from basic principles. It presents walk-throughs of data analysis tasks using different tools to help in taking decisions in healthcare management.
Biologists are stepping up their efforts in understanding the biological processes that underlie disease pathways in the clinical contexts. This has resulted in a flood of biological and clinical data from genomic and protein sequences, DNA microarrays, protein interactions, biomedical images, to disease pathways and electronic health records. To exploit these data for discovering new knowledge that can be translated into clinical applications, there are fundamental data analysis difficulties that have to be overcome. Practical issues such as handling noisy and incomplete data, processing compute-intensive tasks, and integrating various data sources, are new challenges faced by biologists in the post-genome era. This book will cover the fundamentals of state-of-the-art data mining techniques which have been designed to handle such challenging data analysis problems, and demonstrate with real applications how biologists and clinical scientists can employ data mining to enable them to make meaningful observations and discoveries from a wide array of heterogeneous data from molecular biology to pharmaceutical and clinical domains.
Originating from Facebook, LinkedIn, Twitter, Instagram, YouTube, and many other networking sites, the social media shared by users and the associated metadata are collectively known as user generated content (UGC). To analyze UGC and glean insight about user behavior, robust techniques are needed to tackle the huge amount of real-time, multimedia, and multilingual data. Researchers must also know how to assess the social aspects of UGC, such as user relations and influential users. Mining User Generated Content is the first focused effort to compile state-of-the-art research and address future directions of UGC. It explains how to collect, index, and analyze UGC to uncover social trends and user habits. Divided into four parts, the book focuses on the mining and applications of UGC. The first part presents an introduction to this new and exciting topic. Covering the mining of UGC of different medium types, the second part discusses the social annotation of UGC, social network graph construction and community mining, mining of UGC to assist in music retrieval, and the popular but difficult topic of UGC sentiment analysis. The third part describes the mining and searching of various types of UGC, including knowledge extraction, search techniques for UGC content, and a specific study on the analysis and annotation of Japanese blogs. The fourth part on applications explores the use of UGC to support question-answering, information summarization, and recommendations.
The series, Contemporary Perspectives on Data Mining, is composed of blind refereed scholarly research methods and applications of data mining. This series will be targeted both at the academic community, as well as the business practitioner. Data mining seeks to discover knowledge from vast amounts of data with the use of statistical and mathematical techniques. The knowledge is extracted form this data by examining the patterns of the data, whether they be associations of groups or things, predictions, sequential relationships between time order events or natural groups. Data mining applications are seen in finance (banking, brokerage, insurance), marketing (customer relationships, retailing, logistics, travel), as well as in manufacturing, health care, fraud detection, home-land security, and law enforcement.
Learn How to Properly Use the Latest Analytics Approaches in Your Organization Computational Business Analytics presents tools and techniques for descriptive, predictive, and prescriptive analytics applicable across multiple domains. Through many examples and challenging case studies from a variety of fields, practitioners easily see the connections to their own problems and can then formulate their own solution strategies. The book first covers core descriptive and inferential statistics for analytics. The author then enhances numerical statistical techniques with symbolic artificial intelligence (AI) and machine learning (ML) techniques for richer predictive and prescriptive analytics. With a special emphasis on methods that handle time and textual data, the text: Enriches principal component and factor analyses with subspace methods, such as latent semantic analyses Combines regression analyses with probabilistic graphical modeling, such as Bayesian networks Extends autoregression and survival analysis techniques with the Kalman filter, hidden Markov models, and dynamic Bayesian networks Embeds decision trees within influence diagrams Augments nearest-neighbor and k-means clustering techniques with support vector machines and neural networks These approaches are not replacements of traditional statistics-based analytics; rather, in most cases, a generalized technique can be reduced to the underlying traditional base technique under very restrictive conditions. The book shows how these enriched techniques offer efficient solutions in areas, including customer segmentation, churn prediction, credit risk assessment, fraud detection, and advertising campaigns. |
You may like...
Opinion Mining and Text Analytics on…
Pantea Keikhosrokiani, Moussa Pourya Asl
Hardcover
R9,276
Discovery Miles 92 760
Big Data - Concepts, Methodologies…
Information Reso Management Association
Hardcover
R17,613
Discovery Miles 176 130
|