![]() |
Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
||
|
Books > Computing & IT > Applications of computing > Databases > Data mining
Web mining is the application of data mining strategies to excerpt learning from web information, i.e. web content, web structure, and web usage data. With the emergence of the web as the predominant and converging platform for communication, business and scholastic information dissemination, especially in the last five years, there are ever increasing research groups working on different aspects of web mining mainly in three directions. These are: mining of web content, web structure and web usage. In this context there are good number of frameworks and benchmarks related to the metrics of the websites which is certainly weighty for B2B, B2C and in general in any e-commerce paradigm. Owing to the popularity of this topic there are few books in the market, dealing more on such performance metrics and other related issues. This book, however, omits all such routine topics and lays more emphasis on the classification and clustering aspects of the websites in order to come out with the true perception of the websites in light of its usability. In nutshell, Web Mining: A Synergic Approach Resorting to Classifications and Clustering showcases an effective methodology for classification and clustering of web sites from their usability point of view. While the clustering and classification is accomplished by using an open source tool WEKA, the basic dataset for the selected websites has been emanated by using a free tool site-analyzer. As a case study, several commercial websites have been analyzed. The dataset preparation using site-analyzer and classification through WEKA by embedding different algorithms is one of the unique selling points of this book. This text projects a complete spectrum of web mining from its very inception through data mining and takes the reader up to the application level. Salient features of the book include: - Literature review of research work in the area of web mining - Business websites domain researched, and data collected using site-analyzer tool - Accessibility, design, text, multimedia, and networking are assessed - Datasets are filtered further by selecting vital attributes which are Search Engine Optimized for processing using the Weka attributed tool - Dataset with labels have been classified using J48, RBFNetwork, NaiveBayes, and SMO techniques using Weka - A comparative analysis of all classifiers is reported - Commercial applications for improving website performance based on SEO is given
In discrete choice models the relationships between the independent variables and the choice probabilities are nonlinear, depending on both the value of the particular independent variable being interpreted and the values of the other independent variables. Thus, interpreting the magnitude of the effects (the "substantive effects") of the independent variables on choice behavior requires the use of additional interpretative techniques. Three common techniques for interpretation are described here: first differences, marginal effects and elasticities, and odds ratios. Concepts related to these techniques are also discussed, as well as methods to account for estimation uncertainty. Interpretation of binary logits, ordered logits, multinomial and conditional logits, and mixed discrete choice models such as mixed multinomial logits and random effects logits for panel data are covered in detail. The techniques discussed here are general, and can be applied to other models with discrete dependent variables which are not specifically described here.
This book presents the recent achievements on the processing of representative user generated content (UGC) on E-commerce websites. This large size of UGC is valuable information for data mining to help customer/object profiling. It provides a comprehensive overview on the concept of customer credibility, object-oriented review summarization technology and content-based collaborative filtering algorithm. It covers a feedback mechanism which is designed to discover customer credibility, which is used to define the professional degree of review content; product-oriented review summarization for restaurants or trip arrangements, and introduced content-based collaborative filtering for product recommendation.
This book presents the recent achievements on the processing of representative user generated content (UGC) on E-commerce websites. This large size of UGC is valuable information for data mining to help customer/object profiling. It provides a comprehensive overview on the concept of customer credibility, object-oriented review summarization technology and content-based collaborative filtering algorithm. It covers a feedback mechanism which is designed to discover customer credibility, which is used to define the professional degree of review content; product-oriented review summarization for restaurants or trip arrangements, and introduced content-based collaborative filtering for product recommendation.
Group method of data handling (GMDH) is a typical inductive modeling method built on the principles of self-organization. Since its introduction, inductive modelling has been developed to support complex systems in prediction, clusterization, system identification, as well as data mining and knowledge extraction technologies in social science, science, engineering, and medicine.This is the first book to explore GMDH using MATLAB (matrix laboratory) language. Readers will learn how to implement GMDH in MATLAB as a method of dealing with big data analytics. Error-free source codes in MATLAB have been included in supplementary material (accessible online) to assist users in their understanding in GMDH and to make it easy for users to further develop variations of GMDH algorithms.
This is the first study of Boko Haram that brings advanced data-driven, machine learning models to both learn models capable of predicting a wide range of attacks carried out by Boko Haram, as well as develop data-driven policies to shape Boko Haram's behavior and reduce attacks by them. This book also identifies conditions that predict sexual violence, suicide bombings and attempted bombings, abduction, arson, looting, and targeting of government officials and security installations. After reducing Boko Haram's history to a spreadsheet containing monthly information about different types of attacks and different circumstances prevailing over a 9 year period, this book introduces Temporal Probabilistic (TP) rules that can be automatically learned from data and are easy to explain to policy makers and security experts. This book additionally reports on over 1 year of forecasts made using the model in order to validate predictive accuracy. It also introduces a policy computation method to rein in Boko Haram's attacks. Applied machine learning researchers, machine learning experts and predictive modeling experts agree that this book is a valuable learning asset. Counter-terrorism experts, national and international security experts, public policy experts and Africa experts will also agree this book is a valuable learning tool.
This book constitutes selected, revised and extended papers from the 13th International Conference on Computer Supported Education, CSEDU 2021, held as a virtual event in April 2021. The 27 revised full papers were carefully reviewed and selected from 143 submissions. They were organized in topical sections as follows: artificial intelligence in education; information technologies supporting learning; learning/teaching methodologies and assessment; social context and learning environments; ubiquitous learning; current topics.
This book has a collection of articles written by Big Data experts to describe some of the cutting-edge methods and applications from their respective areas of interest, and provides the reader with a detailed overview of the field of Big Data Analytics as it is practiced today. The chapters cover technical aspects of key areas that generate and use Big Data such as management and finance; medicine and healthcare; genome, cytome and microbiome; graphs and networks; Internet of Things; Big Data standards; bench-marking of systems; and others. In addition to different applications, key algorithmic approaches such as graph partitioning, clustering and finite mixture modelling of high-dimensional data are also covered. The varied collection of themes in this volume introduces the reader to the richness of the emerging field of Big Data Analytics.
Advances in Machine Learning and Data Mining for Astronomy documents numerous successful collaborations among computer scientists, statisticians, and astronomers who illustrate the application of state-of-the-art machine learning and data mining techniques in astronomy. Due to the massive amount and complexity of data in most scientific disciplines, the material discussed in this text transcends traditional boundaries between various areas in the sciences and computer science. The book's introductory part provides context to issues in the astronomical sciences that are also important to health, social, and physical sciences, particularly probabilistic and statistical aspects of classification and cluster analysis. The next part describes a number of astrophysics case studies that leverage a range of machine learning and data mining technologies. In the last part, developers of algorithms and practitioners of machine learning and data mining show how these tools and techniques are used in astronomical applications. With contributions from leading astronomers and computer scientists, this book is a practical guide to many of the most important developments in machine learning, data mining, and statistics. It explores how these advances can solve current and future problems in astronomy and looks at how they could lead to the creation of entirely new algorithms within the data mining community.
While laboratory research is the backbone of collecting experimental data in cognitive science, a rapidly increasing amount of research is now capitalizing on large-scale and real-world digital data. Each piece of data is a trace of human behavior and offers us a potential clue to understanding basic cognitive principles. However, we have to be able to put the pieces together in a reasonable way, which necessitates both advances in our theoretical models and development of new methodological techniques. The primary goal of this volume is to present cutting-edge examples of mining large-scale and naturalistic data to discover important principles of cognition and evaluate theories that would not be possible without such a scale. This book also has a mission to stimulate cognitive scientists to consider new ways to harness big data in order to enhance our understanding of fundamental cognitive processes. Finally, this book aims to warn of the potential pitfalls of using, or being over-reliant on, big data and to show how big data can work alongside traditional, rigorously gathered experimental data rather than simply supersede it. In sum, this groundbreaking volume presents cognitive scientists and those in related fields with an exciting, detailed, stimulating, and realistic introduction to big data - and to show how it may greatly advance our understanding of the principles of human memory, perception, categorization, decision-making, language, problem-solving, and representation.
This volume features selected, refereed papers on various aspects of statistics, matrix theory and its applications to statistics, as well as related numerical linear algebra topics and numerical solution methods, which are relevant for problems arising in statistics and in big data. The contributions were originally presented at the 25th International Workshop on Matrices and Statistics (IWMS 2016), held in Funchal (Madeira), Portugal on June 6-9, 2016. The IWMS workshop series brings together statisticians, computer scientists, data scientists and mathematicians, helping them better understand each other's tools, and fostering new collaborations at the interface of matrix theory and statistics.
This book discusses the challenges facing current research in knowledge discovery and data mining posed by the huge volumes of complex data now gathered in various real-world applications (e.g., business process monitoring, cybersecurity, medicine, language processing, and remote sensing). The book consists of 14 chapters covering the latest research by the authors and the research centers they represent. It illustrates techniques and algorithms that have recently been developed to preserve the richness of the data and allow us to efficiently and effectively identify the complex information it contains. Presenting the latest developments in complex pattern mining, this book is a valuable reference resource for data science researchers and professionals in academia and industry.
What is text mining, and how can it be used? What relevance do these methods have to everyday work in information science and the digital humanities? How does one develop competences in text mining? Working with Text provides a series of cross-disciplinary perspectives on text mining and its applications. As text mining raises legal and ethical issues, the legal background of text mining and the responsibilities of the engineer are discussed in this book. Chapters provide an introduction to the use of the popular GATE text mining package with data drawn from social media, the use of text mining to support semantic search, the development of an authority system to support content tagging, and recent techniques in automatic language evaluation. Focused studies describe text mining on historical texts, automated indexing using constrained vocabularies, and the use of natural language processing to explore the climate science literature. Interviews are included that offer a glimpse into the real-life experience of working within commercial and academic text mining.
Recent advances in high-throughput technologies have resulted in a deluge of biological information. Yet the storage, analysis, and interpretation of such multifaceted data require effective and efficient computational tools. This unique text/reference addresses the need for a unified framework describing how soft computing and machine learning techniques can be judiciously formulated and used in building efficient pattern recognition models. The book reviews both established and cutting-edge research, following a clear structure reflecting the major phases of a pattern recognition system: classification, feature selection, and clustering. The text provides a careful balance of theory, algorithms, and applications, with a particular emphasis given to applications in computational biology and bioinformatics. Topics and features: reviews the development of scalable pattern recognition algorithms for computational biology and bioinformatics; integrates different soft computing and machine learning methodologies with pattern recognition tasks; discusses in detail the integration of different techniques for handling uncertainties in decision-making and efficiently mining large biological datasets; presents a particular emphasis on real-life applications, such as microarray expression datasets and magnetic resonance images; includes numerous examples and experimental results to support the theoretical concepts described; concludes each chapter with directions for future research and a comprehensive bibliography. This important work will be of great use to graduate students and researchers in the fields of computer science, electrical and biomedical engineering. Researchers and practitioners involved in pattern recognition, machine learning, computational biology and bioinformatics, data mining, and soft computing will also find the book invaluable.
Focused on the mathematical foundations of social media analysis, Graph-Based Social Media Analysis provides a comprehensive introduction to the use of graph analysis in the study of social and digital media. It addresses an important scientific and technological challenge, namely the confluence of graph analysis and network theory with linear algebra, digital media, machine learning, big data analysis, and signal processing. Supplying an overview of graph-based social media analysis, the book provides readers with a clear understanding of social media structure. It uses graph theory, particularly the algebraic description and analysis of graphs, in social media studies. The book emphasizes the big data aspects of social and digital media. It presents various approaches to storing vast amounts of data online and retrieving that data in real-time. It demystifies complex social media phenomena, such as information diffusion, marketing and recommendation systems in social media, and evolving systems. It also covers emerging trends, such as big data analysis and social media evolution. Describing how to conduct proper analysis of the social and digital media markets, the book provides insights into processing, storing, and visualizing big social media data and social graphs. It includes coverage of graphs in social and digital media, graph and hyper-graph fundamentals, mathematical foundations coming from linear algebra, algebraic graph analysis, graph clustering, community detection, graph matching, web search based on ranking, label propagation and diffusion in social media, graph-based pattern recognition and machine learning, graph-based pattern classification and dimensionality reduction, and much more. This book is an ideal reference for scientists and engineers working in social media and digital media production and distribution. It is also suitable for use as a textbook in undergraduate or graduate courses on digital media, social media, or social networks.
The growth of machines and users of the Internet has led to the proliferation of all sorts of data concerning individuals, institutions, companies, governments, universities, and all kinds of known objects and events happening everywhere in daily life. Scientific knowledge is not an exception to the data boom. The phenomenon of data growth in science pushes forth as the number of scientific papers published doubles every 9-15 years, and the need for methods and tools to understand what is reported in scientific literature becomes evident. As the number of academicians and innovators swells, so do the number of publications of all types, yielding outlets of documents and depots of authors and institutions that need to be found in Bibliometric databases. These databases are dug into and treated to hand over metrics of research performance by means of Scientometrics that analyze the toil of individuals, institutions, journals, countries, and even regions of the world. The objective of this book is to assist students, professors, university managers, government, industry, and stakeholders in general, understand which are the main Bibliometric databases, what are the key research indicators, and who are the main players in university rankings and the methodologies and approaches that they employ in producing ranking tables. The book is divided into two sections. The first looks at Scientometric databases, including Scopus and Google Scholar as well as institutional repositories. The second section examines the application of Scientometrics to world-class universities and the role that Scientometrics can play in competition among them. It looks at university rankings and the methodologies used to create these rankings. Individual chapters examine specific rankings that include: QS World University Scimago Institutions Webometrics U-Multirank U.S. News & World Report The book concludes with a discussion of university performance in the age of research analytics.
In order to make informed decisions, there are three important elements: intuition, trust, and analytics. Intuition is based on experiential learning and recent research has shown that those who rely on their "gut feelings" may do better than those who don't. Analytics, however, are important in a data-driven environment to also inform decision making. The third element, trust, is critical for knowledge sharing to take place. These three elements-intuition, analytics, and trust-make a perfect combination for decision making. This book gathers leading researchers who explore the role of these three elements in the process of decision-making.
This edited collection discusses the emerging topics in statistical modeling for biomedical research. Leading experts in the frontiers of biostatistics and biomedical research discuss the statistical procedures, useful methods, and their novel applications in biostatistics research. Interdisciplinary in scope, the volume as a whole reflects the latest advances in statistical modeling in biomedical research, identifies impactful new directions, and seeks to drive the field forward. It also fosters the interaction of scholars in the arena, offering great opportunities to stimulate further collaborations. This book will appeal to industry data scientists and statisticians, researchers, and graduate students in biostatistics and biomedical science. It covers topics in: Next generation sequence data analysis Deep learning, precision medicine, and their applications Large scale data analysis and its applications Biomedical research and modeling Survival analysis with complex data structure and its applications.
The Art and Science of Analyzing Software Data provides valuable information on analysis techniques often used to derive insight from software data. This book shares best practices in the field generated by leading data scientists, collected from their experience training software engineering students and practitioners to master data science. The book covers topics such as the analysis of security data, code reviews, app stores, log files, and user telemetry, among others. It covers a wide variety of techniques such as co-change analysis, text analysis, topic analysis, and concept analysis, as well as advanced topics such as release planning and generation of source code comments. It includes stories from the trenches from expert data scientists illustrating how to apply data analysis in industry and open source, present results to stakeholders, and drive decisions.
Decision trees have become one of the most powerful and popular approaches in knowledge discovery and data mining; it is the science of exploring large and complex bodies of data in order to discover useful patterns. Decision tree learning continues to evolve over time. Existing methods are constantly being improved and new methods introduced.This 2nd Edition is dedicated entirely to the field of decision trees in data mining; to cover all aspects of this important technique, as well as improved or new methods and techniques developed after the publication of our first edition. In this new edition, all chapters have been revised and new topics brought in. New topics include Cost-Sensitive Active Learning, Learning with Uncertain and Imbalanced Data, Using Decision Trees beyond Classification Tasks, Privacy Preserving Decision Tree Learning, Lessons Learned from Comparative Studies, and Learning Decision Trees for Big Data. A walk-through guide to existing open-source data mining software is also included in this edition.This book invites readers to explore the many benefits in data mining that decision trees offer:
The book proposes new technologies and discusses future solutions for design infrastructure for ICT. The book contains high quality submissions presented at Second International Conference on Information and Communication Technology for Sustainable Development (ICT4SD - 2016) held at Goa, India during 1 - 2 July, 2016. The conference stimulates the cutting-edge research discussions among many academic pioneering researchers, scientists, industrial engineers, and students from all around the world. The topics covered in this book also focus on innovative issues at international level by bringing together the experts from different countries.
Biologists are stepping up their efforts in understanding the biological processes that underlie disease pathways in the clinical contexts. This has resulted in a flood of biological and clinical data from genomic and protein sequences, DNA microarrays, protein interactions, biomedical images, to disease pathways and electronic health records. To exploit these data for discovering new knowledge that can be translated into clinical applications, there are fundamental data analysis difficulties that have to be overcome. Practical issues such as handling noisy and incomplete data, processing compute-intensive tasks, and integrating various data sources, are new challenges faced by biologists in the post-genome era. This book will cover the fundamentals of state-of-the-art data mining techniques which have been designed to handle such challenging data analysis problems, and demonstrate with real applications how biologists and clinical scientists can employ data mining to enable them to make meaningful observations and discoveries from a wide array of heterogeneous data from molecular biology to pharmaceutical and clinical domains.
Originating from Facebook, LinkedIn, Twitter, Instagram, YouTube, and many other networking sites, the social media shared by users and the associated metadata are collectively known as user generated content (UGC). To analyze UGC and glean insight about user behavior, robust techniques are needed to tackle the huge amount of real-time, multimedia, and multilingual data. Researchers must also know how to assess the social aspects of UGC, such as user relations and influential users. Mining User Generated Content is the first focused effort to compile state-of-the-art research and address future directions of UGC. It explains how to collect, index, and analyze UGC to uncover social trends and user habits. Divided into four parts, the book focuses on the mining and applications of UGC. The first part presents an introduction to this new and exciting topic. Covering the mining of UGC of different medium types, the second part discusses the social annotation of UGC, social network graph construction and community mining, mining of UGC to assist in music retrieval, and the popular but difficult topic of UGC sentiment analysis. The third part describes the mining and searching of various types of UGC, including knowledge extraction, search techniques for UGC content, and a specific study on the analysis and annotation of Japanese blogs. The fourth part on applications explores the use of UGC to support question-answering, information summarization, and recommendations.
Learn How to Properly Use the Latest Analytics Approaches in Your Organization Computational Business Analytics presents tools and techniques for descriptive, predictive, and prescriptive analytics applicable across multiple domains. Through many examples and challenging case studies from a variety of fields, practitioners easily see the connections to their own problems and can then formulate their own solution strategies. The book first covers core descriptive and inferential statistics for analytics. The author then enhances numerical statistical techniques with symbolic artificial intelligence (AI) and machine learning (ML) techniques for richer predictive and prescriptive analytics. With a special emphasis on methods that handle time and textual data, the text: Enriches principal component and factor analyses with subspace methods, such as latent semantic analyses Combines regression analyses with probabilistic graphical modeling, such as Bayesian networks Extends autoregression and survival analysis techniques with the Kalman filter, hidden Markov models, and dynamic Bayesian networks Embeds decision trees within influence diagrams Augments nearest-neighbor and k-means clustering techniques with support vector machines and neural networks These approaches are not replacements of traditional statistics-based analytics; rather, in most cases, a generalized technique can be reduced to the underlying traditional base technique under very restrictive conditions. The book shows how these enriched techniques offer efficient solutions in areas, including customer segmentation, churn prediction, credit risk assessment, fraud detection, and advertising campaigns.
Powerful, Flexible Tools for a Data-Driven WorldAs the data deluge continues in today's world, the need to master data mining, predictive analytics, and business analytics has never been greater. These techniques and tools provide unprecedented insights into data, enabling better decision making and forecasting, and ultimately the solution of increasingly complex problems. Learn from the Creators of the RapidMiner Software Written by leaders in the data mining community, including the developers of the RapidMiner software, RapidMiner: Data Mining Use Cases and Business Analytics Applications provides an in-depth introduction to the application of data mining and business analytics techniques and tools in scientific research, medicine, industry, commerce, and diverse other sectors. It presents the most powerful and flexible open source software solutions: RapidMiner and RapidAnalytics. The software and their extensions can be freely downloaded at www.RapidMiner.com. Understand Each Stage of the Data Mining ProcessThe book and software tools cover all relevant steps of the data mining process, from data loading, transformation, integration, aggregation, and visualization to automated feature selection, automated parameter and process optimization, and integration with other tools, such as R packages or your IT infrastructure via web services. The book and software also extensively discuss the analysis of unstructured data, including text and image mining. Easily Implement Analytics Approaches Using RapidMiner and RapidAnalytics Each chapter describes an application, how to approach it with data mining methods, and how to implement it with RapidMiner and RapidAnalytics. These application-oriented chapters give you not only the necessary analytics to solve problems and tasks, but also reproducible, step-by-step descriptions of using RapidMiner and RapidAnalytics. The case studies serve as blueprints for your own data mining applications, enabling you to effectively solve similar problems. |
You may like...
Opinion Mining and Text Analytics on…
Pantea Keikhosrokiani, Moussa Pourya Asl
Hardcover
R9,276
Discovery Miles 92 760
New Opportunities for Sentiment Analysis…
Aakanksha Sharaff, G. R. Sinha, …
Hardcover
R6,648
Discovery Miles 66 480
Intelligent Multidimensional Data…
Siddhartha Bhattacharyya, Sourav De, …
Hardcover
R5,351
Discovery Miles 53 510
Transforming Businesses With Bitcoin…
Dharmendra Singh Rajput, Ramjeevan Singh Thakur, …
Hardcover
R5,938
Discovery Miles 59 380
|