Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
|||
Books > Computing & IT > Applications of computing > Databases > Data mining
From the Foreword: "While large-scale machine learning and data mining have greatly impacted a range of commercial applications, their use in the field of Earth sciences is still in the early stages. This book, edited by Ashok Srivastava, Ramakrishna Nemani, and Karsten Steinhaeuser, serves as an outstanding resource for anyone interested in the opportunities and challenges for the machine learning community in analyzing these data sets to answer questions of urgent societal interest...I hope that this book will inspire more computer scientists to focus on environmental applications, and Earth scientists to seek collaborations with researchers in machine learning and data mining to advance the frontiers in Earth sciences." --Vipin Kumar, University of Minnesota Large-Scale Machine Learning in the Earth Sciences provides researchers and practitioners with a broad overview of some of the key challenges in the intersection of Earth science, computer science, statistics, and related fields. It explores a wide range of topics and provides a compilation of recent research in the application of machine learning in the field of Earth Science. Making predictions based on observational data is a theme of the book, and the book includes chapters on the use of network science to understand and discover teleconnections in extreme climate and weather events, as well as using structured estimation in high dimensions. The use of ensemble machine learning models to combine predictions of global climate models using information from spatial and temporal patterns is also explored. The second part of the book features a discussion on statistical downscaling in climate with state-of-the-art scalable machine learning, as well as an overview of methods to understand and predict the proliferation of biological species due to changes in environmental conditions. The problem of using large-scale machine learning to study the formation of tornadoes is also explored in depth. The last part of the book covers the use of deep learning algorithms to classify images that have very high resolution, as well as the unmixing of spectral signals in remote sensing images of land cover. The authors also apply long-tail distributions to geoscience resources, in the final chapter of the book.
Visual Analytics and Interactive Technologies: Data, Text and Web Mining Applications is a comprehensive reference on concepts, algorithms, theories, applications, software, and visualization of data mining, text mining, Web mining and computing/supercomputing. This publication provides a coherent set of related works on the state-of-the-art of the theory and applications of mining, making it a useful resource for researchers, practitioners, professionals and intellectuals in technical and non-technical fields.
This book features multi-omics big-data integration and data-mining techniques. In the omics age, paramount of multi-omics data from various sources is the new challenge we are facing, but it also provides clues for several biomedical or clinical applications. This book focuses on data integration and data mining methods for multi-omics research, which explains in detail and with supportive examples the “What”, “Why” and “How” of the topic. The contents are organized into eight chapters, out of which one is for the introduction, followed by four chapters dedicated for omics integration techniques focusing on several omics data resources and data-mining methods, and three chapters dedicated for applications of multi-omics analyses with application being demonstrated by several data mining methods. This book is an attempt to bridge the gap between the biomedical multi-omics big data and the data-mining techniques for the best practice of contemporary bioinformatics and the in-depth insights for the biomedical questions. It would be of interests for the researchers and practitioners who want to conduct the multi-omics studies in cancer, inflammation disease, and microbiome researches.
This book equips readers to handle complex multi-view data representation, centered around several major visual applications, sharing many tips and insights through a unified learning framework. This framework is able to model most existing multi-view learning and domain adaptation, enriching readers' understanding from their similarity, and differences based on data organization and problem settings, as well as the research goal. A comprehensive review exhaustively provides the key recent research on multi-view data analysis, i.e., multi-view clustering, multi-view classification, zero-shot learning, and domain adaption. More practical challenges in multi-view data analysis are discussed including incomplete, unbalanced and large-scale multi-view learning. Learning Representation for Multi-View Data Analysis covers a wide range of applications in the research fields of big data, human-centered computing, pattern recognition, digital marketing, web mining, and computer vision.
This volume comprises eight well-versed contributed chapters devoted to report the latest findings on the intelligent approaches to multimedia data analysis. Multimedia data is a combination of different discrete and continuous content forms like text, audio, images, videos, animations and interactional data. At least a single continuous media in the transmitted information generates multimedia information. Due to these different types of varieties, multimedia data present varied degrees of uncertainties and imprecision, which cannot be easy to deal by the conventional computing paradigm. Soft computing technologies are quite efficient to handle the imprecision and uncertainty of the multimedia data and they are flexible enough to process the real-world information. Proper analysis of multimedia data finds wide applications in medical diagnosis, video surveillance, text annotation etc. This volume is intended to be used as a reference by undergraduate and post graduate students of the disciplines of computer science, electronics and telecommunication, information science and electrical engineering. THE SERIES: FRONTIERS IN COMPUTATIONAL INTELLIGENCE The series Frontiers In Computational Intelligence is envisioned to provide comprehensive coverage and understanding of cutting edge research in computational intelligence. It intends to augment the scholarly discourse on all topics relating to the advances in artifi cial life and machine learning in the form of metaheuristics, approximate reasoning, and robotics. Latest research fi ndings are coupled with applications to varied domains of engineering and computer sciences. This field is steadily growing especially with the advent of novel machine learning algorithms being applied to different domains of engineering and technology. The series brings together leading researchers that intend to continue to advance the fi eld and create a broad knowledge about the most recent state of the art.
The book illustrates the inter-relationship between several data management, analytics and decision support techniques and methods commonly adopted in Cybersecurity-oriented frameworks. The recent advent of Big Data paradigms and the use of data science methods, has resulted in a higher demand for effective data-driven models that support decision-making at a strategic level. This motivates the need for defining novel data analytics and decision support approaches in a myriad of real-life scenarios and problems, with Cybersecurity-related domains being no exception. This contributed volume comprises nine chapters, written by leading international researchers, covering a compilation of recent advances in Cybersecurity-related applications of data analytics and decision support approaches. In addition to theoretical studies and overviews of existing relevant literature, this book comprises a selection of application-oriented research contributions. The investigations undertaken across these chapters focus on diverse and critical Cybersecurity problems, such as Intrusion Detection, Insider Threats, Insider Threats, Collusion Detection, Run-Time Malware Detection, Intrusion Detection, E-Learning, Online Examinations, Cybersecurity noisy data removal, Secure Smart Power Systems, Security Visualization and Monitoring. Researchers and professionals alike will find the chapters an essential read for further research on the topic.
Geographic Information has an important role to play in linking and combining datasets through shared location, but the potential is still far from fully realized because the data is not well organized and the technology to aid this process has not been available. Developments in the Semantic Web and Linked Data, however, are making it possible to integrate data based on Geographic Information in a way that is more accessible to users. Drawing on the industry experience of a geographer and a computer scientist, Linked Data: A Geographic Perspective is a practical guide to implementing Geographic Information as Linked Data. Combine Geographic Information from Multiple Sources Using Linked Data After an introduction to the building blocks of Geographic Information, the Semantic Web, and Linked Data, the book explores how Geographic Information can become part of the Semantic Web as Linked Data. In easy-to-understand terms, the authors explain the complexities of modeling Geographic Information using Semantic Web technologies and publishing it as Linked Data. They review the software tools currently available for publishing and modeling Linked Data and provide a framework to help you evaluate new tools in a rapidly developing market. They also give an overview of the important languages and syntaxes you will need to master. Throughout, extensive examples demonstrate why and how you can use ontologies and Linked Data to manipulate and integrate real-world Geographic Information data from multiple sources. A Practical, Readable Guide for Geographers, Software Engineers, and Laypersons A coherent, readable introduction to a complex subject, this book supplies the durable knowledge and insight you need to think about Geographic Information through the lens of the Semantic Web. It provides a window to Linked Data for geographers, as well as a geographic perspective for so
Cluster or co-cluster analyses are important tools in a variety of scientific areas. The introduction of this book presents a state of the art of already well-established, as well as more recent methods of co-clustering. The authors mainly deal with the two-mode partitioning under different approaches, but pay particular attention to a probabilistic approach. Chapter 1 concerns clustering in general and the model-based clustering in particular. The authors briefly review the classical clustering methods and focus on the mixture model. They present and discuss the use of different mixtures adapted to different types of data. The algorithms used are described and related works with different classical methods are presented and commented upon. This chapter is useful in tackling the problem of co-clustering under the mixture approach. Chapter 2 is devoted to the latent block model proposed in the mixture approach context. The authors discuss this model in detail and present its interest regarding co-clustering. Various algorithms are presented in a general context. Chapter 3 focuses on binary and categorical data. It presents, in detail, the appropriated latent block mixture models. Variants of these models and algorithms are presented and illustrated using examples. Chapter 4 focuses on contingency data. Mutual information, phi-squared and model-based co-clustering are studied. Models, algorithms and connections among different approaches are described and illustrated. Chapter 5 presents the case of continuous data. In the same way, the different approaches used in the previous chapters are extended to this situation. Contents 1. Cluster Analysis. 2. Model-Based Co-Clustering. 3. Co-Clustering of Binary and Categorical Data. 4. Co-Clustering of Contingency Tables. 5. Co-Clustering of Continuous Data. About the Authors Gerard Govaert is Professor at the University of Technology of Compiegne, France. He is also a member of the CNRS Laboratory Heudiasyc (Heuristic and diagnostic of complex systems). His research interests include latent structure modeling, model selection, model-based cluster analysis, block clustering and statistical pattern recognition. He is one of the authors of the MIXMOD (MIXtureMODelling) software. Mohamed Nadif is Professor at the University of Paris-Descartes, France, where he is a member of LIPADE (Paris Descartes computer science laboratory) in the Mathematics and Computer Science department. His research interests include machine learning, data mining, model-based cluster analysis, co-clustering, factorization and data analysis. Cluster Analysis is an important tool in a variety of scientific areas. Chapter 1 briefly presents a state of the art of already well-established as well more recent methods. The hierarchical, partitioning and fuzzy approaches will be discussed amongst others. The authors review the difficulty of these classical methods in tackling the high dimensionality, sparsity and scalability. Chapter 2 discusses the interests of coclustering, presenting different approaches and defining a co-cluster. The authors focus on co-clustering as a simultaneous clustering and discuss the cases of binary, continuous and co-occurrence data. The criteria and algorithms are described and illustrated on simulated and real data. Chapter 3 considers co-clustering as a model-based co-clustering. A latent block model is defined for different kinds of data. The estimation of parameters and co-clustering is tackled under two approaches: maximum likelihood and classification maximum likelihood. Hard and soft algorithms are described and applied on simulated and real data. Chapter 4 considers co-clustering as a matrix approximation. The trifactorization approach is considered and algorithms based on update rules are described. Links with numerical and probabilistic approaches are established. A combination of algorithms are proposed and evaluated on simulated and real data. Chapter 5 considers a co-clustering or bi-clustering as the search for coherent co-clusters in biological terms or the extraction of co-clusters under conditions. Classical algorithms will be described and evaluated on simulated and real data. Different indices to evaluate the quality of coclusters are noted and used in numerical experiments.
This book offers a clear and comprehensive introduction to broad learning, one of the novel learning problems studied in data mining and machine learning. Broad learning aims at fusing multiple large-scale information sources of diverse varieties together, and carrying out synergistic data mining tasks across these fused sources in one unified analytic. This book takes online social networks as an application example to introduce the latest alignment and knowledge discovery algorithms. Besides the overview of broad learning, machine learning and social network basics, specific topics covered in this book include network alignment, link prediction, community detection, information diffusion, viral marketing, and network embedding.
Covering research at the frontier of this field, Privacy-Aware Knowledge Discovery: Novel Applications and New Techniques presents state-of-the-art privacy-preserving data mining techniques for application domains, such as medicine and social networks, that face the increasing heterogeneity and complexity of new forms of data. Renowned authorities from prominent organizations not only cover well-established results-they also explore complex domains where privacy issues are generally clear and well defined, but the solutions are still preliminary and in continuous development. Divided into seven parts, the book provides in-depth coverage of the most novel reference scenarios for privacy-preserving techniques. The first part gives general techniques that can be applied to various applications discussed in the rest of the book. The second section focuses on the sanitization of network traces and privacy in data stream mining. After the third part on privacy in spatio-temporal data mining and mobility data analysis, the book examines time series analysis in the fourth section, explaining how a perturbation method and a segment-based method can tackle privacy issues of time series data. The fifth section on biomedical data addresses genomic data as well as the problem of privacy-aware information sharing of health data. In the sixth section on web applications, the book deals with query log mining and web recommender systems. The final part on social networks analyzes privacy issues related to the management of social network data under different perspectives. While several new results have recently occurred in the privacy, database, and data mining research communities, a uniform presentation of up-to-date techniques and applications is lacking. Filling this void, Privacy-Aware Knowledge Discovery presents novel algorithms, patterns, and models, along with a significant collection of open problems for future investigation.
Web mining aims to discover useful information and knowledge from Web hyperlinks, page contents, and usage data. Although Web mining uses many conventional data mining techniques, it is not purely an application of traditional data mining due to the semi-structured and unstructured nature of the Web data. The field has also developed many of its own algorithms and techniques. Liu has written a comprehensive text on Web mining, which consists of two parts. The first part covers the data mining and machine learning foundations, where all the essential concepts and algorithms of data mining and machine learning are presented. The second part covers the key topics of Web mining, where Web crawling, search, social network analysis, structured data extraction, information integration, opinion mining and sentiment analysis, Web usage mining, query log mining, computational advertising, and recommender systems are all treated both in breadth and in depth. His book thus brings all the related concepts and algorithms together to form an authoritative and coherent text. The book offers a rich blend of theory and practice. It is suitable for students, researchers and practitioners interested in Web mining and data mining both as a learning text and as a reference book. Professors can readily use it for classes on data mining, Web mining, and text mining. Additional teaching materials such as lecture slides, datasets, and implemented algorithms are available online. "
This book explores the possibility of using social media data for detecting socio-economic recovery activities. In the last decade, there have been intensive research activities focusing on social media during and after disasters. This approach, which views people's communication on social media as a sensor for real-time situations, has been widely adopted as the "people as sensor" approach. Furthermore, to improve recovery efforts after large-scale disasters, detecting communities' real-time recovery situations is essential, since conventional socio-economic recovery indicators, such as governmental statistics, are not published in real time. Thanks to its timeliness, using social media data can fill the gap. Motivated by this possibility, this book especially focuses on the relationships between people's communication on Twitter and Facebook pages, and socio-economic recovery activities as reflected in the used-car market data and the housing market data in the case of two major disasters: the Great East Japan Earthquake and Tsunami of 2011 and Hurricane Sandy in 2012. The book pursues an interdisciplinary approach, combining e.g. disaster recovery studies, crisis informatics, and economics. In terms of its contributions, firstly, the book sheds light on the "people as sensors" approach for detecting socio-economic recovery activities, which has not been thoroughly studied to date but has the potential to improve situation awareness during the recovery phase. Secondly, the book proposes new socio-economic recovery indicators: used-car market data and housing market data. Thirdly, in the context of using social media during the recovery phase, the results demonstrate the importance of distinguishing between social media data posted both by people who are at or near disaster-stricken areas and by those who are farther away.
This book provides a unique, in-depth discussion of multiview learning, one of the fastest developing branches in machine learning. Multiview Learning has been proved to have good theoretical underpinnings and great practical success. This book describes the models and algorithms of multiview learning in real data analysis. Incorporating multiple views to improve the generalization performance, multiview learning is also known as data fusion or data integration from multiple feature sets. This self-contained book is applicable for multi-modal learning research, and requires minimal prior knowledge of the basic concepts in the field. It is also a valuable reference resource for researchers working in the field of machine learning and also those in various application domains.
In order to make informed decisions, there are three important elements: intuition, trust, and analytics. Intuition is based on experiential learning and recent research has shown that those who rely on their "gut feelings" may do better than those who don't. Analytics, however, are important in a data-driven environment to also inform decision making. The third element, trust, is critical for knowledge sharing to take place. These three elements-intuition, analytics, and trust-make a perfect combination for decision making. This book gathers leading researchers who explore the role of these three elements in the process of decision-making.
The growth of machines and users of the Internet has led to the proliferation of all sorts of data concerning individuals, institutions, companies, governments, universities, and all kinds of known objects and events happening everywhere in daily life. Scientific knowledge is not an exception to the data boom. The phenomenon of data growth in science pushes forth as the number of scientific papers published doubles every 9-15 years, and the need for methods and tools to understand what is reported in scientific literature becomes evident. As the number of academicians and innovators swells, so do the number of publications of all types, yielding outlets of documents and depots of authors and institutions that need to be found in Bibliometric databases. These databases are dug into and treated to hand over metrics of research performance by means of Scientometrics that analyze the toil of individuals, institutions, journals, countries, and even regions of the world. The objective of this book is to assist students, professors, university managers, government, industry, and stakeholders in general, understand which are the main Bibliometric databases, what are the key research indicators, and who are the main players in university rankings and the methodologies and approaches that they employ in producing ranking tables. The book is divided into two sections. The first looks at Scientometric databases, including Scopus and Google Scholar as well as institutional repositories. The second section examines the application of Scientometrics to world-class universities and the role that Scientometrics can play in competition among them. It looks at university rankings and the methodologies used to create these rankings. Individual chapters examine specific rankings that include: QS World University Scimago Institutions Webometrics U-Multirank U.S. News & World Report The book concludes with a discussion of university performance in the age of research analytics.
This book shows healthcare professionals how to turn data points into meaningful knowledge upon which they can take effective action. Actionable intelligence can take many forms, from informing health policymakers on effective strategies for the population to providing direct and predictive insights on patients to healthcare providers so they can achieve positive outcomes. It can assist those performing clinical research where relevant statistical methods are applied to both identify the efficacy of treatments and improve clinical trial design. It also benefits healthcare data standards groups through which pertinent data governance policies are implemented to ensure quality data are obtained, measured, and evaluated for the benefit of all involved. Although the obvious constant thread among all of these important healthcare use cases of actionable intelligence is the data at hand, such data in and of itself merely represents one element of the full structure of healthcare data analytics. This book examines the structure for turning data into actionable knowledge and discusses: The importance of establishing research questions Data collection policies and data governance Principle-centered data analytics to transform data into information Understanding the "why" of classified causes and effects Narratives and visualizations to inform all interested parties Actionable Intelligence in Healthcare is an important examination of how proper healthcare-related questions should be formulated, how relevant data must be transformed to associated information, and how the processing of information relates to knowledge. It indicates to clinicians and researchers why this relative knowledge is meaningful and how best to apply such newfound understanding for the betterment of all.
Compiled by world- class leaders in the field of collaborative information retrieval and search (CIS), this book centres on the notion that information seeking is not always a solitary activity and working in collaboration to perform information-seeking tasks should be studied and supported. Covering aspects of theories, models, and applications the book is divided in three parts: * Best Practices and Studies: providing an overview of current knowledge and state-of-the-art in the field. * New Domains: covers some of the new and exciting opportunities of applying CIS * New Thoughts: focuses on new research directions by scholars from academia and industry from around the world. Collaborative Information Seeking provides a valuable reference for student, teachers, and researchers interested in the area of collaborative work, information seeking/retrieval, and human-computer interaction.
This volume unpacks an intriguing challenge for the field of media research: combining media research with the study of complex networks. Bringing together research on the small-world idea and digital culture it questions the assumption that we are separated from any other person on the planet by just a few steps, and that this distance decreases within digital social networks. The book argues that the role of languages is decisive to understand how people connect, and it looks at the consequences this has on the ways knowledge spreads digitally. This volume offers a first conceptual venue to analyse emerging phenomena at the innovative intersection of media and complex network research.
In discrete choice models the relationships between the independent variables and the choice probabilities are nonlinear, depending on both the value of the particular independent variable being interpreted and the values of the other independent variables. Thus, interpreting the magnitude of the effects (the "substantive effects") of the independent variables on choice behavior requires the use of additional interpretative techniques. Three common techniques for interpretation are described here: first differences, marginal effects and elasticities, and odds ratios. Concepts related to these techniques are also discussed, as well as methods to account for estimation uncertainty. Interpretation of binary logits, ordered logits, multinomial and conditional logits, and mixed discrete choice models such as mixed multinomial logits and random effects logits for panel data are covered in detail. The techniques discussed here are general, and can be applied to other models with discrete dependent variables which are not specifically described here.
As cameras become more pervasive in our daily life, vast amounts of video data are generated. The popularity of YouTube and similar websites such as Tudou and Youku provides strong evidence for the increasing role of video in society. One of the main challenges confronting us in the era of information technology is to - fectively rely on the huge and rapidly growing video data accumulating in large multimedia archives. Innovative video processing and analysis techniques will play an increasingly important role in resolving the difficult task of video search and retrieval. A wide range of video-based applications have benefited from - vances in video search and mining including multimedia information mana- ment, human-computer interaction, security and surveillance, copyright prot- tion, and personal entertainment, to name a few. This book provides an overview of emerging new approaches to video search and mining based on promising methods being developed in the computer vision and image analysis community. Video search and mining is a rapidly evolving discipline whose aim is to capture interesting patterns in video data. It has become one of the core areas in the data mining research community. In comparison to other types of data mining (e. g. text), video mining is still in its infancy. Many challenging research problems are facing video mining researchers.
Online social networking sites like Facebook, LinkedIn, and Twitter, offer millions of members the opportunity to befriend one another, send messages to each other, and post content on the site - actions which generate mind-boggling amounts of data every day.To make sense of the massive data from these sites, we resort to social media mining to answer questions like the following:
This book, drawing on recent literature, highlights several methodologies for the detection of outliers and explains how to apply them to solve several interesting real-life problems. The detection of objects that deviate from the norm in a data set is an essential task in data mining due to its significance in many contemporary applications. More specifically, the detection of fraud in e-commerce transactions and discovering anomalies in network data have become prominent tasks, given recent developments in the field of information and communication technologies and security. Accordingly, the book sheds light on specific state-of-the-art algorithmic approaches such as the community-based analysis of networks and characterization of temporal outliers present in dynamic networks. It offers a valuable resource for young researchers working in data mining, helping them understand the technical depth of the outlier detection problem and devise innovative solutions to address related challenges.
This book constitutes selected, revised and extended papers from the 13th International Conference on Computer Supported Education, CSEDU 2021, held as a virtual event in April 2021. The 27 revised full papers were carefully reviewed and selected from 143 submissions. They were organized in topical sections as follows: artificial intelligence in education; information technologies supporting learning; learning/teaching methodologies and assessment; social context and learning environments; ubiquitous learning; current topics.
Your logical, linear guide to the fundamentals of data science programming Data science is exploding--in a good way--with a forecast of 1.7 megabytes of new information created every second for each human being on the planet by 2020 and 11.5 million job openings by 2026. It clearly pays dividends to be in the know. This friendly guide charts a path through the fundamentals of data science and then delves into the actual work: linear regression, logical regression, machine learning, neural networks, recommender engines, and cross-validation of models. Data Science Programming All-In-One For Dummies is a compilation of the key data science, machine learning, and deep learning programming languages: Python and R. It helps you decide which programming languages are best for specific data science needs. It also gives you the guidelines to build your own projects to solve problems in real time. Get grounded: the ideal start for new data professionals What lies ahead: learn about specific areas that data is transforming Be meaningful: find out how to tell your data story See clearly: pick up the art of visualization Whether you're a beginning student or already mid-career, get your copy now and add even more meaning to your life--and everyone else's! |
You may like...
New Opportunities for Sentiment Analysis…
Aakanksha Sharaff, G. R. Sinha, …
Hardcover
R7,022
Discovery Miles 70 220
Implementation of Machine Learning…
Veljko Milutinovi, Nenad Mitic, …
Hardcover
R7,022
Discovery Miles 70 220
Opinion Mining and Text Analytics on…
Pantea Keikhosrokiani, Moussa Pourya Asl
Hardcover
R9,808
Discovery Miles 98 080
Modeling and Simulating Complex Business…
Zoumpolia Dikopoulou
Hardcover
R3,506
Discovery Miles 35 060
Enhancing Academic Research With…
Dhananjay Subhashchandra Deshpande, Narayan Bhosale, …
Hardcover
R5,242
Discovery Miles 52 420
|