![]() |
Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
||
|
Books > Computing & IT > Applications of computing > Databases > Data capture & analysis
Data are not only ubiquitous in society, but are increasingly complex both in size and dimensionality. Dimension reduction offers researchers and scholars the ability to make such complex, high dimensional data spaces simpler and more manageable. This Element offers readers a suite of modern unsupervised dimension reduction techniques along with hundreds of lines of R code, to efficiently represent the original high dimensional data space in a simplified, lower dimensional subspace. Launching from the earliest dimension reduction technique principal components analysis and using real social science data, I introduce and walk readers through application of the following techniques: locally linear embedding, t-distributed stochastic neighbor embedding (t-SNE), uniform manifold approximation and projection, self-organizing maps, and deep autoencoders. The result is a well-stocked toolbox of unsupervised algorithms for tackling the complexities of high dimensional data so common in modern society. All code is publicly accessible on Github.
Learn from Today's Most Successful Workforce Analytics Leaders Transforming the immense potential of workforce analytics into reality isn't easy. Pioneering practitioners have learned crucial lessons that can help you succeed. The Power of People shares their journeys-and their indispensable insights. Drawing on incisive case studies and vignettes, three experts help you bring purpose and clarity to any workforce analytics project, with robust research design and analysis to get reliable insights. They reveal where to start, where to find stakeholder support, and how to earn "quick wins" to build upon. You'll learn how to sustain success through best-practice data management, technology usage, partnering, and skill building. Finally, you'll discover how to earn even more value by establishing an analytical mindset throughout HR, and building two key skills: storytelling and visualization. The Power of People will be invaluable to HR executives establishing or leading analytics functions; HR professionals planning analytics projects; and any business executive who wants more value from HR.
Data analysis is changing fast. Driven by a vast range of application domains and affordable tools, machine learning has become mainstream. Unsupervised data analysis, including cluster analysis, factor analysis, and low dimensionality mapping methods continually being updated, have reached new heights of achievement in the incredibly rich data world that we inhabit. Statistical Learning and Data Science is a work of reference in the rapidly evolving context of converging methodologies. It gathers contributions from some of the foundational thinkers in the different fields of data analysis to the major theoretical results in the domain. On the methodological front, the volume includes conformal prediction and frameworks for assessing confidence in outputs, together with attendant risk. It illustrates a wide range of applications, including semantics, credit risk, energy production, genomics, and ecology. The book also addresses issues of origin and evolutions in the unsupervised data analysis arena, and presents some approaches for time series, symbolic data, and functional data. Over the history of multidimensional data analysis, more and more complex data have become available for processing. Supervised machine learning, semi-supervised analysis approaches, and unsupervised data analysis, provide great capability for addressing the digital data deluge. Exploring the foundations and recent breakthroughs in the field, Statistical Learning and Data Science demonstrates how data analysis can improve personal and collective health and the well-being of our social, business, and physical environments.
This book examines how cloud-based services challenge the current application of antitrust and privacy laws in the EU and the US. The author looks at the elements of data centers, the way information is organized, and how antitrust, competition and privacy laws in the US and the EU regulate cloud-based services and their market practices. She discusses how platform interoperability can be a driver of incremental innovation and the consequences of not promoting radical innovation. She evaluates applications of predictive analysis based on big data as well as deriving privacy-invasive conduct. She looks at the way antitrust and privacy laws approach consumer protection and how lawmakers can reach more balanced outcomes by understanding the technical background of cloud-based services.
The Cognitive Approach in Cloud Computing and Internet of Things Technologies for Surveillance Tracking Systems discusses the recent, rapid development of Internet of things (IoT) and its focus on research in smart cities, especially on surveillance tracking systems in which computing devices are widely distributed and huge amounts of dynamic real-time data are collected and processed. Efficient surveillance tracking systems in the Big Data era require the capability of quickly abstracting useful information from the increasing amounts of data. Real-time information fusion is imperative and part of the challenge to mission critical surveillance tasks for various applications. This book presents all of these concepts, with a goal of creating automated IT systems that are capable of resolving problems without demanding human aid.
Building Big Data Applications helps data managers and their organizations make the most of unstructured data with an existing data warehouse. It provides readers with what they need to know to make sense of how Big Data fits into the world of Data Warehousing. Readers will learn about infrastructure options and integration and come away with a solid understanding on how to leverage various architectures for integration. The book includes a wide range of use cases that will help data managers visualize reference architectures in the context of specific industries (healthcare, big oil, transportation, software, etc.).
The Critical Infrastructure Protection Survey recently released by Symantec found that 53% of interviewed IT security experts from international companies experienced at least ten cyber attacks in the last five years, and financial institutions were often subject to some of the most sophisticated and large-scale cyber attacks and frauds. The book by Baldoni and Chockler analyzes the structure of software infrastructures found in the financial domain, their vulnerabilities to cyber attacks and the existing protection mechanisms. It then shows the advantages of sharing information among financial players in order to detect and quickly react to cyber attacks. Various aspects associated with information sharing are investigated from the organizational, cultural and legislative perspectives. The presentation is organized in two parts: Part I explores general issues associated with information sharing in the financial sector and is intended to set the stage for the vertical IT middleware solution proposed in Part II. Nonetheless, it is self-contained and details a survey of various types of critical infrastructure along with their vulnerability analysis, which has not yet appeared in a textbook-style publication elsewhere. Part II then presents the CoMiFin middleware for collaborative protection of the financial infrastructure. The material is presented in an accessible style and does not require specific prerequisites. It appeals to both researchers in the areas of security, distributed systems, and event processing working on new protection mechanisms, and practitioners looking for a state-of-the-art middleware technology to enhance the security of their critical infrastructures in e.g. banking, military, and other highly sensitive applications. The latter group will especially appreciate the concrete usage scenarios included.
This book is designed to enable and encourage health professionals and family support workers to include fathers in the process of their work. It focuses on the enormous potential value of accessing men at a time they are known to be particularly receptive - before and after the birth - within the context of providing solutions in the debate about problematic aspects of masculinity and fatherhood. It looks at how important the father's role is within the family environment and how fathers should be encouraged to take part in the upbringing of their children.
When it comes to data analytics, it pays tothink big. PySpark blends the powerful Spark big data processing engine withthe Python programming language to provide a data analysis platform that can scaleup for nearly any task. Data Analysis with Python and PySpark is yourguide to delivering successful Python-driven data projects. Data Analysis with Python and PySpark is a carefully engineered tutorial that helps you use PySpark to deliver your data-driven applications at any scale. This clear and hands-on guide shows you how to enlarge your processing capabilities across multiple machines with data from any source, ranging from Had oop-based clusters to Excel worksheets. You'll learn how to break down big analysis tasks into manageable chunks and how to choose and use the best PySpark data abstraction for your unique needs. The Spark data processing engine is an amazing analytics factory: raw data comes in,and insight comes out. Thanks to its ability to handle massive amounts of data distributed across a cluster, Spark has been adopted as standard by organizations both big and small. PySpark, which wraps the core Spark engine with a Python-based API, puts Spark-based data pipelines in the hands of programmers and data scientists working with the Python programming language. PySpark simplifies Spark's steep learning curve, and provides a seamless bridge between Spark and an ecosystem of Python-based data science tools.
This volume deals with two complementary topics. On one hand the book deals with the problem of determining the the probability distribution of a positive compound random variable, a problem which appears in the banking and insurance industries, in many areas of operational research and in reliability problems in the engineering sciences. On the other hand, the methodology proposed to solve such problems, which is based on an application of the maximum entropy method to invert the Laplace transform of the distributions, can be applied to many other problems. The book contains applications to a large variety of problems, including the problem of dependence of the sample data used to estimate empirically the Laplace transform of the random variable. Contents Introduction Frequency models Individual severity models Some detailed examples Some traditional approaches to the aggregation problem Laplace transforms and fractional moment problems The standard maximum entropy method Extensions of the method of maximum entropy Superresolution in maxentropic Laplace transform inversion Sample data dependence Disentangling frequencies and decompounding losses Computations using the maxentropic density Review of statistical procedures
This compact course is written for the mathematically literate reader who wants to learn to analyze data in a principled fashion. The language of mathematics enables clear exposition that can go quite deep, quite quickly, and naturally supports an axiomatic and inductive approach to data analysis. Starting with a good grounding in probability, the reader moves to statistical inference via topics of great practical importance - simulation and sampling, as well as experimental design and data collection - that are typically displaced from introductory accounts. The core of the book then covers both standard methods and such advanced topics as multiple testing, meta-analysis, and causal inference.
As qualitative researchers incorporate computer assistance into their analytic approaches, important questions arise about the adoption of new technology. Is it worth learning computer-assisted methods? Will the pay-off be sufficient to justify the investment? Which programs are worth learning? What are the effects on the analysis process? This book complements the existing literature by giving a detailed account of the use of four major programs in analyzing the same data. Priority is given to the tasks of qualitative analysis rather than to program capability and the programs are treated as tools rather than as a discipline to be acquired. The key is not what the programs allow researcher to do, but whether the tasks that researchers need to undertake are facilitated by the software. Thus the study develops a user-centred approach to the adoption of computer-assisted qualitative data analysis. The author emphasises qualitative analysis as a creative craft, but one which must increasingly be subject to rigorous methodological scrutiny. The adoption of computer-aided methods offers opportunities, but also dangers and ultimately this book is about the scientific qualitative research. Written in a distinctive and succinct style, this book will be valuable to social science researchers and students interested in qualitative research and in the potential for computer-assisted analysis.
The technique of DNA Sequencing lies at the heart of modern molecular biology. Since current methods were first introduced, sequence databases have grown exponentially, and are now an indispensable research tool. This up-to-date, practical guide is unique in covering all aspects of the methodology of DNA sequencing, as well as sequence analysis. It describes the basic methods (both manual and automated) and the more advanced techniques (for example, those based on PCR) before moving on to key applications. The final section focuses on the analysis of sequence data; it details the software available, and explains how the Internet can be used for accessing software and major databases. By explaining the options available and their merits, DNA Sequencing allows newcomers to the field to decide which method is the most suitable for their application. For experienced sequencers the book is a useful reference source for details of the less common techniques and as a means of updating knowledge.
This accessible introdution to statistics using the program Minitab
explains when to apply and how to calculate and interpret a wide
range of statistical procedures commonly used in the social
sciences. Keeping statistical symbols and formulae to a minimum and
using simple examples, this book:
Big Data Analytics for Sensor-Network Collected Intelligence explores state-of-the-art methods for using advanced ICT technologies to perform intelligent analysis on sensor collected data. The book shows how to develop systems that automatically detect natural and human-made events, how to examine people's behaviors, and how to unobtrusively provide better services. It begins by exploring big data architecture and platforms, covering the cloud computing infrastructure and how data is stored and visualized. The book then explores how big data is processed and managed, the key security and privacy issues involved, and the approaches used to ensure data quality. In addition, readers will find a thorough examination of big data analytics, analyzing statistical methods for data analytics and data mining, along with a detailed look at big data intelligence, ubiquitous and mobile computing, and designing intelligence system based on context and situation. Indexing: The books of this series are submitted to EI-Compendex and SCOPUS
Quantitative data analysis is now a compulsory component of most degree courses in the social sciences and students are increasingly reliant on computers for the analysis of data. Quantitative Data Analysis with Minitab explains statistical tests for Mac users using the same formulae free, non-technical approach as the very successful SPPS version. Students will learn a wide range of quantitative data analysis techniques and become familiar with how these techniques can be implemented through the latest version of Minitab. Techniques covered include univariate analysis (with frequency table, dispersion and histograms), bivariate (with contingency tables correlation, analysis of varience and non-parametric tests) and multivariate analysis (with multiple regression, path analysis, covarience and factor analysis). In addition the book covers issues such as sampling, statistical significance, conceptualization and measurement and the selection of appropriate tests. Each chapter concludes with a set of exercises. Social science students will be interested in this integrated, non-mathematical introduction to quantitative data anlysis and the Minitab package.
This text provides deep and comprehensive coverage of the mathematical background for data science, including machine learning, optimal recovery, compressed sensing, optimization, and neural networks. In the past few decades, heuristic methods adopted by big tech companies have complemented existing scientific disciplines to form the new field of Data Science. This text embarks the readers on an engaging itinerary through the theory supporting the field. Altogether, twenty-seven lecture-length chapters with exercises provide all the details necessary for a solid understanding of key topics in data science. While the book covers standard material on machine learning and optimization, it also includes distinctive presentations of topics such as reproducing kernel Hilbert spaces, spectral clustering, optimal recovery, compressed sensing, group testing, and applications of semidefinite programming. Students and data scientists with less mathematical background will appreciate the appendices that provide more background on some of the more abstract concepts.
Understand advanced data analytics concepts such as time series and principal component analysis with ETL, supervised learning, and PySpark using Python. This book covers architectural patterns in data analytics, text and image classification, optimization techniques, natural language processing, and computer vision in the cloud environment. Generic design patterns in Python programming is clearly explained, emphasizing architectural practices such as hot potato anti-patterns. You'll review recent advances in databases such as Neo4j, Elasticsearch, and MongoDB. You'll then study feature engineering in images and texts with implementing business logic and see how to build machine learning and deep learning models using transfer learning. Advanced Analytics with Python, 2nd edition features a chapter on clustering with a neural network, regularization techniques, and algorithmic design patterns in data analytics with reinforcement learning. Finally, the recommender system in PySpark explains how to optimize models for a specific application. What You'll Learn Build intelligent systems for enterprise Review time series analysis, classifications, regression, and clustering Explore supervised learning, unsupervised learning, reinforcement learning, and transfer learning Use cloud platforms like GCP and AWS in data analytics Understand Covers design patterns in Python Who This Book Is For Data scientists and software developers interested in the field of data analytics.
Advanced Data Science and Analytics with Python enables data scientists to continue developing their skills and apply them in business as well as academic settings. The subjects discussed in this book are complementary and a follow-up to the topics discussed in Data Science and Analytics with Python. The aim is to cover important advanced areas in data science using tools developed in Python such as SciKit-learn, Pandas, Numpy, Beautiful Soup, NLTK, NetworkX and others. The model development is supported by the use of frameworks such as Keras, TensorFlow and Core ML, as well as Swift for the development of iOS and MacOS applications. Features: Targets readers with a background in programming, who are interested in the tools used in data analytics and data science Uses Python throughout Presents tools, alongside solved examples, with steps that the reader can easily reproduce and adapt to their needs Focuses on the practical use of the tools rather than on lengthy explanations Provides the reader with the opportunity to use the book whenever needed rather than following a sequential path The book can be read independently from the previous volume and each of the chapters in this volume is sufficiently independent from the others, providing flexibility for the reader. Each of the topics addressed in the book tackles the data science workflow from a practical perspective, concentrating on the process and results obtained. The implementation and deployment of trained models are central to the book. Time series analysis, natural language processing, topic modelling, social network analysis, neural networks and deep learning are comprehensively covered. The book discusses the need to develop data products and addresses the subject of bringing models to their intended audiences - in this case, literally to the users' fingertips in the form of an iPhone app. About the Author Dr. Jesus Rogel-Salazar is a lead data scientist in the field, working for companies such as Tympa Health Technologies, Barclays, AKQA, IBM Data Science Studio and Dow Jones. He is a visiting researcher at the Department of Physics at Imperial College London, UK and a member of the School of Physics, Astronomy and Mathematics at the University of Hertfordshire, UK.
Topological data analysis (TDA) has emerged recently as a viable tool for analyzing complex data, and the area has grown substantially both in its methodologies and applicability. Providing a computational and algorithmic foundation for techniques in TDA, this comprehensive, self-contained text introduces students and researchers in mathematics and computer science to the current state of the field. The book features a description of mathematical objects and constructs behind recent advances, the algorithms involved, computational considerations, as well as examples of topological structures or ideas that can be used in applications. It provides a thorough treatment of persistent homology together with various extensions - like zigzag persistence and multiparameter persistence - and their applications to different types of data, like point clouds, triangulations, or graph data. Other important topics covered include discrete Morse theory, the Mapper structure, optimal generating cycles, as well as recent advances in embedding TDA within machine learning frameworks.
A state-of-the-art view of recent developments in the use of artificial neural networks for analysing remotely sensed satellite data. Neural networks, as a new form of computational paradigm, appear well suited to many of the tasks involved in this image analysis. This book demonstrates a wide range of uses of neural networks for remote sensing applications and reports the views of a large number of European experts brought together as part of a concerted action supported by the European Commission.
This is the first rigorous, self-contained treatment of the theory of deep learning. Starting with the foundations of the theory and building it up, this is essential reading for any scientists, instructors, and students interested in artificial intelligence and deep learning. It provides guidance on how to think about scientific questions, and leads readers through the history of the field and its fundamental connections to neuroscience. The author discusses many applications to beautiful problems in the natural sciences, in physics, chemistry, and biomedicine. Examples include the search for exotic particles and dark matter in experimental physics, the prediction of molecular properties and reaction outcomes in chemistry, and the prediction of protein structures and the diagnostic analysis of biomedical images in the natural sciences. The text is accompanied by a full set of exercises at different difficulty levels and encourages out-of-the-box thinking.
Many students find it daunting to move from studying environmental science, to designing and implementing their own research proposals. This book provides a practical introduction to help develop scientific thinking, aimed at undergraduate and new graduate students in the earth and environmental sciences. Students are guided through the steps of scientific thinking using published scientific literature and real environmental data. The book starts with advice on how to effectively read scientific papers, before outlining how to articulate testable questions and answer them using basic data analysis. The Mauna Loa CO2 dataset is used to demonstrate how to read metadata, prepare data, generate effective graphs and identify dominant cycles on various timescales. Practical, question-driven examples are explored to explain running averages, anomalies, correlations and simple linear models. The final chapter provides a framework for writing persuasive research proposals, making this an essential guide for students embarking on their first research project. |
You may like...
The Shepherd And The Beast - The Hero's…
Tramayne Monaghan
Paperback
90 Rules For Entrepreneurs - Your Guide…
Marnus Broodryk
Paperback
(4)
|