![]() |
Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
||
|
Books > Reference & Interdisciplinary > Communication studies > Data analysis
The book describes the emergence of big data technologies and the role of Spark in the entire big data stack. It compares Spark and Hadoop and identifies the shortcomings of Hadoop that have been overcome by Spark. The book mainly focuses on the in-depth architecture of Spark and our understanding of Spark RDDs and how RDD complements big data's immutable nature, and solves it with lazy evaluation, cacheable and type inference. It also addresses advanced topics in Spark, starting with the basics of Scala and the core Spark framework, and exploring Spark data frames, machine learning using Mllib, graph analytics using Graph X and real-time processing with Apache Kafka, AWS Kenisis, and Azure Event Hub. It then goes on to investigate Spark using PySpark and R. Focusing on the current big data stack, the book examines the interaction with current big data tools, with Spark being the core processing layer for all types of data. The book is intended for data engineers and scientists working on massive datasets and big data technologies in the cloud. In addition to industry professionals, it is helpful for aspiring data processing professionals and students working in big data processing and cloud computing environments.
Spatial data analysis has seen explosive growth in recent years. Both in mainstream statistics and econometrics as well as in many applied ?elds, the attention to space, location, and interaction has become an important feature of scholarly work. The methodsdevelopedto dealwith problemsofspatialpatternrecognition,spatialau- correlation, and spatial heterogeneity have seen greatly increased adoption, in part due to the availability of user friendlydesktopsoftware. Throughhis theoretical and appliedwork,ArthurGetishasbeena majorcontributing?gureinthisdevelopment. In this volume, we take both a retrospective and a prospective view of the ?eld. We use the occasion of the retirement and move to emeritus status of Arthur Getis to highlight the contributions of his work. In addition, we aim to place it into perspective in light of the current state of the art and future directions in spatial data analysis. To this end, we elected to combine reprints of selected classic contributions by Getiswithchapterswrittenbykeyspatialscientists.Thesescholarswerespeci?cally invited to react to the earlier work by Getis with an eye toward assessing its impact, tracing out the evolution of related research, and to re?ect on the future broadening of spatial analysis. The organizationof the book follows four main themes in Getis' contributions: * Spatial analysis * Pattern analysis * Local statistics * Applications For each of these themes, the chapters provide a historical perspective on early methodological developments and theoretical insights, assessments of these c- tributions in light of the current state of the art, as well as descriptions of new techniques and applications.
Recently, there has been a rapid increase in interest regarding social network analysis in the data mining community. Cognitive radios are expected to play a major role in meeting this exploding traffic demand on social networks due to their ability to sense the environment, analyze outdoor parameters, and then make decisions for dynamic time, frequency, space, resource allocation, and management to improve the utilization of mining the social data. Cognitive Social Mining Applications in Data Analytics and Forensics is an essential reference source that reviews cognitive radio concepts and examines their applications to social mining using a machine learning approach so that an adaptive and intelligent mining is achieved. Featuring research on topics such as data mining, real-time ubiquitous social mining services, and cognitive computing, this book is ideally designed for social network analysts, researchers, academicians, and industry professionals.
Statistics for Social Work with SPSS provides readers with a user-friendly, evidence-based, and practical resource to help them make sense of, organize, analyze, and interpret data in contemporary contexts. It incorporates one of the most well-known statistics software applications, the Statistical Package for the Social Science (SPSS), within each chapter to help readers integrate their knowledge either manually or with the assistance of technology. The book begins with a brief introduction to statistics and research, followed by chapters that address variables, frequency distributions, measures of central tendency, and measures of variability. Additional chapters cover probability and hypothesis testing; normal distribution and Z score; correlation; simple linear regression; one-way ANOVA; and more. Each chapter features concise, simple explanations of key terms, formulas, and calculations; study questions and answers; specific SPSS instructions on computerized computations; and evidence-based, practical examples to support the learning experience. Presenting students with highly accessible and universally understandable statistical concepts, Statistics for Social Work with SPSS is an ideal textbook for undergraduate and graduate-level courses in social work statistics, as well as research-based courses within the social and behavioral sciences.
This is a book about the scientific process and how you apply it to data in ecology. You will learn how to plan for data collection, how to assemble data, how to analyze data and finally how to present the results. The book uses Microsoft Excel and the powerful Open Source R program to carry out data handling as well as producing graphs. Statistical approaches covered include: data exploration; tests for difference - t-test and U-test; correlation - Spearman's rank test and Pearson product-moment; association including Chi-squared tests and goodness of fit; multivariate testing using analysis of variance (ANOVA) and Kruskal-Wallis test; and multiple regression. Key skills taught in this book include: how to plan ecological projects; how to record and assemble your data; how to use R and Excel for data analysis and graphs; how to carry out a wide range of statistical analyses including analysis of variance and regression; how to create professional looking graphs; and how to present your results. New in this edition: a completely revised chapter on graphics including graph types and their uses, Excel Chart Tools, R graphics commands and producing different chart types in Excel and in R; an expanded range of support material online, including; example data, exercises and additional notes & explanations; a new chapter on basic community statistics, biodiversity and similarity; chapter summaries and end-of-chapter exercises. Praise for the first edition: This book is a superb way in for all those looking at how to design investigations and collect data to support their findings. - Sue Townsend, Biodiversity Learning Manager, Field Studies Council [M]akes it easy for the reader to synthesise R and Excel and there is extra help and sample data available on the free companion webpage if needed. I recommended this text to the university library as well as to colleagues at my student workshops on R. Although I initially bought this book when I wanted to discover R I actually also learned new techniques for data manipulation and management in Excel - Mark Edwards, EcoBlogging A must for anyone getting to grips with data analysis using R and excel. - Amazon 5-star review It has been very easy to follow and will be perfect for anyone. - Amazon 5-star review A solid introduction to working with Excel and R. The writing is clear and informative, the book provides plenty of examples and figures so that each string of code in R or step in Excel is understood by the reader. - Goodreads, 4-star review
Data literacy is one of the key skills that companies are looking for but it's a specialist skill - currently. This book is your comprehensive guide to becoming data literate: understand data analytics, how to use data insights effectively in your organisation, and how to talk about data with experts and non-experts confidently.
Statistical Tools for Nonlinear Regression, Second Edition, presents methods for analyzing data using parametric nonlinear regression models. The new edition has been expanded to include binomial, multinomial and Poisson non-linear models. Using examples from experiments in agronomy and biochemistry, it shows how to apply these methods. It concentrates on presenting the methods in an intuitive way rather than developing the theoretical backgrounds. The examples are analyzed with the free software nls2 updated to deal with the new models included in the second edition. The nls2 package is implemented in S-PLUS and R. Its main advantages are to make the model building, estimation and validation tasks, easy to do. More precisely, Complex models can be easily described using a symbolic syntax. The regression function as well as the variance function can be defined explicitly as functions of independent variables and of unknown parameters or they can be defined as the solution to a system of differential equations. Moreover, constraints on the parameters can easily be added to the model. It is thus possible to test nested hypotheses and to compare several data sets. Several additional tools are included in the package for calculating confidence regions for functions of parameters or calibration intervals, using classical methodology or bootstrap. Some graphical tools are proposed for visualizing the fitted curves, the residuals, the confidence regions, and the numerical estimation procedure.
Open government data (OGD) has developed rapidly in recent years due to various benefits that can be derived through transparency and public access. However, researchers emphasize a lack of use instead of lack of disclosure as a key problem in OGD's present development. Previous studies have approached this issue either from the supply-side, focusing on data quantity and quality, or from the demand-side, focusing on factors that affect users' acceptance of OGD, but seldom consider both sides at the same time. This unique study compares the supply and demand sides of OGD and explores possible directions for the future development of OGD portals based on the discovered mismatches between the two. The authors improve OGD utilization by balancing the supply-side and demand-side according to citizens' demands through OGD portals. Based on the concept of an OGD ecosystem, four connected studies are explored. The first study built an evaluation framework for understanding the development of the OGD supply-side. The second study focuses on a survey conducted to analyze the awareness and utilization of OGD portals by citizens, who are the primary users and major beneficiaries of OGD on the demand-side. A third study compares the supply and demand sides based on Diffusion of Innovation theory. A final study tests the proposed usability criteria for building an OGD portal by carrying out a between-subjects experiment including a virtual agent. Each case study examines a unique aspect of OGD in China, and also offers reflections on future directions for developing OGD. Providing a unique and enhanced theoretical and practical understanding of OGD and its usage, as well as proposing directions for OGD portals' future development in order to encourage citizens' OGD utilization, this is a must-read for researchers and policymakers examining the impact and possibilities of OGD.
This book thoroughly covers the remote sensing visualization and analysis techniques based on computational imaging and vision in Earth science. Remote sensing is considered a significant information source for monitoring and mapping natural and man-made land through the development of sensor resolutions that committed different Earth observation platforms. The book includes related topics for the different systems, models, and approaches used in the visualization of remote sensing images. It offers flexible and sophisticated solutions for removing uncertainty from the satellite data. It introduces real time big data analytics to derive intelligence systems in enterprise earth science applications. Furthermore, the book integrates statistical concepts with computer-based geographic information systems (GIS). It focuses on image processing techniques for observing data together with uncertainty information raised by spectral, spatial, and positional accuracy of GPS data. The book addresses several advanced improvement models to guide the engineers in developing different remote sensing visualization and analysis schemes. Highlights on the advanced improvement models of the supervised/unsupervised classification algorithms, support vector machines, artificial neural networks, fuzzy logic, decision-making algorithms, and Time Series Model and Forecasting are addressed. This book guides engineers, designers, and researchers to exploit the intrinsic design remote sensing systems. The book gathers remarkable material from an international experts' panel to guide the readers during the development of earth big data analytics and their challenges.
During the last decades, there has been an explosion in computation and information technology. This development comes with an expansion of complex observational studies and clinical trials in a variety of fields such as medicine, biology, epidemiology, sociology, and economics among many others, which involve collection of large amounts of data on subjects or organisms over time. The goal of such studies can be formulated as estimation of a finite dimensional parameter of the population distribution corresponding to the observed time-dependent process. Such estimation problems arise in survival analysis, causal inference and regression analysis. This book provides a fundamental statistical framework for the analysis of complex longitudinal data. It provides the first comprehensive description of optimal estimation techniques based on time-dependent data structures subject to informative censoring and treatment assignment in so called semiparametric models. Semiparametric models are particularly attractive since they allow the presence of large unmodeled nuisance parameters. These techniques include estimation of regression parameters in the familiar (multivariate) generalized linear regression and multiplicative intensity models. They go beyond standard statistical approaches by incorporating all the observed data to allow for informative censoring, to obtain maximal efficiency, and by developing estimators of causal effects. It can be used to teach masters and Ph.D. students in biostatistics and statistics and is suitable for researchers in statistics with a strong interest in the analysis of complex longitudinal data.
Multivariate data analysis is a central tool whenever several variables need to be considered at the same time. The present book explains a powerful and versatile way to analyse data tables, suitable also for researchers without formal training in statistics. This method for extracting useful information from data is demonstrated for various types of quality assessment, ranging from human quality perception via industrial quality monitoring to health quality and its molecular basis. Key features include:
The book is written with ISO certified businesses and laboratories in mind, to enhance Total Quality Management (TQM). As yet there are no clear guidelines for realistic data analysis of quality in complex systems - this volume bridges the gap.
The modern world is awash with data. The R Project is a statistical environment and programming language that can help to make sense of it all. A huge open-source project, R has become enormously popular because of its power and flexibility. With R you can organise, analyse and visualise data. This clear and methodical book will help you learn how to use R from the ground up, giving you a start in the world of data science. Learning about data is important in many academic and business settings, and R offers a potent and adaptable programming toolbox. The book covers a range of topics, including: importing/exporting data, summarising data, visualising data, managing and manipulating data objects, data analysis (regression, ANOVA and association among others) and programming functions. Regardless of your background or specialty, you'll find this book the perfect primer on data analysis, data visualisation and data management, and a springboard for further exploration.
This book presents both theory of financial data analytics, as well as comprehensive insights into the application of financial data analytics techniques in real financial world situations. It offers solutions on how to logically analyze the enormous amount of structured and unstructured data generated every moment in the finance sector. This data can be used by companies, organizations, and investors to create strategies, as the finance sector rapidly moves towards data-driven optimization. This book provides an efficient resource, addressing all applications of data analytics in the finance sector. International experts from around the globe cover the most important subjects in finance, including data processing, knowledge management, machine learning models, data modeling, visualization, optimization for financial problems, financial econometrics, financial time series analysis, project management, and decision making. The authors provide empirical evidence as examples of specific topics. By combining both applications and theory, the book offers a holistic approach. Therefore, it is a must-read for researchers and scholars of financial economics and finance, as well as practitioners interested in a better understanding of financial data analytics.
The American Statistical Association (ASA) and the Association of Computing Machinery (ACM) have longstanding ethical practice standards that are explicitly intended to be utilized by all who use statistical practices or computing, or both. Since statistics and computing are critical in any data-centered activity, these practice standards are essential to instruction in the uses of statistical practices or computing across disciplines. Ethical Reasoning for a Data-Centered World is aimed at any undergraduate or graduate students utilizing data. Whether the career goal is research, teaching, business, government, or a combination, this book presents a method for understanding and prioritizing ethical statistics, computing, and data science - featuring the ASA and ACM practice standards. To facilitate engagement, integration with prior learning, and authenticity, the material is organized around seven tasks: Planning/Designing; Data collection; Analysis; Interpretation; Reporting; Documenting; and Engaging in Team Work. This book is a companion volume to Ethical Practice of Statistics and Data Science, also published by Ethics International Press (2022). These are the first and only books to be based on, and to provide guidance to, the American Statistical Association (ASA) and Association of Computing Machinery (ACM) ethical guideline documents.
Achieve successful digital transformation with this authoritative guide designed specifically for established organizations. At a time where even the most recognized business models are under threat, organizations risk devastation if they do not transition successfully to the new digital reality. Yet what works for digital natives does not always work for established organizations. Recognized as one of the world's top global executives leading innovative transformation, Neetan Chopra's deep experience of steering organizations through digital disruption drives the practical approach of Accelerated Digital Transformation. Having designed transformation journeys, overcome setbacks and driven outcomes within multiple leading companies, Neetan Chopra tackles key factors for established organizations including inertia, impetus, outcomes, digital capabilities and culture. The book is underpinned by a tried and tested framework that will guide readers step by step through the entire digital transformation journey. This will be an essential resource for leaders, managers and practitioners leading and executing digital transformation.
Value-Driven Data explains how data and business leaders can co-create and deploy data-driven solutions for their organizations. Value-Driven Data explores how organizations can understand their problems and come up with better solutions, aligning data storytelling with business needs. The book reviews the main challenges that plague most data-to-business interactions and offers actionable strategies for effective data value implementation, including methods for tackling obstacles and incentivizing change. Value-Driven Data is supported by tried-and-tested frameworks that can be applied to different contexts and organizations. It features cutting-edge examples relating to digital transformation, data strategy, resolving conflicts of interests, building a data P&L and AI value prediction methodology. Recognizing different types of data value, this book presents tangible methodologies for identifying, capturing, communicating, measuring and deploying data-enabled opportunities. This is essential reading for data specialists, business stakeholders and leaders involved in capturing and executing data value opportunities for organizations and for informing data value strategies.
This handbook is the first book ever covering the area of Multimodal Learning Analytics (MMLA). The field of MMLA is an emerging domain of Learning Analytics and plays an important role in expanding the Learning Analytics goal of understanding and improving learning in all the different environments where it occurs. The challenge for research and practice in this field is how to develop theories about the analysis of human behaviors during diverse learning processes and to create useful tools that could augment the capabilities of learners and instructors in a way that is ethical and sustainable. Behind this area, the CrossMMLA research community exchanges ideas on how we can analyze evidence from multimodal and multisystem data and how we can extract meaning from this increasingly fluid and complex data coming from different kinds of transformative learning situations and how to best feed back the results of these analyses to achieve positive transformative actions on those learning processes. This handbook also describes how MMLA uses the advances in machine learning and affordable sensor technologies to act as a virtual observer/analyst of learning activities. The book describes how this "virtual nature" allows MMLA to provide new insights into learning processes that happen across multiple contexts between stakeholders, devices and resources. Using such technologies in combination with machine learning, Learning Analytics researchers can now perform text, speech, handwriting, sketches, gesture, affective, or eye-gaze analysis, improve the accuracy of their predictions and learned models and provide automated feedback to enable learner self-reflection. However, with this increased complexity in data, new challenges also arise. Conducting the data gathering, pre-processing, analysis, annotation and sense-making, in a way that is meaningful for learning scientists and other stakeholders (e.g., students or teachers), still pose challenges in this emergent field. This handbook aims to serve as a unique resource for state of the art methods and processes. Chapter 11 of this book is available open access under a CC BY 4.0 license at link.springer.com.
Although there has been a surge of interest in density estimation in recent years, much of the published research has been concerned with purely technical matters with insufficient emphasis given to the technique's practical value. Furthermore, the subject has been rather inaccessible to the general statistician.
Don't simply show your data tell a story with it! Storytelling with Data teaches you the fundamentals of data visualization and how to communicate effectively with data. You'll discover the power of storytelling and the way to make data a pivotal point in your story. The lessons in this illuminative text are grounded in theory, but made accessible through numerous real-world examples ready for immediate application to your next graph or presentation. Storytelling is not an inherent skill, especially when it comes to data visualization, and the tools at our disposal don't make it any easier. This book demonstrates how to go beyond conventional tools to reach the root of your data, and how to use your data to create an engaging, informative, compelling story. Specifically, you'll learn how to: * Understand the importance of context and audience * Determine the appropriate type of graph for your situation * Recognize and eliminate the clutter clouding your information * Direct your audience's attention to the most important parts of your data * Think like a designer and utilize concepts of design in data visualization * Leverage the power of storytelling to help your message resonate with your audience Together, the lessons in this book will help you turn your data into high impact visual stories that stick with your audience. Rid your world of ineffective graphs, one exploding 3D pie chart at a time. There is a story in your data Storytelling with Data will give you the skills and power to tell it!
Aspects of Robust Statistics are important in many areas. Based on the International Conference on Robust Statistics 2001 (ICORS 2001) in Vorau, Austria, this volume discusses future directions of the discipline, bringing together leading scientists, experienced researchers and practitioners, as well as younger researchers. The papers cover a multitude of different aspects of Robust Statistics. For instance, the fundamental problem of data summary (weights of evidence) is considered and its robustness properties are studied. Further theoretical subjects include e.g.: robust methods for skewness, time series, longitudinal data, multivariate methods, and tests. Some papers deal with computational aspects and algorithms. Finally, the aspects of application and programming tools complete the volume.
The importance of data analytics is well known, but how can you get end users to engage with analytics and business intelligence (BI) when adoption of new technology can be frustratingly slow or may not happen at all? Avoid wasting time on dashboards and reports that no one uses with this practical guide to increasing analytics adoption by focusing on people and process, not technology. Pulling together agile, UX and change management principles, Delivering Data Analytics outlines a step-by-step, technology agnostic process designed to shift the organizational data culture and gain buy-in from users and stakeholders at every stage of the project. This book outlines how to succeed and build trust with stakeholders amid the politics, ambiguity and lack of engagement in business. With case studies, templates, checklists and scripts based on the author's considerable experience in analytics and data visualisation, this book covers the full cycle from requirements gathering and data assessment to training and launch. Ensure lasting adoption, trust and, most importantly, actionable business value with this roadmap to creating user-centric analytics projects.
This book includes high-quality papers presented at the Second International Conference on Data Science and Management (ICDSM 2021), organized by the Gandhi Institute for Education and Technology, Bhubaneswar, from 19 to 20 February 2021. It features research in which data science is used to facilitate the decision-making process in various application areas, and also covers a wide range of learning methods and their applications in a number of learning problems. The empirical studies, theoretical analyses and comparisons to psychological phenomena described contribute to the development of products to meet market demands.
The massive volume of data generated in modern applications can overwhelm our ability to conveniently transmit, store, and index it. For many scenarios, building a compact summary of a dataset that is vastly smaller enables flexibility and efficiency in a range of queries over the data, in exchange for some approximation. This comprehensive introduction to data summarization, aimed at practitioners and students, showcases the algorithms, their behavior, and the mathematical underpinnings of their operation. The coverage starts with simple sums and approximate counts, building to more advanced probabilistic structures such as the Bloom Filter, distinct value summaries, sketches, and quantile summaries. Summaries are described for specific types of data, such as geometric data, graphs, and vectors and matrices. The authors offer detailed descriptions of and pseudocode for key algorithms that have been incorporated in systems from companies such as Google, Apple, Microsoft, Netflix and Twitter. |
You may like...
Research Anthology on Agile Software…
Information R Management Association
Hardcover
R14,547
Discovery Miles 145 470
Infrastructure Computer Vision
Ioannis Brilakis, Carl Thomas Michael Haas
Paperback
R3,039
Discovery Miles 30 390
Introduction to Computational Economics…
Hans Fehr, Fabian Kindermann
Hardcover
R4,258
Discovery Miles 42 580
|