![]() |
![]() |
Your cart is empty |
||
Books > Computing & IT > Applications of computing > Databases > Data capture & analysis
Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. You'll explore the basic operations and common functions of Spark's structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Spark's scalable machine-learning library. Get a gentle overview of big data and Spark Learn about DataFrames, SQL, and Datasets-Spark's core APIs-through worked examples Dive into Spark's low-level APIs, RDDs, and execution of SQL and DataFrames Understand how Spark runs on a cluster Debug, monitor, and tune Spark clusters and applications Learn the power of Structured Streaming, Spark's stream-processing engine Learn how you can apply MLlib to a variety of problems, including classification or recommendation
Newcomers to quantitative analysis need practical guidance on how to analyze data in the real world yet most introductory books focus on lengthy derivations and justifications instead of practical techniques. Covering the technical and professional skills needed by analysts in the academic, private, and public sectors, Applying Analytics: A Practical Introduction systematically teaches novices how to apply algorithms to real data and how to recognize potential pitfalls. It offers one of the first textbooks for the emerging first course in analytics. The text concentrates on the interpretation, strengths, and weaknesses of analytical techniques, along with challenges encountered by analysts in their daily work. The author shares various lessons learned from applying analytics in the real world. He supplements the technical material with coverage of professional skills traditionally learned through experience, such as project management, analytic communication, and using analysis to inform decisions. Example data sets used in the text are available for download online so that readers can test their own analytic routines. Suitable for beginning analysts in the sciences, business, engineering, and government, this book provides an accessible, example-driven introduction to the emerging field of analytics. It shows how to interpret data and identify trends across a range of fields.
Actuarial Principles: Lifetables and Mortality Models explores the core of actuarial science: the study of mortality and other risks and applications. Including the CT4 and CT5 UK courses, but applicable to a global audience, this work lightly covers the mathematical and theoretical background of the subject to focus on real life practice. It offers a brief history of the field, why actuarial notation has become universal, and how theory can be applied to many situations. Uniquely covering both life contingency risks and survival models, the text provides numerous exercises (and their solutions), along with complete self-contained real-world assignments.
This is the first comprehensive overview of the 'science of science,' an emerging interdisciplinary field that relies on big data to unveil the reproducible patterns that govern individual scientific careers and the workings of science. It explores the roots of scientific impact, the role of productivity and creativity, when and what kind of collaborations are effective, the impact of failure and success in a scientific career, and what metrics can tell us about the fundamental workings of science. The book relies on data to draw actionable insights, which can be applied by individuals to further their career or decision makers to enhance the role of science in society. With anecdotes and detailed, easy-to-follow explanations of the research, this book is accessible to all scientists and graduate students, policymakers, and administrators with an interest in the wider scientific enterprise.
The general theme of this book is to present innovative psychometric modeling and methods. In particular, this book includes research and successful examples of modeling techniques for new data sources from digital assessments, such as eye-tracking data, hint uses, and process data from game-based assessments. In addition, innovative psychometric modeling approaches, such as graphical models, item tree models, network analysis, and cognitive diagnostic models, are included. Chapters 1, 2, 4 and 6 are about psychometric models and methods for learning analytics. The first two chapters focus on advanced cognitive diagnostic models for tracking learning and the improvement of attribute classification accuracy. Chapter 4 demonstrates the use of network analysis for learning analytics. Chapter 6 introduces the conjunctive root causes model for the understanding of prerequisite skills in learning. Chapters 3, 5, 8, 9 are about innovative psychometric techniques to model process data. Specifically, Chapters 3 and 5 illustrate the usage of generalized linear mixed effect models and item tree models to analyze eye-tracking data. Chapter 8 discusses the modeling approach of hint uses and response accuracy in learning environment. Chapter 9 demonstrates the identification of observable outcomes in the game-based assessments. Chapters 7 and 10 introduce innovative latent variable modeling approaches, including the graphical and generalized linear model approach and the dynamic modeling approach. In summary, the book includes theoretical, methodological, and applied research and practices that serve as the foundation for future development. These chapters provide illustrations of efforts to model and analyze multiple data sources from digital assessments. When computer-based assessments are emerging and evolving, it is important that researchers can expand and improve the methods for modeling and analyzing new data sources. This book provides a useful resource to researchers who are interested in the development of psychometric methods to solve issues in this digital assessment age.
Numerical simulation models are used in all engineering disciplines for modeling physical phenomena to learn how the phenomena work, and to identify problems and optimize behavior. Smart Proxy Models provide an opportunity to replicate numerical simulations with very high accuracy and can be run on a laptop within a few minutes, thereby simplifying the use of complex numerical simulations, which can otherwise take tens of hours. This book focuses on Smart Proxy Modeling and provides readers with all the essential details on how to develop Smart Proxy Models using Artificial Intelligence and Machine Learning, as well as how it may be used in real-world cases. Covers replication of highly accurate numerical simulations using Artificial Intelligence and Machine Learning Details application in reservoir simulation and modeling and computational fluid dynamics Includes real case studies based on commercially available simulators Smart Proxy Modeling is ideal for petroleum, chemical, environmental, and mechanical engineers, as well as statisticians and others working with applications of data-driven analytics.
Product information not available.
Poor data quality is known to compromise the credibility and efficiency of commercial and public endeavours. Also, the importance of managing data quality has increased manifold as the diversity of sources, formats and volume of data grows. This volume targets the data quality in the light of collaborative information systems where data creation and ownership is increasingly difficult to establish.
The authors provide an understanding of big data and MapReduce by clearly presenting the basic terminologies and concepts. They have employed over 100 illustrations and many worked-out examples to convey the concepts and methods used in big data, the inner workings of MapReduce, and single node/multi-node installation on physical/virtual machines. This book covers almost all the necessary information on Hadoop MapReduce for most online certification exams. Upon completing this book, readers will find it easy to understand other big data processing tools such as Spark, Storm, etc. Ultimately, readers will be able to: * understand what big data is and the factors that are involved * understand the inner workings of MapReduce, which is essential for certification exams * learn the features and weaknesses of MapReduce * set up Hadoop clusters with 100s of physical/virtual machines * create a virtual machine in AWS * write MapReduce with Eclipse in a simple way * understand other big data processing tools and their applications
Comprehensive Coverage of the Entire Area of ClassificationResearch on the problem of classification tends to be fragmented across such areas as pattern recognition, database, data mining, and machine learning. Addressing the work of these different communities in a unified way, Data Classification: Algorithms and Applications explores the underlying algorithms of classification as well as applications of classification in a variety of problem domains, including text, multimedia, social network, and biological data. This comprehensive book focuses on three primary aspects of data classification: Methods: The book first describes common techniques used for classification, including probabilistic methods, decision trees, rule-based methods, instance-based methods, support vector machine methods, and neural networks. Domains: The book then examines specific methods used for data domains such as multimedia, text, time-series, network, discrete sequence, and uncertain data. It also covers large data sets and data streams due to the recent importance of the big data paradigm. Variations: The book concludes with insight on variations of the classification process. It discusses ensembles, rare-class learning, distance function learning, active learning, visual learning, transfer learning, and semi-supervised learning as well as evaluation aspects of classifiers.
Review of Marketing Research pushes the boundaries of marketing-broadening the marketing concept to make the world a better place. Here, leading scholars explore how marketing is currently shaping, and being shaped by, the evolution of Artificial Intelligence (AI). Topics covered include the effects of AI on: economics; personalisation; pricing; content generation; the identification, structuring, and prioritization of customer needs; customer feedback; Natural Language Processing; image analytics; deep learning; and the anthropomorphism of AI, such as in virtual assistants and chatbots. Each chapter provides thought provoking discussions which will be relevant to researchers, professionals, and students.
Build predictive models from time-based patterns in your data. Master statistical models including new deep learning approaches for time series forecasting. In Time Series Forecasting in Python you will learn how to: Recognize a time series forecasting problem and build a performant predictive model Create univariate forecasting models that account for seasonal effects and external variables Build multivariate forecasting models to predict many time series at once Leverage large datasets by using deep learning for forecasting time series Automate the forecasting process DESCRIPTION Time Series Forecasting in Python teaches you to build powerful predictive models from time-based data. Every model you create is relevant, useful, and easy to implement with Python. You'll explore interesting real-world datasets like Google's daily stock price and economic data for the USA, quickly progressing from the basics to developing large-scale models that use deep learning tools like TensorFlow.Time Series Forecasting in Python teaches you to apply time series forecasting and get immediate, meaningful predictions. You'll learn both traditional statistical and new deep learning models for time series forecasting, all fully illustrated with Python source code. Time Series Forecasting in Python teaches you to build powerful predictive models from time-based data. Every model you create is relevant, useful, and easy to implement with Python. You'll explore interesting real-world datasets like Google's daily stock price and economic data for the USA, quickly progressing from the basics to developing large-scale models that use deep learning tools like TensorFlow. about the technology Time series forecasting reveals hidden trends and makes predictions about the future from your data. This powerful technique has proven incredibly valuable across multiple fields-from tracking business metrics, to healthcare and the sciences. Modern Python libraries and powerful deep learning tools have opened up new methods and utilities for making practical time series forecasts. about the book Time Series Forecasting in Python teaches you to apply time series forecasting and get immediate, meaningful predictions. You'll learn both traditional statistical and new deep learning models for time series forecasting, all fully illustrated with Python source code. Test your skills with hands-on projects for forecasting air travel, volume of drug prescriptions, and the earnings of Johnson & Johnson. By the time you're done, you'll be ready to build accurate and insightful forecasting models with tools from the Python ecosystem.
Leverage the power of Talent Intelligence (TI) to make evidence-informed decisions that drive business performance by using data about people, skills, jobs, business functions and geographies. Improved access to people and business data has created huge opportunities for the HR function. However, simply having access to this data is not enough. HR professionals need to know how to analyse the data, know what questions to ask of it and where and how the insights from the data can add the most value. Talent Intelligence is a practical guide that explains everything HR professionals need to know to achieve this. It outlines what Talent Intelligence (TI) is why it's important, how to use it to improve business results and includes guidance on how HR professionals can build the business case for it. This book also explains how and why talent intelligence is different from workforce planning, sourcing research and standard predictive HR analytics and shows how to assess where in the organization talent intelligence can have the biggest impact and how to demonstrate the results to all stakeholders. Most importantly, this book covers KPIs and metrics for success, short-term and long-term TI goals, an outline of what success looks like and the skills needed for effective Talent Intelligence. It also features case studies from organizations including Philips, Barclays and Kimberly-Clark.
Praise for the First Edition "A very useful book for self study and reference." "Very well written. It is concise and really packs a lot of material in a valuable reference book." "An informative and well-written book . . . presented in an easy-to-understand style with many illustrative numerical examples taken from engineering and scientific studies." Practicing engineers and scientists often have a need to utilize statistical approaches to solving problems in an experimental setting. Yet many have little formal training in statistics. Statistical Design and Analysis of Experiments gives such readers a carefully selected, practical background in the statistical techniques that are most useful to experimenters and data analysts who collect, analyze, and interpret data. The First Edition of this now-classic book garnered praise in the field. Now its authors update and revise their text, incorporating readers’ suggestions as well as a number of new developments. Statistical Design and Analysis of Experiments, Second Edition emphasizes the strategy of experimentation, data analysis, and the interpretation of experimental results, presenting statistics as an integral component of experimentation from the planning stage to the presentation of conclusions. Giving an overview of the conceptual foundations of modern statistical practice, the revised text features discussions of:
Ideal for both students and professionals, this focused and cogent reference has proven to be an excellent classroom textbook with numerous examples. It deserves a place among the tools of every engineer and scientist working in an experimental setting.
Our present and our past are manifestly intertwined. Memories
are not identical simulations of the past, but are stories shaped
by our current perspectives of others, the world, and ourselves. As
a result, the gathering of early recollections can be used as a
projective technique that indicates our strengths, goals, lines of
movement, fears, and a host of other relevant psychological data.
Early Recollections are a quick, accurate, and cost-effective
personality assessment demonstrated to have similar reliability and
validity to other personality measures. Both a comprehensive and accessible text, Early Recollections: Interpretative Method and Application presents a constructivist approach and systematic development of early recollection theory. Mosak and Di Pietro invite students to think and actively engage in problem solving rather than merely read for content. Supported by step-by-step examples, this book also offers a perspective suitable for application by Adlerian practitioners, non-Adlerian clinicians, and all other mental health professionals and students seeking a new framework for evaluating personality.
Value-Driven Data explains how data and business leaders can co-create and deploy data-driven solutions for their organizations. Value-Driven Data explores how organizations can understand their problems and come up with better solutions, aligning data storytelling with business needs. The book reviews the main challenges that plague most data-to-business interactions and offers actionable strategies for effective data value implementation, including methods for tackling obstacles and incentivizing change. Value-Driven Data is supported by tried-and-tested frameworks that can be applied to different contexts and organizations. It features cutting-edge examples relating to digital transformation, data strategy, resolving conflicts of interests, building a data P&L and AI value prediction methodology. Recognizing different types of data value, this book presents tangible methodologies for identifying, capturing, communicating, measuring and deploying data-enabled opportunities. This is essential reading for data specialists, business stakeholders and leaders involved in capturing and executing data value opportunities for organizations and for informing data value strategies.
Multivariate data analysis is a central tool whenever several variables need to be considered at the same time. The present book explains a powerful and versatile way to analyse data tables, suitable also for researchers without formal training in statistics. This method for extracting useful information from data is demonstrated for various types of quality assessment, ranging from human quality perception via industrial quality monitoring to health quality and its molecular basis. Key features include:
The book is written with ISO certified businesses and laboratories in mind, to enhance Total Quality Management (TQM). As yet there are no clear guidelines for realistic data analysis of quality in complex systems - this volume bridges the gap.
A large international conference on Advances in Machine Learning and Data Analysis was held in UC Berkeley, California, USA, October 22-24, 2008, under the auspices of the World Congress on Engineering and Computer Science (WCECS 2008). This volume contains sixteen revised and extended research articles written by prominent researchers participating in the conference. Topics covered include Expert system, Intelligent decision making, Knowledge-based systems, Knowledge extraction, Data analysis tools, Computational biology, Optimization algorithms, Experiment designs, Complex system identification, Computational modeling, and industrial applications. Advances in Machine Learning and Data Analysis offers the state of the art of tremendous advances in machine learning and data analysis and also serves as an excellent reference text for researchers and graduate students, working on machine learning and data analysis.
There is a lack of an exposition on interdisciplinary and innovative methods of data mining and visualization for biodata. This book fills the gap by introducing an interdisciplinary set of the most recent methods and references on novel techniques from artificial intelligence, data mining, engineering, pattern recognition, and ontological data mining fields that are applicable to bioinformatics. The latest novel approaches are explained in detail, their advantages and disadvantages are summarized, and pointers to the future development of new applications are given. By widening the pool from which biologists and bioinformaticians can adopt methods for biodata mining and visualization, computational data mining experts in nonbiological fields are also encouraged to utilize their expertise in order to contribute to the progress of computational biology, thus enhancing the collaboration between these two disciplines.
A practical, skill-based introduction to data analysis and literacy We are swimming in a world of data, and this handy guide will keep you afloat while you learn to make sense of it all. In Data Literacy: A User's Guide, David Herzog, a journalist with a decade of experience using data analysis to transform information into captivating storytelling, introduces students and professionals to the fundamentals of data literacy, a key skill in today's world. Assuming the reader has no advanced knowledge of data analysis or statistics, this book shows how to create insight from publicly-available data through exercises using simple Excel functions. Extensively illustrated, step-by-step instructions within a concise, yet comprehensive, reference will help readers identify, obtain, evaluate, clean, analyze and visualize data. A concluding chapter introduces more sophisticated data analysis methods and tools including database managers such as Microsoft Access and MySQL and standalone statistical programs such as SPSS, SAS and R.
Learn from Today's Most Successful Workforce Analytics Leaders Transforming the immense potential of workforce analytics into reality isn't easy. Pioneering practitioners have learned crucial lessons that can help you succeed. The Power of People shares their journeys-and their indispensable insights. Drawing on incisive case studies and vignettes, three experts help you bring purpose and clarity to any workforce analytics project, with robust research design and analysis to get reliable insights. They reveal where to start, where to find stakeholder support, and how to earn "quick wins" to build upon. You'll learn how to sustain success through best-practice data management, technology usage, partnering, and skill building. Finally, you'll discover how to earn even more value by establishing an analytical mindset throughout HR, and building two key skills: storytelling and visualization. The Power of People will be invaluable to HR executives establishing or leading analytics functions; HR professionals planning analytics projects; and any business executive who wants more value from HR.
When it comes to data analytics, it pays tothink big. PySpark blends the powerful Spark big data processing engine withthe Python programming language to provide a data analysis platform that can scaleup for nearly any task. Data Analysis with Python and PySpark is yourguide to delivering successful Python-driven data projects. Data Analysis with Python and PySpark is a carefully engineered tutorial that helps you use PySpark to deliver your data-driven applications at any scale. This clear and hands-on guide shows you how to enlarge your processing capabilities across multiple machines with data from any source, ranging from Had oop-based clusters to Excel worksheets. You'll learn how to break down big analysis tasks into manageable chunks and how to choose and use the best PySpark data abstraction for your unique needs. The Spark data processing engine is an amazing analytics factory: raw data comes in,and insight comes out. Thanks to its ability to handle massive amounts of data distributed across a cluster, Spark has been adopted as standard by organizations both big and small. PySpark, which wraps the core Spark engine with a Python-based API, puts Spark-based data pipelines in the hands of programmers and data scientists working with the Python programming language. PySpark simplifies Spark's steep learning curve, and provides a seamless bridge between Spark and an ecosystem of Python-based data science tools.
Fuzzy Cluster Analysis presents advanced and powerful fuzzy clustering techniques. This thorough and self-contained introduction to fuzzy clustering methods and applications covers classification, image recognition, data analysis and rule generation. Combining theoretical and practical perspectives, each method is analysed in detail and fully illustrated with examples. Features include:
"The first magnetic recording device was demonstrated and patented
by the Danish inventor Valdemar Poulsen in 1898. Poulsen made a
magnetic recording of his voice on a length of piano wire. MAGNETIC
RECORDING traces the development of the watershed products and the
technical breakthroughs in magnetic recording that took place
during the century from Paulsen's experiment to today's ubiquitous
audio, video, and data recording technologies including tape
recorders, video cassette recorders, and computer hard drives.
|
![]() ![]() You may like...
IAENG Transactions on Engineering…
Haeng-kon Kim, Sio-Iong Ao, …
Hardcover
R4,506
Discovery Miles 45 060
Machine Learning - A Practical Approach…
Rodrigo F Mello, Moacir Antonelli Ponti
Hardcover
R2,929
Discovery Miles 29 290
Research Anthology on Big Data…
Information R Management Association
Hardcover
R17,073
Discovery Miles 170 730
Applications of Machine Learning and…
Ran Yan, Shuaian Wang
Hardcover
|