![]() |
![]() |
Your cart is empty |
||
Books > Computing & IT > Applications of computing > Databases > Data capture & analysis
Knowing everything you can about each click to your Web site can help you make strategic decisions regarding your business. This book is about the why, not just the how, of web analytics and the rules for developing a "culture of analysis" inside your organization. Why you should collect various types of data. Why you need a strategy. Why it must remain flexible. Why your data must generate meaningful action. The authors answer these critical questions--and many more--using their decade of experience in Web analytics.
Statistical methods are a key tool for all scientists working with data, but learning the basics continues to challenge successive generations of students. This accessible textbook provides an up-to-date introduction to the classical techniques and modern extensions of linear model analysis-one of the most useful approaches for investigating scientific data in the life and environmental sciences. While some of the foundational analyses (e.g. t tests, regression, ANOVA) are as useful now as ever, best practice moves on and there are many new general developments that offer great potential. The book emphasizes an estimation-based approach that takes account of recent criticisms of over-use of probability values and introduces the alternative approach that uses information criteria. This new edition includes the latest advances in R and related software and has been thoroughly "road-tested" over the last decade to create a proven textbook that teaches linear and generalized linear model analysis to students of ecology, evolution, and environmental studies (including worked analyses of data sets relevant to all three disciplines). While R is used throughout, the focus remains firmly on statistical analysis. The New Statistics with R is suitable for senior undergraduate and graduate students, professional researchers, and practitioners in the fields of ecology, evolution and environmental studies.
The World Wide Web has a massive and permanent influence on our lives. Economy, industry, education, healthcare, public administration, entertainment - there is hardly any part of our daily lives which has not been pervaded by the Internet. Accordingly, modern Web applications are fully-fledged, complex software systems, and in order to be successful their development must be thorough and systematic. Web Engineering is the application of quantifiable approaches to the cost-effective requirements analysis, design, implementation, testing, operation and maintenance of high quality Web applications. Web Engineers face the same traditional concerns as Software Engineers: the risks of failure to meet business needs, project schedule delays, budget overruns and poor quality of deliverables. But in the Web environment new and complicated issues demand attention, too. Web Engineering addresses the problems associated with shorter lead times which require rapid prototyping and agile methods, the interactivity and visual nature of the medium which make HCI aspects highly significant, and multimedia features of Web applications. This well-organized guide takes a rigorous interdisciplinary approach to Web Engineering, covering Web development concepts, methods, tools and techniques, and is ideal for undergraduate and graduate students on Web-focused or Software Engineering courses, as well as Web software developers, Web designers and project managers.
This handbook is a comprehensive reference guide for researchers, funding agencies and organizations engaged in survey research. Drawing on research from a world-class team of experts, this collection addresses the challenges facing survey-based data collection today as well as the potential opportunities presented by new approaches to survey research, including in the development of policy. It examines innovations in survey methodology and how survey scholars and practitioners should think about survey data in the context of the explosion of new digital sources of data. The Handbook is divided into four key sections: the challenges faced in conventional survey research; opportunities to expand data collection; methods of linking survey data with external sources; and, improving research transparency and data dissemination, with a focus on data curation, evaluating the usability of survey project websites, and the credibility of survey-based social science. Chapter 23 of this book is open access under a CC BY 4.0 license at link.springer.com.
Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. You'll explore the basic operations and common functions of Spark's structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Spark's scalable machine-learning library. Get a gentle overview of big data and Spark Learn about DataFrames, SQL, and Datasets-Spark's core APIs-through worked examples Dive into Spark's low-level APIs, RDDs, and execution of SQL and DataFrames Understand how Spark runs on a cluster Debug, monitor, and tune Spark clusters and applications Learn the power of Structured Streaming, Spark's stream-processing engine Learn how you can apply MLlib to a variety of problems, including classification or recommendation
Enterprise Resource Planning (ERP), Supply Chain Management (SCM), Customer Relationship Management (CRM), Business Intelligence (BI) and Big Data Analytics (BDA) are business related tasks and processes, which are supported by standardized software solutions. The book explains that this requires business oriented thinking and acting from IT specialists and data scientists. It is a good idea to let students experience this directly from the business perspective, for example as executives of a virtual company. The course simulates the stepwise integration of the linked business process chain ERP-SCM-CRM-BI-Big Data of four competing groups of companies. The course participants become board members with full P&L responsibility for business units of one of four beer brewery groups managing supply chains from production to retailer.
An accessible primer on how to create effective graphics from data This book provides students and researchers a hands-on introduction to the principles and practice of data visualization. It explains what makes some graphs succeed while others fail, how to make high-quality figures from data using powerful and reproducible methods, and how to think about data visualization in an honest and effective way. Data Visualization builds the reader's expertise in ggplot2, a versatile visualization library for the R programming language. Through a series of worked examples, this accessible primer then demonstrates how to create plots piece by piece, beginning with summaries of single variables and moving on to more complex graphics. Topics include plotting continuous and categorical variables; layering information on graphics; producing effective "small multiple" plots; grouping, summarizing, and transforming data for plotting; creating maps; working with the output of statistical models; and refining plots to make them more comprehensible. Effective graphics are essential to communicating ideas and a great way to better understand data. This book provides the practical skills students and practitioners need to visualize quantitative data and get the most out of their research findings. Provides hands-on instruction using R and ggplot2 Shows how the "tidyverse" of data analysis tools makes working with R easier and more consistent Includes a library of data sets, code, and functions
Gain the basics of Ruby's map, reduce, and select functions and discover how to use them to solve data-processing problems. This compact hands-on book explains how you can encode certain complex programs in 10 lines of Ruby code, an astonishingly small number. You will walk through problems and solutions which are effective because they use map, reduce, and select. As you read Ruby Data Processing, type in the code, run the code, and ponder the results. Tweak the code to test the code and see how the results change. After reading this book, you will have a deeper understanding of how to break data-processing problems into processing stages, each of which is understandable, debuggable, and composable, and how to combine the stages to solve your data-processing problem. As a result, your Ruby coding will become more efficient and your programs will be more elegant and robust. What You Will Learn Discover Ruby data processing and how to do it using the map, reduce, and select functions Develop complex solutions including debugging, randomizing, sorting, grouping, and more Reverse engineer complex data-processing solutions Who This Book Is For Those who have at least some prior experience programming in Ruby and who have a background and interest in data analysis and processing using Ruby.
This book shows business and data analysts how to use BigQuery most effectively, avoid common pitfalls, and ultimately execute sophisticated queries against large, complex data sets. The authors will share tips and recipes for running complex queries. And they will also show how to write code to communicate with the BigQuery API. The authors will demonstrate best practices and techniques against an extended real-world example -- a web application that collects sensor data from mobile devices and displays a dashboard visualizing the data in real-time. Along the way, the authors will use examples to demonstrate streaming ingestion, transformation via Hadoop in Google Compute engine, AppEngine datastore integration, and using GViz with Tableau to generate charts of query results.The authors will not just cover the mechanics of using BigQuery; they will also cover the architecture of the underlying Dremel query engine: understanding how a query will execute is a key to getting good results from BigQuery. The book describes how Dremel works, and pairs it with concrete query examples showing how to work around limitations in the architecture. The query samples will be in BigQuery's variant of SQL. And the web application examples will be in Python, the most popular language for analytics. Where the Java analogue of the Python samples would differ significantly, Java samples will be given as well. All code and data sets will be available on the book's companion website.
Power BI Data Analysis and Visualization provides a roadmap to vendor choices and highlights why Microsoft's Power BI is a very viable, cost effective option for data visualization. The book covers the fundamentals and most commonly used features of Power BI, but also includes an in-depth discussion of advanced Power BI features such as natural language queries; embedding Power BI dashboards; and live streaming data. It discusses real solutions to extract data from the ERP application, Microsoft Dynamics CRM, and also offers ways to host the Power BI Dashboard as an Azure application, extracting data from popular data sources like Microsoft SQL Server and open-source PostgreSQL. Authored by Microsoft experts, this book uses real-world coding samples and screenshots to spotlight how to create reports, embed them in a webpage, view them across multiple platforms, and more. Business owners, IT professionals, data scientists, and analysts will benefit from this thorough presentation of Power BI and its functions.
Data clustering is a highly interdisciplinary field, the goal of which is to divide a set of objects into homogeneous groups such that objects in the same group are similar and objects in different groups are quite distinct. Thousands of theoretical papers and a number of books on data clustering have been published over the past 50 years. However, few books exist to teach people how to implement data clustering algorithms. This book was written for anyone who wants to implement or improve their data clustering algorithms. Using object-oriented design and programming techniques, Data Clustering in C++ exploits the commonalities of all data clustering algorithms to create a flexible set of reusable classes that simplifies the implementation of any data clustering algorithm. Readers can follow the development of the base data clustering classes and several popular data clustering algorithms. Additional topics such as data pre-processing, data visualization, cluster visualization, and cluster interpretation are briefly covered. This book is divided into three parts-- Data Clustering and C++ Preliminaries: A review of basic concepts of data clustering, the unified modeling language, object-oriented programming in C++, and design patterns A C++ Data Clustering Framework: The development of data clustering base classes Data Clustering Algorithms: The implementation of several popular data clustering algorithms A key to learning a clustering algorithm is to implement and experiment the clustering algorithm. Complete listings of classes, examples, unit test cases, and GNU configuration files are included in the appendices of this book as well as in the downloadable resources. The only requirements to compile the code are a modern C++ compiler and the Boost C++ libraries.
This SpringerBrief reviews the knowledge engineering problem of engineering objectivity in top-k query answering; essentially, answers must be computed taking into account the user's preferences and a collection of (subjective) reports provided by other users. Most assume each report can be seen as a set of scores for a list of features, its author's preferences among the features, as well as other information is discussed in this brief. These pieces of information for every report are then combined, along with the querying user's preferences and their trust in each report, to rank the query results. Everyday examples of this setup are the online reviews that can be found in sites like Amazon, Trip Advisor, and Yelp, among many others. Throughout this knowledge engineering effort the authors adopt the Datalog+/- family of ontology languages as the underlying knowledge representation and reasoning formalism, and investigate several alternative ways in which rankings can b e derived, along with algorithms for top-k (atomic) query answering under these rankings. This SpringerBrief also investigate assumptions under which our algorithms run in polynomial time in the data complexity. Since this SpringerBrief contains a gentle introduction to the main building blocks (OBDA, Datalog+/-, and reasoning with preferences), it should be of value to students, researchers, and practitioners who are interested in the general problem of incorporating user preferences into related formalisms and tools. Practitioners also interested in using Ontology-based Data Access to leverage information contained in reviews of products and services for a better customer experience will be interested in this brief and researchers working in the areas of Ontological Languages, Semantic Web, Data Provenance, and Reasoning with Preferences.
This book constitutes the proceedings of the 22nd Annual Conference on Research in Computational Molecular Biology, RECOMB 2018, held in Paris, France, in April 2018. The 16 extended and 22 short abstracts presented were carefully reviewed and selected from 193 submissions. The short abstracts are included in the back matter of the volume. They report on original research in all areas of computational molecular biology and bioinformatics.
This book provides an account of the use of computational tactical metrics in improving sports analysis, in particular the use of Global Positioning System (GPS) data in soccer. As well as offering a practical perspective on collective behavioural analysis, it introduces the computational metrics available in the literature that allow readers to identify collective behaviour and patterns of play in team sports. These metrics only require the bio-dimensional geo-referencing information from GPS or video-tracking systems to provide qualitative and quantitative information about the tactical behaviour of players and the inter-relationships between teammates and their opponents. Exercises, experimental cases and algorithms enable readers to fully comprehend how to compute these metrics, as well as introducing them to the ultimate performance analysis tool, which is the basis to run them on. The script to compute the metrics is presented in Python. The book is a valuable resource for professional analysts as well students and researchers in the field of sports analysis wanting to optimise the use of GPS trackers in soccer.
Statistical data and evidence-based claims are increasingly central to our everyday lives. Critically examining 'Big Data', this book charts the recent explosion in sources of data, including those precipitated by global developments and technological change. It sets out changes and controversies related to data harvesting and construction, dissemination and data analytics by a range of private, governmental and social organisations in multiple settings. Analysing the power of data to shape political debate, the presentation of ideas to us by the media, and issues surrounding data ownership and access, the authors suggest how data can be used to uncover injustices and to advance social progress.
This book constitutes the thoroughly refereed proceedings of the Fourth International Conference on Data Technologies and Applications, DATA 2016, held in Colmar, France, in July 2016. The 9 revised full papers were carefully reviewed and selected from 50 submissions. The papers deal with the following topics: databases, data warehousing, data mining, data management, data security, knowledge and information systems and technologies; advanced application of data.
This text introduces and provides instruction on the design and analysis of experiments for a broad audience. Formed by decades of teaching, consulting, and industrial experience in the Design of Experiments field, this new edition contains updated examples, exercises, and situations covering the science and engineering practice. This text minimizes the amount of mathematical detail, while still doing full justice to the mathematical rigor of the presentation and the precision of statements, making the text accessible for those who have little experience with design of experiments and who need some practical advice on using such designs to solve day-to-day problems. Additionally, an intuitive understanding of the principles is always emphasized, with helpful hints throughout.
This book gathers papers presented at the ECC 2016, the Third Euro-China Conference on Intelligent Data Analysis and Applications, which was held in Fuzhou City, China from November 7 to 9, 2016. The aim of the ECC is to provide an internationally respected forum for scientific research in the broad areas of intelligent data analysis, computational intelligence, signal processing, and all associated applications of artificial intelligence (AI). The third installment of the ECC was jointly organized by Fujian University of Technology, China, and VSB-Technical University of Ostrava, Czech Republic. The conference was co-sponsored by Taiwan Association for Web Intelligence Consortium, and Immersion Co., Ltd.
Unique reference book covering the entire field of accounting information systems. Contributions from an international range of accounting and information systems experts. Includes coverage of contemporary themes such as big data, data security, cloud computing, IoT and blockchain.
Discover relevant questions-and detailed answers-to help you prepare for job interviews and break into the field of analytics. This book contains more than 200 questions based on consultations with hiring managers and technical professionals already working in analytics. Interview Questions in Business Analytics: How to Ace Interviews and Get the Job You Want fills a gap in information on business analytics for job seekers. Bhasker Gupta, the founder and editor of Analytics India Magazine, has come up with more than 200 questions job applicants are likely to face in an interview. Covering data preparation, statistics, analytics implementation, as well as other crucial topics favored by interviewers, this book: Provides 200+ interview questions often asked by recruiters and hiring managers in global corporations Offers short and to-the-point answers to the depth required, while looking at the problem from all angles Provides a full range of interview questions for jobs ranging from junior analytics to senior data scientists and managers Offers analytics professionals a quick reference on topics in analytics Using a question-and-answer format from start to finish, Interview Questions in Business Analytics: How to Ace Interviews and Get the Job You Want will help you grasp concepts sooner and with deep clarity. The book therefore also serves as a primer on analytics and covers issues relating to business implementation. You will learn about not just the how and what of analytics, but also the why and when. This book will thus ensure that you are well prepared for interviews-putting your dream job well within reach. Business analytics is currently one of the hottest and trendiest areas for technical professionals. With the rise of the profession, there is significant job growth. Even so, it's not easy to get a job in the field, because you need knowledge of subjects such as statistics, databases, and IT services. Candidates must also possess keen business acumen. What's more, employers cast a cold critical eye on all applicants, making the task of getting a job even more difficult. What You'll Learn The 200 questions in this book cover such topics as: * The different types of data used in analytics * How analytics are put to use in different industries * The process of hypothesis testing * Predictive vs. descriptive analytics * Correlation, regression, segmentation and advanced statistics * Predictive modeling Who This Book Is For Those aspiring to jobs in business analytics, including recent graduates and technical professionals looking for a new or better job. Job interviewers will also find the book helpful in preparing interview questions.
This book provides comprehensive reviews of recent progress in matrix variate and tensor variate data analysis from applied points of view. Matrix and tensor approaches for data analysis are known to be extremely useful for recently emerging complex and high-dimensional data in various applied fields. The reviews contained herein cover recent applications of these methods in psychology (Chap. 1), audio signals (Chap. 2) , image analysis from tensor principal component analysis (Chap. 3), and image analysis from decomposition (Chap. 4), and genetic data (Chap. 5) . Readers will be able to understand the present status of these techniques as applicable to their own fields. In Chapter 5 especially, a theory of tensor normal distributions, which is a basic in statistical inference, is developed, and multi-way regression, classification, clustering, and principal component analysis are exemplified under tensor normal distributions. Chapter 6 treats one-sided tests under matrix variate and tensor variate normal distributions, whose theory under multivariate normal distributions has been a popular topic in statistics since the books of Barlow et al. (1972) and Robertson et al. (1988). Chapters 1, 5, and 6 distinguish this book from ordinary engineering books on these topics.
What is the cost of employees today and what will this be in the future? This book explains how to take a data-driven approach to workforce planning and allow the business to reach its strategic goals. Organizational Planning and Analysis (OP&A) is a data-driven approach to workforce planning. It allows HR professionals, OD practitioners and business leaders to monitor an organization's activities and analyse business data to regularly adjust plans to ensure that the business succeeds. This book covers everything from how to build an OP&A function, the difference between strategic and operational workforce planning and how to manage demand and supply through to how to match people to new or changing roles and develop robust succession planning. Organizational Planning and Analysis also covers how OP&A works with HR operations including recruitment, L&D, reward and performance management and includes a chapter on new human capital analytics which allow a business to improve the return on investment for each of its employees. Full of practical advice and step by step guidance, this book is also supported by case studies from organizations including KPMG, Sainsbury's, WPP, Accenture, TSB, Johnson & Johnson, Aer Lingus and FedEx.
Disaster management is a process or strategy that is implemented when any type of catastrophic event takes place. The process may be initiated when anything threatens to disrupt normal operations or puts the lives of human beings at risk. Governments on all levels as well as many businesses create some sort of disaster plan that make it possible to overcome the catastrophe and return to normal function as quickly as possible. Response to natural disasters (e.g., floods, earthquakes) or technological disaster (e.g., nuclear, chemical) is an extreme complex process that involves severe time pressure, various uncertainties, high non-linearity and many stakeholders. Disaster management often requires several autonomous agencies to collaboratively mitigate, prepare, respond, and recover from heterogeneous and dynamic sets of hazards to society. Almost all disasters involve high degrees of novelty to deal with most unexpected various uncertainties and dynamic time pressures. Existing studies and approaches within disaster management have mainly been focused on some specific type of disasters with certain agency oriented. There is a lack of a general framework to deal with similarities and synergies among different disasters by taking their specific features into account. This book provides with various decisions analysis theories and support tools in complex systems in general and in disaster management in particular. The book is also generated during a long-term preparation of a European project proposal among most leading experts in the areas related to the book title. Chapters are evaluated based on quality and originality in theory and methodology, application oriented, relevance to the title of the book.
This book constitutes the thoroughly refereed post-conference proceedings of the International Conference on Scalable Information Systems, INFOSCALE 2014, held in September 2014 in Seoul, South Korea. The 9 revised full papers presented were carefully reviewed and selected from 14 submissions. The papers cover a wide range of topics such as scalable data analysis and big data applications.
Big Data Analytics with Spark is a step-by-step guide for learning Spark, which is an open-source fast and general-purpose cluster computing framework for large-scale data analysis. You will learn how to use Spark for different types of big data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning. In addition, this book will help you become a much sought-after Spark expert. Spark is one of the hottest Big Data technologies. The amount of data generated today by devices, applications and users is exploding. Therefore, there is a critical need for tools that can analyze large-scale data and unlock value from it. Spark is a powerful technology that meets that need. You can, for example, use Spark to perform low latency computations through the use of efficient caching and iterative algorithms; leverage the features of its shell for easy and interactive Data analysis; employ its fast batch processing and low latency features to process your real time data streams and so on. As a result, adoption of Spark is rapidly growing and is replacing Hadoop MapReduce as the technology of choice for big data analytics. This book provides an introduction to Spark and related big-data technologies. It covers Spark core and its add-on libraries, including Spark SQL, Spark Streaming, GraphX, and MLlib. Big Data Analytics with Spark is therefore written for busy professionals who prefer learning a new technology from a consolidated source instead of spending countless hours on the Internet trying to pick bits and pieces from different sources. The book also provides a chapter on Scala, the hottest functional programming language, and the program that underlies Spark. You'll learn the basics of functional programming in Scala, so that you can write Spark applications in it. What's more, Big Data Analytics with Spark provides an introduction to other big data technologies that are commonly used along with Spark, like Hive, Avro, Kafka and so on. So the book is self-sufficient; all the technologies that you need to know to use Spark are covered. The only thing that you are expected to know is programming in any language. There is a critical shortage of people with big data expertise, so companies are willing to pay top dollar for people with skills in areas like Spark and Scala. So reading this book and absorbing its principles will provide a boost-possibly a big boost-to your career. |
![]() ![]() You may like...
Legends - People Who Changed South…
Matthew Blackman, Nick Dall
Paperback
|