![]() |
![]() |
Your cart is empty |
||
Books > Computing & IT > Computer software packages > Other software packages > Mathematical & statistical software
This book has a collection of articles written by Big Data experts to describe some of the cutting-edge methods and applications from their respective areas of interest, and provides the reader with a detailed overview of the field of Big Data Analytics as it is practiced today. The chapters cover technical aspects of key areas that generate and use Big Data such as management and finance; medicine and healthcare; genome, cytome and microbiome; graphs and networks; Internet of Things; Big Data standards; bench-marking of systems; and others. In addition to different applications, key algorithmic approaches such as graph partitioning, clustering and finite mixture modelling of high-dimensional data are also covered. The varied collection of themes in this volume introduces the reader to the richness of the emerging field of Big Data Analytics.
Reproducible Finance with R: Code Flows and Shiny Apps for Portfolio Analysis is a unique introduction to data science for investment management that explores the three major R/finance coding paradigms, emphasizes data visualization, and explains how to build a cohesive suite of functioning Shiny applications. The full source code, asset price data and live Shiny applications are available at reproduciblefinance.com. The ideal reader works in finance or wants to work in finance and has a desire to learn R code and Shiny through simple, yet practical real-world examples. The book begins with the first step in data science: importing and wrangling data, which in the investment context means importing asset prices, converting to returns, and constructing a portfolio. The next section covers risk and tackles descriptive statistics such as standard deviation, skewness, kurtosis, and their rolling histories. The third section focuses on portfolio theory, analyzing the Sharpe Ratio, CAPM, and Fama French models. The book concludes with applications for finding individual asset contribution to risk and for running Monte Carlo simulations. For each of these tasks, the three major coding paradigms are explored and the work is wrapped into interactive Shiny dashboards.
This book traces the theory and methodology of multivariate statistical analysis and shows how it can be conducted in practice using the LISREL computer program. It presents not only the typical uses of LISREL, such as confirmatory factor analysis and structural equation models, but also several other multivariate analysis topics, including regression (univariate, multivariate, censored, logistic, and probit), generalized linear models, multilevel analysis, and principal component analysis. It provides numerous examples from several disciplines and discusses and interprets the results, illustrated with sections of output from the LISREL program, in the context of the example. The book is intended for masters and PhD students and researchers in the social, behavioral, economic and many other sciences who require a basic understanding of multivariate statistical theory and methods for their analysis of multivariate data. It can also be used as a textbook on various topics of multivariate statistical analysis.
This book provides a unified view on a new methodology for Machine Translation (MT). This methodology extracts information from widely available resources (extensive monolingual corpora) while only assuming the existence of a very limited parallel corpus, thus having a unique starting point to Statistical Machine Translation (SMT). In this book, a detailed presentation of the methodology principles and system architecture is followed by a series of experiments, where the proposed system is compared to other MT systems using a set of established metrics including BLEU, NIST, Meteor and TER. Additionally, a free-to-use code is available, that allows the creation of new MT systems. The volume is addressed to both language professionals and researchers. Prerequisites for the readers are very limited and include a basic understanding of the machine translation as well as of the basic tools of natural language processing.
This book introduces multidimensional scaling (MDS) and unfolding as data analysis techniques for applied researchers. MDS is used for the analysis of proximity data on a set of objects, representing the data as distances between points in a geometric space (usually of two dimensions). Unfolding is a related method that maps preference data (typically evaluative ratings of different persons on a set of objects) as distances between two sets of points (representing the persons and the objects, resp.). This second edition has been completely revised to reflect new developments and the coverage of unfolding has also been substantially expanded. Intended for applied researchers whose main interests are in using these methods as tools for building substantive theories, it discusses numerous applications (classical and recent), highlights practical issues (such as evaluating model fit), presents ways to enforce theoretical expectations for the scaling solutions, and addresses the typical mistakes that MDS/unfolding users tend to make. Further, it shows how MDS and unfolding can be used in practical research work, primarily by using the smacof package in the R environment but also Proxscal in SPSS. It is a valuable resource for psychologists, social scientists, and market researchers, with a basic understanding of multivariate statistics (such as multiple regression and factor analysis).
This book focuses on statistical methods for the analysis of discrete failure times. Failure time analysis is one of the most important fields in statistical research, with applications affecting a wide range of disciplines, in particular, demography, econometrics, epidemiology and clinical research. Although there are a large variety of statistical methods for failure time analysis, many techniques are designed for failure times that are measured on a continuous scale. In empirical studies, however, failure times are often discrete, either because they have been measured in intervals (e.g., quarterly or yearly) or because they have been rounded or grouped. The book covers well-established methods like life-table analysis and discrete hazard regression models, but also introduces state-of-the art techniques for model evaluation, nonparametric estimation and variable selection. Throughout, the methods are illustrated by real life applications, and relationships to survival analysis in continuous time are explained. Each section includes a set of exercises on the respective topics. Various functions and tools for the analysis of discrete survival data are collected in the R package discSurv that accompanies the book.
This textbook examines empirical linguistics from a theoretical linguist's perspective. It provides both a theoretical discussion of what quantitative corpus linguistics entails and detailed, hands-on, step-by-step instructions to implement the techniques in the field. The statistical methodology and R-based coding from this book teach readers the basic and then more advanced skills to work with large data sets in their linguistics research and studies. Massive data sets are now more than ever the basis for work that ranges from usage-based linguistics to the far reaches of applied linguistics. This book presents much of the methodology in a corpus-based approach. However, the corpus-based methods in this book are also essential components of recent developments in sociolinguistics, historical linguistics, computational linguistics, and psycholinguistics. Material from the book will also be appealing to researchers in digital humanities and the many non-linguistic fields that use textual data analysis and text-based sensorimetrics. Chapters cover topics including corpus processing, frequencing data, and clustering methods. Case studies illustrate each chapter with accompanying data sets, R code, and exercises for use by readers. This book may be used in advanced undergraduate courses, graduate courses, and self-study.
This edited book focuses on the latest developments in classification, statistical learning, data analysis and related areas of data science, including statistical analysis of large datasets, big data analytics, time series clustering, integration of data from different sources, as well as social networks. It covers both methodological aspects as well as applications to a wide range of areas such as economics, marketing, education, social sciences, medicine, environmental sciences and the pharmaceutical industry. In addition, it describes the basic features of the software behind the data analysis results, and provides links to the corresponding codes and data sets where necessary. This book is intended for researchers and practitioners who are interested in the latest developments and applications in the field. The peer-reviewed contributions were presented at the 10th Scientific Meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society, held in Santa Margherita di Pula (Cagliari), Italy, October 8-10, 2015.
This edited volume on the latest advances in data science covers a wide range of topics in the context of data analysis and classification. In particular, it includes contributions on classification methods for high-dimensional data, clustering methods, multivariate statistical methods, and various applications. The book gathers a selection of peer-reviewed contributions presented at the Fifteenth Conference of the International Federation of Classification Societies (IFCS2015), which was hosted by the Alma Mater Studiorum, University of Bologna, from July 5 to 8, 2015.
This tutorial teaches you how to use the statistical programming language R to develop a business case simulation and analysis. It presents a methodology for conducting business case analysis that minimizes decision delay by focusing stakeholders on what matters most and suggests pathways for minimizing the risk in strategic and capital allocation decisions. Business case analysis, often conducted in spreadsheets, exposes decision makers to additional risks that arise just from the use of the spreadsheet environment. R has become one of the most widely used tools for reproducible quantitative analysis, and analysts fluent in this language are in high demand. The R language, traditionally used for statistical analysis, provides a more explicit, flexible, and extensible environment than spreadsheets for conducting business case analysis. The main tutorial follows the case in which a chemical manufacturing company considers constructing a chemical reactor and production facility to bring a new compound to market. There are numerous uncertainties and risks involved, including the possibility that a competitor brings a similar product online. The company must determine the value of making the decision to move forward and where they might prioritize their attention to make a more informed and robust decision. While the example used is a chemical company, the analysis structure it presents can be applied to just about any business decision, from IT projects to new product development to commercial real estate. The supporting tutorials include the perspective of the founder of a professional service firm who wants to grow his business and a member of a strategic planning group in a biomedical device company who wants to know how much to budget in order to refine the quality of information about critical uncertainties that might affect the value of a chosen product development pathway. What You'll Learn Set up a business case abstraction in an influence diagram to communicate the essence of the problem to other stakeholders Model the inherent uncertainties in the problem with Monte Carlo simulation using the R language Communicate the results graphically Draw appropriate insights from the results Develop creative decision strategies for thorough opportunity cost analysis Calculate the value of information on critical uncertainties between competing decision strategies to set the budget for deeper data analysis Construct appropriate information to satisfy the parameters for the Monte Carlo simulation when little or no empirical data are available Who This Book Is For Financial analysts, data practitioners, and risk/business professionals; also appropriate for graduate level finance, business, or data science students
After the fundamental volume and the advanced technique volume, this volume focuses on R applications in the quantitative investment area. Quantitative investment has been hot for some years, and there are more and more startups working on it, combined with many other internet communities and business models. R is widely used in this area, and can be a very powerful tool. The author introduces R applications with cases from his own startup, covering topics like portfolio optimization and risk management.
This volume collects latest methodological and applied contributions on functional, high-dimensional and other complex data, related statistical models and tools as well as on operator-based statistics. It contains selected and refereed contributions presented at the Fourth International Workshop on Functional and Operatorial Statistics (IWFOS 2017) held in A Coruna, Spain, from 15 to 17 June 2017. The series of IWFOS workshops was initiated by the Working Group on Functional and Operatorial Statistics at the University of Toulouse in 2008. Since then, many of the major advances in functional statistics and related fields have been periodically presented and discussed at the IWFOS workshops.
The book provides a description of the process of health economic evaluation and modelling for cost-effectiveness analysis, particularly from the perspective of a Bayesian statistical approach. Some relevant theory and introductory concepts are presented using practical examples and two running case studies. The book also describes in detail how to perform health economic evaluations using the R package BCEA (Bayesian Cost-Effectiveness Analysis). BCEA can be used to post-process the results of a Bayesian cost-effectiveness model and perform advanced analyses producing standardised and highly customisable outputs. It presents all the features of the package, including its many functions and their practical application, as well as its user-friendly web interface. The book is a valuable resource for statisticians and practitioners working in the field of health economics wanting to simplify and standardise their workflow, for example in the preparation of dossiers in support of marketing authorisation, or academic and scientific publications.
This book offers a concise and gentle introduction to finite element programming in Python based on the popular FEniCS software library. Using a series of examples, including the Poisson equation, the equations of linear elasticity, the incompressible Navier-Stokes equations, and systems of nonlinear advection-diffusion-reaction equations, it guides readers through the essential steps to quickly solving a PDE in FEniCS, such as how to define a finite variational problem, how to set boundary conditions, how to solve linear and nonlinear systems, and how to visualize solutions and structure finite element Python programs. This book is open access under a CC BY license.
This guide for practicing statisticians, data scientists, and R users and programmers will teach the essentials of preprocessing: data leveraging the R programming language to easily and quickly turn noisy data into usable pieces of information. Data wrangling, which is also commonly referred to as data munging, transformation, manipulation, janitor work, etc., can be a painstakingly laborious process. Roughly 80% of data analysis is spent on cleaning and preparing data; however, being a prerequisite to the rest of the data analysis workflow (visualization, analysis, reporting), it is essential that one become fluent and efficient in data wrangling techniques. This book will guide the user through the data wrangling process via a step-by-step tutorial approach and provide a solid foundation for working with data in R. The author's goal is to teach the user how to easily wrangle data in order to spend more time on understanding the content of the data. By the end of the book, the user will have learned: How to work with different types of data such as numerics, characters, regular expressions, factors, and dates The difference between different data structures and how to create, add additional components to, and subset each data structure How to acquire and parse data from locations previously inaccessible How to develop functions and use loop control structures to reduce code redundancy How to use pipe operators to simplify code and make it more readable How to reshape the layout of data and manipulate, summarize, and join data sets
Familiarize yourself with MATLAB using this concise, practical tutorial that is focused on writing code to learn concepts. Starting from the basics, this book covers array-based computing, plotting and working with files, numerical computation formalism, and the primary concepts of approximations. Introduction to MATLAB is useful for industry engineers, researchers, and students who are looking for open-source solutions for numerical computation. In this book you will learn by doing, avoiding technical jargon, which makes the concepts easy to learn. First you'll see how to run basic calculations, absorbing technical complexities incrementally as you progress toward advanced topics. Throughout, the language is kept simple to ensure that readers at all levels can grasp the concepts. What You'll Learn Apply sample code to your engineering or science problems Work with MATLAB arrays, functions, and loops Use MATLAB's plotting functions for data visualization Solve numerical computing and computational engineering problems with a MATLAB case study Who This Book Is For Engineers, scientists, researchers, and students who are new to MATLAB. Some prior programming experience would be helpful but not required.
Discusses sf, a new foundational package that defines a new set of classes for working with spatial data in R Developed as an open source project on Github Written at an introductory level
This book constitutes the refereed proceedings of the 19th International Conference on Distributed and Computer and Communication Networks, DCCN 2016, held in Moscow, Russia, in November 2016. The 50 revised full papers and the 6 revised short papers presented were carefully reviewed and selected from 141 submissions. The papers cover the following topics: computer and communication networks architecture optimization; control in computer and communication networks; performance and QoS/QoE evaluation in wireless networks; analytical modeling and simulation of next-generation communications systems; queuing theory and reliability theory applications in computer networks; wireless 4G/5G networks, cm- and mm-wave radio technologies; RFID technology and its application in intellectual transportation networks; internet of things, wearables, and applications of distributed information systems; probabilistic and statistical models in information systems; mathematical modeling of high-tech systems; mathematical modeling and control problems; distributed and cloud computing systems, big data analytics.
This edited volume lays the groundwork for Social Data Science, addressing epistemological issues, methods, technologies, software and applications of data science in the social sciences. It presents data science techniques for the collection, analysis and use of both online and offline new (big) data in social research and related applications. Among others, the individual contributions cover topics like social media, learning analytics, clustering, statistical literacy, recurrence analysis and network analysis. Data science is a multidisciplinary approach based mainly on the methods of statistics and computer science, and its aim is to develop appropriate methodologies for forecasting and decision-making in response to an increasingly complex reality often characterized by large amounts of data (big data) of various types (numeric, ordinal and nominal variables, symbolic data, texts, images, data streams, multi-way data, social networks etc.) and from diverse sources. This book presents selected papers from the international conference on Data Science & Social Research, held in Naples, Italy in February 2016, and will appeal to researchers in the social sciences working in academia as well as in statistical institutes and offices.
This introductory textbook for business statistics teaches statistical analysis and research methods via business case studies and financial data using Excel, Minitab, and SAS. Every chapter in this textbook engages the reader with data of individual stock, stock indices, options, and futures. One studies and uses statistics to learn how to study, analyze, and understand a data set of particular interest. Some of the more popular statistical programs that have been developed to use statistical and computational methods to analyze data sets are SAS, SPSS, and Minitab. Of those, we look at Minitab and SAS in this textbook. One of the main reasons to use Minitab is that it is the easiest to use among the popular statistical programs. We look at SAS because it is the leading statistical package used in industry. We also utilize the much less costly and ubiquitous Microsoft Excel to do statistical analysis, as the benefits of Excel have become widely recognized in the academic world and its analytical capabilities extend to about 90 percent of statistical analysis done in the business world. We demonstrate much of our statistical analysis using Excel and double check the analysis and outcomes using Minitab and SAS-also helpful in some analytical methods not possible or practical to do in Excel.
Intuitive Probability and Random Processes using MATLAB (R) is an introduction to probability and random processes that merges theory with practice. Based on the author's belief that only "hands-on" experience with the material can promote intuitive understanding, the approach is to motivate the need for theory using MATLAB examples, followed by theory and analysis, and finally descriptions of "real-world" examples to acquaint the reader with a wide variety of applications. The latter is intended to answer the usual question "Why do we have to study this?" Other salient features are: *heavy reliance on computer simulation for illustration and student exercises *the incorporation of MATLAB programs and code segments *discussion of discrete random variables followed by continuous random variables to minimize confusion *summary sections at the beginning of each chapter *in-line equation explanations *warnings on common errors and pitfalls *over 750 problems designed to help the reader assimilate and extend the concepts Intuitive Probability and Random Processes using MATLAB (R) is intended for undergraduate and first-year graduate students in engineering. The practicing engineer as well as others having the appropriate mathematical background will also benefit from this book. About the Author Steven M. Kay is a Professor of Electrical Engineering at the University of Rhode Island and a leading expert in signal processing. He has received the Education Award "for outstanding contributions in education and in writing scholarly books and texts..." from the IEEE Signal Processing society and has been listed as among the 250 most cited researchers in the world in engineering.
This book discusses the problem of model choice when the statistical models are separate, also called nonnested. Chapter 1 provides an introduction, motivating examples and a general overview of the problem. Chapter 2 presents the classical or frequentist approach to the problem as well as several alternative procedures and their properties. Chapter 3 explores the Bayesian approach, the limitations of the classical Bayes factors and the proposed alternative Bayes factors to overcome these limitations. It also discusses a significance Bayesian procedure. Lastly, Chapter 4 examines the pure likelihood approach. Various real-data examples and computer simulations are provided throughout the text.
The R Companion to Elementary Applied Statistics includes traditional applications covered in elementary statistics courses as well as some additional methods that address questions that might arise during or after the application of commonly used methods. Beginning with basic tasks and computations with R, readers are then guided through ways to bring data into R, manipulate the data as needed, perform common statistical computations and elementary exploratory data analysis tasks, prepare customized graphics, and take advantage of R for a wide range of methods that find use in many elementary applications of statistics. Features: Requires no familiarity with R or programming to begin using this book. Can be used as a resource for a project-based elementary applied statistics course, or for researchers and professionals who wish to delve more deeply into R. Contains an extensive array of examples that illustrate ideas on various ways to use pre-packaged routines, as well as on developing individualized code. Presents quite a few methods that may be considered non-traditional, or advanced. Includes accompanying carefully documented script files that contain code for all examples presented, and more. R is a powerful and free product that is gaining popularity across the scientific community in both the professional and academic arenas. Statistical methods discussed in this book are used to introduce the fundamentals of using R functions and provide ideas for developing further skills in writing R code. These ideas are illustrated through an extensive collection of examples. About the Author: Christopher Hay-Jahans received his Doctor of Arts in mathematics from Idaho State University in 1999. After spending three years at University of South Dakota, he moved to Juneau, Alaska, in 2002 where he has taught a wide range of undergraduate courses at University of Alaska Southeast.
A unique point of this book is its low threshold, textually simple and at the same time full of self-assessment opportunities. Other unique points are the succinctness of the chapters with 3 to 6 pages, the presence of entire-commands-texts of the statistical methodologies reviewed and the fact that dull scientific texts imposing an unnecessary burden on busy and jaded professionals have been left out. For readers requesting more background, theoretical and mathematical information a note section with references is in each chapter. The first edition in 2010 was the first publication of a complete overview of SPSS methodologies for medical and health statistics. Well over 100,000 copies of various chapters were sold within the first year of publication. Reasons for a rewrite were four. First, many important comments from readers urged for a rewrite. Second, SPSS has produced many updates and upgrades, with relevant novel and improved methodologies. Third, the authors felt that the chapter texts needed some improvements for better readability: chapters have now been classified according the outcome data helpful for choosing your analysis rapidly, a schematic overview of data, and explanatory graphs have been added. Fourth, current data are increasingly complex and many important methods for analysis were missing in the first edition. For that latter purpose some more advanced methods seemed unavoidable, like hierarchical loglinear methods, gamma and Tweedie regressions and random intercept analyses. In order for the contents of the book to remain covered by the title, the authors renamed the book: SPSS for Starters and 2nd Levelers. Special care was, nonetheless, taken to keep things as simple as possible, simple menu commands are given. The arithmetic is still of a no-more-than high-school level. Step-by-step analyses of different statistical methodologies are given with the help of 60 SPSS data files available through the internet. Because of the lack of time of this busy group of people, the authors have given every effort to produce a text as succinct as possible.
The objective of Kai Zhang and his research is to assess the existing process monitoring and fault detection (PM-FD) methods. His aim is to provide suggestions and guidance for choosing appropriate PM-FD methods, because the performance assessment study for PM-FD methods has become an area of interest in both academics and industry. The author first compares basic FD statistics, and then assesses different PM-FD methods to monitor the key performance indicators of static processes, steady-state dynamic processes and general dynamic processes including transient states. He validates the theoretical developments using both benchmark and real industrial processes. |
![]() ![]() You may like...
Honor to the Great Head of the Church…
Margarette W Williams Ed D
Paperback
Remember Their Sin No More?
David J. Shepherd, Richard S. Briggs
Hardcover
R909
Discovery Miles 9 090
Ethics in Christian Ministry - A Guide…
Charles W Christian
Paperback
The Courage to Heal - Moving Beyond Your…
Tracy Strawberry
Paperback
|