![]() |
Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
||
|
Books > Computing & IT > Computer software packages > Other software packages > Mathematical & statistical software
Straightforward, clear, and applied, this book will give you the theoretical and practical basis you need to apply data analysis techniques to real data. Combining key statistical concepts with detailed technical advice, it addresses common themes and problems presented by real research, and shows you how to adjust your techniques and apply your statistical knowledge to a range of datasets. It also embeds code and software output throughout and is supported by online resources to enable practice and safe experimentation. The book includes: * Original case studies and data sets * Practical exercises and lists of commands for each chapter * Downloadable Stata programmes created to work alongside chapters * A wide range of detailed applications using Stata * Step-by-step guidance on writing the relevant code. This is the perfect text for anyone doing statistical research in the social sciences getting started using Stata for data analysis.
The statistical analyses that students of the life-sciences are being expected to perform are becoming increasingly advanced. Whether at the undergraduate, graduate, or post-graduate level, this book provides the tools needed to properly analyze your data in an efficient, accessible, plainspoken, frank, and occasionally humorous manner, ensuring that readers come away with the knowledge of which analyses they should use and when they should use them. The book uses the statistical language R, which is the choice of ecologists worldwide and is rapidly becoming the 'go-to' stats program throughout the life-sciences. Furthermore, by using a single, real-world dataset throughout the book, readers are encouraged to become deeply familiar with an imperfect but realistic set of data. Indeed, early chapters are specifically designed to teach basic data manipulation skills and build good habits in preparation for learning more advanced analyses. This approach also demonstrates the importance of viewing data through different lenses, facilitating an easy and natural progression from linear and generalized linear models through to mixed effects versions of those same analyses. Readers will also learn advanced plotting and data-wrangling techniques, and gain an introduction to writing their own functions. Applied Statistics with R is suitable for senior undergraduate and graduate students, professional researchers, and practitioners throughout the life-sciences, whether in the fields of ecology, evolution, environmental studies, or computational biology.
For problems that require extensive computation, a C++ program can race through billions of examples faster than most other computing choices. C++ enables mathematicians of virtually any discipline to create programs to meet their needs quickly, and is available on most computer systems at no cost. C++ for Mathematicians: An Introduction for Students and Professionals accentuates C++ concepts that are most valuable for pure and applied mathematical research. This is the first book available on C++ programming that is written specifically for a mathematical audience; it omits the language's more obscure features in favor of the aspects of greatest utility for mathematical work. The author explains how to use C++ to formulate conjectures, create images and diagrams, verify proofs, build mathematical structures, and explore myriad examples. Emphasizing the essential role of practice as part of the learning process, the book is ideally designed for undergraduate coursework as well as self-study. Each chapter provides many problems and solutions which complement the text and enable you to learn quickly how to apply them to your own problems. Accompanying downloadable resources provide all numbered programs so that readers can easily use or adapt the code as needed. Presenting clear explanations and examples from the world of mathematics that develop concepts from the ground up, C++ for Mathematicians can be used again and again as a resource for applying C++ to problems that range from the basic to the complex.
Do you want to create data analysis reports without writing a line of code? This book introduces SAS Studio, a free data science web browser-based product for educational and non-commercial purposes. The power of SAS Studio comes from its visual point-and-click user interface that generates SAS code. It is easier to learn SAS Studio than to learn R and Python to accomplish data cleaning, statistics, and visualization tasks. The book includes a case study about analyzing the data required for predicting the results of presidential elections in the state of Maine for 2016 and 2020. In addition to the presidential elections, the book provides real-life examples including analyzing stocks, oil and gold prices, crime, marketing, and healthcare. You will see data science in action and how easy it is to perform complicated tasks and visualizations in SAS Studio. You will learn, step-by-step, how to do visualizations, including maps. In most cases, you will not need a line of code as you work with the SAS Studio graphical user interface. The book includes explanations of the code that SAS Studio generates automatically. You will learn how to edit this code to perform more complicated advanced tasks. The book introduces you to multiple SAS products such as SAS Viya, SAS Analytics, and SAS Visual Statistics. What You Will Learn Become familiar with SAS Studio IDE Understand essential visualizations Know the fundamental statistical analysis required in most data science and analytics reports Clean the most common data set problems Use linear progression for data prediction Write programs in SAS Get introduced to SAS-Viya, which is more potent than SAS studio Who This Book Is For A general audience of people who are new to data science, students, and data analysts and scientists who are experienced but new to SAS. No programming or in-depth statistics knowledge is needed.
Clinical Data Quality Checks for CDISC Compliance using SAS is the first book focused on identifying and correcting data quality and CDISC compliance issues with real-world innovative SAS programming techniques such as Proc SQL, metadata and macro programming. Learn to master Proc SQL's subqueries and summary functions for multi-tasking process. Drawing on his more than 25 years' experience in the pharmaceutical industry, the author provides a unique approach that empowers SAS programmers to take control of data quality and CDISC compliance. This book helps you create a system of SDTM and ADaM checks that can be tracked for continuous improvement. How often have you encountered issues such as missing required variables, duplicate records, invalid derived variables and invalid sequence of two dates? With the SAS programming techniques introduced in this book, you can start to monitor these and more complex data and CDISC compliance issues. With increased standardization in SDTM and ADaM specifications and data values, codelist dictionaries can be created for better organization, planning and maintenance. This book includes a SAS program to create excel files containing unique values from all SDTM and ADaM variables as columns. In addition, another SAS program compares SDTM and ADaM codelist dictionaries with codelists from define.xml specifications. Having tools to automate this process greatly saves time from doing it manually. Features SDTMs and ADaMs Vitals SDTMs and ADaMs Data CDISC Specifications Compliance CDISC Data Compliance Protocol Compliance Codelist Dictionary Compliance
A Tour of Data Science: Learn R and Python in Parallel covers the fundamentals of data science, including programming, statistics, optimization, and machine learning in a single short book. It does not cover everything, but rather, teaches the key concepts and topics in Data Science. It also covers two of the most popular programming languages used in Data Science, R and Python, in one source. Key features: Allows you to learn R and Python in parallel Cover statistics, programming, optimization and predictive modelling, and the popular data manipulation tools - data.table and pandas Provides a concise and accessible presentation Includes machine learning algorithms implemented from scratch, linear regression, lasso, ridge, logistic regression, gradient boosting trees, etc. Appealing to data scientists, statisticians, quantitative analysts, and others who want to learn programming with R and Python from a data science perspective.
This compact course is written for the mathematically literate reader who wants to learn to analyze data in a principled fashion. The language of mathematics enables clear exposition that can go quite deep, quite quickly, and naturally supports an axiomatic and inductive approach to data analysis. Starting with a good grounding in probability, the reader moves to statistical inference via topics of great practical importance - simulation and sampling, as well as experimental design and data collection - that are typically displaced from introductory accounts. The core of the book then covers both standard methods and such advanced topics as multiple testing, meta-analysis, and causal inference.
An Introduction to Survival Analysis Using Stata, Revised Third Edition is the ideal tutorial for professional data analysts who want to learn survival analysis for the first time or who are well versed in survival analysis but are not as dexterous in using Stata to analyze survival data. This text also serves as a valuable reference to those readers who already have experience using Stata's survival analysis routines. The revised third edition has been updated for Stata 14, and it includes a new section on predictive margins and marginal effects, which demonstrates how to obtain and visualize marginal predictions and marginal effects using the margins and marginsplot commands after survival regression models. Survival analysis is a field of its own that requires specialized data management and analysis procedures. To meet this requirement, Stata provides the st family of commands for organizing and summarizing survival data. This book provides statistical theory, step-by-step procedures for analyzing survival data, an in-depth usage guide for Stata's most widely used st commands, and a collection of tips for using Stata to analyze survival data and to present the results. This book develops from first principles the statistical concepts unique to survival data and assumes only a knowledge of basic probability and statistics and a working knowledge of Stata. The first three chapters of the text cover basic theoretical concepts: hazard functions, cumulative hazard functions, and their interpretations; survivor functions; hazard models; and a comparison of nonparametric, semiparametric, and parametric methodologies. Chapter 4 deals with censoring and truncation. The next three chapters cover the formatting, manipulation, stsetting, and error checking involved in preparing survival data for analysis using Stata's st analysis commands. Chapter 8 covers nonparametric methods, including the Kaplan-Meier and Nelson-Aalen estimators and the various nonparametric tests for the equality of survival experience. Chapters 9-11 discuss Cox regression and include various examples of fitting a Cox model, obtaining predictions, interpreting results, building models, model diagnostics, and regression with survey data. The next four chapters cover parametric models, which are fit using Stata's streg command. These chapters include detailed derivations of all six parametric models currently supported in Stata and methods for determining which model is appropriate, as well as information on stratification, obtaining predictions, and advanced topics such as frailty models. Chapter 16 is devoted to power and sample-size calculations for survival studies. The final chapter covers survival analysis in the presence of competing risks.
Mathematical Statistics with Applications in R, Third Edition, offers a modern calculus-based theoretical introduction to mathematical statistics and applications. The book covers many modern statistical computational and simulation concepts that are not covered in other texts, such as the Jackknife, bootstrap methods, the EM algorithms, and Markov chain Monte Carlo (MCMC) methods, such as the Metropolis algorithm, Metropolis-Hastings algorithm and the Gibbs sampler. By combining discussion on the theory of statistics with a wealth of real-world applications, the book helps students to approach statistical problem-solving in a logical manner. Step-by-step procedure to solve real problems make the topics very accessible.
Chunyan Li is a course instructor with many years of experience in teaching about time series analysis. His book is essential for students and researchers in oceanography and other subjects in the Earth sciences, looking for a complete coverage of the theory and practice of time series data analysis using MATLAB. This textbook covers the topic's core theory in depth, and provides numerous instructional examples, many drawn directly from the author's own teaching experience, using data files, examples, and exercises. The book explores many concepts, including time; distance on Earth; wind, current, and wave data formats; finding a subset of ship-based data along planned or random transects; error propagation; Taylor series expansion for error estimates; the least squares method; base functions and linear independence of base functions; tidal harmonic analysis; Fourier series and the generalized Fourier transform; filtering techniques: sampling theorems: finite sampling effects; wavelet analysis; and EOF analysis.
Monte Carlo statistical methods, particularly those based on Markov chains, are now an essential component of the standard set of techniques used by statisticians. This new edition has been revised towards a coherent and flowing coverage of these simulation techniques, with incorporation of the most recent developments in the field. In particular, the introductory coverage of random variable generation has been totally revised, with many concepts being unified through a fundamental theorem of simulation There are five completely new chapters that cover Monte Carlo control, reversible jump, slice sampling, sequential Monte Carlo, and perfect sampling. There is a more in-depth coverage of Gibbs sampling, which is now contained in three consecutive chapters. The development of Gibbs sampling starts with slice sampling and its connection with the fundamental theorem of simulation, and builds up to two-stage Gibbs sampling and its theoretical properties. A third chapter covers the multi-stage Gibbs sampler and its variety of applications. Lastly, chapters from the previous edition have been revised towards easier access, with the examples getting more detailed coverage. This textbook is intended for a second year graduate course, but will also be useful to someone who either wants to apply simulation techniques for the resolution of practical problems or wishes to grasp the fundamental principles behind those methods. The authors do not assume familiarity with Monte Carlo techniques (such as random variable generation), with computer programming, or with any Markov chain theory (the necessary concepts are developed in Chapter 6). A solutions manual, which coversapproximately 40% of the problems, is available for instructors who require the book for a course. Christian P. Robert is Professor of Statistics in the Applied Mathematics Department at UniversitA(c) Paris Dauphine, France. He is also Head of the Statistics Laboratory at the Center for Research in Economics and Statistics (CREST) of the National Institute for Statistics and Economic Studies (INSEE) in Paris, and Adjunct Professor at Ecole Polytechnique. He has written three other books, including The Bayesian Choice, Second Edition, Springer 2001. He also edited Discretization and MCMC Convergence Assessment, Springer 1998. He has served as associate editor for the Annals of Statistics and the Journal of the American Statistical Association. He is a fellow of the Institute of Mathematical Statistics, and a winner of the Young Statistician Award of the SocietiA(c) de Statistique de Paris in 1995. George Casella is Distinguished Professor and Chair, Department of Statistics, University of Florida. He has served as the Theory and Methods Editor of the Journal of the American Statistical Association and Executive Editor of Statistical Science. He has authored three other textbooks: Statistical Inference, Second Edition, 2001, with Roger L. Berger; Theory of Point Estimation, 1998, with Erich Lehmann; and Variance Components, 1992, with Shayle R. Searle and Charles E. McCulloch. He is a fellow of the Institute of Mathematical Statistics and the American Statistical Association, and an elected fellow of the International Statistical Institute.
Presents the main ideas of computer-intensive statistical methods Gives the algorithms for all the methods Uses various plots and illustrations for explaining the main ideas Features the theoretical backgrounds of the main methods. Includes R codes for the methods and examples
This book presents two new decomposition methods to decompose a time series in intrinsic components of low and high frequencies. The methods are based on Singular Value Decomposition (SVD) of a Hankel matrix (HSVD). The proposed decomposition is used to improve the accuracy of linear and nonlinear auto-regressive models. Linear Auto-regressive models (AR, ARMA and ARIMA) and Auto-regressive Neural Networks (ANNs) have been found insufficient because of the highly complicated nature of some time series. Hybrid models are a recent solution to deal with non-stationary processes which combine pre-processing techniques with conventional forecasters, some pre-processing techniques broadly implemented are Singular Spectrum Analysis (SSA) and Stationary Wavelet Transform (SWT). Although the flexibility of SSA and SWT allows their usage in a wide range of forecast problems, there is a lack of standard methods to select their parameters. The proposed decomposition HSVD and Multilevel SVD are described in detail through time series coming from the transport and fishery sectors. Further, for comparison purposes, it is evaluated the forecast accuracy reached by SSA and SWT, both jointly with AR-based models and ANNs.
Hands-on Machine Learning with R provides a practical and applied approach to learning and developing intuition into today's most popular machine learning methods. This book serves as a practitioner's guide to the machine learning process and is meant to help the reader learn to apply the machine learning stack within R, which includes using various R packages such as glmnet, h2o, ranger, xgboost, keras, and others to effectively model and gain insight from their data. The book favors a hands-on approach, providing an intuitive understanding of machine learning concepts through concrete examples and just a little bit of theory. Throughout this book, the reader will be exposed to the entire machine learning process including feature engineering, resampling, hyperparameter tuning, model evaluation, and interpretation. The reader will be exposed to powerful algorithms such as regularized regression, random forests, gradient boosting machines, deep learning, generalized low rank models, and more! By favoring a hands-on approach and using real word data, the reader will gain an intuitive understanding of the architectures and engines that drive these algorithms and packages, understand when and how to tune the various hyperparameters, and be able to interpret model results. By the end of this book, the reader should have a firm grasp of R's machine learning stack and be able to implement a systematic approach for producing high quality modeling results. Features: * Offers a practical and applied introduction to the most popular machine learning methods. * Topics covered include feature engineering, resampling, deep learning and more. * Uses a hands-on approach and real world data.
This textbook on practical data analytics unites fundamental principles, algorithms, and data. Algorithms are the keystone of data analytics and the focal point of this textbook. Clear and intuitive explanations of the mathematical and statistical foundations make the algorithms transparent. But practical data analytics requires more than just the foundations. Problems and data are enormously variable and only the most elementary of algorithms can be used without modification. Programming fluency and experience with real and challenging data is indispensable and so the reader is immersed in Python and R and real data analysis. By the end of the book, the reader will have gained the ability to adapt algorithms to new problems and carry out innovative analyses. This book has three parts:(a) Data Reduction: Begins with the concepts of data reduction, data maps, and information extraction. The second chapter introduces associative statistics, the mathematical foundation of scalable algorithms and distributed computing. Practical aspects of distributed computing is the subject of the Hadoop and MapReduce chapter.(b) Extracting Information from Data: Linear regression and data visualization are the principal topics of Part II. The authors dedicate a chapter to the critical domain of Healthcare Analytics for an extended example of practical data analytics. The algorithms and analytics will be of much interest to practitioners interested in utilizing the large and unwieldly data sets of the Centers for Disease Control and Prevention's Behavioral Risk Factor Surveillance System.(c) Predictive Analytics Two foundational and widely used algorithms, k-nearest neighbors and naive Bayes, are developed in detail. A chapter is dedicated to forecasting. The last chapter focuses on streaming data and uses publicly accessible data streams originating from the Twitter API and the NASDAQ stock market in the tutorials. This book is intended for a one- or two-semester course in data analytics for upper-division undergraduate and graduate students in mathematics, statistics, and computer science. The prerequisites are kept low, and students with one or two courses in probability or statistics, an exposure to vectors and matrices, and a programming course will have no difficulty. The core material of every chapter is accessible to all with these prerequisites. The chapters often expand at the close with innovations of interest to practitioners of data science. Each chapter includes exercises of varying levels of difficulty. The text is eminently suitable for self-study and an exceptional resource for practitioners.
"This book is a great way to both start learning data science through the promising Julia language and to become an efficient data scientist."- Professor Charles Bouveyron, INRIA Chair in Data Science, Universite Cote d'Azur, Nice, France Julia, an open-source programming language, was created to be as easy to use as languages such as R and Python while also as fast as C and Fortran. An accessible, intuitive, and highly efficient base language with speed that exceeds R and Python, makes Julia a formidable language for data science. Using well known data science methods that will motivate the reader, Data Science with Julia will get readers up to speed on key features of the Julia language and illustrate its facilities for data science and machine learning work. Features: Covers the core components of Julia as well as packages relevant to the input, manipulation and representation of data. Discusses several important topics in data science including supervised and unsupervised learning. Reviews data visualization using the Gadfly package, which was designed to emulate the very popular ggplot2 package in R. Readers will learn how to make many common plots and how to visualize model results. Presents how to optimize Julia code for performance. Will be an ideal source for people who already know R and want to learn how to use Julia (though no previous knowledge of R or any other programming language is required). The advantages of Julia for data science cannot be understated. Besides speed and ease of use, there are already over 1,900 packages available and Julia can interface (either directly or through packages) with libraries written in R, Python, Matlab, C, C++ or Fortran. The book is for senior undergraduates, beginning graduate students, or practicing data scientists who want to learn how to use Julia for data science. "This book is a great way to both start learning data science through the promising Julia language and to become an efficient data scientist." Professor Charles Bouveyron INRIA Chair in Data Science Universite Cote d'Azur, Nice, France
Reproducible Finance with R: Code Flows and Shiny Apps for Portfolio Analysis is a unique introduction to data science for investment management that explores the three major R/finance coding paradigms, emphasizes data visualization, and explains how to build a cohesive suite of functioning Shiny applications. The full source code, asset price data and live Shiny applications are available at reproduciblefinance.com. The ideal reader works in finance or wants to work in finance and has a desire to learn R code and Shiny through simple, yet practical real-world examples. The book begins with the first step in data science: importing and wrangling data, which in the investment context means importing asset prices, converting to returns, and constructing a portfolio. The next section covers risk and tackles descriptive statistics such as standard deviation, skewness, kurtosis, and their rolling histories. The third section focuses on portfolio theory, analyzing the Sharpe Ratio, CAPM, and Fama French models. The book concludes with applications for finding individual asset contribution to risk and for running Monte Carlo simulations. For each of these tasks, the three major coding paradigms are explored and the work is wrapped into interactive Shiny dashboards.
Compositional Data Analysis in Practice is a user-oriented practical guide to the analysis of data with the property of a constant sum, for example percentages adding up to 100%. Compositional data can give misleading results if regular statistical methods are applied, and are best analysed by first transforming them to logarithms of ratios. This book explains how this transformation affects the analysis, results and interpretation of this very special type of data. All aspects of compositional data analysis are considered: visualization, modelling, dimension-reduction, clustering and variable selection, with many examples in the fields of food science, archaeology, sociology and biochemistry, and a final chapter containing a complete case study using fatty acid compositions in ecology. The applicability of these methods extends to other fields such as linguistics, geochemistry, marketing, economics and finance. R Software The following repository contains data files and R scripts from the book https://github.com/michaelgreenacre/CODAinPractice. The R package easyCODA, which accompanies this book, is available on CRAN -- note that you should have version 0.25 or higher. The latest version of the package will always be available on R-Forge and can be installed from R with this instruction: install.packages("easyCODA", repos="http://R-Forge.R-project.org").
Sufficient dimension reduction is a rapidly developing research field that has wide applications in regression diagnostics, data visualization, machine learning, genomics, image processing, pattern recognition, and medicine, because they are fields that produce large datasets with a large number of variables. Sufficient Dimension Reduction: Methods and Applications with R introduces the basic theories and the main methodologies, provides practical and easy-to-use algorithms and computer codes to implement these methodologies, and surveys the recent advances at the frontiers of this field. Features Provides comprehensive coverage of this emerging research field. Synthesizes a wide variety of dimension reduction methods under a few unifying principles such as projection in Hilbert spaces, kernel mapping, and von Mises expansion. Reflects most recent advances such as nonlinear sufficient dimension reduction, dimension folding for tensorial data, as well as sufficient dimension reduction for functional data. Includes a set of computer codes written in R that are easily implemented by the readers. Uses real data sets available online to illustrate the usage and power of the described methods. Sufficient dimension reduction has undergone momentous development in recent years, partly due to the increased demands for techniques to process high-dimensional data, a hallmark of our age of Big Data. This book will serve as the perfect entry into the field for the beginning researchers or a handy reference for the advanced ones. The author Bing Li obtained his Ph.D. from the University of Chicago. He is currently a Professor of Statistics at the Pennsylvania State University. His research interests cover sufficient dimension reduction, statistical graphical models, functional data analysis, machine learning, estimating equations and quasilikelihood, and robust statistics. He is a fellow of the Institute of Mathematical Statistics and the American Statistical Association. He is an Associate Editor for The Annals of Statistics and the Journal of the American Statistical Association.
After the fundamental volume and the advanced technique volume, this volume focuses on R applications in the quantitative investment area. Quantitative investment has been hot for some years, and there are more and more startups working on it, combined with many other internet communities and business models. R is widely used in this area, and can be a very powerful tool. The author introduces R applications with cases from his own startup, covering topics like portfolio optimization and risk management.
This book discusses all major topics on survey sampling and estimation. It covers traditional as well as advanced sampling methods related to the spatial populations. The book presents real-world applications of major sampling methods and illustrates them with the R software. As a large sample size is not cost-efficient, this book introduces a new method by using the domain knowledge of the negative correlation between the variable of interest and the auxiliary variable in order to control the size of a sample. In addition, the book focuses on adaptive cluster sampling, rank-set sampling and their applications in real life. Advance methods discussed in the book have tremendous applications in ecology, environmental science, health science, forestry, bio-sciences, and humanities. This book is targeted as a text for undergraduate and graduate students of statistics, as well as researchers in various disciplines.
The Workflow of Data Analysis Using Stata, by J. Scott Long, is an essential productivity tool for data analysts. Long presents lessons gained from his experience and demonstrates how to design and implement efficient workflows for both one-person projects and team projects. After introducing workflows and explaining how a better workflow can make it easier to work with data, Long describes planning, organizing, and documenting your work. He then introduces how to write and debug Stata do-files and how to use local and global macros. After a discussion of conventions that greatly simplify data analysis the author covers cleaning, analyzing, and protecting data.
This book shows the capabilities of Microsoft Excel in teaching marketing statistics effectively. It is a step-by-step, exercise-driven guide for students and practitioners who need to master Excel to solve practical marketing problems. If understanding statistics isn't your strongest suit, you are not especially mathematically inclined, or if you are wary of computers, this is the right book for you.Excel, a widely available computer program for students and managers, is also an effective teaching and learning tool for quantitative analyses in marketing courses. Its powerful computational ability and graphical functions make learning statistics much easier than in years past. Excel 2019 for Marketing Statistics: A Guide to Solving Practical Problems capitalizes on these improvements by teaching students and managers how to apply Excel to statistical techniques necessary in their courses and work. In this new edition, each chapter explains statistical formulas and directs the reader to use Excel commands to solve specific, easy-to-understand marketing problems. Practice problems are provided at the end of each chapter with their solutions in an appendix. Separately, there is a full practice test (with answers in an appendix) that allows readers to test what they have learned.
An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, deep learning, survival analysis, multiple testing, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra. This Second Edition features new chapters on deep learning, survival analysis, and multiple testing, as well as expanded treatments of naive Bayes, generalized linear models, Bayesian additive regression trees, and matrix completion. R code has been updated throughout to ensure compatibility.
R is rapidly becoming the standard software for statistical analyses, graphical presentation of data, and programming in the natural, physical, social, and engineering sciences. Getting Started with R is now the go-to introductory guide for biologists wanting to learn how to use R in their research. It teaches readers how to import, explore, graph, and analyse data, while keeping them focused on their ultimate goals: clearly communicating their data in oral presentations, posters, papers, and reports. It provides a consistent workflow for using R that is simple, efficient, reliable, and reproducible. This second edition has been updated and expanded while retaining the concise and engaging nature of its predecessor, offering an accessible and fun introduction to the packages dplyr and ggplot2 for data manipulation and graphing. It expands the set of basic statistics considered in the first edition to include new examples of a simple regression, a one-way and a two-way ANOVA. Finally, it introduces a new chapter on the generalised linear model. Getting Started with R is suitable for undergraduates, graduate students, professional researchers, and practitioners in the biological sciences. |
You may like...
Essential Java for Scientists and…
Brian Hahn, Katherine Malan
Paperback
R1,266
Discovery Miles 12 660
Mathematical Modeling for Smart…
Debabrata Samanta, Debabrata Singh
Hardcover
R11,427
Discovery Miles 114 270
The Little SAS Enterprise Guide Book
Susan J Slaughter, Lora D Delwiche
Hardcover
R1,790
Discovery Miles 17 900
Jump into JMP Scripting, Second Edition…
Wendy Murphrey, Rosemary Lucas
Hardcover
R1,530
Discovery Miles 15 300
Portfolio and Investment Analysis with…
John B. Guerard, Ziwei Wang, …
Hardcover
R2,322
Discovery Miles 23 220
|