![]() |
![]() |
Your cart is empty |
||
Showing 1 - 1 of 1 matches in All Departments
Scale your data using your existing Python APIs and data structures with the help of Dask clusters Key Features Build and run your ETL pipeline with Dask delayed and analyze your data Translate a scikit-learn workflow to Dask and perform hyperparameter tuning Model a Dask cluster on the cloud for principal providers such as AWS, Azure, and GCP Book DescriptionData scientists and machine learning engineers are used to building prototypes in pandas, NumPy, and scikit-learn but this approach is most likely to fail when the data increases or in production. Machine Learning and Data Analysis with Dask shows you how Dask can help you tackle this challenge by using existing Python APIs and data structures so you don't have to completely rewrite your code or retrain to scale up. The book starts with an introduction to Dask and covers the fundamentals of distributed computation as well as the advantages and possible disadvantages of using Dask. You'll then discover how to build an extract, transform, and load (ETL) pipeline with Dask delayed and compare its flexibility to multithreading/multiprocessing when working on a single machine. The book further demonstrates how to analyze data with Dask arrays and DataFrames. Later, you'll explore how to distribute Python and R code with Dask and build a machine learning model with Dask-ML. In addition to this, you will understand how to run a parameter search a hundred times faster than on a single machine and then get to grips with the basics of Rapids. Finally, you'll develop Dask clusters on Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). By the end of this book, you will have learned how to use Dask for both research and production. What you will learn Distribute computation both locally and on a cluster Scale and analyze machine learning algorithms on a cluster Create and manage clusters on principal cloud providers Explore distributed computation and translate the usual pandas/scikit-learn workflow to Dask for analytics Manage a massive amount of data effectively and keep cloud costs under control Build a machine learning model step-by-step using Dask to process a huge amount of data Who This Book Is ForThis data analysis machine learning book is for data scientists, ML engineers, and Python users who want to distribute their code using Dask. Beginner-level experience with Python, pandas, and NumPy will help you get the best out of this book.
|
![]() ![]() You may like...
Encyclopedia of Organizational…
Mehdi Khosrow-Pour, D.B.A.
Hardcover
R27,486
Discovery Miles 274 860
Handbook of Technology Transfer
David E. Audretsch, Erik B. Lehmann, …
Hardcover
R4,456
Discovery Miles 44 560
|