|
Showing 1 - 3 of
3 matches in All Departments
Accelerate computations and make the most of your data effectively
and efficiently on Databricks Key Features Understand Spark
optimizations for big data workloads and maximizing performance
Build efficient big data engineering pipelines with Databricks and
Delta Lake Efficiently manage Spark clusters for big data
processing Book DescriptionDatabricks is an industry-leading,
cloud-based platform for data analytics, data science, and data
engineering supporting thousands of organizations across the world
in their data journey. It is a fast, easy, and collaborative Apache
Spark-based big data analytics platform for data science and data
engineering in the cloud. In Optimizing Databricks Workloads, you
will get started with a brief introduction to Azure Databricks and
quickly begin to understand the important optimization techniques.
The book covers how to select the optimal Spark cluster
configuration for running big data processing and workloads in
Databricks, some very useful optimization techniques for Spark
DataFrames, best practices for optimizing Delta Lake, and
techniques to optimize Spark jobs through Spark core. It contains
an opportunity to learn about some of the real-world scenarios
where optimizing workloads in Databricks has helped organizations
increase performance and save costs across various domains. By the
end of this book, you will be prepared with the necessary toolkit
to speed up your Spark jobs and process your data more efficiently.
What you will learn Get to grips with Spark fundamentals and the
Databricks platform Process big data using the Spark DataFrame API
with Delta Lake Analyze data using graph processing in Databricks
Use MLflow to manage machine learning life cycles in Databricks
Find out how to choose the right cluster configuration for your
workloads Explore file compaction and clustering methods to tune
Delta tables Discover advanced optimization techniques to speed up
Spark jobs Who this book is forThis book is for data engineers,
data scientists, and cloud architects who have working knowledge of
Spark/Databricks and some basic understanding of data engineering
principles. Readers will need to have a working knowledge of
Python, and some experience of SQL in PySpark and Spark SQL is
beneficial.
|
You may like...
Loot
Nadine Gordimer
Paperback
(2)
R383
R318
Discovery Miles 3 180
|
Email address subscribed successfully.
A activation email has been sent to you.
Please click the link in that email to activate your subscription.