|
|
Books > Computing & IT > Applications of computing > Databases > Data warehousing
Leverage the power of MongoDB 4.x to build and administer
fault-tolerant database applications Key Features Master the new
features and capabilities of MongoDB 4.x Implement advanced data
modeling, querying, and administration techniques in MongoDB
Includes rich case-studies and best practices followed by expert
MongoDB developers Book DescriptionMongoDB is the best platform for
working with non-relational data and is considered to be the
smartest tool for organizing data in line with business needs. The
recently released MongoDB 4.x supports ACID transactions and makes
the technology an asset for enterprises across the IT and fintech
sectors. This book provides expertise in advanced and niche areas
of managing databases (such as modeling and querying databases)
along with various administration techniques in MongoDB, thereby
helping you become a successful MongoDB expert. The book helps you
understand how the newly added capabilities function with the help
of some interesting examples and large datasets. You will dive
deeper into niche areas such as high-performance configurations,
optimizing SQL statements, configuring large-scale sharded
clusters, and many more. You will also master best practices in
overcoming database failover, and master recovery and backup
procedures for database security. By the end of the book, you will
have gained a practical understanding of administering database
applications both on premises and on the cloud; you will also be
able to scale database applications across all servers. What you
will learn Perform advanced querying techniques such as indexing
and expressions Configure, monitor, and maintain a highly scalable
MongoDB environment Master replication and data sharding to
optimize read/write performance Administer MongoDB-based
applications on premises or on the cloud Integrate MongoDB with big
data sources to process huge amounts of data Deploy MongoDB on
Kubernetes containers Use MongoDB in IoT, mobile, and serverless
environments Who this book is forThis book is ideal for MongoDB
developers and database administrators who wish to become
successful MongoDB experts and build scalable and fault-tolerant
applications using MongoDB. It will also be useful for database
professionals who wish to become certified MongoDB professionals.
Some understanding of MongoDB and basic database concepts is
required to get the most out of this book.
Develop modern solutions with Snowflake's unique architecture and
integration capabilities; process bulk and real-time data into a
data lake; and leverage time travel, cloning, and data-sharing
features to optimize data operations Key Features Build and scale
modern data solutions using the all-in-one Snowflake platform
Perform advanced cloud analytics for implementing big data and data
science solutions Make quicker and better-informed business
decisions by uncovering key insights from your data Book
DescriptionSnowflake is a unique cloud-based data warehousing
platform built from scratch to perform data management on the
cloud. This book introduces you to Snowflake's unique architecture,
which places it at the forefront of cloud data warehouses. You'll
explore the compute model available with Snowflake, and find out
how Snowflake allows extensive scaling through the virtual
warehouses. You will then learn how to configure a virtual
warehouse for optimizing cost and performance. Moving on, you'll
get to grips with the data ecosystem and discover how Snowflake
integrates with other technologies for staging and loading data. As
you progress through the chapters, you will leverage Snowflake's
capabilities to process a series of SQL statements using tasks to
build data pipelines and find out how you can create modern data
solutions and pipelines designed to provide high performance and
scalability. You will also get to grips with creating role
hierarchies, adding custom roles, and setting default roles for
users before covering advanced topics such as data sharing,
cloning, and performance optimization. By the end of this Snowflake
book, you will be well-versed in Snowflake's architecture for
building modern analytical solutions and understand best practices
for solving commonly faced problems using practical recipes. What
you will learn Get to grips with data warehousing techniques
aligned with Snowflake's cloud architecture Broaden your skills as
a data warehouse designer to cover the Snowflake ecosystem Transfer
skills from on-premise data warehousing to the Snowflake cloud
analytics platform Optimize performance and costs associated with a
Snowflake solution Stage data on object stores and load it into
Snowflake Secure data and share it efficiently for access Manage
transactions and extend Snowflake using stored procedures Extend
cloud data applications using Spark Connector Who this book is
forThis book is for data warehouse developers, data analysts,
database administrators, and anyone involved in designing,
implementing, and optimizing a Snowflake data warehouse. Knowledge
of data warehousing and database and cloud concepts will be useful.
Basic familiarity with Snowflake is beneficial, but not necessary.
Learn how to combine SQL Server's analytics with Azure's
flexibility and hybrid connectivity to achieve industry-leading
performance and manageability for your cloud database. Key Features
Understand platform availability for SQL Server in Azure Explore
the benefits and deployment choices offered by SQL IaaS Get to
grips with deploying SQL Server on the Linux development ecosystem
Book DescriptionDeploying SQL Server on Azure virtual machines
allows you to work on full versions of SQL Server in the cloud
without having to maintain on-premises hardware. The book begins by
introducing you to the SQL portfolio in Azure and takes you through
SQL Server IaaS scenarios, before explaining the factors that you
need to consider while choosing an OS for SQL Server in Azure VMs.
As you progress through the book, you'll explore different VM
options and deployment choices for IaaS and understand platform
availability, migration tools, and best practices in Azure. In
later chapters, you'll learn how to configure storage to achieve
optimized performance. Finally, you'll get to grips with the
concept of Azure Hybrid Benefit and find out how you can use it to
maximize the value of your existing on-premises SQL Server. By the
end of this book, you'll be proficient in administering SQL Server
on Microsoft Azure and leveraging the tools required for its
deployment. What you will learn Choose an operating system for SQL
Server in Azure VMs Use the Azure Management Portal to facilitate
the deployment process Verify connectivity and network latency in
cloud Configure storage for optimal performance and connectivity
Explore various disaster recovery options for SQL Server in Azure
Optimize SQL Server on Linux Discover how to back up databases to a
URL Who this book is forSQL Server on Azure VMs is for you if you
are a developer, data enthusiast, or anyone who wants to migrate
SQL Server databases to Azure virtual machines. Basic familiarity
with SQL Server and managed identities for Azure resources will be
a plus.
Get up to speed with the new features added to Microsoft SQL Server
2019 Analysis Services and create models to support your business
Key Features Explore tips and tricks to design, develop, and
optimize end-to-end data analytics solutions using Microsoft's
technologies Learn tabular modeling and multi-dimensional cube
design development using real-world examples Implement Analysis
Services to help you make productive business decisions Book
DescriptionSQL Server Analysis Services (SSAS) continues to be a
leading enterprise-scale toolset, enabling customers to deliver
data and analytics across large datasets with great performance.
This book will help you understand MS SQL Server 2019's new
features and improvements, especially when it comes to SSAS. First,
you'll cover a quick overview of SQL Server 2019, learn how to
choose the right analytical model to use, and understand their key
differences. You'll then explore how to create a multi-dimensional
model with SSAS and expand on that model with MDX. Next, you'll
create and deploy a tabular model using Microsoft Visual Studio and
Management Studio. You'll learn when and how to use both tabular
and multi-dimensional model types, how to deploy and configure your
servers to support them, and design principles that are relevant to
each model. The book comes packed with tips and tricks to build
measures, optimize your design, and interact with models using
Excel and Power BI. All this will help you visualize data to gain
useful insights and make better decisions. Finally, you'll discover
practices and tools for securing and maintaining your models once
they are deployed. By the end of this MS SQL Server book, you'll be
able to choose the right model and build and deploy it to support
the analytical needs of your business. What you will learn
Determine the best analytical model using SSAS Cover the core
aspects involved in MDX, including writing your first query
Implement calculated tables and calculation groups (new in version
2019) in DAX Create and deploy tabular and multi-dimensional models
on SQL 2019 Connect and create data visualizations using Excel and
Power BI Implement row-level and other data security methods with
tabular and multi-dimensional models Explore essential concepts and
techniques to scale, manage, and optimize your SSAS solutions Who
this book is forThis Microsoft SQL Server book is for BI
professionals and data analysts who are looking for a practical
guide to creating and maintaining tabular and multi-dimensional
models using SQL Server 2019 Analysis Services. A basic working
knowledge of BI solutions such as Power BI and database querying is
required.
Explore the latest Azure ETL techniques both on-premises and in the
cloud using Azure services such as SQL Server Integration Services
(SSIS), Azure Data Factory, and Azure Databricks Key Features
Understand the key components of an ETL solution using Azure
Integration Services Discover the common and not-so-common
challenges faced while creating modern and scalable ETL solutions
Program and extend your packages to develop efficient data
integration and data transformation solutions Book DescriptionETL
is one of the most common and tedious procedures for moving and
processing data from one database to another. With the help of this
book, you will be able to speed up the process by designing
effective ETL solutions using the Azure services available for
handling and transforming any data to suit your requirements. With
this cookbook, you'll become well versed in all the features of SQL
Server Integration Services (SSIS) to perform data migration and
ETL tasks that integrate with Azure. You'll learn how to transform
data in Azure and understand how legacy systems perform ETL
on-premises using SSIS. Later chapters will get you up to speed
with connecting and retrieving data from SQL Server 2019 Big Data
Clusters, and even show you how to extend and customize the SSIS
toolbox using custom-developed tasks and transforms. This ETL book
also contains practical recipes for moving and transforming data
with Azure services, such as Data Factory and Azure Databricks, and
lets you explore various options for migrating SSIS packages to
Azure. Toward the end, you'll find out how to profile data in the
cloud and automate service creation with Business Intelligence
Markup Language (BIML). By the end of this book, you'll have
developed the skills you need to create and automate ETL solutions
on-premises as well as in Azure. What you will learn Explore ETL
and how it is different from ELT Move and transform various data
sources with Azure ETL and ELT services Use SSIS 2019 with Azure
HDInsight clusters Discover how to query SQL Server 2019 Big Data
Clusters hosted in Azure Migrate SSIS solutions to Azure and solve
key challenges associated with it Understand why data profiling is
crucial and how to implement it in Azure Databricks Get to grips
with BIML and learn how it applies to SSIS and Azure Data Factory
solutions Who this book is forThis book is for data warehouse
architects, ETL developers, or anyone who wants to build scalable
ETL applications in Azure. Those looking to extend their existing
on-premise ETL applications to use big data and a variety of Azure
services or others interested in migrating existing on-premise
solutions to the Azure cloud platform will also find the book
useful. Familiarity with SQL Server services is necessary to get
the most out of this book.
Simplify your ETL processes with these hands-on data hygiene tips,
tricks, and best practices. Key Features Focus on the basics of
data wrangling Study various ways to extract the most out of your
data in less time Boost your learning curve with bonus topics like
random data generation and data integrity checks Book
DescriptionFor data to be useful and meaningful, it must be curated
and refined. Data Wrangling with Python teaches you the core ideas
behind these processes and equips you with knowledge of the most
popular tools and techniques in the domain. The book starts with
the absolute basics of Python, focusing mainly on data structures.
It then delves into the fundamental tools of data wrangling like
NumPy and Pandas libraries. You'll explore useful insights into why
you should stay away from traditional ways of data cleaning, as
done in other languages, and take advantage of the specialized
pre-built routines in Python. This combination of Python tips and
tricks will also demonstrate how to use the same Python backend and
extract/transform data from an array of sources including the
Internet, large database vaults, and Excel financial tables. To
help you prepare for more challenging scenarios, you'll cover how
to handle missing or wrong data, and reformat it based on the
requirements from the downstream analytics tool. The book will
further help you grasp concepts through real-world examples and
datasets. By the end of this book, you will be confident in using a
diverse array of sources to extract, clean, transform, and format
your data efficiently. What you will learn Use and manipulate
complex and simple data structures Harness the full potential of
DataFrames and numpy.array at run time Perform web scraping with
BeautifulSoup4 and html5lib Execute advanced string search and
manipulation with RegEX Handle outliers and perform data imputation
with Pandas Use descriptive statistics and plotting techniques
Practice data wrangling and modeling using data generation
techniques Who this book is forData Wrangling with Python is
designed for developers, data analysts, and business analysts who
are keen to pursue a career as a full-fledged data scientist or
analytics expert. Although, this book is for beginners, prior
working knowledge of Python is necessary to easily grasp the
concepts covered here. It will also help to have rudimentary
knowledge of relational database and SQL.
The concept of a big data warehouse appeared in order to store
moving data objects and temporal data information. Moving objects
are geometries that change their position and shape continuously
over time. In order to support spatio-temporal data, a data model
and associated query language is needed for supporting moving
objects. Emerging Perspectives in Big Data Warehousing is an
essential research publication that explores current innovative
activities focusing on the integration between data warehousing and
data mining with an emphasis on the applicability to real-world
problems. Featuring a wide range of topics such as index
structures, ontology, and user behavior, this book is ideally
designed for IT consultants, researchers, professionals, computer
scientists, academicians, and managers.
Build, monitor, and manage real-time data pipelines to create data
engineering infrastructure efficiently using open-source Apache
projects Key Features Become well-versed in data architectures,
data preparation, and data optimization skills with the help of
practical examples Design data models and learn how to extract,
transform, and load (ETL) data using Python Schedule, automate, and
monitor complex data pipelines in production Book DescriptionData
engineering provides the foundation for data science and analytics,
and forms an important part of all businesses. This book will help
you to explore various tools and methods that are used for
understanding the data engineering process using Python. The book
will show you how to tackle challenges commonly faced in different
aspects of data engineering. You'll start with an introduction to
the basics of data engineering, along with the technologies and
frameworks required to build data pipelines to work with large
datasets. You'll learn how to transform and clean data and perform
analytics to get the most out of your data. As you advance, you'll
discover how to work with big data of varying complexity and
production databases, and build data pipelines. Using real-world
examples, you'll build architectures on which you'll learn how to
deploy data pipelines. By the end of this Python book, you'll have
gained a clear understanding of data modeling techniques, and will
be able to confidently build data engineering pipelines for
tracking data, running quality checks, and making necessary changes
in production. What you will learn Understand how data engineering
supports data science workflows Discover how to extract data from
files and databases and then clean, transform, and enrich it
Configure processors for handling different file formats as well as
both relational and NoSQL databases Find out how to implement a
data pipeline and dashboard to visualize results Use staging and
validation to check data before landing in the warehouse Build
real-time pipelines with staging areas that perform validation and
handle failures Get to grips with deploying pipelines in the
production environment Who this book is forThis book is for data
analysts, ETL developers, and anyone looking to get started with or
transition to the field of data engineering or refresh their
knowledge of data engineering using Python. This book will also be
useful for students planning to build a career in data engineering
or IT professionals preparing for a transition. No previous
knowledge of data engineering is required.
A comprehensive guide to understanding key techniques for
architecture and hardware planning, monitoring, replication,
backups, and decoupling Key Features Newly updated edition,
covering the latest PostgreSQL 12 features with hands-on
industry-driven recipes Create a PostgreSQL cluster that stays
online even when disaster strikes Learn how to avoid costly
downtime and data loss that can ruin your business Book
DescriptionDatabases are nothing without the data they store. In
the event of an outage or technical catastrophe, immediate recovery
is essential. This updated edition ensures that you will learn the
important concepts related to node architecture design, as well as
techniques such as using repmgr for failover automation. From
cluster layout and hardware selection to software stacks and
horizontal scalability, this PostgreSQL cookbook will help you
build a PostgreSQL cluster that will survive crashes, resist data
corruption, and grow smoothly with customer demand. You'll start by
understanding how to plan a PostgreSQL database architecture that
is resistant to outages and scalable, as it is the scaffolding on
which everything rests. With the bedrock established, you'll cover
the topics that PostgreSQL database administrators need to know to
manage a highly available cluster. This includes configuration,
troubleshooting, monitoring and alerting, backups through proxies,
failover automation, and other considerations that are essential
for a healthy PostgreSQL cluster. Later, you'll learn to use
multi-master replication to maximize server availability. Later
chapters will guide you through managing major version upgrades
without downtime. By the end of this book, you'll have learned how
to build an efficient and adaptive PostgreSQL 12 database cluster.
What you will learn Understand how to protect data with PostgreSQL
replication tools Focus on hardware planning to ensure that your
database runs efficiently Reduce database resource contention with
connection pooling Monitor and visualize cluster activity with
Nagios and the TIG (Telegraf, InfluxDB, Grafana) stack Construct a
robust software stack that can detect and avert outages Use
multi-master to achieve an enduring PostgreSQL cluster Who this
book is forThis book is for Postgres administrators and developers
who are looking to build and maintain a highly reliable PostgreSQL
cluster. Although knowledge of the new features of PostgreSQL 12 is
not required, a basic understanding of PostgreSQL administration is
expected.
Kick-start your DevOps career by learning how to effectively deploy
Kubernetes on Azure in an easy, comprehensive, and fun way with
hands-on coding tasks Key Features Understand the fundamentals of
Docker and Kubernetes Learn to implement microservices architecture
using the Kubernetes platform Discover how you can scale your
application workloads in Azure Kubernetes Service (AKS) Book
DescriptionFrom managing versioning efficiently to improving
security and portability, technologies such as Kubernetes and
Docker have greatly helped cloud deployments and application
development. Starting with an introduction to Docker, Kubernetes,
and Azure Kubernetes Service (AKS), this book will guide you
through deploying an AKS cluster in different ways. You'll then
explore the Azure portal by deploying a sample guestbook
application on AKS and installing complex Kubernetes apps using
Helm. With the help of real-world examples, you'll also get to
grips with scaling your application and cluster. As you advance,
you'll understand how to overcome common challenges in AKS and
secure your application with HTTPS and Azure AD (Active Directory).
Finally, you'll explore serverless functions such as HTTP triggered
Azure functions and queue triggered functions. By the end of this
Kubernetes book, you'll be well-versed with the fundamentals of
Azure Kubernetes Service and be able to deploy containerized
workloads on Microsoft Azure with minimal management overhead. What
you will learn Plan, configure, and run containerized applications
in production Use Docker to build apps in containers and deploy
them on Kubernetes Improve the configuration and deployment of apps
on the Azure Cloud Store your container images securely with Azure
Container Registry Install complex Kubernetes applications using
Helm Integrate Kubernetes with multiple Azure PaaS services, such
as databases, Event Hubs and Functions. Who this book is forThis
book is for aspiring DevOps professionals, system administrators,
developers, and site reliability engineers looking to understand
test and deployment processes and improve their efficiency. If
you're new to working with containers and orchestration, you'll
find this book useful.
Learn how to migrate your SAP data to Azure simply and
successfully. Key Features Learn why Azure is suitable for
business-critical systems Understand how to migrate your SAP
infrastructure to Azure Use Lift & shift migration, Lift &
migrate, Lift & migrate to HANA, or Lift & transform to
S/4HANA Book DescriptionCloud technologies have now reached a level
where even the most critical business systems can run on them. For
most organizations SAP is the key business system. If SAP is
unavailable for any reason then potentially your business stops.
Because of this, it is understandable that you will be concerned
whether such a critical system can run in the public cloud.
However, the days when you truly ran your IT system on-premises
have long since gone. Most organizations have been getting rid of
their own data centers and increasingly moving to co-location
facilities. In this context the public cloud is nothing more than
an additional virtual data center connected to your existing
network. There are typically two main reasons why you may consider
migrating SAP to Azure: You need to replace the infrastructure that
is currently running SAP, or you want to migrate SAP to a new
database. Depending on your goal SAP offers different migration
paths. You can decide either to migrate the current workload to
Azure as-is, or to combine it with changing the database and
execute both activities as a single step. SAP on Azure
Implementation Guide covers the main migration options to lead you
through migrating your SAP data to Azure simply and successfully.
What you will learn Successfully migrate your SAP infrastructure to
Azure Understand the security benefits of Azure See how Azure can
scale to meet the most demanding of business needs Ensure your SAP
infrastructure maintains high availability Increase business
agility through cloud capabilities Leverage cloud-native
capabilities to enhance SAP Who this book is forSAP on Azure
Implementation Guide is designed to benefit existing SAP architects
looking to migrate their SAP infrastructure to Azure. Whether you
are an architect implementing the migration or an IT decision maker
evaluating the benefits of migration, this book is for you.
Leverage the power of Microsoft Azure Data Factory v2 to build
hybrid data solutions Key Features Combine the power of Azure Data
Factory v2 and SQL Server Integration Services Design and enhance
performance and scalability of a modern ETL hybrid solution
Interact with the loaded data in data warehouse and data lake using
Power BI Book DescriptionETL is one of the essential techniques in
data processing. Given data is everywhere, ETL will always be the
vital process to handle data from different sources. Hands-On Data
Warehousing with Azure Data Factory starts with the basic concepts
of data warehousing and ETL process. You will learn how Azure Data
Factory and SSIS can be used to understand the key components of an
ETL solution. You will go through different services offered by
Azure that can be used by ADF and SSIS, such as Azure Data Lake
Analytics, Machine Learning and Databrick's Spark with the help of
practical examples. You will explore how to design and implement
ETL hybrid solutions using different integration services with a
step-by-step approach. Once you get to grips with all this, you
will use Power BI to interact with data coming from different
sources in order to reveal valuable insights. By the end of this
book, you will not only learn how to build your own ETL solutions
but also address the key challenges that are faced while building
them. What you will learn Understand the key components of an ETL
solution using Azure Data Factory and Integration Services Design
the architecture of a modern ETL hybrid solution Implement ETL
solutions for both on-premises and Azure data Improve the
performance and scalability of your ETL solution Gain thorough
knowledge of new capabilities and features added to Azure Data
Factory and Integration Services Who this book is forThis book is
for you if you are a software professional who develops and
implements ETL solutions using Microsoft SQL Server or Azure cloud.
It will be an added advantage if you are a software engineer,
DW/ETL architect, or ETL developer, and know how to create a new
ETL implementation or enhance an existing one with ADF or SSIS.
Build and design multiple types of applications that are
cross-language, platform, and cost-effective by understanding core
Azure principles and foundational concepts Key Features Get
familiar with the different design patterns available in Microsoft
Azure Develop Azure cloud architecture and a pipeline management
system Get to know the security best practices for your Azure
deployment Book DescriptionThanks to its support for high
availability, scalability, security, performance, and disaster
recovery, Azure has been widely adopted to create and deploy
different types of application with ease. Updated for the latest
developments, this third edition of Azure for Architects helps you
get to grips with the core concepts of designing serverless
architecture, including containers, Kubernetes deployments, and big
data solutions. You'll learn how to architect solutions such as
serverless functions, you'll discover deployment patterns for
containers and Kubernetes, and you'll explore large-scale big data
processing using Spark and Databricks. As you advance, you'll
implement DevOps using Azure DevOps, work with intelligent
solutions using Azure Cognitive Services, and integrate security,
high availability, and scalability into each solution. Finally,
you'll delve into Azure security concepts such as OAuth,
OpenConnect, and managed identities. By the end of this book,
you'll have gained the confidence to design intelligent Azure
solutions based on containers and serverless functions. What you
will learn Understand the components of the Azure cloud platform
Use cloud design patterns Use enterprise security guidelines for
your Azure deployment Design and implement serverless and
integration solutions Build efficient data solutions on Azure
Understand container services on Azure Who this book is forIf you
are a cloud architect, DevOps engineer, or a developer looking to
learn about the key architectural aspects of the Azure cloud
platform, this book is for you. A basic understanding of the Azure
cloud platform will help you grasp the concepts covered in this
book more effectively.
This book examines how cloud-based services challenge the current
application of antitrust and privacy laws in the EU and the US. The
author looks at the elements of data centers, the way information
is organized, and how antitrust, competition and privacy laws in
the US and the EU regulate cloud-based services and their market
practices. She discusses how platform interoperability can be a
driver of incremental innovation and the consequences of not
promoting radical innovation. She evaluates applications of
predictive analysis based on big data as well as deriving
privacy-invasive conduct. She looks at the way antitrust and
privacy laws approach consumer protection and how lawmakers can
reach more balanced outcomes by understanding the technical
background of cloud-based services.
A fast paced guide that will help you learn about Apache Hadoop 3
and its ecosystem Key Features Set up, configure and get started
with Hadoop to get useful insights from large data sets Work with
the different components of Hadoop such as MapReduce, HDFS and YARN
Learn about the new features introduced in Hadoop 3 Book
DescriptionApache Hadoop is a widely used distributed data
platform. It enables large datasets to be efficiently processed
instead of using one large computer to store and process the data.
This book will get you started with the Hadoop ecosystem, and
introduce you to the main technical topics, including MapReduce,
YARN, and HDFS. The book begins with an overview of big data and
Apache Hadoop. Then, you will set up a pseudo Hadoop development
environment and a multi-node enterprise Hadoop cluster. You will
see how the parallel programming paradigm, such as MapReduce, can
solve many complex data processing problems. The book also covers
the important aspects of the big data software development
lifecycle, including quality assurance and control, performance,
administration, and monitoring. You will then learn about the
Hadoop ecosystem, and tools such as Kafka, Sqoop, Flume, Pig, Hive,
and HBase. Finally, you will look at advanced topics, including
real time streaming using Apache Storm, and data analytics using
Apache Spark. By the end of the book, you will be well versed with
different configurations of the Hadoop 3 cluster. What you will
learn Store and analyze data at scale using HDFS, MapReduce and
YARN Install and configure Hadoop 3 in different modes Use Yarn
effectively to run different applications on Hadoop based platform
Understand and monitor how Hadoop cluster is managed Consume
streaming data using Storm, and then analyze it using Spark Explore
Apache Hadoop ecosystem components, such as Flume, Sqoop, HBase,
Hive, and Kafka Who this book is forAspiring Big Data professionals
who want to learn the essentials of Hadoop 3 will find this book to
be useful. Existing Hadoop users who want to get up to speed with
the new features introduced in Hadoop 3 will also benefit from this
book. Having knowledge of Java programming will be an added
advantage.
Build, manage, and configure high-performing, reliable NoSQL
database for your applications with Cassandra Key Features Write
programs more efficiently using Cassandra's features with the help
of examples Configure Cassandra and fine-tune its parameters
depending on your needs Integrate Cassandra database with Apache
Spark and build strong data analytics pipeline Book DescriptionWith
ever-increasing rates of data creation, the demand for storing data
fast and reliably becomes a need. Apache Cassandra is the perfect
choice for building fault-tolerant and scalable databases.
Mastering Apache Cassandra 3.x teaches you how to build and
architect your clusters, configure and work with your nodes, and
program in a high-throughput environment, helping you understand
the power of Cassandra as per the new features. Once you've covered
a brief recap of the basics, you'll move on to deploying and
monitoring a production setup and optimizing and integrating it
with other software. You'll work with the advanced features of CQL
and the new storage engine in order to understand how they function
on the server-side. You'll explore the integration and interaction
of Cassandra components, followed by discovering features such as
token allocation algorithm, CQL3, vnodes, lightweight transactions,
and data modelling in detail. Last but not least you will get to
grips with Apache Spark. By the end of this book, you'll be able to
analyse big data, and build and manage high-performance databases
for your application. What you will learn Write programs more
efficiently using Cassandra's features more efficiently Exploit the
given infrastructure, improve performance, and tweak the Java
Virtual Machine (JVM) Use CQL3 in your application in order to
simplify working with Cassandra Configure Cassandra and fine-tune
its parameters depending on your needs Set up a cluster and learn
how to scale it Monitor a Cassandra cluster in different ways Use
Apache Spark and other big data processing tools Who this book is
forMastering Apache Cassandra 3.x is for you if you are a big data
administrator, database administrator, architect, or developer who
wants to build a high-performing, scalable, and fault-tolerant
database. Prior knowledge of core concepts of databases is
required.
Understand data science concepts and methodologies to manage and
deliver top-notch solutions for your organization Key Features
Learn the basics of data science and explore its possibilities and
limitations Manage data science projects and assemble teams
effectively even in the most challenging situations Understand
management principles and approaches for data science projects to
streamline the innovation process Book DescriptionData science and
machine learning can transform any organization and unlock new
opportunities. However, employing the right management strategies
is crucial to guide the solution from prototype to production.
Traditional approaches often fail as they don't entirely meet the
conditions and requirements necessary for current data science
projects. In this book, you'll explore the right approach to data
science project management, along with useful tips and best
practices to guide you along the way. After understanding the
practical applications of data science and artificial intelligence,
you'll see how to incorporate them into your solutions. Next, you
will go through the data science project life cycle, explore the
common pitfalls encountered at each step, and learn how to avoid
them. Any data science project requires a skilled team, and this
book will offer the right advice for hiring and growing a data
science team for your organization. Later, you'll be shown how to
efficiently manage and improve your data science projects through
the use of DevOps and ModelOps. By the end of this book, you will
be well versed with various data science solutions and have gained
practical insights into tackling the different challenges that
you'll encounter on a daily basis. What you will learn Understand
the underlying problems of building a strong data science pipeline
Explore the different tools for building and deploying data science
solutions Hire, grow, and sustain a data science team Manage data
science projects through all stages, from prototype to production
Learn how to use ModelOps to improve your data science pipelines
Get up to speed with the model testing techniques used in both
development and production stages Who this book is forThis book is
for data scientists, analysts, and program managers who want to use
data science for business productivity by incorporating data
science workflows efficiently. Some understanding of basic data
science concepts will be useful to get the most out of this book.
|
|