![]() |
Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
||
|
Books > Computing & IT > Applications of computing > Databases > Data warehousing
A beginner's guide to storing, managing, and analyzing data with the updated features of Elastic 7.0 Key Features Gain access to new features and updates introduced in Elastic Stack 7.0 Grasp the fundamentals of Elastic Stack including Elasticsearch, Logstash, and Kibana Explore useful tips for using Elastic Cloud and deploying Elastic Stack in production environments Book DescriptionThe Elastic Stack is a powerful combination of tools for techniques such as distributed search, analytics, logging, and visualization of data. Elastic Stack 7.0 encompasses new features and capabilities that will enable you to find unique insights into analytics using these techniques. This book will give you a fundamental understanding of what the stack is all about, and help you use it efficiently to build powerful real-time data processing applications. The first few sections of the book will help you understand how to set up the stack by installing tools, and exploring their basic configurations. You'll then get up to speed with using Elasticsearch for distributed searching and analytics, Logstash for logging, and Kibana for data visualization. As you work through the book, you will discover the technique of creating custom plugins using Kibana and Beats. This is followed by coverage of the Elastic X-Pack, a useful extension for effective security and monitoring. You'll also find helpful tips on how to use Elastic Cloud and deploy Elastic Stack in production environments. By the end of this book, you'll be well versed with the fundamental Elastic Stack functionalities and the role of each component in the stack to solve different data processing problems. What you will learn Install and configure an Elasticsearch architecture Solve the full-text search problem with Elasticsearch Discover powerful analytics capabilities through aggregations using Elasticsearch Build a data pipeline to transfer data from a variety of sources into Elasticsearch for analysis Create interactive dashboards for effective storytelling with your data using Kibana Learn how to secure, monitor and use Elastic Stack's alerting and reporting capabilities Take applications to an on-premise or cloud-based production environment with Elastic Stack Who this book is forThis book is for entry-level data professionals, software engineers, e-commerce developers, and full-stack developers who want to learn about Elastic Stack and how the real-time processing and search engine works for business analytics and enterprise search applications. Previous experience with Elastic Stack is not required, however knowledge of data warehousing and database concepts will be helpful.
Leverage the power of MongoDB 4.x to build and administer fault-tolerant database applications Key Features Master the new features and capabilities of MongoDB 4.x Implement advanced data modeling, querying, and administration techniques in MongoDB Includes rich case-studies and best practices followed by expert MongoDB developers Book DescriptionMongoDB is the best platform for working with non-relational data and is considered to be the smartest tool for organizing data in line with business needs. The recently released MongoDB 4.x supports ACID transactions and makes the technology an asset for enterprises across the IT and fintech sectors. This book provides expertise in advanced and niche areas of managing databases (such as modeling and querying databases) along with various administration techniques in MongoDB, thereby helping you become a successful MongoDB expert. The book helps you understand how the newly added capabilities function with the help of some interesting examples and large datasets. You will dive deeper into niche areas such as high-performance configurations, optimizing SQL statements, configuring large-scale sharded clusters, and many more. You will also master best practices in overcoming database failover, and master recovery and backup procedures for database security. By the end of the book, you will have gained a practical understanding of administering database applications both on premises and on the cloud; you will also be able to scale database applications across all servers. What you will learn Perform advanced querying techniques such as indexing and expressions Configure, monitor, and maintain a highly scalable MongoDB environment Master replication and data sharding to optimize read/write performance Administer MongoDB-based applications on premises or on the cloud Integrate MongoDB with big data sources to process huge amounts of data Deploy MongoDB on Kubernetes containers Use MongoDB in IoT, mobile, and serverless environments Who this book is forThis book is ideal for MongoDB developers and database administrators who wish to become successful MongoDB experts and build scalable and fault-tolerant applications using MongoDB. It will also be useful for database professionals who wish to become certified MongoDB professionals. Some understanding of MongoDB and basic database concepts is required to get the most out of this book.
Develop modern solutions with Snowflake's unique architecture and integration capabilities; process bulk and real-time data into a data lake; and leverage time travel, cloning, and data-sharing features to optimize data operations Key Features Build and scale modern data solutions using the all-in-one Snowflake platform Perform advanced cloud analytics for implementing big data and data science solutions Make quicker and better-informed business decisions by uncovering key insights from your data Book DescriptionSnowflake is a unique cloud-based data warehousing platform built from scratch to perform data management on the cloud. This book introduces you to Snowflake's unique architecture, which places it at the forefront of cloud data warehouses. You'll explore the compute model available with Snowflake, and find out how Snowflake allows extensive scaling through the virtual warehouses. You will then learn how to configure a virtual warehouse for optimizing cost and performance. Moving on, you'll get to grips with the data ecosystem and discover how Snowflake integrates with other technologies for staging and loading data. As you progress through the chapters, you will leverage Snowflake's capabilities to process a series of SQL statements using tasks to build data pipelines and find out how you can create modern data solutions and pipelines designed to provide high performance and scalability. You will also get to grips with creating role hierarchies, adding custom roles, and setting default roles for users before covering advanced topics such as data sharing, cloning, and performance optimization. By the end of this Snowflake book, you will be well-versed in Snowflake's architecture for building modern analytical solutions and understand best practices for solving commonly faced problems using practical recipes. What you will learn Get to grips with data warehousing techniques aligned with Snowflake's cloud architecture Broaden your skills as a data warehouse designer to cover the Snowflake ecosystem Transfer skills from on-premise data warehousing to the Snowflake cloud analytics platform Optimize performance and costs associated with a Snowflake solution Stage data on object stores and load it into Snowflake Secure data and share it efficiently for access Manage transactions and extend Snowflake using stored procedures Extend cloud data applications using Spark Connector Who this book is forThis book is for data warehouse developers, data analysts, database administrators, and anyone involved in designing, implementing, and optimizing a Snowflake data warehouse. Knowledge of data warehousing and database and cloud concepts will be useful. Basic familiarity with Snowflake is beneficial, but not necessary.
Build and design multiple types of applications that are cross-language, platform, and cost-effective by understanding core Azure principles and foundational concepts Key Features Get familiar with the different design patterns available in Microsoft Azure Develop Azure cloud architecture and a pipeline management system Get to know the security best practices for your Azure deployment Book DescriptionThanks to its support for high availability, scalability, security, performance, and disaster recovery, Azure has been widely adopted to create and deploy different types of application with ease. Updated for the latest developments, this third edition of Azure for Architects helps you get to grips with the core concepts of designing serverless architecture, including containers, Kubernetes deployments, and big data solutions. You'll learn how to architect solutions such as serverless functions, you'll discover deployment patterns for containers and Kubernetes, and you'll explore large-scale big data processing using Spark and Databricks. As you advance, you'll implement DevOps using Azure DevOps, work with intelligent solutions using Azure Cognitive Services, and integrate security, high availability, and scalability into each solution. Finally, you'll delve into Azure security concepts such as OAuth, OpenConnect, and managed identities. By the end of this book, you'll have gained the confidence to design intelligent Azure solutions based on containers and serverless functions. What you will learn Understand the components of the Azure cloud platform Use cloud design patterns Use enterprise security guidelines for your Azure deployment Design and implement serverless and integration solutions Build efficient data solutions on Azure Understand container services on Azure Who this book is forIf you are a cloud architect, DevOps engineer, or a developer looking to learn about the key architectural aspects of the Azure cloud platform, this book is for you. A basic understanding of the Azure cloud platform will help you grasp the concepts covered in this book more effectively.
Learn how to combine SQL Server's analytics with Azure's flexibility and hybrid connectivity to achieve industry-leading performance and manageability for your cloud database. Key Features Understand platform availability for SQL Server in Azure Explore the benefits and deployment choices offered by SQL IaaS Get to grips with deploying SQL Server on the Linux development ecosystem Book DescriptionDeploying SQL Server on Azure virtual machines allows you to work on full versions of SQL Server in the cloud without having to maintain on-premises hardware. The book begins by introducing you to the SQL portfolio in Azure and takes you through SQL Server IaaS scenarios, before explaining the factors that you need to consider while choosing an OS for SQL Server in Azure VMs. As you progress through the book, you'll explore different VM options and deployment choices for IaaS and understand platform availability, migration tools, and best practices in Azure. In later chapters, you'll learn how to configure storage to achieve optimized performance. Finally, you'll get to grips with the concept of Azure Hybrid Benefit and find out how you can use it to maximize the value of your existing on-premises SQL Server. By the end of this book, you'll be proficient in administering SQL Server on Microsoft Azure and leveraging the tools required for its deployment. What you will learn Choose an operating system for SQL Server in Azure VMs Use the Azure Management Portal to facilitate the deployment process Verify connectivity and network latency in cloud Configure storage for optimal performance and connectivity Explore various disaster recovery options for SQL Server in Azure Optimize SQL Server on Linux Discover how to back up databases to a URL Who this book is forSQL Server on Azure VMs is for you if you are a developer, data enthusiast, or anyone who wants to migrate SQL Server databases to Azure virtual machines. Basic familiarity with SQL Server and managed identities for Azure resources will be a plus.
Get up to speed with the new features added to Microsoft SQL Server 2019 Analysis Services and create models to support your business Key Features Explore tips and tricks to design, develop, and optimize end-to-end data analytics solutions using Microsoft's technologies Learn tabular modeling and multi-dimensional cube design development using real-world examples Implement Analysis Services to help you make productive business decisions Book DescriptionSQL Server Analysis Services (SSAS) continues to be a leading enterprise-scale toolset, enabling customers to deliver data and analytics across large datasets with great performance. This book will help you understand MS SQL Server 2019's new features and improvements, especially when it comes to SSAS. First, you'll cover a quick overview of SQL Server 2019, learn how to choose the right analytical model to use, and understand their key differences. You'll then explore how to create a multi-dimensional model with SSAS and expand on that model with MDX. Next, you'll create and deploy a tabular model using Microsoft Visual Studio and Management Studio. You'll learn when and how to use both tabular and multi-dimensional model types, how to deploy and configure your servers to support them, and design principles that are relevant to each model. The book comes packed with tips and tricks to build measures, optimize your design, and interact with models using Excel and Power BI. All this will help you visualize data to gain useful insights and make better decisions. Finally, you'll discover practices and tools for securing and maintaining your models once they are deployed. By the end of this MS SQL Server book, you'll be able to choose the right model and build and deploy it to support the analytical needs of your business. What you will learn Determine the best analytical model using SSAS Cover the core aspects involved in MDX, including writing your first query Implement calculated tables and calculation groups (new in version 2019) in DAX Create and deploy tabular and multi-dimensional models on SQL 2019 Connect and create data visualizations using Excel and Power BI Implement row-level and other data security methods with tabular and multi-dimensional models Explore essential concepts and techniques to scale, manage, and optimize your SSAS solutions Who this book is forThis Microsoft SQL Server book is for BI professionals and data analysts who are looking for a practical guide to creating and maintaining tabular and multi-dimensional models using SQL Server 2019 Analysis Services. A basic working knowledge of BI solutions such as Power BI and database querying is required.
Explore the latest Azure ETL techniques both on-premises and in the cloud using Azure services such as SQL Server Integration Services (SSIS), Azure Data Factory, and Azure Databricks Key Features Understand the key components of an ETL solution using Azure Integration Services Discover the common and not-so-common challenges faced while creating modern and scalable ETL solutions Program and extend your packages to develop efficient data integration and data transformation solutions Book DescriptionETL is one of the most common and tedious procedures for moving and processing data from one database to another. With the help of this book, you will be able to speed up the process by designing effective ETL solutions using the Azure services available for handling and transforming any data to suit your requirements. With this cookbook, you'll become well versed in all the features of SQL Server Integration Services (SSIS) to perform data migration and ETL tasks that integrate with Azure. You'll learn how to transform data in Azure and understand how legacy systems perform ETL on-premises using SSIS. Later chapters will get you up to speed with connecting and retrieving data from SQL Server 2019 Big Data Clusters, and even show you how to extend and customize the SSIS toolbox using custom-developed tasks and transforms. This ETL book also contains practical recipes for moving and transforming data with Azure services, such as Data Factory and Azure Databricks, and lets you explore various options for migrating SSIS packages to Azure. Toward the end, you'll find out how to profile data in the cloud and automate service creation with Business Intelligence Markup Language (BIML). By the end of this book, you'll have developed the skills you need to create and automate ETL solutions on-premises as well as in Azure. What you will learn Explore ETL and how it is different from ELT Move and transform various data sources with Azure ETL and ELT services Use SSIS 2019 with Azure HDInsight clusters Discover how to query SQL Server 2019 Big Data Clusters hosted in Azure Migrate SSIS solutions to Azure and solve key challenges associated with it Understand why data profiling is crucial and how to implement it in Azure Databricks Get to grips with BIML and learn how it applies to SSIS and Azure Data Factory solutions Who this book is forThis book is for data warehouse architects, ETL developers, or anyone who wants to build scalable ETL applications in Azure. Those looking to extend their existing on-premise ETL applications to use big data and a variety of Azure services or others interested in migrating existing on-premise solutions to the Azure cloud platform will also find the book useful. Familiarity with SQL Server services is necessary to get the most out of this book.
Build, monitor, and manage real-time data pipelines to create data engineering infrastructure efficiently using open-source Apache projects Key Features Become well-versed in data architectures, data preparation, and data optimization skills with the help of practical examples Design data models and learn how to extract, transform, and load (ETL) data using Python Schedule, automate, and monitor complex data pipelines in production Book DescriptionData engineering provides the foundation for data science and analytics, and forms an important part of all businesses. This book will help you to explore various tools and methods that are used for understanding the data engineering process using Python. The book will show you how to tackle challenges commonly faced in different aspects of data engineering. You'll start with an introduction to the basics of data engineering, along with the technologies and frameworks required to build data pipelines to work with large datasets. You'll learn how to transform and clean data and perform analytics to get the most out of your data. As you advance, you'll discover how to work with big data of varying complexity and production databases, and build data pipelines. Using real-world examples, you'll build architectures on which you'll learn how to deploy data pipelines. By the end of this Python book, you'll have gained a clear understanding of data modeling techniques, and will be able to confidently build data engineering pipelines for tracking data, running quality checks, and making necessary changes in production. What you will learn Understand how data engineering supports data science workflows Discover how to extract data from files and databases and then clean, transform, and enrich it Configure processors for handling different file formats as well as both relational and NoSQL databases Find out how to implement a data pipeline and dashboard to visualize results Use staging and validation to check data before landing in the warehouse Build real-time pipelines with staging areas that perform validation and handle failures Get to grips with deploying pipelines in the production environment Who this book is forThis book is for data analysts, ETL developers, and anyone looking to get started with or transition to the field of data engineering or refresh their knowledge of data engineering using Python. This book will also be useful for students planning to build a career in data engineering or IT professionals preparing for a transition. No previous knowledge of data engineering is required.
Simplify your ETL processes with these hands-on data hygiene tips, tricks, and best practices. Key Features Focus on the basics of data wrangling Study various ways to extract the most out of your data in less time Boost your learning curve with bonus topics like random data generation and data integrity checks Book DescriptionFor data to be useful and meaningful, it must be curated and refined. Data Wrangling with Python teaches you the core ideas behind these processes and equips you with knowledge of the most popular tools and techniques in the domain. The book starts with the absolute basics of Python, focusing mainly on data structures. It then delves into the fundamental tools of data wrangling like NumPy and Pandas libraries. You'll explore useful insights into why you should stay away from traditional ways of data cleaning, as done in other languages, and take advantage of the specialized pre-built routines in Python. This combination of Python tips and tricks will also demonstrate how to use the same Python backend and extract/transform data from an array of sources including the Internet, large database vaults, and Excel financial tables. To help you prepare for more challenging scenarios, you'll cover how to handle missing or wrong data, and reformat it based on the requirements from the downstream analytics tool. The book will further help you grasp concepts through real-world examples and datasets. By the end of this book, you will be confident in using a diverse array of sources to extract, clean, transform, and format your data efficiently. What you will learn Use and manipulate complex and simple data structures Harness the full potential of DataFrames and numpy.array at run time Perform web scraping with BeautifulSoup4 and html5lib Execute advanced string search and manipulation with RegEX Handle outliers and perform data imputation with Pandas Use descriptive statistics and plotting techniques Practice data wrangling and modeling using data generation techniques Who this book is forData Wrangling with Python is designed for developers, data analysts, and business analysts who are keen to pursue a career as a full-fledged data scientist or analytics expert. Although, this book is for beginners, prior working knowledge of Python is necessary to easily grasp the concepts covered here. It will also help to have rudimentary knowledge of relational database and SQL.
The concept of a big data warehouse appeared in order to store moving data objects and temporal data information. Moving objects are geometries that change their position and shape continuously over time. In order to support spatio-temporal data, a data model and associated query language is needed for supporting moving objects. Emerging Perspectives in Big Data Warehousing is an essential research publication that explores current innovative activities focusing on the integration between data warehousing and data mining with an emphasis on the applicability to real-world problems. Featuring a wide range of topics such as index structures, ontology, and user behavior, this book is ideally designed for IT consultants, researchers, professionals, computer scientists, academicians, and managers.
A comprehensive guide to understanding key techniques for architecture and hardware planning, monitoring, replication, backups, and decoupling Key Features Newly updated edition, covering the latest PostgreSQL 12 features with hands-on industry-driven recipes Create a PostgreSQL cluster that stays online even when disaster strikes Learn how to avoid costly downtime and data loss that can ruin your business Book DescriptionDatabases are nothing without the data they store. In the event of an outage or technical catastrophe, immediate recovery is essential. This updated edition ensures that you will learn the important concepts related to node architecture design, as well as techniques such as using repmgr for failover automation. From cluster layout and hardware selection to software stacks and horizontal scalability, this PostgreSQL cookbook will help you build a PostgreSQL cluster that will survive crashes, resist data corruption, and grow smoothly with customer demand. You'll start by understanding how to plan a PostgreSQL database architecture that is resistant to outages and scalable, as it is the scaffolding on which everything rests. With the bedrock established, you'll cover the topics that PostgreSQL database administrators need to know to manage a highly available cluster. This includes configuration, troubleshooting, monitoring and alerting, backups through proxies, failover automation, and other considerations that are essential for a healthy PostgreSQL cluster. Later, you'll learn to use multi-master replication to maximize server availability. Later chapters will guide you through managing major version upgrades without downtime. By the end of this book, you'll have learned how to build an efficient and adaptive PostgreSQL 12 database cluster. What you will learn Understand how to protect data with PostgreSQL replication tools Focus on hardware planning to ensure that your database runs efficiently Reduce database resource contention with connection pooling Monitor and visualize cluster activity with Nagios and the TIG (Telegraf, InfluxDB, Grafana) stack Construct a robust software stack that can detect and avert outages Use multi-master to achieve an enduring PostgreSQL cluster Who this book is forThis book is for Postgres administrators and developers who are looking to build and maintain a highly reliable PostgreSQL cluster. Although knowledge of the new features of PostgreSQL 12 is not required, a basic understanding of PostgreSQL administration is expected.
Kick-start your DevOps career by learning how to effectively deploy Kubernetes on Azure in an easy, comprehensive, and fun way with hands-on coding tasks Key Features Understand the fundamentals of Docker and Kubernetes Learn to implement microservices architecture using the Kubernetes platform Discover how you can scale your application workloads in Azure Kubernetes Service (AKS) Book DescriptionFrom managing versioning efficiently to improving security and portability, technologies such as Kubernetes and Docker have greatly helped cloud deployments and application development. Starting with an introduction to Docker, Kubernetes, and Azure Kubernetes Service (AKS), this book will guide you through deploying an AKS cluster in different ways. You'll then explore the Azure portal by deploying a sample guestbook application on AKS and installing complex Kubernetes apps using Helm. With the help of real-world examples, you'll also get to grips with scaling your application and cluster. As you advance, you'll understand how to overcome common challenges in AKS and secure your application with HTTPS and Azure AD (Active Directory). Finally, you'll explore serverless functions such as HTTP triggered Azure functions and queue triggered functions. By the end of this Kubernetes book, you'll be well-versed with the fundamentals of Azure Kubernetes Service and be able to deploy containerized workloads on Microsoft Azure with minimal management overhead. What you will learn Plan, configure, and run containerized applications in production Use Docker to build apps in containers and deploy them on Kubernetes Improve the configuration and deployment of apps on the Azure Cloud Store your container images securely with Azure Container Registry Install complex Kubernetes applications using Helm Integrate Kubernetes with multiple Azure PaaS services, such as databases, Event Hubs and Functions. Who this book is forThis book is for aspiring DevOps professionals, system administrators, developers, and site reliability engineers looking to understand test and deployment processes and improve their efficiency. If you're new to working with containers and orchestration, you'll find this book useful.
Learn how to migrate your SAP data to Azure simply and successfully. Key Features Learn why Azure is suitable for business-critical systems Understand how to migrate your SAP infrastructure to Azure Use Lift & shift migration, Lift & migrate, Lift & migrate to HANA, or Lift & transform to S/4HANA Book DescriptionCloud technologies have now reached a level where even the most critical business systems can run on them. For most organizations SAP is the key business system. If SAP is unavailable for any reason then potentially your business stops. Because of this, it is understandable that you will be concerned whether such a critical system can run in the public cloud. However, the days when you truly ran your IT system on-premises have long since gone. Most organizations have been getting rid of their own data centers and increasingly moving to co-location facilities. In this context the public cloud is nothing more than an additional virtual data center connected to your existing network. There are typically two main reasons why you may consider migrating SAP to Azure: You need to replace the infrastructure that is currently running SAP, or you want to migrate SAP to a new database. Depending on your goal SAP offers different migration paths. You can decide either to migrate the current workload to Azure as-is, or to combine it with changing the database and execute both activities as a single step. SAP on Azure Implementation Guide covers the main migration options to lead you through migrating your SAP data to Azure simply and successfully. What you will learn Successfully migrate your SAP infrastructure to Azure Understand the security benefits of Azure See how Azure can scale to meet the most demanding of business needs Ensure your SAP infrastructure maintains high availability Increase business agility through cloud capabilities Leverage cloud-native capabilities to enhance SAP Who this book is forSAP on Azure Implementation Guide is designed to benefit existing SAP architects looking to migrate their SAP infrastructure to Azure. Whether you are an architect implementing the migration or an IT decision maker evaluating the benefits of migration, this book is for you.
Leverage the power of Microsoft Azure Data Factory v2 to build hybrid data solutions Key Features Combine the power of Azure Data Factory v2 and SQL Server Integration Services Design and enhance performance and scalability of a modern ETL hybrid solution Interact with the loaded data in data warehouse and data lake using Power BI Book DescriptionETL is one of the essential techniques in data processing. Given data is everywhere, ETL will always be the vital process to handle data from different sources. Hands-On Data Warehousing with Azure Data Factory starts with the basic concepts of data warehousing and ETL process. You will learn how Azure Data Factory and SSIS can be used to understand the key components of an ETL solution. You will go through different services offered by Azure that can be used by ADF and SSIS, such as Azure Data Lake Analytics, Machine Learning and Databrick's Spark with the help of practical examples. You will explore how to design and implement ETL hybrid solutions using different integration services with a step-by-step approach. Once you get to grips with all this, you will use Power BI to interact with data coming from different sources in order to reveal valuable insights. By the end of this book, you will not only learn how to build your own ETL solutions but also address the key challenges that are faced while building them. What you will learn Understand the key components of an ETL solution using Azure Data Factory and Integration Services Design the architecture of a modern ETL hybrid solution Implement ETL solutions for both on-premises and Azure data Improve the performance and scalability of your ETL solution Gain thorough knowledge of new capabilities and features added to Azure Data Factory and Integration Services Who this book is forThis book is for you if you are a software professional who develops and implements ETL solutions using Microsoft SQL Server or Azure cloud. It will be an added advantage if you are a software engineer, DW/ETL architect, or ETL developer, and know how to create a new ETL implementation or enhance an existing one with ADF or SSIS.
A fast paced guide that will help you learn about Apache Hadoop 3 and its ecosystem Key Features Set up, configure and get started with Hadoop to get useful insights from large data sets Work with the different components of Hadoop such as MapReduce, HDFS and YARN Learn about the new features introduced in Hadoop 3 Book DescriptionApache Hadoop is a widely used distributed data platform. It enables large datasets to be efficiently processed instead of using one large computer to store and process the data. This book will get you started with the Hadoop ecosystem, and introduce you to the main technical topics, including MapReduce, YARN, and HDFS. The book begins with an overview of big data and Apache Hadoop. Then, you will set up a pseudo Hadoop development environment and a multi-node enterprise Hadoop cluster. You will see how the parallel programming paradigm, such as MapReduce, can solve many complex data processing problems. The book also covers the important aspects of the big data software development lifecycle, including quality assurance and control, performance, administration, and monitoring. You will then learn about the Hadoop ecosystem, and tools such as Kafka, Sqoop, Flume, Pig, Hive, and HBase. Finally, you will look at advanced topics, including real time streaming using Apache Storm, and data analytics using Apache Spark. By the end of the book, you will be well versed with different configurations of the Hadoop 3 cluster. What you will learn Store and analyze data at scale using HDFS, MapReduce and YARN Install and configure Hadoop 3 in different modes Use Yarn effectively to run different applications on Hadoop based platform Understand and monitor how Hadoop cluster is managed Consume streaming data using Storm, and then analyze it using Spark Explore Apache Hadoop ecosystem components, such as Flume, Sqoop, HBase, Hive, and Kafka Who this book is forAspiring Big Data professionals who want to learn the essentials of Hadoop 3 will find this book to be useful. Existing Hadoop users who want to get up to speed with the new features introduced in Hadoop 3 will also benefit from this book. Having knowledge of Java programming will be an added advantage.
|
You may like...
Big Data Management, Technologies, and…
Wen-Chen Hu, Naima Kaabouch
Hardcover
R4,548
Discovery Miles 45 480
E-Discovery Tools and Applications in…
Egbert de Smet, Sangeeta Dhamdhere
Hardcover
R4,969
Discovery Miles 49 690
Intro to Python for Computer Science and…
Paul Deitel
Paperback
Innovations in XML Applications and…
Jose Carlos Ramalho, Alberto Simoes, …
Hardcover
R4,902
Discovery Miles 49 020
Data Warehouse Systems - Design and…
Alejandro Vaisman, Esteban Zimanyi
Hardcover
R1,622
Discovery Miles 16 220
|