Although you don't need a large computing infrastructure to process
massive amounts of data with Apache Hadoop, it can still be
difficult to get started. This practical guide shows you how to
quickly launch data analysis projects in the cloud by using Amazon
Elastic MapReduce (EMR), the hosted Hadoop framework in Amazon Web
Services (AWS). Authors Kevin Schmidt and Christopher Phillips
demonstrate best practices for using EMR and various AWS and Apache
technologies by walking you through the construction of a sample
MapReduce log analysis application. Using code samples and example
configurations, you'll learn how to assemble the building blocks
necessary to solve your biggest data analysis problems. Get an
overview of the AWS and Apache software tools used in large-scale
data analysis Go through the process of executing a Job Flow with a
simple log analyzer Discover useful MapReduce patterns for
filtering and analyzing data sets Use Apache Hive and Pig instead
of Java to build a MapReduce Job Flow Learn the basics for using
Amazon EMR to run machine learning algorithms Develop a project
cost model for using Amazon EMR and other AWS tools
General
Is the information for this product incomplete, wrong or inappropriate?
Let us know about it.
Does this product have an incorrect or missing image?
Send us a new image.
Is this product missing categories?
Add more categories.
Review This Product
No reviews yet - be the first to create one!