![]() |
![]() |
Your cart is empty |
||
Books > Computing & IT > Applications of computing > Databases > Data capture & analysis
Deploy supervised and unsupervised machine learning algorithms using scikit-learn to perform classification, regression, and clustering. Key Features Build your first machine learning model using scikit-learn Train supervised and unsupervised models using popular techniques such as classification, regression and clustering Understand how scikit-learn can be applied to different types of machine learning problems Book DescriptionScikit-learn is a robust machine learning library for the Python programming language. It provides a set of supervised and unsupervised learning algorithms. This book is the easiest way to learn how to deploy, optimize, and evaluate all of the important machine learning algorithms that scikit-learn provides. This book teaches you how to use scikit-learn for machine learning. You will start by setting up and configuring your machine learning environment with scikit-learn. To put scikit-learn to use, you will learn how to implement various supervised and unsupervised machine learning models. You will learn classification, regression, and clustering techniques to work with different types of datasets and train your models. Finally, you will learn about an effective pipeline to help you build a machine learning project from scratch. By the end of this book, you will be confident in building your own machine learning models for accurate predictions. What you will learn Learn how to work with all scikit-learn's machine learning algorithms Install and set up scikit-learn to build your first machine learning model Employ Unsupervised Machine Learning Algorithms to cluster unlabelled data into groups Perform classification and regression machine learning Use an effective pipeline to build a machine learning project from scratch Who this book is forThis book is for aspiring machine learning developers who want to get started with scikit-learn. Intermediate knowledge of Python programming and some fundamental knowledge of linear algebra and probability will help.
A fast paced guide that will help you learn about Apache Hadoop 3 and its ecosystem Key Features Set up, configure and get started with Hadoop to get useful insights from large data sets Work with the different components of Hadoop such as MapReduce, HDFS and YARN Learn about the new features introduced in Hadoop 3 Book DescriptionApache Hadoop is a widely used distributed data platform. It enables large datasets to be efficiently processed instead of using one large computer to store and process the data. This book will get you started with the Hadoop ecosystem, and introduce you to the main technical topics, including MapReduce, YARN, and HDFS. The book begins with an overview of big data and Apache Hadoop. Then, you will set up a pseudo Hadoop development environment and a multi-node enterprise Hadoop cluster. You will see how the parallel programming paradigm, such as MapReduce, can solve many complex data processing problems. The book also covers the important aspects of the big data software development lifecycle, including quality assurance and control, performance, administration, and monitoring. You will then learn about the Hadoop ecosystem, and tools such as Kafka, Sqoop, Flume, Pig, Hive, and HBase. Finally, you will look at advanced topics, including real time streaming using Apache Storm, and data analytics using Apache Spark. By the end of the book, you will be well versed with different configurations of the Hadoop 3 cluster. What you will learn Store and analyze data at scale using HDFS, MapReduce and YARN Install and configure Hadoop 3 in different modes Use Yarn effectively to run different applications on Hadoop based platform Understand and monitor how Hadoop cluster is managed Consume streaming data using Storm, and then analyze it using Spark Explore Apache Hadoop ecosystem components, such as Flume, Sqoop, HBase, Hive, and Kafka Who this book is forAspiring Big Data professionals who want to learn the essentials of Hadoop 3 will find this book to be useful. Existing Hadoop users who want to get up to speed with the new features introduced in Hadoop 3 will also benefit from this book. Having knowledge of Java programming will be an added advantage.
Build a strong foundation of machine learning algorithms in 7 days Key Features Use Python and its wide array of machine learning libraries to build predictive models Learn the basics of the 7 most widely used machine learning algorithms within a week Know when and where to apply data science algorithms using this guide Book DescriptionMachine learning applications are highly automated and self-modifying, and continue to improve over time with minimal human intervention, as they learn from the trained data. To address the complex nature of various real-world data problems, specialized machine learning algorithms have been developed. Through algorithmic and statistical analysis, these models can be leveraged to gain new knowledge from existing data as well. Data Science Algorithms in a Week addresses all problems related to accurate and efficient data classification and prediction. Over the course of seven days, you will be introduced to seven algorithms, along with exercises that will help you understand different aspects of machine learning. You will see how to pre-cluster your data to optimize and classify it for large datasets. This book also guides you in predicting data based on existing trends in your dataset. This book covers algorithms such as k-nearest neighbors, Naive Bayes, decision trees, random forest, k-means, regression, and time-series analysis. By the end of this book, you will understand how to choose machine learning algorithms for clustering, classification, and regression and know which is best suited for your problem What you will learn Understand how to identify a data science problem correctly Implement well-known machine learning algorithms efficiently using Python Classify your datasets using Naive Bayes, decision trees, and random forest with accuracy Devise an appropriate prediction solution using regression Work with time series data to identify relevant data events and trends Cluster your data using the k-means algorithm Who this book is forThis book is for aspiring data science professionals who are familiar with Python and have a little background in statistics. You'll also find this book useful if you're currently working with data science algorithms in some capacity and want to expand your skill set
Solve real-world statistical problems using the most popular R packages and techniques Key Features Learn how to apply statistical methods to your everyday research with handy recipes Foster your analytical skills and interpret research across industries and business verticals Perform t-tests, chi-squared tests, and regression analysis using modern statistical techniques Book DescriptionR is a popular programming language for developing statistical software. This book will be a useful guide to solving common and not-so-common challenges in statistics. With this book, you'll be equipped to confidently perform essential statistical procedures across your organization with the help of cutting-edge statistical tools. You'll start by implementing data modeling, data analysis, and machine learning to solve real-world problems. You'll then understand how to work with nonparametric methods, mixed effects models, and hidden Markov models. This book contains recipes that will guide you in performing univariate and multivariate hypothesis tests, several regression techniques, and using robust techniques to minimize the impact of outliers in data.You'll also learn how to use the caret package for performing machine learning in R. Furthermore, this book will help you understand how to interpret charts and plots to get insights for better decision making. By the end of this book, you will be able to apply your skills to statistical computations using R 3.5. You will also become well-versed with a wide array of statistical techniques in R that are extensively used in the data science industry. What you will learn Become well versed with recipes that will help you interpret plots with R Formulate advanced statistical models in R to understand its concepts Perform Bayesian regression to predict models and input missing data Use time series analysis for modelling and forecasting temporal data Implement a range of regression techniques for efficient data modelling Get to grips with robust statistics and hidden Markov models Explore ANOVA (Analysis of Variance) and perform hypothesis testing Who this book is forIf you are a quantitative researcher, statistician, data analyst, or data scientist looking to tackle various challenges in statistics, this book is what you need! Proficiency in R programming and basic knowledge of linear algebra is necessary to follow along the recipes covered in this book.
Build reporting applications and dashboards using the different MicroStrategy objects Key Features Learn the fundamentals of MicroStrategy Use MicroStrategy to get actionable insights from your business data Create visualizations and build intuitive dashboards and reports Book DescriptionMicroStrategy is an enterprise business intelligence application. It turns data into reports for making and executing key organization decisions. This book shows you how to implement Business Intelligence (BI) with MicroStrategy. It takes you from setting up and configuring MicroStrategy to security and administration. The book starts by detailing the different components of the MicroStrategy platform, and the key concepts of Metadata and Project Source. You will then install and configure MicroStrategy and lay down the foundations for building MicroStrategy BI solutions. By learning about objects and different object types, you will develop a strong understanding of the MicroStrategy Schema and Public Objects. With these MicroStrategy objects, you will enhance and scale your BI and Analytics solutions. Finally, you will learn about the administration, security, and monitoring of your BI solution. What you will learn Set up the MicroStrategy Intelligence Server and client tools Create a MicroStrategy metadata repository and your first Project Explore the main MicroStrategy object types and their dependencies Create, manipulate, and share Reports Create and share Dashboards Manage Users and Groups Who this book is forThis book is for Business Intelligence professionals or data analysts who want to get started with Microstrategy. Some basic understanding of BI and data analysis will be required to get the most from this book.
Data Analysis in Criminal Justice and Criminology: History, Concept, and Application breaks down various data analysis techniques to help students build their conceptual understanding of key methods and processes. The information in the text encourages discussion and consideration of how and why data analysis plays an important role in the fields of criminal justice and criminology. The book is divided into three units. Unit 1 discusses how data analysis is used in criminal justice and criminology, various methods of data collection, the importance of identifying the purpose of analysis and key data elements prior to analyzing information, and graphical representation of data. Unit 2 introduces students to samples, distributions, and the central limit theorem as it relates to data analysis. This section provides students with the essential knowledge and skills needed to understand statistical concepts and calculations. The final unit explains how to move beyond statistical description to statistical inference and how sample statistics can be used to estimate population parameters. Highly accessible in nature, Data Analysis in Criminal Justice and Criminology is ideal for undergraduate and graduate courses in criminal justice, criminology, and sociology especially those with emphasis on data analysis.
Progressively explore UI development with Shiny via practical examples Key Features Write a Shiny interface in pure HTML Explore powerful layout functions to make attractive dashboards and other intuitive interfaces Get to grips with Bootstrap and leverage it in your Shiny applications Book DescriptionAlthough vanilla Shiny applications look attractive with some layout flexibility, you may still want to have more control over how the interface is laid out to produce a dashboard. Hands-On Dashboard Development with Shiny helps you incorporate this in your applications. The book starts by guiding you in producing an application based on the diamonds dataset included in the ggplot2 package. You'll create a single application, but the interface will be reskinned and rebuilt throughout using different methods to illustrate their uses and functions using HTML, CSS, and JavaScript. You will also learn to develop an application that creates documents and reports using R Markdown. Furthermore, the book demonstrates the use of HTML templates and the Bootstrap framework. Moving along, you will learn how to produce dashboards using the Shiny command and dashboard package. Finally, you will learn how to lay out applications using a wide range of built-in functions. By the end of the book, you will have an understanding of the principles that underpin layout in Shiny applications, including sections of HTML added to a vanilla Shiny application, HTML interfaces written from scratch, dashboards, navigation bars, and interfaces. What you will learn Add HTML to a Shiny application and write its interfaces from scratch in HTML Use built-in Shiny functions to produce attractive and flexible layouts Produce dashboards, adding icons and notifications Explore Bootstrap themes to lay out your applications Get insights into UI development with hands-on examples Use R Markdown to create and download reports Who this book is forIf you have some experience writing Shiny applications and want to use HTML, CSS, and Bootstrap to make custom interfaces, then this book is for you.
Get unique insights from your data by combining the power of SQL Server, R and Python Key Features Use the features of SQL Server 2017 to implement the data science project life cycle Leverage the power of R and Python to design and develop efficient data models find unique insights from your data with powerful techniques for data preprocessing and analysis Book DescriptionSQL Server only started to fully support data science with its two most recent editions. If you are a professional from both worlds, SQL Server and data science, and interested in using SQL Server and Machine Learning (ML) Services for your projects, then this is the ideal book for you. This book is the ideal introduction to data science with Microsoft SQL Server and In-Database ML Services. It covers all stages of a data science project, from businessand data understanding,through data overview, data preparation, modeling and using algorithms, model evaluation, and deployment. You will learn to use the engines and languages that come with SQL Server, including ML Services with R and Python languages and Transact-SQL. You will also learn how to choose which algorithm to use for which task, and learn the working of each algorithm. What you will learn Use the popular programming languages,T-SQL, R, and Python, for data science Understand your data with queries and introductory statistics Create and enhance the datasets for ML Visualize and analyze data using basic and advanced graphs Explore ML using unsupervised and supervised models Deploy models in SQL Server and perform predictions Who this book is forSQL Server professionals who want to start with data science, and data scientists who would like to start using SQL Server in their projects will find this book to be useful. Prior exposure to SQL Server will be helpful.
Get to grips with Kibana and its advanced functions to create interactive visualizations and dashboards Key Features Explore visualizations and perform histograms, stats, and map analytics Unleash X-Pack and Timelion, and learn alerting, monitoring, and reporting features Manage dashboards with Beats and create machine learning jobs for faster analytics Book DescriptionKibana is one of the popular tools among data enthusiasts for slicing and dicing large datasets and uncovering Business Intelligence (BI) with the help of its rich and powerful visualizations. To begin with, Mastering Kibana 6.x quickly introduces you to the features of Kibana 6.x, before teaching you how to create smart dashboards in no time. You will explore metric analytics and graph exploration, followed by understanding how to quickly customize Kibana dashboards. In addition to this, you will learn advanced analytics such as maps, hits, and list analytics. All this will help you enhance your skills in running and comparing multiple queries and filters, influencing your data visualization skills at scale. With Kibana's Timelion feature, you can analyze time series data with histograms and stats analytics. By the end of this book, you will have created a speedy machine learning job using X-Pack capabilities. What you will learn Create unique dashboards with various intuitive data visualizations Visualize Timelion expressions with added histograms and stats analytics Integrate X-Pack with your Elastic Stack in simple steps Extract data from Elasticsearch for advanced analysis and anomaly detection using dashboards Build dashboards from web applications for application logs Create monitoring and alerting dashboards using Beats Who this book is forMastering Kibana 6.x is for you if you are a big data engineer, DevOps engineer, or data scientist aspiring to go beyond data visualization at scale and gain maximum insights from their large datasets. Basic knowledge of Elasticstack will be an added advantage, although not mandatory.
Explore powerful R packages to create predictive models using ensemble methods Key Features Implement machine learning algorithms to build ensemble-efficient models Explore powerful R packages to create predictive models using ensemble methods Learn to build ensemble models on large datasets using a practical approach Book DescriptionEnsemble techniques are used for combining two or more similar or dissimilar machine learning algorithms to create a stronger model. Such a model delivers superior prediction power and can give your datasets a boost in accuracy. Hands-On Ensemble Learning with R begins with the important statistical resampling methods. You will then walk through the central trilogy of ensemble techniques - bagging, random forest, and boosting - then you'll learn how they can be used to provide greater accuracy on large datasets using popular R packages. You will learn how to combine model predictions using different machine learning algorithms to build ensemble models. In addition to this, you will explore how to improve the performance of your ensemble models. By the end of this book, you will have learned how machine learning algorithms can be combined to reduce common problems and build simple efficient ensemble models with the help of real-world examples. What you will learn Carry out an essential review of re-sampling methods, bootstrap, and jackknife Explore the key ensemble methods: bagging, random forests, and boosting Use multiple algorithms to make strong predictive models Enjoy a comprehensive treatment of boosting methods Supplement methods with statistical tests, such as ROC Walk through data structures in classification, regression, survival, and time series data Use the supplied R code to implement ensemble methods Learn stacking method to combine heterogeneous machine learning models Who this book is forThis book is for you if you are a data scientist or machine learning developer who wants to implement machine learning techniques by building ensemble models with the power of R. You will learn how to combine different machine learning algorithms to perform efficient data processing. Basic knowledge of machine learning techniques and programming knowledge of R would be an added advantage.
Leverage the capabilities of SAS to process and analyze Big Data About This Book * Combine SAS with platforms such as Hadoop, SAP HANA, and Cloud Foundry-based platforms for effecient Big Data analytics * Learn how to use the web browser-based SAS Studio and iPython Jupyter Notebook interfaces with SAS * Practical, real-world examples on predictive modeling, forecasting, optimizing and reporting your Big Data analysis with SAS Who This Book Is For SAS professionals and data analysts who wish to perform analytics on Big Data using SAS to gain actionable insights will find this book to be very useful. If you are a data science professional looking to perform large-scale analytics with SAS, this book will also help you. A basic understanding of SAS will be helpful, but is not mandatory. What You Will Learn * Configure a free version of SAS in order do hands-on exercises dealing with data management, analysis, and reporting. * Understand the basic concepts of the SAS language which consists of the data step (for data preparation) and procedures (or PROCs) for analysis. * Make use of the web browser based SAS Studio and iPython Jupyter Notebook interfaces for coding in the SAS, DS2, and FedSQL programming languages. * Understand how the DS2 programming language plays an important role in Big Data preparation and analysis using SAS * Integrate and work efficiently with Big Data platforms like Hadoop, SAP HANA, and cloud foundry based systems. In Detail SAS has been recognized by Money Magazine and Payscale as one of the top business skills to learn in order to advance one's career. Through innovative data management, analytics, and business intelligence software and services, SAS helps customers solve their business problems by allowing them to make better decisions faster. This book introduces the reader to the SAS and how they can use SAS to perform efficient analysis on any size data, including Big Data. The reader will learn how to prepare data for analysis, perform predictive, forecasting, and optimization analysis and then deploy or report on the results of these analyses. While performing the coding examples within this book the reader will learn how to use the web browser based SAS Studio and iPython Jupyter Notebook interfaces for working with SAS. Finally, the reader will learn how SAS's architecture is engineered and designed to scale up and/or out and be combined with the open source offerings such as Hadoop, Python, and R. By the end of this book, you will be able to clearly understand how you can efficiently analyze Big Data using SAS. Style and approach The book starts off by introducing the reader to SAS and the SAS programming language which provides data management, analytical, and reporting capabilities. Most chapters include hands on examples which highlights how SAS provides The Power to Know (c). The reader will learn that if they are looking to perform large-scale data analysis that SAS provides an open platform engineered and designed to scale both up and out which allows the power of SAS to combine with open source offerings such as Hadoop, Python, and R.
With Hands-On Recommendation Systems with Python, learn the tools and techniques required in building various kinds of powerful recommendation systems (collaborative, knowledge and content based) and deploying them to the web Key Features Build industry-standard recommender systems Only familiarity with Python is required No need to wade through complicated machine learning theory to use this book Book DescriptionRecommendation systems are at the heart of almost every internet business today; from Facebook to Netflix to Amazon. Providing good recommendations, whether it's friends, movies, or groceries, goes a long way in defining user experience and enticing your customers to use your platform. This book shows you how to do just that. You will learn about the different kinds of recommenders used in the industry and see how to build them from scratch using Python. No need to wade through tons of machine learning theory-you'll get started with building and learning about recommenders as quickly as possible.. In this book, you will build an IMDB Top 250 clone, a content-based engine that works on movie metadata. You'll use collaborative filters to make use of customer behavior data, and a Hybrid Recommender that incorporates content based and collaborative filtering techniques With this book, all you need to get started with building recommendation systems is a familiarity with Python, and by the time you're fnished, you will have a great grasp of how recommenders work and be in a strong position to apply the techniques that you will learn to your own problem domains. What you will learn Get to grips with the different kinds of recommender systems Master data-wrangling techniques using the pandas library Building an IMDB Top 250 Clone Build a content based engine to recommend movies based on movie metadata Employ data-mining techniques used in building recommenders Build industry-standard collaborative filters using powerful algorithms Building Hybrid Recommenders that incorporate content based and collaborative fltering Who this book is forIf you are a Python developer and want to develop applications for social networking, news personalization or smart advertising, this is the book for you. Basic knowledge of machine learning techniques will be helpful, but not mandatory.
Perform efficient fast text representation and classification with Facebook's fastText library Key Features Introduction to Facebook's fastText library for NLP Perform efficient word representations, sentence classification, vector representation Build better, more scalable solutions for text representation and classification Book DescriptionFacebook's fastText library handles text representation and classification, used for Natural Language Processing (NLP). Most organizations have to deal with enormous amounts of text data on a daily basis, and gaining efficient data insights requires powerful NLP tools such as fastText. This book is your ideal introduction to fastText. You will learn how to create fastText models from the command line, without the need for complicated code. You will explore the algorithms that fastText is built on and how to use them for word representation and text classification. Next, you will use fastText in conjunction with other popular libraries and frameworks such as Keras, TensorFlow, and PyTorch. Finally, you will deploy fastText models to mobile devices. By the end of this book, you will have all the required knowledge to use fastText in your own applications at work or in projects. What you will learn Create models using the default command line options in fastText Understand the algorithms used in fastText to create word vectors Combine command line text transformation capabilities and the fastText library to implement a training, validation, and prediction pipeline Explore word representation and sentence classification using fastText Use Gensim and spaCy to load the vectors, transform, lemmatize, and perform other NLP tasks efficiently Develop a fastText NLP classifier using popular frameworks, such as Keras, Tensorflow, and PyTorch Who this book is forThis book is for data analysts, data scientists, and machine learning developers who want to perform efficient word representation and sentence classification using Facebook's fastText library. Basic knowledge of Python programming is required.
Turn your noisy data into relevant, insight-ready information by leveraging the data wrangling techniques in Python and R About This Book * This easy-to-follow guide takes you through every step of the data wrangling process in the best possible way * Work with different types of datasets, and reshape the layout of your data to make it easier for analysis * Get simple examples and real-life data wrangling solutions for data pre-processing Who This Book Is For If you are a data scientist, data analyst, or a statistician who wants to learn how to wrangle your data for analysis in the best possible manner, this book is for you. As this book covers both R and Python, some understanding of them will be beneficial. What You Will Learn * Read a csv file into python and R, and print out some statistics on the data * Gain knowledge of the data formats and programming structures involved in retrieving API data * Make effective use of regular expressions in the data wrangling process * Explore the tools and packages available to prepare numerical data for analysis * Find out how to have better control over manipulating the structure of the data * Create a dexterity to programmatically read, audit, correct, and shape data * Write and complete programs to take in, format, and output data sets In Detail Around 80% of time in data analysis is spent on cleaning and preparing data for analysis. This is, however, an important task, and is a prerequisite to the rest of the data analysis workflow, including visualization, analysis and reporting. Python and R are considered a popular choice of tool for data analysis, and have packages that can be best used to manipulate different kinds of data, as per your requirements. This book will show you the different data wrangling techniques, and how you can leverage the power of Python and R packages to implement them. You'll start by understanding the data wrangling process and get a solid foundation to work with different types of data. You'll work with different data structures and acquire and parse data from various locations. You'll also see how to reshape the layout of data and manipulate, summarize, and join data sets. Finally, we conclude with a quick primer on accessing and processing data from databases, conducting data exploration, and storing and retrieving data quickly using databases. The book includes practical examples on each of these points using simple and real-world data sets to give you an easier understanding. By the end of the book, you'll have a thorough understanding of all the data wrangling concepts and how to implement them in the best possible way. Style and approach This is a practical book on data wrangling designed to give you an insight into the practical application of data wrangling. It takes you through complex concepts and tasks in an accessible way, featuring information on a wide range of data wrangling techniques with Python and R.
This book takes you on a fantastic journey to discover the attributes of big data using Apache Hive. Key Features Grasp the skills needed to write efficient Hive queries to analyze the Big Data Discover how Hive can coexist and work with other tools within the Hadoop ecosystem Uses practical, example-oriented scenarios to cover all the newly released features of Apache Hive 2.3.3 Book DescriptionIn this book, we prepare you for your journey into big data by frstly introducing you to backgrounds in the big data domain, alongwith the process of setting up and getting familiar with your Hive working environment. Next, the book guides you through discovering and transforming the values of big data with the help of examples. It also hones your skills in using the Hive language in an effcient manner. Toward the end, the book focuses on advanced topics, such as performance, security, and extensions in Hive, which will guide you on exciting adventures on this worthwhile big data journey. By the end of the book, you will be familiar with Hive and able to work effeciently to find solutions to big data problems What you will learn Create and set up the Hive environment Discover how to use Hive's definition language to describe data Discover interesting data by joining and filtering datasets in Hive Transform data by using Hive sorting, ordering, and functions Aggregate and sample data in different ways Boost Hive query performance and enhance data security in Hive Customize Hive to your needs by using user-defined functions and integrate it with other tools Who this book is forIf you are a data analyst, developer, or simply someone who wants to quickly get started with Hive to explore and analyze Big Data in Hadoop, this is the book for you. Since Hive is an SQL-like language, some previous experience with SQL will be useful to get the most out of this book.
Combine the power of Apache Spark and Python to build effective big data applications Key Features Perform effective data processing, machine learning, and analytics using PySpark Overcome challenges in developing and deploying Spark solutions using Python Explore recipes for efficiently combining Python and Apache Spark to process data Book DescriptionApache Spark is an open source framework for efficient cluster computing with a strong interface for data parallelism and fault tolerance. The PySpark Cookbook presents effective and time-saving recipes for leveraging the power of Python and putting it to use in the Spark ecosystem. You'll start by learning the Apache Spark architecture and how to set up a Python environment for Spark. You'll then get familiar with the modules available in PySpark and start using them effortlessly. In addition to this, you'll discover how to abstract data with RDDs and DataFrames, and understand the streaming capabilities of PySpark. You'll then move on to using ML and MLlib in order to solve any problems related to the machine learning capabilities of PySpark and use GraphFrames to solve graph-processing problems. Finally, you will explore how to deploy your applications to the cloud using the spark-submit command. By the end of this book, you will be able to use the Python API for Apache Spark to solve any problems associated with building data-intensive applications. What you will learn Configure a local instance of PySpark in a virtual environment Install and configure Jupyter in local and multi-node environments Create DataFrames from JSON and a dictionary using pyspark.sql Explore regression and clustering models available in the ML module Use DataFrames to transform data used for modeling Connect to PubNub and perform aggregations on streams Who this book is forThe PySpark Cookbook is for you if you are a Python developer looking for hands-on recipes for using the Apache Spark 2.x ecosystem in the best possible way. A thorough understanding of Python (and some familiarity with Spark) will help you get the best out of the book.
Learn to view, edit and analyse geospatial data using QGIS and Python 3 Key Features Leverage the power of QGIS to add professionalism to your maps Explore and work with the newly released features like Python 3, GeoPackage, 3D views, Print layouts in QGIS 3.4 Build your own plugins and customize maps using QT designer Book DescriptionQGIS 3.4 is the first LTR (long term release) of QGIS version 3. This is a giant leap forward for the project with tons of new features and impactful changes. Learn QGIS is fully updated for QGIS 3.4, covering its processing engine update, Python 3 de-facto coding environment, and the GeoPackage format. This book will help you get started on your QGIS journey, guiding you to develop your own processing pathway. You will explore the user interface, loading your data, editing, and then creating data. QGIS often surprises new users with its mapping capabilities; you will discover how easily you can style and create your first map. But that's not all! In the final part of the book, you'll learn about spatial analysis and the powerful tools in QGIS, and conclude by looking at Python processing options. By the end of the book, you will have become proficient in geospatial analysis using QGIS and Python. What you will learn Explore various ways to load data into QGIS Understand how to style data and present it in a map Create maps and explore ways to expand them Get acquainted with the new processing toolbox in QGIS 3.4 Manipulate your geospatial data and gain quality insights Understand how to customize QGIS 3.4 Work with QGIS 3.4 in 3D Who this book is forIf you are a developer or consultant familiar with the basic functions and processes of GIS and want to learn how to use QGIS to analyze geospatial data and create rich mapping applications, this book is for you. You'll also find this book useful if you're new to QGIS and wish to grasp its fundamentals
Build, manage, and configure high-performing, reliable NoSQL database for your applications with Cassandra Key Features Write programs more efficiently using Cassandra's features with the help of examples Configure Cassandra and fine-tune its parameters depending on your needs Integrate Cassandra database with Apache Spark and build strong data analytics pipeline Book DescriptionWith ever-increasing rates of data creation, the demand for storing data fast and reliably becomes a need. Apache Cassandra is the perfect choice for building fault-tolerant and scalable databases. Mastering Apache Cassandra 3.x teaches you how to build and architect your clusters, configure and work with your nodes, and program in a high-throughput environment, helping you understand the power of Cassandra as per the new features. Once you've covered a brief recap of the basics, you'll move on to deploying and monitoring a production setup and optimizing and integrating it with other software. You'll work with the advanced features of CQL and the new storage engine in order to understand how they function on the server-side. You'll explore the integration and interaction of Cassandra components, followed by discovering features such as token allocation algorithm, CQL3, vnodes, lightweight transactions, and data modelling in detail. Last but not least you will get to grips with Apache Spark. By the end of this book, you'll be able to analyse big data, and build and manage high-performance databases for your application. What you will learn Write programs more efficiently using Cassandra's features more efficiently Exploit the given infrastructure, improve performance, and tweak the Java Virtual Machine (JVM) Use CQL3 in your application in order to simplify working with Cassandra Configure Cassandra and fine-tune its parameters depending on your needs Set up a cluster and learn how to scale it Monitor a Cassandra cluster in different ways Use Apache Spark and other big data processing tools Who this book is forMastering Apache Cassandra 3.x is for you if you are a big data administrator, database administrator, architect, or developer who wants to build a high-performing, scalable, and fault-tolerant database. Prior knowledge of core concepts of databases is required.
Learn how to create interactive and visually aesthetic plots using the Bokeh package in Python Key Features A step by step approach to creating interactive plots with Bokeh Go from installation all the way to deploying your very own Bokeh application Work with a real time datasets to practice and create your very own plots and applications Book DescriptionAdding a layer of interactivity to your plots and converting these plots into applications hold immense value in the field of data science. The standard approach to adding interactivity would be to use paid software such as Tableau, but the Bokeh package in Python offers users a way to create both interactive and visually aesthetic plots for free. This book gets you up to speed with Bokeh - a popular Python library for interactive data visualization. The book starts out by helping you understand how Bokeh works internally and how you can set up and install the package in your local machine. You then use a real world data set which uses stock data from Kaggle to create interactive and visually stunning plots. You will also learn how to leverage Bokeh using some advanced concepts such as plotting with spatial and geo data. Finally you will use all the concepts that you have learned in the previous chapters to create your very own Bokeh application from scratch. By the end of the book you will be able to create your very own Bokeh application. You will have gone through a step by step process that starts with understanding what Bokeh actually is and ends with building your very own Bokeh application filled with interactive and visually aesthetic plots. What you will learn Installing Bokeh and understanding its key concepts Creating plots using glyphs, the fundamental building blocks of Bokeh Creating plots using different data structures like NumPy and Pandas Using layouts and widgets to visually enhance your plots and add a layer of interactivity Building and hosting applications on the Bokeh server Creating advanced plots using spatial data Who this book is forThis book is well suited for data scientists and data analysts who want to perform interactive data visualization on their web browsers using Bokeh. Some exposure to Python programming will be helpful, but prior experience with Bokeh is not required. |
![]() ![]() You may like...
Cloud-Based Big Data Analytics in…
Ram Shringar Rao, Nanhay Singh, …
Hardcover
R7,384
Discovery Miles 73 840
Big Data, IoT, and Machine Learning…
Rashmi Agrawal, Marcin Paprzycki, …
Paperback
R1,656
Discovery Miles 16 560
Cross-Cultural Analysis of Image-Based…
Lisa Keller, Robert Keller, …
Hardcover
R3,599
Discovery Miles 35 990
Data Analytics for Social Microblogging…
Soumi Dutta, Asit Kumar Das, …
Paperback
R3,454
Discovery Miles 34 540
Big Data - Concepts, Methodologies…
Information Reso Management Association
Hardcover
R19,596
Discovery Miles 195 960
Insightful Data Visualization with SAS…
Falko Schulz, Travis Murphy
Hardcover
R1,248
Discovery Miles 12 480
|