|
|
Books > Computing & IT > Applications of computing > Databases > Data capture & analysis
Over 60 recipes to model and handle real-life biological data using
modern libraries from the R ecosystem Key Features Apply modern R
packages to handle biological data using real-world examples
Represent biological data with advanced visualizations suitable for
research and publications Handle real-world problems in
bioinformatics such as next-generation sequencing, metagenomics,
and automating analyses Book DescriptionHandling biological data
effectively requires an in-depth knowledge of machine learning
techniques and computational skills, along with an understanding of
how to use tools such as edgeR and DESeq. With the R Bioinformatics
Cookbook, you'll explore all this and more, tackling common and
not-so-common challenges in the bioinformatics domain using
real-world examples. This book will use a recipe-based approach to
show you how to perform practical research and analysis in
computational biology with R. You will learn how to effectively
analyze your data with the latest tools in Bioconductor, ggplot,
and tidyverse. The book will guide you through the essential tools
in Bioconductor to help you understand and carry out protocols in
RNAseq, phylogenetics, genomics, and sequence analysis. As you
progress, you will get up to speed with how machine learning
techniques can be used in the bioinformatics domain. You will
gradually develop key computational skills such as creating
reusable workflows in R Markdown and packages for code reuse. By
the end of this book, you'll have gained a solid understanding of
the most important and widely used techniques in bioinformatic
analysis and the tools you need to work with real biological data.
What you will learn Employ Bioconductor to determine differential
expressions in RNAseq data Run SAMtools and develop pipelines to
find single nucleotide polymorphisms (SNPs) and Indels Use ggplot
to create and annotate a range of visualizations Query external
databases with Ensembl to find functional genomics information
Execute large-scale multiple sequence alignment with DECIPHER to
perform comparative genomics Use d3.js and Plotly to create dynamic
and interactive web graphics Use k-nearest neighbors, support
vector machines and random forests to find groups and classify data
Who this book is forThis book is for bioinformaticians, data
analysts, researchers, and R developers who want to address
intermediate-to-advanced biological and bioinformatics problems by
learning through a recipe-based approach. Working knowledge of R
programming language and basic knowledge of bioinformatics are
prerequisites.
Discover the power of location data to build effective, intelligent
data models with Geospatial ecosystems Key Features Manipulate
location-based data and create intelligent geospatial data models
Build effective location recommendation systems used by popular
companies such as Uber A hands-on guide to help you consume spatial
data and parallelize GIS operations effectively Book
DescriptionData scientists, who have access to vast data streams,
are a bit myopic when it comes to intrinsic and extrinsic
location-based data and are missing out on the intelligence it can
provide to their models. This book demonstrates effective
techniques for using the power of data science and geospatial
intelligence to build effective, intelligent data models that make
use of location-based data to give useful predictions and analyses.
This book begins with a quick overview of the fundamentals of
location-based data and how techniques such as Exploratory Data
Analysis can be applied to it. We then delve into spatial
operations such as computing distances, areas, extents, centroids,
buffer polygons, intersecting geometries, geocoding, and more,
which adds additional context to location data. Moving ahead, you
will learn how to quickly build and deploy a geo-fencing system
using Python. Lastly, you will learn how to leverage geospatial
analysis techniques in popular recommendation systems such as
collaborative filtering and location-based recommendations, and
more. By the end of the book, you will be a rockstar when it comes
to performing geospatial analysis with ease. What you will learn
Learn how companies now use location data Set up your Python
environment and install Python geospatial packages Visualize
spatial data as graphs Extract geometry from spatial data Perform
spatial regression from scratch Build web applications which
dynamically references geospatial data Who this book is forData
Scientists who would like to leverage location-based data and want
to use location-based intelligence in their data models will find
this book useful. This book is also for GIS developers who wish to
incorporate data analysis in their projects. Knowledge of Python
programming and some basic understanding of data analysis are all
you need to get the most out of this book.
Build, manage, and configure high-performing, reliable NoSQL
database for your applications with Cassandra Key Features Write
programs more efficiently using Cassandra's features with the help
of examples Configure Cassandra and fine-tune its parameters
depending on your needs Integrate Cassandra database with Apache
Spark and build strong data analytics pipeline Book DescriptionWith
ever-increasing rates of data creation, the demand for storing data
fast and reliably becomes a need. Apache Cassandra is the perfect
choice for building fault-tolerant and scalable databases.
Mastering Apache Cassandra 3.x teaches you how to build and
architect your clusters, configure and work with your nodes, and
program in a high-throughput environment, helping you understand
the power of Cassandra as per the new features. Once you've covered
a brief recap of the basics, you'll move on to deploying and
monitoring a production setup and optimizing and integrating it
with other software. You'll work with the advanced features of CQL
and the new storage engine in order to understand how they function
on the server-side. You'll explore the integration and interaction
of Cassandra components, followed by discovering features such as
token allocation algorithm, CQL3, vnodes, lightweight transactions,
and data modelling in detail. Last but not least you will get to
grips with Apache Spark. By the end of this book, you'll be able to
analyse big data, and build and manage high-performance databases
for your application. What you will learn Write programs more
efficiently using Cassandra's features more efficiently Exploit the
given infrastructure, improve performance, and tweak the Java
Virtual Machine (JVM) Use CQL3 in your application in order to
simplify working with Cassandra Configure Cassandra and fine-tune
its parameters depending on your needs Set up a cluster and learn
how to scale it Monitor a Cassandra cluster in different ways Use
Apache Spark and other big data processing tools Who this book is
forMastering Apache Cassandra 3.x is for you if you are a big data
administrator, database administrator, architect, or developer who
wants to build a high-performing, scalable, and fault-tolerant
database. Prior knowledge of core concepts of databases is
required.
Supervised and unsupervised machine learning made easy in Scala
with this quick-start guide. Key Features Construct and deploy
machine learning systems that learn from your data and give
accurate predictions Unleash the power of Spark ML along with
popular machine learning algorithms to solve complex tasks in
Scala. Solve hands-on problems by combining popular neural network
architectures such as LSTM and CNN using Scala with DeepLearning4j
library Book DescriptionScala is a highly scalable integration of
object-oriented nature and functional programming concepts that
make it easy to build scalable and complex big data applications.
This book is a handy guide for machine learning developers and data
scientists who want to develop and train effective machine learning
models in Scala. The book starts with an introduction to machine
learning, while covering deep learning and machine learning basics.
It then explains how to use Scala-based ML libraries to solve
classification and regression problems using linear regression,
generalized linear regression, logistic regression, support vector
machine, and Naive Bayes algorithms. It also covers tree-based
ensemble techniques for solving both classification and regression
problems. Moving ahead, it covers unsupervised learning techniques,
such as dimensionality reduction, clustering, and recommender
systems. Finally, it provides a brief overview of deep learning
using a real-life example in Scala. What you will learn Get
acquainted with JVM-based machine learning libraries for Scala such
as Spark ML and Deeplearning4j Learn RDDs, DataFrame, and Spark SQL
for analyzing structured and unstructured data Understand
supervised and unsupervised learning techniques with best practices
and pitfalls Learn classification and regression analysis with
linear regression, logistic regression, Naive Bayes, support vector
machine, and tree-based ensemble techniques Learn effective ways of
clustering analysis with dimensionality reduction techniques Learn
recommender systems with collaborative filtering approach Delve
into deep learning and neural network architectures Who this book
is forThis book is for machine learning developers looking to train
machine learning models in Scala without spending too much time and
effort. Some fundamental knowledge of Scala programming and some
basics of statistics and linear algebra is all you need to get
started with this book.
Understand data science concepts and methodologies to manage and
deliver top-notch solutions for your organization Key Features
Learn the basics of data science and explore its possibilities and
limitations Manage data science projects and assemble teams
effectively even in the most challenging situations Understand
management principles and approaches for data science projects to
streamline the innovation process Book DescriptionData science and
machine learning can transform any organization and unlock new
opportunities. However, employing the right management strategies
is crucial to guide the solution from prototype to production.
Traditional approaches often fail as they don't entirely meet the
conditions and requirements necessary for current data science
projects. In this book, you'll explore the right approach to data
science project management, along with useful tips and best
practices to guide you along the way. After understanding the
practical applications of data science and artificial intelligence,
you'll see how to incorporate them into your solutions. Next, you
will go through the data science project life cycle, explore the
common pitfalls encountered at each step, and learn how to avoid
them. Any data science project requires a skilled team, and this
book will offer the right advice for hiring and growing a data
science team for your organization. Later, you'll be shown how to
efficiently manage and improve your data science projects through
the use of DevOps and ModelOps. By the end of this book, you will
be well versed with various data science solutions and have gained
practical insights into tackling the different challenges that
you'll encounter on a daily basis. What you will learn Understand
the underlying problems of building a strong data science pipeline
Explore the different tools for building and deploying data science
solutions Hire, grow, and sustain a data science team Manage data
science projects through all stages, from prototype to production
Learn how to use ModelOps to improve your data science pipelines
Get up to speed with the model testing techniques used in both
development and production stages Who this book is forThis book is
for data scientists, analysts, and program managers who want to use
data science for business productivity by incorporating data
science workflows efficiently. Some understanding of basic data
science concepts will be useful to get the most out of this book.
Learn how to use R to apply powerful machine learning methods and
gain insight into real-world applications using clustering,
logistic regressions, random forests, support vector machine, and
more. Key Features Use R 3.5 to implement real-world examples in
machine learning Implement key machine learning algorithms to
understand the working mechanism of smart models Create end-to-end
machine learning pipelines using modern libraries from the R
ecosystem Book DescriptionMachine Learning with R Quick Start Guide
takes you on a data-driven journey that starts with the very basics
of R and machine learning. It gradually builds upon core concepts
so you can handle the varied complexities of data and understand
each stage of the machine learning pipeline. From data collection
to implementing Natural Language Processing (NLP), this book covers
it all. You will implement key machine learning algorithms to
understand how they are used to build smart models. You will cover
tasks such as clustering, logistic regressions, random forests,
support vector machines, and more. Furthermore, you will also look
at more advanced aspects such as training neural networks and topic
modeling. By the end of the book, you will be able to apply the
concepts of machine learning, deal with data-related problems, and
solve them using the powerful yet simple language that is R. What
you will learn Introduce yourself to the basics of machine learning
with R 3.5 Get to grips with R techniques for cleaning and
preparing your data for analysis and visualize your results Learn
to build predictive models with the help of various machine
learning techniques Use R to visualize data spread across multiple
dimensions and extract useful features Use interactive data
analysis with R to get insights into data Implement supervised and
unsupervised learning, and NLP using R libraries Who this book is
forThis book is for graduate students, aspiring data scientists,
and data analysts who wish to enter the field of machine learning
and are looking to implement machine learning techniques and
methodologies from scratch using R 3.5. A working knowledge of the
R programming language is expected.
Solve business challenges with Microsoft Power BI's advanced
visualization and data analysis techniques Key Features Create
effective storytelling reports by implementing
simple-to-intermediate Power BI features Develop powerful
analytical models to extract key insights for changing business
needs Build, publish, and share impressive dashboards for your
organization Book DescriptionTo succeed in today's transforming
business world, organizations need business intelligence
capabilities to make smarter decisions faster than ever before.
This Power BI book is an entry-level guide that will get you up and
running with data modeling, visualization, and analytical
techniques from scratch. You'll find this book handy if you want to
get well-versed with the extensive Power BI ecosystem. You'll start
by covering the basics of business intelligence and installing
Power BI. You'll then learn the wide range of Power BI features to
unlock business insights. As you progress, the book will take you
through how to use Power Query to ingest, cleanse, and shape your
data, and use Power BI DAX to create simple to complex
calculations. You'll also be able to add a variety of interactive
visualizations to your reports to bring your data to life. Finally,
you'll gain hands-on experience in creating visually stunning
reports that speak to business decision makers, and see how you can
securely share these reports and collaborate with others. By the
end of this book, you'll be ready to create simple, yet effective,
BI reports and dashboards using the latest features of Power BI.
What you will learn Explore the different features of Power BI to
create interactive dashboards Use the Query Editor to import and
transform data Perform simple and complex DAX calculations to
enhance analysis Discover business insights and tell a story with
your data using Power BI Explore data and learn to manage datasets,
dataflows, and data gateways Use workspaces to collaborate with
others and publish your reports Who this book is forIf you're an IT
manager, data analyst, or BI user new to using Power BI for solving
business intelligence problems, this book is for you. You'll also
find this book useful if you want to migrate from other BI tools to
create powerful and interactive dashboards. No experience of
working with Power BI is expected.
Document the architecture of your software easily with this highly
practical, open-source template. Key Features Get to grips with
leveraging the features of arc42 to create insightful documents
Learn the concepts of software architecture documentation through
real-world examples Discover techniques to create compact, helpful,
and easy-to-read documentation Book DescriptionWhen developers
document the architecture of their systems, they often invent their
own specific ways of articulating structures, designs, concepts,
and decisions. What they need is a template that enables simple and
efficient software architecture documentation. arc42 by Example
shows how it's done through several real-world examples. Each
example in the book, whether it is a chess engine, a huge CRM
system, or a cool web system, starts with a brief description of
the problem domain and the quality requirements. Then, you'll
discover the system context with all the external interfaces.
You'll dive into an overview of the solution strategy to implement
the building blocks and runtime scenarios. The later chapters also
explain various cross-cutting concerns and how they affect other
aspects of a program. What you will learn Utilize arc42 to document
a system's physical infrastructure Learn how to identify a system's
scope and boundaries Break a system down into building blocks and
illustrate the relationships between them Discover how to describe
the runtime behavior of a system Know how to document design
decisions and their reasons Explore the risks and technical debt of
your system Who this book is forThis book is for software
developers and solutions architects who are looking for an easy,
open-source tool to document their systems. It is a useful
reference for those who are already using arc42. If you are new to
arc42, this book is a great learning resource. For those of you who
want to write better technical documentation will benefit from the
general concepts covered in this book.
A fast paced guide that will help you to create, read, update and
delete data using MongoDB Key Features Create secure databases with
MongoDB Manipulate and maintain your database Model and use data in
a No SQL environment with MongoDB Book DescriptionMongoDB has grown
to become the de facto NoSQL database with millions of users, from
small start-ups to Fortune 500 companies. It can solve problems
that are considered difficult, if not impossible, for aging RDBMS
technologies. Written for version 4 of MongoDB, this book is the
easiest way to get started with MongoDB. You will start by getting
a MongoDB installation up and running in a safe and secure manner.
You will learn how to perform mission-critical create, read,
update, and delete operations, and set up database security. You
will also learn about advanced features of MongoDB such as the
aggregation pipeline, replication, and sharding. You will learn how
to build a simple web application that uses MongoDB to respond to
AJAX queries, and see how to make use of the MongoDB programming
language driver for PHP. The examples incorporate new features
available in MongoDB version 4 where appropriate. What you will
learn Get a standard MongoDB database up and running quickly
Perform simple CRUD operations on the database using the MongoDB
command shell Set up a simple aggregation pipeline to return
subsets of data grouped, sorted, and filtered Safeguard your data
via replication and handle massive amounts of data via sharding
Publish data from a web form to the database using a program
language driver Explore the basic CRUD operations performed using
the PHP MongoDB driver Who this book is forWeb developers, IT
professionals and Database Administrators (DBAs) who want to learn
how to create and manage MongoDB databases.
Develop, deploy, and streamline your data science projects with the
most popular end-to-end platform, Anaconda Key Features -Use
Anaconda to find solutions for clustering, classification, and
linear regression -Analyze your data efficiently with the most
powerful data science stack -Use the Anaconda cloud to store,
share, and discover projects and libraries Book DescriptionAnaconda
is an open source platform that brings together the best tools for
data science professionals with more than 100 popular packages
supporting Python, Scala, and R languages. Hands-On Data Science
with Anaconda gets you started with Anaconda and demonstrates how
you can use it to perform data science operations in the real
world. The book begins with setting up the environment for Anaconda
platform in order to make it accessible for tools and frameworks
such as Jupyter, pandas, matplotlib, Python, R, Julia, and more.
You'll walk through package manager Conda, through which you can
automatically manage all packages including cross-language
dependencies, and work across Linux, macOS, and Windows. You'll
explore all the essentials of data science and linear algebra to
perform data science tasks using packages such as SciPy,
contrastive, scikit-learn, Rattle, and Rmixmod. Once you're
accustomed to all this, you'll start with operations in data
science such as cleaning, sorting, and data classification. You'll
move on to learning how to perform tasks such as clustering,
regression, prediction, and building machine learning models and
optimizing them. In addition to this, you'll learn how to visualize
data using the packages available for Julia, Python, and R. What
you will learn Perform cleaning, sorting, classification,
clustering, regression, and dataset modeling using Anaconda Use the
package manager conda and discover, install, and use functionally
efficient and scalable packages Get comfortable with heterogeneous
data exploration using multiple languages within a project Perform
distributed computing and use Anaconda Accelerate to optimize
computational powers Discover and share packages, notebooks, and
environments, and use shared project drives on Anaconda Cloud
Tackle advanced data prediction problems Who this book is
forHands-On Data Science with Anaconda is for you if you are a
developer who is looking for the best tools in the market to
perform data science. It's also ideal for data analysts and data
science professionals who want to improve the efficiency of their
data science applications by using the best libraries in multiple
languages. Basic programming knowledge with R or Python and
introductory knowledge of linear algebra is expected.
Gain useful insights from your data using popular data science
tools Key Features A one-stop guide to Python libraries such as
pandas and NumPy Comprehensive coverage of data science operations
such as data cleaning and data manipulation Choose scalable
learning algorithms for your data science tasks Book
DescriptionFully expanded and upgraded, the latest edition of
Python Data Science Essentials will help you succeed in data
science operations using the most common Python libraries. This
book offers up-to-date insight into the core of Python, including
the latest versions of the Jupyter Notebook, NumPy, pandas, and
scikit-learn. The book covers detailed examples and large hybrid
datasets to help you grasp essential statistical techniques for
data collection, data munging and analysis, visualization, and
reporting activities. You will also gain an understanding of
advanced data science topics such as machine learning algorithms,
distributed computing, tuning predictive models, and natural
language processing. Furthermore, You'll also be introduced to deep
learning and gradient boosting solutions such as XGBoost, LightGBM,
and CatBoost. By the end of the book, you will have gained a
complete overview of the principal machine learning algorithms,
graph analysis techniques, and all the visualization and deployment
instruments that make it easier to present your results to an
audience of both data science experts and business users What you
will learn Set up your data science toolbox on Windows, Mac, and
Linux Use the core machine learning methods offered by the
scikit-learn library Manipulate, fix, and explore data to solve
data science problems Learn advanced explorative and manipulative
techniques to solve data operations Optimize your machine learning
models for optimized performance Explore and cluster graphs, taking
advantage of interconnections and links in your data Who this book
is forIf you're a data science entrant, data analyst, or data
engineer, this book will help you get ready to tackle real-world
data science problems without wasting any time. Basic knowledge of
probability/statistics and Python coding experience will assist you
in understanding the concepts covered in this book.
Explore and understand data with the powerful data visualization
techniques of Tableau, and then communicate insights in powerful
ways Key Features Apply best practices in data visualization and
chart types exploration Explore the latest version of Tableau
Desktop with hands-on examples Understand the fundamentals of
Tableau storytelling Book DescriptionGraphical presentation of data
enables us to easily understand complex data sets. Tableau 10
Complete Reference provides easy-to-follow recipes with several use
cases and real-world business scenarios to get you up and running
with Tableau 10. This Learning Path begins with the history of data
visualization and its importance in today's businesses. You'll also
be introduced to Tableau - how to connect, clean, and analyze data
in this visual analytics software. Then, you'll learn how to apply
what you've learned by creating some simple calculations in Tableau
and using Table Calculations to help drive greater analysis from
your data. Next, you'll explore different advanced chart types in
Tableau. These chart types require you to have some understanding
of the Tableau interface and understand basic calculations. You'll
study in detail all dashboard techniques and best practices. A
number of recipes specifically for geospatial visualization,
analytics, and data preparation are also covered. Last but not
least, you'll learn about the power of storytelling through the
creation of interactive dashboards in Tableau. Through this
Learning Path, you will gain confidence and competence to analyze
and communicate data and insights more efficiently and effectively
by creating compelling interactive charts, dashboards, and stories
in Tableau. This Learning Path includes content from the following
Packt products: Learning Tableau 10 - Second Edition by Joshua N.
Milligan Getting Started with Tableau 2018.x by Tristan Guillevin
What you will learn Build effective visualizations, dashboards, and
story points Build basic to more advanced charts with step-by-step
recipes Become familiar row-level, aggregate, and table
calculations Dig deep into data with clustering and distribution
models Prepare and transform data for analysis Leverage Tableau's
mapping capabilities to visualize data Use data storytelling
techniques to aid decision making strategy Who this book is
forTableau 10 Complete Reference is designed for anyone who wants
to understand their data better and represent it in an effective
manner. It is also used for BI professionals and data analysts who
want to do better at their jobs.
Combine advanced analytics including Machine Learning, Deep
Learning Neural Networks and Natural Language Processing with
modern scalable technologies including Apache Spark to derive
actionable insights from Big Data in real-time Key Features Make a
hands-on start in the fields of Big Data, Distributed Technologies
and Machine Learning Learn how to design, develop and interpret the
results of common Machine Learning algorithms Uncover hidden
patterns in your data in order to derive real actionable insights
and business value Book DescriptionEvery person and every
organization in the world manages data, whether they realize it or
not. Data is used to describe the world around us and can be used
for almost any purpose, from analyzing consumer habits to fighting
disease and serious organized crime. Ultimately, we manage data in
order to derive value from it, and many organizations around the
world have traditionally invested in technology to help process
their data faster and more efficiently. But we now live in an
interconnected world driven by mass data creation and consumption
where data is no longer rows and columns restricted to a
spreadsheet, but an organic and evolving asset in its own right.
With this realization comes major challenges for organizations: how
do we manage the sheer size of data being created every second
(think not only spreadsheets and databases, but also social media
posts, images, videos, music, blogs and so on)? And once we can
manage all of this data, how do we derive real value from it? The
focus of Machine Learning with Apache Spark is to help us answer
these questions in a hands-on manner. We introduce the latest
scalable technologies to help us manage and process big data. We
then introduce advanced analytical algorithms applied to real-world
use cases in order to uncover patterns, derive actionable insights,
and learn from this big data. What you will learn Understand how
Spark fits in the context of the big data ecosystem Understand how
to deploy and configure a local development environment using
Apache Spark Understand how to design supervised and unsupervised
learning models Build models to perform NLP, deep learning, and
cognitive services using Spark ML libraries Design real-time
machine learning pipelines in Apache Spark Become familiar with
advanced techniques for processing a large volume of data by
applying machine learning algorithms Who this book is forThis book
is aimed at Business Analysts, Data Analysts and Data Scientists
who wish to make a hands-on start in order to take advantage of
modern Big Data technologies combined with Advanced Analytics.
Stay updated with expert techniques for solving data analytics and
machine learning challenges and gain insights from complex projects
and power up your applications Key Features Build independent
machine learning (ML) systems leveraging the best features of R 3.5
Understand and apply different machine learning techniques using
real-world examples Use methods such as multi-class classification,
regression, and clustering Book DescriptionGiven the growing
popularity of the R-zerocost statistical programming environment,
there has never been a better time to start applying ML to your
data. This book will teach you advanced techniques in ML ,using?
the latest code in R 3.5. You will delve into various complex
features of supervised learning, unsupervised learning, and
reinforcement learning algorithms to design efficient and powerful
ML models. This newly updated edition is packed with fresh examples
covering a range of tasks from different domains. Mastering Machine
Learning with R starts by showing you how to quickly manipulate
data and prepare it for analysis. You will explore simple and
complex models and understand how to compare them. You'll also
learn to use the latest library support, such as TensorFlow and
Keras-R, for performing advanced computations. Additionally, you'll
explore complex topics, such as natural language processing (NLP),
time series analysis, and clustering, which will further refine
your skills in developing applications. Each chapter will help you
implement advanced ML algorithms using real-world examples. You'll
even be introduced to reinforcement learning, along with its
various use cases and models. In the concluding chapters, you'll
get a glimpse into how some of these blackbox models can be
diagnosed and understood. By the end of this book, you'll be
equipped with the skills to deploy ML techniques in your own
projects or at work. What you will learn Prepare data for machine
learning methods with ease Understand how to write production-ready
code and package it for use Produce simple and effective data
visualizations for improved insights Master advanced methods, such
as Boosted Trees and deep neural networks Use natural language
processing to extract insights in relation to text Implement
tree-based classifiers, including Random Forest and Boosted Tree
Who this book is forThis book is for data science professionals,
machine learning engineers, or anyone who is looking for the ideal
guide to help them implement advanced machine learning algorithms.
The book will help you take your skills to the next level and
advance further in this field. Working knowledge of machine
learning with R is mandatory.
Data Analysis in Criminal Justice and Criminology: History,
Concept, and Application breaks down various data analysis
techniques to help students build their conceptual understanding of
key methods and processes. The information in the text encourages
discussion and consideration of how and why data analysis plays an
important role in the fields of criminal justice and criminology.
The book is divided into three units. Unit 1 discusses how data
analysis is used in criminal justice and criminology, various
methods of data collection, the importance of identifying the
purpose of analysis and key data elements prior to analyzing
information, and graphical representation of data. Unit 2
introduces students to samples, distributions, and the central
limit theorem as it relates to data analysis. This section provides
students with the essential knowledge and skills needed to
understand statistical concepts and calculations. The final unit
explains how to move beyond statistical description to statistical
inference and how sample statistics can be used to estimate
population parameters. Highly accessible in nature, Data Analysis
in Criminal Justice and Criminology is ideal for undergraduate and
graduate courses in criminal justice, criminology, and sociology
especially those with emphasis on data analysis.
Build efficient, high-performance & scalable systems to process
large volumes of data with Apache Ignite Key Features Understand
Apache Ignite's in-memory technology Create High-Performance app
components with Ignite Build a real-time data streaming and complex
event processing system Book DescriptionApache Ignite is a
distributed in-memory platform designed to scale and process large
volume of data. It can be integrated with microservices as well as
monolithic systems, and can be used as a scalable, highly available
and performant deployment platform for microservices. This book
will teach you to use Apache Ignite for building a
high-performance, scalable, highly available system architecture
with data integrity. The book takes you through the basics of
Apache Ignite and in-memory technologies. You will learn about
installation and clustering Ignite nodes, caching topologies, and
various caching strategies, such as cache aside, read and write
through, and write behind. Next, you will delve into detailed
aspects of Ignite's data grid: web session clustering and querying
data. You will learn how to process large volumes of data using
compute grid and Ignite's map-reduce and executor service. You will
learn about the memory architecture of Apache Ignite and monitoring
memory and caches. You will use Ignite for complex event
processing, event streaming, and the time-series predictions of
opportunities and threats. Additionally, you will go through
off-heap and on-heap caching, swapping, and native and Spring
framework integration with Apache Ignite. By the end of this book,
you will be confident with all the features of Apache Ignite 2.x
that can be used to build a high-performance system architecture.
What you will learn Use Apache Ignite's data grid and implement web
session clustering Gain high performance and linear scalability
with in-memory distributed data processing Create a microservice on
top of Apache Ignite that can scale and perform Perform
ACID-compliant CRUD operations on an Ignite cache Retrieve data
from Apache Ignite's data grid using SQL, Scan and Lucene Text
query Explore complex event processing concepts and event streaming
Integrate your Ignite app with the Spring framework Who this book
is forThe book is for Big Data professionals who want to learn the
essentials of Apache Ignite. Prior experience in Java is necessary.
Use PySpark to easily crush messy data at-scale and discover proven
techniques to create testable, immutable, and easily parallelizable
Spark jobs Key Features Work with large amounts of agile data using
distributed datasets and in-memory caching Source data from all
popular data hosting platforms, such as HDFS, Hive, JSON, and S3
Employ the easy-to-use PySpark API to deploy big data Analytics for
production Book DescriptionApache Spark is an open source
parallel-processing framework that has been around for quite some
time now. One of the many uses of Apache Spark is for data
analytics applications across clustered computers. In this book,
you will not only learn how to use Spark and the Python API to
create high-performance analytics with big data, but also discover
techniques for testing, immunizing, and parallelizing Spark jobs.
You will learn how to source data from all popular data hosting
platforms, including HDFS, Hive, JSON, and S3, and deal with large
datasets with PySpark to gain practical big data experience. This
book will help you work on prototypes on local machines and
subsequently go on to handle messy data in production and at scale.
This book covers installing and setting up PySpark, RDD operations,
big data cleaning and wrangling, and aggregating and summarizing
data into useful reports. You will also learn how to implement some
practical and proven techniques to improve certain aspects of
programming and administration in Apache Spark. By the end of the
book, you will be able to build big data analytical solutions using
the various PySpark offerings and also optimize them effectively.
What you will learn Get practical big data experience while working
on messy datasets Analyze patterns with Spark SQL to improve your
business intelligence Use PySpark's interactive shell to speed up
development time Create highly concurrent Spark programs by
leveraging immutability Discover ways to avoid the most expensive
operation in the Spark API: the shuffle operation Re-design your
jobs to use reduceByKey instead of groupBy Create robust processing
pipelines by testing Apache Spark jobs Who this book is forThis
book is for developers, data scientists, business analysts, or
anyone who needs to reliably analyze large amounts of large-scale,
real-world data. Whether you're tasked with creating your company's
business intelligence function or creating great data platforms for
your machine learning models, or are looking to use code to magnify
the impact of your business, this book is for you.
Learn exploratory data analysis concepts using powerful R packages
to enhance your R data analysis skills Key Features Speed up your
data analysis projects using powerful R packages and techniques
Create multiple hands-on data analysis projects using real-world
data Discover and practice graphical exploratory analysis
techniques across domains Book DescriptionHands-On Exploratory Data
Analysis with R will help you build not just a foundation but also
expertise in the elementary ways to analyze data. You will learn
how to understand your data and summarize its main characteristics.
You'll also uncover the structure of your data, and you'll learn
graphical and numerical techniques using the R language. This book
covers the entire exploratory data analysis (EDA) process-data
collection, generating statistics, distribution, and invalidating
the hypothesis. As you progress through the book, you will learn
how to set up a data analysis environment with tools such as
ggplot2, knitr, and R Markdown, using tools such as DOE Scatter
Plot and SML2010 for multifactor, optimization, and regression data
problems. By the end of this book, you will be able to successfully
carry out a preliminary investigation on any dataset, identify
hidden insights, and present your results in a business context.
What you will learn Learn powerful R techniques to speed up your
data analysis projects Import, clean, and explore data using
powerful R packages Practice graphical exploratory analysis
techniques Create informative data analysis reports using ggplot2
Identify and clean missing and erroneous data Explore data analysis
techniques to analyze multi-factor datasets Who this book is
forHands-On Exploratory Data Analysis with R is for data
enthusiasts who want to build a strong foundation for data
analysis. If you are a data analyst, data engineer, software
engineer, or product manager, this book will sharpen your skills in
the complete workflow of exploratory data analysis.
Make sense of your data and predict the unpredictable About This
Book * A unique book that centers around develop six key practical
skills needed to develop and implement predictive analytics * Apply
the principles and techniques of predictive analytics to
effectively interpret big data * Solve real-world analytical
problems with the help of practical case studies and real-world
scenarios taken from the world of healthcare, marketing, and other
business domains Who This Book Is For This book is for those with a
mathematical/statistics background who wish to understand the
concepts, techniques, and implementation of predictive analytics to
resolve complex analytical issues. Basic familiarity with a
programming language of R is expected. What You Will Learn * Master
the core predictive analytics algorithm which are used today in
business * Learn to implement the six steps for a successful
analytics project * Classify the right algorithm for your
requirements * Use and apply predictive analytics to research
problems in healthcare * Implement predictive analytics to retain
and acquire your customers * Use text mining to understand
unstructured data * Develop models on your own PC or in
Spark/Hadoop environments * Implement predictive analytics products
for customers In Detail This is the go-to book for anyone
interested in the steps needed to develop predictive analytics
solutions with examples from the world of marketing, healthcare,
and retail. We'll get started with a brief history of predictive
analytics and learn about different roles and functions people play
within a predictive analytics project. Then, we will learn about
various ways of installing R along with their pros and cons,
combined with a step-by-step installation of RStudio, and a
description of the best practices for organizing your projects. On
completing the installation, we will begin to acquire the skills
necessary to input, clean, and prepare your data for modeling. We
will learn the six specific steps needed to implement and
successfully deploy a predictive model starting from asking the
right questions through model development and ending with deploying
your predictive model into production. We will learn why
collaboration is important and how agile iterative modeling cycles
can increase your chances of developing and deploying the best
successful model. We will continue your journey in the cloud by
extending your skill set by learning about Databricks and SparkR,
which allow you to develop predictive models on vast gigabytes of
data. Style and Approach This book takes a practical hands-on
approach wherein the algorithms will be explained with the help of
real-world use cases. It is written in a well-researched academic
style which is a great mix of theoretical and practical
information. Code examples are supplied for both theoretical
concepts as well as for the case studies. Key references and
summaries will be provided at the end of each chapter so that you
can explore those topics on their own.
|
You may like...
Becoming
Michelle Obama
Hardcover
(6)
R729
R658
Discovery Miles 6 580
|