Manage and Automate Data Analysis with Pandas in Python Today,
analysts must manage data characterized by extraordinary variety,
velocity, and volume. Using the open source Pandas library, you can
use Python to rapidly automate and perform virtually any data
analysis task, no matter how large or complex. Pandas can help you
ensure the veracity of your data, visualize it for effective
decision-making, and reliably reproduce analyses across multiple
data sets. Pandas for Everyone, 2nd Edition, brings together
practical knowledge and insight for solving real problems with
Pandas, even if you're new to Python data analysis. Daniel Y. Chen
introduces key concepts through simple but practical examples,
incrementally building on them to solve more difficult, real-world
data science problems such as using regularization to prevent data
overfitting, or when to use unsupervised machine learning methods
to find the underlying structure in a data set. New features to the
second edition include: Extended coverage of plotting and the
seaborn data visualization library Expanded examples and resources
Updated Python 3.9 code and packages coverage, including
statsmodels and scikit-learn libraries Online bonus material on
geopandas, Dask, and creating interactive graphics with Altair Chen
gives you a jumpstart on using Pandas with a realistic data set and
covers combining data sets, handling missing data, and structuring
data sets for easier analysis and visualization. He demonstrates
powerful data cleaning techniques, from basic string manipulation
to applying functions simultaneously across dataframes. Once your
data is ready, Chen guides you through fitting models for
prediction, clustering, inference, and exploration. He provides
tips on performance and scalability and introduces you to the wider
Python data analysis ecosystem. Work with DataFrames and Series,
and import or export data Create plots with matplotlib, seaborn,
and pandas Combine data sets and handle missing data Reshape, tidy,
and clean data sets so they're easier to work with Convert data
types and manipulate text strings Apply functions to scale data
manipulations Aggregate, transform, and filter large data sets with
groupby Leverage Pandas' advanced date and time capabilities Fit
linear models using statsmodels and scikit-learn libraries Use
generalized linear modeling to fit models with different response
variables Compare multiple models to select the "best" one
Regularize to overcome overfitting and improve performance Use
clustering in unsupervised machine learning
General
Is the information for this product incomplete, wrong or inappropriate?
Let us know about it.
Does this product have an incorrect or missing image?
Send us a new image.
Is this product missing categories?
Add more categories.
Review This Product
No reviews yet - be the first to create one!