![]() |
Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
||
|
Books > Computing & IT > Applications of computing > Databases > Data capture & analysis
Gain hands-on experience of Python programming with industry-standard machine learning techniques using pandas, scikit-learn, and XGBoost Key Features Think critically about data and use it to form and test a hypothesis Choose an appropriate machine learning model and train it on your data Communicate data-driven insights with confidence and clarity Book DescriptionIf data is the new oil, then machine learning is the drill. As companies gain access to ever-increasing quantities of raw data, the ability to deliver state-of-the-art predictive models that support business decision-making becomes more and more valuable. In this book, you'll work on an end-to-end project based around a realistic data set and split up into bite-sized practical exercises. This creates a case-study approach that simulates the working conditions you'll experience in real-world data science projects. You'll learn how to use key Python packages, including pandas, Matplotlib, and scikit-learn, and master the process of data exploration and data processing, before moving on to fitting, evaluating, and tuning algorithms such as regularized logistic regression and random forest. Now in its second edition, this book will take you through the end-to-end process of exploring data and delivering machine learning models. Updated for 2021, this edition includes brand new content on XGBoost, SHAP values, algorithmic fairness, and the ethical concerns of deploying a model in the real world. By the end of this data science book, you'll have the skills, understanding, and confidence to build your own machine learning models and gain insights from real data. What you will learn Load, explore, and process data using the pandas Python package Use Matplotlib to create compelling data visualizations Implement predictive machine learning models with scikit-learn Use lasso and ridge regression to reduce model overfitting Evaluate random forest and logistic regression model performance Deliver business insights by presenting clear, convincing conclusions Who this book is forData Science Projects with Python - Second Edition is for anyone who wants to get started with data science and machine learning. If you're keen to advance your career by using data analysis and predictive modeling to generate business insights, then this book is the perfect place to begin. To quickly grasp the concepts covered, it is recommended that you have basic experience of programming with Python or another similar language, and a general interest in statistics.
A practical guide to implementing a scalable and fast state-of-the-art analytical data estate Key Features Store and analyze data with enterprise-grade security and auditing Perform batch, streaming, and interactive analytics to optimize your big data solutions with ease Develop and run parallel data processing programs using real-world enterprise scenarios Book DescriptionAzure Data Lake, the modern data warehouse architecture, and related data services on Azure enable organizations to build their own customized analytical platform to fit any analytical requirements in terms of volume, speed, and quality. This book is your guide to learning all the features and capabilities of Azure data services for storing, processing, and analyzing data (structured, unstructured, and semi-structured) of any size. You will explore key techniques for ingesting and storing data and perform batch, streaming, and interactive analytics. The book also shows you how to overcome various challenges and complexities relating to productivity and scaling. Next, you will be able to develop and run massive data workloads to perform different actions. Using a cloud-based big data-modern data warehouse-analytics setup, you will also be able to build secure, scalable data estates for enterprises. Finally, you will not only learn how to develop a data warehouse but also understand how to create enterprise-grade security and auditing big data programs. By the end of this Azure book, you will have learned how to develop a powerful and efficient analytical platform to meet enterprise needs. What you will learn Implement data governance with Azure services Use integrated monitoring in the Azure Portal and integrate Azure Data Lake Storage into the Azure Monitor Explore the serverless feature for ad-hoc data discovery, logical data warehousing, and data wrangling Implement networking with Synapse Analytics and Spark pools Create and run Spark jobs with Databricks clusters Implement streaming using Azure Functions, a serverless runtime environment on Azure Explore the predefined ML services in Azure and use them in your app Who this book is forThis book is for data architects, ETL developers, or anyone who wants to get well-versed with Azure data services to implement an analytical data estate for their enterprise. The book will also appeal to data scientists and data analysts who want to explore all the capabilities of Azure data services, which can be used to store, process, and analyze any kind of data. A beginner-level understanding of data analysis and streaming will be required.
Leverage the Azure analytics platform's key analytics services to deliver unmatched intelligence for your data Key Features Learn to ingest, prepare, manage, and serve data for immediate business requirements Bring enterprise data warehousing and big data analytics together to gain insights from your data Develop end-to-end analytics solutions using Azure Synapse Book DescriptionAzure Synapse Analytics, which Microsoft describes as the next evolution of Azure SQL Data Warehouse, is a limitless analytics service that brings enterprise data warehousing and big data analytics together. With this book, you'll learn how to discover insights from your data effectively using this platform. The book starts with an overview of Azure Synapse Analytics, its architecture, and how it can be used to improve business intelligence and machine learning capabilities. Next, you'll go on to choose and set up the correct environment for your business problem. You'll also learn a variety of ways to ingest data from various sources and orchestrate the data using transformation techniques offered by Azure Synapse. Later, you'll explore how to handle both relational and non-relational data using the SQL language. As you progress, you'll perform real-time streaming and execute data analysis operations on your data using various languages, before going on to apply ML techniques to derive accurate and granular insights from data. Finally, you'll discover how to protect sensitive data in real time by using security and privacy features. By the end of this Azure book, you'll be able to build end-to-end analytics solutions while focusing on data prep, data management, data warehousing, and AI tasks. What you will learn Explore the necessary considerations for data ingestion and orchestration while building analytical pipelines Understand pipelines and activities in Synapse pipelines and use them to construct end-to-end data-driven workflows Query data using various coding languages on Azure Synapse Focus on Synapse SQL and Synapse Spark Manage and monitor resource utilization and query activity in Azure Synapse Connect Power BI workspaces with Azure Synapse and create or modify reports directly from Synapse Studio Create and manage IP firewall rules in Azure Synapse Who this book is forThis book is for data architects, data scientists, data engineers, and business analysts who are looking to get up and running with the Azure Synapse Analytics platform. Basic knowledge of data warehousing will be beneficial to help you understand the concepts covered in this book more effectively.
Informationsgesellschaft, Information als Wettbewerbsfaktor, Informationsflut: Diese Stichworte verdeutlichen die unternehmerische und gesellschaftliche Bedeutung von Informationen. Doch nicht nur Information allein, sondern auch die Systeme, die Informationen verarbeiten, speichern und ubertragen sowie die Technologien, auf denen sie beruhen, verdienen Aufmerksamkeit. Informationsmanagement hat die Aufgabe, den im Hinblick auf das Unternehmensziel bestmoeglichen Einsatz der Ressource Information zu gewahrleisten. Es zahlt zu den wesentlichen Bestandteilen heutiger Unternehmensfuhrung. Das Lehrbuch vermittelt in 13 Einheiten die Grundlagen des Informationsmanagements. Dabei werden neben den Managementaufgaben der Informationswirtschaft, der Systeme und der Technologien auch ausgewahlte Fuhrungsaufgaben des Informationsmanagementsbehandelt. Jede Lehreinheit beginnt mit einem UEberblick uber die behandelten Themen und schliesst mit einer Zusammenfassung sowie Aufgaben zur Wiederholung ab. So richtet sich dieses Buch insbesondere an Bachelorstudenten in den Fachern Wirtschaftsinformatik, BWL und Informatik.
Quickly build and deploy massive data pipelines and improve productivity using Azure Databricks Key Features Get to grips with the distributed training and deployment of machine learning and deep learning models Learn how ETLs are integrated with Azure Data Factory and Delta Lake Explore deep learning and machine learning models in a distributed computing infrastructure Book DescriptionMicrosoft Azure Databricks helps you to harness the power of distributed computing and apply it to create robust data pipelines, along with training and deploying machine learning and deep learning models. Databricks' advanced features enable developers to process, transform, and explore data. Distributed Data Systems with Azure Databricks will help you to put your knowledge of Databricks to work to create big data pipelines. The book provides a hands-on approach to implementing Azure Databricks and its associated methodologies that will make you productive in no time. Complete with detailed explanations of essential concepts, practical examples, and self-assessment questions, you'll begin with a quick introduction to Databricks core functionalities, before performing distributed model training and inference using TensorFlow and Spark MLlib. As you advance, you'll explore MLflow Model Serving on Azure Databricks and implement distributed training pipelines using HorovodRunner in Databricks. Finally, you'll discover how to transform, use, and obtain insights from massive amounts of data to train predictive models and create entire fully working data pipelines. By the end of this MS Azure book, you'll have gained a solid understanding of how to work with Databricks to create and manage an entire big data pipeline. What you will learn Create ETLs for big data in Azure Databricks Train, manage, and deploy machine learning and deep learning models Integrate Databricks with Azure Data Factory for extract, transform, load (ETL) pipeline creation Discover how to use Horovod for distributed deep learning Find out how to use Delta Engine to query and process data from Delta Lake Understand how to use Data Factory in combination with Databricks Use Structured Streaming in a production-like environment Who this book is forThis book is for software engineers, machine learning engineers, data scientists, and data engineers who are new to Azure Databricks and want to build high-quality data pipelines without worrying about infrastructure. Knowledge of Azure Databricks basics is required to learn the concepts covered in this book more effectively. A basic understanding of machine learning concepts and beginner-level Python programming knowledge is also recommended.
People have described nature since the beginning of human history. They do it for various purposes, including to communicate about economic, social, governmental, meteorological, sustainability-related, strategic, military, and survival issues as well as artistic expression. As a part of the whole world of living beings, we use various types of senses, known and unknown, labeled and not identified, to both communicate and create. Describing Nature Through Visual Data is a collection of impactful research that discusses issues related to the visualization of scientific concepts, picturing processes, and products, as well as the role of computing in advancing visual literacy skills. Organized into four sections, the book contains descriptions, theories, and examples of visual and music-based solutions concerning the selected natural or technological events that are shaping present-day reality. The chapters pertain to selected scientific fields, digital art, computer graphics, and new media and confer the possible ways that visuals, visualization, simulation, and interactive knowledge presentation can help us to understand and share the content of scientific thought, research, artistic works, and practice. Featuring coverage on topics that include mathematical thinking, music theory, and visual communication, this reference is ideal for instructors, professionals, researchers, and students keen on comprehending and enhancing the role of knowledge visualization in computing, sciences, design, media communication, film, advertising, and marketing.
Get to grips with pandas by working with real datasets and master data discovery, data manipulation, data preparation, and handling data for analytical tasks Key Features Perform efficient data analysis and manipulation tasks using pandas 1.x Apply pandas to different real-world domains with the help of step-by-step examples Make the most of pandas as an effective data exploration tool Book DescriptionExtracting valuable business insights is no longer a 'nice-to-have', but an essential skill for anyone who handles data in their enterprise. Hands-On Data Analysis with Pandas is here to help beginners and those who are migrating their skills into data science get up to speed in no time. This book will show you how to analyze your data, get started with machine learning, and work effectively with the Python libraries often used for data science, such as pandas, NumPy, matplotlib, seaborn, and scikit-learn. Using real-world datasets, you will learn how to use the pandas library to perform data wrangling to reshape, clean, and aggregate your data. Then, you will learn how to conduct exploratory data analysis by calculating summary statistics and visualizing the data to find patterns. In the concluding chapters, you will explore some applications of anomaly detection, regression, clustering, and classification using scikit-learn to make predictions based on past data. This updated edition will equip you with the skills you need to use pandas 1.x to efficiently perform various data manipulation tasks, reliably reproduce analyses, and visualize your data for effective decision making - valuable knowledge that can be applied across multiple domains. What you will learn Understand how data analysts and scientists gather and analyze data Perform data analysis and data wrangling using Python Combine, group, and aggregate data from multiple sources Create data visualizations with pandas, matplotlib, and seaborn Apply machine learning algorithms to identify patterns and make predictions Use Python data science libraries to analyze real-world datasets Solve common data representation and analysis problems using pandas Build Python scripts, modules, and packages for reusable analysis code Who this book is forThis book is for data science beginners, data analysts, and Python developers who want to explore each stage of data analysis and scientific computing using a wide range of datasets. Data scientists looking to implement pandas in their machine learning workflow will also find plenty of valuable know-how as they progress. You'll find it easier to follow along with this book if you have a working knowledge of the Python programming language, but a Python crash-course tutorial is provided in the code bundle for anyone who needs a refresher.
Dieses Buch bietet einen historisch orientierten Einstieg in die elementare Zahlentheorie. Es stellt eine solide Basis fur vielfaltige Ausbaumoeglichkeiten dar. Besondere Zielsetzungen sind: Elementaritat und Anschaulichkeit, die Berucksichtigung der historischen Entwicklung, Motivation der Begriffe und Verfahren anhand konkreter, aussagekraftiger Beispiele unter Einbezug moderner Werkzeuge (Computeralgebra Systeme, Internet). Als Zusatzmedien werden Computer- und Internet-spezifische Interaktions- und Visualisierungsmoeglichkeiten (kostenlos) zur Verfugung gestellt. Das Werk wendet sich an Studierende (Bachelor/Lehramt), Lehrer(innen) sowie alle an Elementarmathematik interessierten Leser.
Think about your data intelligently and ask the right questions Key Features Master data cleaning techniques necessary to perform real-world data science and machine learning tasks Spot common problems with dirty data and develop flexible solutions from first principles Test and refine your newly acquired skills through detailed exercises at the end of each chapter Book DescriptionData cleaning is the all-important first step to successful data science, data analysis, and machine learning. If you work with any kind of data, this book is your go-to resource, arming you with the insights and heuristics experienced data scientists had to learn the hard way. In a light-hearted and engaging exploration of different tools, techniques, and datasets real and fictitious, Python veteran David Mertz teaches you the ins and outs of data preparation and the essential questions you should be asking of every piece of data you work with. Using a mixture of Python, R, and common command-line tools, Cleaning Data for Effective Data Science follows the data cleaning pipeline from start to end, focusing on helping you understand the principles underlying each step of the process. You'll look at data ingestion of a vast range of tabular, hierarchical, and other data formats, impute missing values, detect unreliable data and statistical anomalies, and generate synthetic features. The long-form exercises at the end of each chapter let you get hands-on with the skills you've acquired along the way, also providing a valuable resource for academic courses. What you will learn Ingest and work with common data formats like JSON, CSV, SQL and NoSQL databases, PDF, and binary serialized data structures Understand how and why we use tools such as pandas, SciPy, scikit-learn, Tidyverse, and Bash Apply useful rules and heuristics for assessing data quality and detecting bias, like Benford's law and the 68-95-99.7 rule Identify and handle unreliable data and outliers, examining z-score and other statistical properties Impute sensible values into missing data and use sampling to fix imbalances Use dimensionality reduction, quantization, one-hot encoding, and other feature engineering techniques to draw out patterns in your data Work carefully with time series data, performing de-trending and interpolation Who this book is forThis book is designed to benefit software developers, data scientists, aspiring data scientists, teachers, and students who work with data. If you want to improve your rigor in data hygiene or are looking for a refresher, this book is for you. Basic familiarity with statistics, general concepts in machine learning, knowledge of a programming language (Python or R), and some exposure to data science are helpful.
Get to grips with solving real-world NLP problems, such as dependency parsing, information extraction, topic modeling, and text data visualization Key Features Analyze varying complexities of text using popular Python packages such as NLTK, spaCy, sklearn, and gensim Implement common and not-so-common linguistic processing tasks using Python libraries Overcome the common challenges faced while implementing NLP pipelines Book DescriptionPython is the most widely used language for natural language processing (NLP) thanks to its extensive tools and libraries for analyzing text and extracting computer-usable data. This book will take you through a range of techniques for text processing, from basics such as parsing the parts of speech to complex topics such as topic modeling, text classification, and visualization. Starting with an overview of NLP, the book presents recipes for dividing text into sentences, stemming and lemmatization, removing stopwords, and parts of speech tagging to help you to prepare your data. You'll then learn ways of extracting and representing grammatical information, such as dependency parsing and anaphora resolution, discover different ways of representing the semantics using bag-of-words, TF-IDF, word embeddings, and BERT, and develop skills for text classification using keywords, SVMs, LSTMs, and other techniques. As you advance, you'll also see how to extract information from text, implement unsupervised and supervised techniques for topic modeling, and perform topic modeling of short texts, such as tweets. Additionally, the book shows you how to develop chatbots using NLTK and Rasa and visualize text data. By the end of this NLP book, you'll have developed the skills to use a powerful set of tools for text processing. What you will learn Become well-versed with basic and advanced NLP techniques in Python Represent grammatical information in text using spaCy, and semantic information using bag-of-words, TF-IDF, and word embeddings Perform text classification using different methods, including SVMs and LSTMs Explore different techniques for topic modeling such as K-means, LDA, NMF, and BERT Work with visualization techniques such as NER and word clouds for different NLP tools Build a basic chatbot using NLTK and Rasa Extract information from text using regular expression techniques and statistical and deep learning tools Who this book is forThis book is for data scientists and professionals who want to learn how to work with text. Intermediate knowledge of Python will help you to make the most out of this book. If you are an NLP practitioner, this book will serve as a code reference when working on your projects.
Get to grips with building reliable, scalable, and maintainable database solutions for enterprises and production databases Key Features Implement PostgreSQL 13 features to perform end-to-end modern database management Design, manage, and build enterprise database solutions using a unique recipe-based approach Solve common and not-so-common challenges faced while working to achieve optimal database performance Book DescriptionPostgreSQL has become the most advanced open source database on the market. This book follows a step-by-step approach, guiding you effectively in deploying PostgreSQL in production environments. The book starts with an introduction to PostgreSQL and its architecture. You'll cover common and not-so-common challenges faced while designing and managing the database. Next, the book focuses on backup and recovery strategies to ensure your database is steady and achieves optimal performance. Throughout the book, you'll address key challenges such as maintaining reliability, data integrity, a fault-tolerant environment, a robust feature set, extensibility, consistency, and authentication. Moving ahead, you'll learn how to manage a PostgreSQL cluster and explore replication features for high availability. Later chapters will assist you in building a secure PostgreSQL server, along with covering recipes for encrypting data in motion and data at rest. Finally, you'll not only discover how to tune your database for optimal performance but also understand ways to monitor and manage maintenance activities, before learning how to perform PostgreSQL upgrades during downtime. By the end of this book, you'll be well-versed with the essential PostgreSQL 13 features to build enterprise relational databases. What you will learn Understand logical and physical backups in Postgres Demonstrate the different types of replication methods possible with PostgreSQL today Set up a high availability cluster that provides seamless automatic failover for applications Secure a PostgreSQL encryption through authentication, authorization, and auditing Analyze the live and historic activity of a PostgreSQL server Understand how to monitor critical services in Postgres 13 Manage maintenance activities and performance tuning of a PostgreSQL cluster Who this book is forThis PostgreSQL book is for database architects, database developers and administrators, or anyone who wants to become well-versed with PostgreSQL 13 features to plan, manage, and design efficient database solutions. Prior experience with the PostgreSQL database and SQL language is expected.
Get to grips with automated machine learning and adopt a hands-on approach to AutoML implementation and associated methodologies Key Features Get up to speed with AutoML using OSS, Azure, AWS, GCP, or any platform of your choice Eliminate mundane tasks in data engineering and reduce human errors in machine learning models Find out how you can make machine learning accessible for all users to promote decentralized processes Book DescriptionEvery machine learning engineer deals with systems that have hyperparameters, and the most basic task in automated machine learning (AutoML) is to automatically set these hyperparameters to optimize performance. The latest deep neural networks have a wide range of hyperparameters for their architecture, regularization, and optimization, which can be customized effectively to save time and effort. This book reviews the underlying techniques of automated feature engineering, model and hyperparameter tuning, gradient-based approaches, and much more. You'll discover different ways of implementing these techniques in open source tools and then learn to use enterprise tools for implementing AutoML in three major cloud service providers: Microsoft Azure, Amazon Web Services (AWS), and Google Cloud Platform. As you progress, you'll explore the features of cloud AutoML platforms by building machine learning models using AutoML. The book will also show you how to develop accurate models by automating time-consuming and repetitive tasks in the machine learning development lifecycle. By the end of this machine learning book, you'll be able to build and deploy AutoML models that are not only accurate, but also increase productivity, allow interoperability, and minimize feature engineering tasks. What you will learn Explore AutoML fundamentals, underlying methods, and techniques Assess AutoML aspects such as algorithm selection, auto featurization, and hyperparameter tuning in an applied scenario Find out the difference between cloud and operations support systems (OSS) Implement AutoML in enterprise cloud to deploy ML models and pipelines Build explainable AutoML pipelines with transparency Understand automated feature engineering and time series forecasting Automate data science modeling tasks to implement ML solutions easily and focus on more complex problems Who this book is forCitizen data scientists, machine learning developers, artificial intelligence enthusiasts, or anyone looking to automatically build machine learning models using the features offered by open source tools, Microsoft Azure Machine Learning, AWS, and Google Cloud Platform will find this book useful. Beginner-level knowledge of building ML models is required to get the best out of this book. Prior experience in using Enterprise cloud is beneficial. |
You may like...
Intelligent Data Analysis for e-Learning…
Jorge Miguel, Santi Caballe, …
Paperback
Big Data - Concepts, Methodologies…
Information Reso Management Association
Hardcover
R17,613
Discovery Miles 176 130
Machine Learning for Biometrics…
Partha Pratim Sarangi, Madhumita Panda, …
Paperback
R2,570
Discovery Miles 25 700
Data Analytics for Social Microblogging…
Soumi Dutta, Asit Kumar Das, …
Paperback
R3,335
Discovery Miles 33 350
Demystifying Graph Data Science - Graph…
Pethuru Raj, Abhishek Kumar, …
Hardcover
Cognitive and Soft Computing Techniques…
Akash Kumar Bhoi, Victor Hugo Costa de Albuquerque, …
Paperback
R2,583
Discovery Miles 25 830
|