![]() |
![]() |
Your cart is empty |
||
Books > Computing & IT > Applications of computing > Databases > Data capture & analysis
Text is everywhere, and it is a fantastic resource for social scientists. However, because it is so abundant, and because language is so variable, it is often difficult to extract the information we want. There is a whole subfield of AI concerned with text analysis (natural language processing). Many of the basic analysis methods developed are now readily available as Python implementations. This Element will teach you when to use which method, the mathematical background of how it works, and the Python code to implement it.
Hybride Leistungsbundel (HLB) dienen dazu, ein innovatives und nutzenorientiertes Produktverstandnis von Sach- und Dienstleistungen zu etablieren. Hochkomplexe Anlagen lassen sich durch diese integrierte Betrachtung von Sach- und Dienstleistungsanteilen deutlich besser vermarkten. Der Band liefert einen Uberblick zu diesem Konzept und stellt entsprechende Methoden und Werkzeuge zur Entwicklung von Sach- und Dienstleistungen vor. Dabei berucksichtigen die Autoren den gesamten Zyklus: von der Planung und Entwicklung bis zur Erbringung und Nutzung."
Images play a crucial role in shaping and reflecting political life. Digitization has vastly increased the presence of such images in daily life, creating valuable new research opportunities for social scientists. We show how recent innovations in computer vision methods can substantially lower the costs of using images as data. We introduce readers to the deep learning algorithms commonly used for object recognition, facial recognition, and visual sentiment analysis. We then provide guidance and specific instructions for scholars interested in using these methods in their own research.
Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere fraction of what classical analytics solutions cost, while at the same time getting the results you need, incrementally faster. This book explains how the confluence of these pivotal technologies gives you enormous power, and cheaply, when it comes to huge datasets. You will begin by learning how cloud infrastructure makes it possible to scale your code to large amounts of processing units, without having to pay for the machinery in advance. From there you will learn how Apache Spark, an open source framework, can enable all those CPUs for data analytics use. Finally, you will see how services such as Databricks provide the power of Apache Spark, without you having to know anything about configuring hardware or software. By removing the need for expensive experts and hardware, your resources can instead be allocated to actually finding business value in the data. This book guides you through some advanced topics such as analytics in the cloud, data lakes, data ingestion, architecture, machine learning, and tools, including Apache Spark, Apache Hadoop, Apache Hive, Python, and SQL. Valuable exercises help reinforce what you have learned. What You Will Learn Discover the value of big data analytics that leverage the power of the cloud Get started with Databricks using SQL and Python in either Microsoft Azure or AWS Understand the underlying technology, and how the cloud and Apache Spark fit into the bigger picture See how these tools are used in the real world Run basic analytics, including machine learning, on billions of rows at a fraction of a cost or free Who This Book Is For Data engineers, data scientists, and cloud architects who want or need to run advanced analytics in the cloud. It is assumed that the reader has data experience, but perhaps minimal exposure to Apache Spark and Azure Databricks. The book is also recommended for people who want to get started in the analytics field, as it provides a strong foundation.
Focus on the most important and most often overlooked factor in a successful Tableau project-data. Without a reliable data source, you will not achieve the results you hope for in Tableau. This book does more than teach the mechanics of data preparation. It teaches you: how to look at data in a new way, to recognize the most common issues that hinder analytics, and how to mitigate those factors one by one. Tableau can change the course of business, but the old adage of "garbage in, garbage out" is the hard truth that hides behind every Tableau sales pitch. That amazing sales demo does not work as well with bad data. The unfortunate reality is that almost all data starts out in a less-than-perfect state. Data prep is hard. Traditionally, we were forced into the world of the database where complex ETL (Extract, Transform, Load) operations created by the data team did all the heavy lifting for us. Fortunately, we have moved past those days. With the introduction of the Tableau Data Prep tool you can now handle most of the common Data Prep and cleanup tasks on your own, at your desk, and without the help of the data team. This essential book will guide you through: The layout and important parts of the Tableau Data Prep tool Connecting to data Data quality and consistency The shape of the data. Is the data oriented in columns or rows? How to decide? Why does it matter? What is the level of detail in the source data? Why is that important? Combining source data to bring in more fields and rows Saving the data flow and the results of our data prep work Common cleanup and setup tasks in Tableau Desktop What You Will Learn Recognize data sources that are good candidates for analytics in Tableau Connect to local, server, and cloud-based data sources Profile data to better understand its content and structure Rename fields, adjust data types, group data points, and aggregate numeric data Pivot data Join data from local, server, and cloud-based sources for unified analytics Review the steps and results of each phase of the Data Prep process Output new data sources that can be reviewed in Tableau or any other analytics tool Who This Book Is For Tableau Desktop users who want to: connect to data, profile the data to identify common issues, clean up those issues, join to additional data sources, and save the newly cleaned, joined data so that it can be used more effectively in Tableau
This book presents fundamental new techniques for understanding and processing geospatial data. These "spatial gems" articulate and highlight insightful ideas that often remain unstated in graduate textbooks, and which are not the focus of research papers. They teach us how to do something useful with spatial data, in the form of algorithms, code, or equations. Unlike a research paper, Spatial Gems, Volume 1 does not focus on "Look what we have done!" but rather shows "Look what YOU can do!" With contributions from researchers at the forefront of the field, this volume occupies a unique position in the literature by serving graduate students, professional researchers, professors, and computer developers in the field alike.
This book constitutes the refereed joint proceedings of the 4th International Workshop on Large-Scale Annotation of Biomedical Data and Expert Label Synthesis, LABELS 2019, the First International Workshop on Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention, HAL-MICCAI 2019, and the Second International Workshop on Correction of Brainshift with Intra-Operative Ultrasound, CuRIOUS 2019, held in conjunction with the 22nd International Conference on Medical Imaging and Computer-Assisted Intervention, MICCAI 2019, in Shenzhen, China, in October 2019. The 8 papers presented at LABELS 2019, the 5 papers presented at HAL-MICCAI 2019, and the 3 papers presented at CuRIOUS 2019 were carefully reviewed and selected from numerous submissions. The LABELS papers present a variety of approaches for dealing with a limited number of labels, from semi-supervised learning to crowdsourcing. The HAL-MICCAI papers cover a wide set of hardware applications in medical problems, including medical image segmentation, electron tomography, pneumonia detection, etc. The CuRIOUS papers provide a snapshot of the current progress in the field through extended discussions and provide researchers an opportunity to characterize their image registration methods on newly released standardized datasets of iUS-guided brain tumor resection.
This book constitutes the proceedings of the 7th International Conference on Analysis of Images, Social Networks and Texts, AIST 2018, held in Moscow, Russia, in July 2018. The 29 full papers were carefully reviewed and selected from 107 submissions (of which 26 papers were rejected without being reviewed). The papers are organized in topical sections on natural language processing; analysis of images and video; general topics of data analysis; analysis of dynamic behavior through event data; optimization problems on graphs and network structures; and innovative systems.
Understand the complexities of modern-day data engineering platforms and explore strategies to deal with them with the help of use case scenarios led by an industry expert in big data Key Features Become well-versed with the core concepts of Apache Spark and Delta Lake for building data platforms Learn how to ingest, process, and analyze data that can be later used for training machine learning models Understand how to operationalize data models in production using curated data Book DescriptionIn the world of ever-changing data and schemas, it is important to build data pipelines that can auto-adjust to changes. This book will help you build scalable data platforms that managers, data scientists, and data analysts can rely on. Starting with an introduction to data engineering, along with its key concepts and architectures, this book will show you how to use Microsoft Azure Cloud services effectively for data engineering. You'll cover data lake design patterns and the different stages through which the data needs to flow in a typical data lake. Once you've explored the main features of Delta Lake to build data lakes with fast performance and governance in mind, you'll advance to implementing the lambda architecture using Delta Lake. Packed with practical examples and code snippets, this book takes you through real-world examples based on production scenarios faced by the author in his 10 years of experience working with big data. Finally, you'll cover data lake deployment strategies that play an important role in provisioning the cloud resources and deploying the data pipelines in a repeatable and continuous way. By the end of this data engineering book, you'll know how to effectively deal with ever-changing data and create scalable data pipelines to streamline data science, ML, and artificial intelligence (AI) tasks. What you will learn Discover the challenges you may face in the data engineering world Add ACID transactions to Apache Spark using Delta Lake Understand effective design strategies to build enterprise-grade data lakes Explore architectural and design patterns for building efficient data ingestion pipelines Orchestrate a data pipeline for preprocessing data using Apache Spark and Delta Lake APIs Automate deployment and monitoring of data pipelines in production Get to grips with securing, monitoring, and managing data pipelines models efficiently Who this book is forThis book is for aspiring data engineers and data analysts who are new to the world of data engineering and are looking for a practical guide to building scalable data platforms. If you already work with PySpark and want to use Delta Lake for data engineering, you'll find this book useful. Basic knowledge of Python, Spark, and SQL is expected.
A comprehensive compilation of new developments in data linkage methodology The increasing availability of large administrative databases has led to a dramatic rise in the use of data linkage, yet the standard texts on linkage are still those which describe the seminal work from the 1950-60s, with some updates. Linkage and analysis of data across sources remains problematic due to lack of discriminatory and accurate identifiers, missing data and regulatory issues. Recent developments in data linkage methodology have concentrated on bias and analysis of linked data, novel approaches to organising relationships between databases and privacy-preserving linkage. Methodological Developments in Data Linkage brings together a collection of contributions from members of the international data linkage community, covering cutting edge methodology in this field. It presents opportunities and challenges provided by linkage of large and often complex datasets, including analysis problems, legal and security aspects, models for data access and the development of novel research areas. New methods for handling uncertainty in analysis of linked data, solutions for anonymised linkage and alternative models for data collection are also discussed. Key Features : Presents cutting edge methods for a topic of increasing importance to a wide range of research areas, with applications to data linkage systems internationally Covers the essential issues associated with data linkage today Includes examples based on real data linkage systems, highlighting the opportunities, successes and challenges that the increasing availability of linkage data provides Novel approach incorporates technical aspects of both linkage, management and analysis of linked data This book will be of core interest to academics, government employees, data holders, data managers, analysts and statisticians who use administrative data. It will also appeal to researchers in a variety of areas, including epidemiology, biostatistics, social statistics, informatics, policy and public health.
The social sciences are becoming datafied. The questions once considered the domain of sociologists are now answered by data scientists operating on large datasets and breaking with methodological tradition, for better or worse. The traditional social sciences, such as sociology or anthropology, are under the double threat of becoming marginalized or even irrelevant, both from new methods of research which require more computational skills and from increasing competition from the corporate world which gains an additional advantage based on data access. However, unlike data scientists, sociologists and anthropologists have a long history of doing qualitative research. The more quantified datasets we have, the more difficult it is to interpret them without adding layers of qualitative interpretation. Big Data therefore needs Thick Data. This book presents the available arsenal of new methods and tools for studying society both quantitatively and qualitatively, opening ground for the social sciences to take the lead in analysing digital behaviour. It shows that Big Data can and should be supplemented and interpreted through thick data as well as cultural analysis. Thick Big Data is critically important for students and researchers in the social sciences to understand the possibilities of digital analysis, both in the quantitative and qualitative area, and to successfully build mixed-methods approaches.
Get started using Python in data analysis with this compact practical guide. This book includes three exercises and a case study on getting data in and out of Python code in the right format. Learn Data Analysis with Python also helps you discover meaning in the data using analysis and shows you how to visualize it. Each lesson is, as much as possible, self-contained to allow you to dip in and out of the examples as your needs dictate. If you are already using Python for data analysis, you will find a number of things that you wish you knew how to do in Python. You can then take these techniques and apply them directly to your own projects. If you aren't using Python for data analysis, this book takes you through the basics at the beginning to give you a solid foundation in the topic. As you work your way through the book you will have a better of idea of how to use Python for data analysis when you are finished. What You Will Learn Get data into and out of Python code Prepare the data and its format Find the meaning of the data Visualize the data using iPython Who This Book Is For Those who want to learn data analysis using Python. Some experience with Python is recommended but not required, as is some prior experience with data analysis or data science.
This book constitutes the thoroughly refereed post-workshop proceedings of the 6th International Workshop on Big Data Benchmarking, WBDB 2015, held in Toronto, ON, Canada, in June 2015 and the 7th International Workshop, WBDB 2015, held in New Delhi, India, in December 2015. The 8 full papers presented in this book were carefully reviewed and selected from 22 submissions. They deal with recent trends in big data and HPC convergence, new proposals for big data benchmarking, as well as tooling and performance results.
This book constitutes revised selected papers from the 4th ECML PKDD Workshop on Data Analytics for Renewable Energy Integration, DARE 2016, held in Riva del Garda, Italy, in September 2016. The 11 papers presented in this volume were carefully reviewed and selected for inclusion in this book and handle topics such as time series forecasting, the detection of faults, cyber security, smart grid and smart cities, technology integration, demand response and many others.
People have described nature since the beginning of human history. They do it for various purposes, including to communicate about economic, social, governmental, meteorological, sustainability-related, strategic, military, and survival issues as well as artistic expression. As a part of the whole world of living beings, we use various types of senses, known and unknown, labeled and not identified, to both communicate and create. Describing Nature Through Visual Data is a collection of impactful research that discusses issues related to the visualization of scientific concepts, picturing processes, and products, as well as the role of computing in advancing visual literacy skills. Organized into four sections, the book contains descriptions, theories, and examples of visual and music-based solutions concerning the selected natural or technological events that are shaping present-day reality. The chapters pertain to selected scientific fields, digital art, computer graphics, and new media and confer the possible ways that visuals, visualization, simulation, and interactive knowledge presentation can help us to understand and share the content of scientific thought, research, artistic works, and practice. Featuring coverage on topics that include mathematical thinking, music theory, and visual communication, this reference is ideal for instructors, professionals, researchers, and students keen on comprehending and enhancing the role of knowledge visualization in computing, sciences, design, media communication, film, advertising, and marketing.
Machine learning has finally come of age. With H2O software, you can perform machine learning and data analysis using a simple open source framework that's easy to use, has a wide range of OS and language support, and scales for big data. This hands-on guide teaches you how to use H20 with only minimal math and theory behind the learning algorithms. If you're familiar with R or Python, know a bit of statistics, and have some experience manipulating data, author Darren Cook will take you through H2O basics and help you conduct machine-learning experiments on different sample data sets. You'll explore several modern machine-learning techniques such as deep learning, random forests, unsupervised learning, and ensemble learning. Learn how to import, manipulate, and export data with H2O Explore key machine-learning concepts, such as cross-validation and validation data sets Work with three diverse data sets, including a regression, a multinomial classification, and a binomial classification Use H2O to analyze each sample data set with four supervised machine-learning algorithms Understand how cluster analysis and other unsupervised machine-learning algorithms work
Ask questions of your data and gain insights to make better business decisions using the open source business intelligence tool, Metabase Key Features Deploy Metabase applications to let users across your organization interact with it Learn to create data visualizations, charts, reports, and dashboards with the help of a variety of examples Understand how to embed Metabase into your website and send out reports automatically using email and Slack Book DescriptionMetabase is an open source business intelligence tool that helps you use data to answer questions about your business. This book will give you a detailed introduction to using Metabase in your organization to get the most value from your data. You'll start by installing and setting up Metabase on your local computer. You'll then progress to handling the administration aspect of Metabase by learning how to configure and deploy Metabase, manage accounts, and execute administrative tasks such as adding users and creating permissions and metadata. Complete with examples and detailed instructions, this book shows you how to create different visualizations, charts, and dashboards to gain insights from your data. As you advance, you'll learn how to share the results with peers in your organization and cover production-related aspects such as embedding Metabase and auditing performance. Throughout the book, you'll explore the entire data analytics process-from connecting your data sources, visualizing data, and creating dashboards through to daily reporting. By the end of this book, you'll be ready to implement Metabase as an integral tool in your organization. What you will learn Explore different types of databases and find out how to connect them to Metabase Deploy and host Metabase securely using Amazon Web Services Use Metabase's user interface to filter and aggregate data on single and multiple tables Become a Metabase admin by learning how to add users and create permissions Answer critical questions for your organization by using the Notebook editor and writing SQL queries Use the search functionality to search through tables, dashboards, and metrics Who this book is forThis book is for business analysts, data analysts, data scientists, and other professionals who want to become well-versed with business intelligence and analytics using Metabase. This book will also appeal to anyone who wants to understand their data to extract meaningful insights with the help of practical examples. A basic understanding of data handling and processing is necessary to get started with this book. |
![]() ![]() You may like...
Quantitative Analysis and IBM (R) SPSS…
Abdulkader Aljandali
Hardcover
R1,541
Discovery Miles 15 410
Introduction to Intelligent Simulation…
Abdelhakim Artiba, V. V. Emelyanov, …
Hardcover
R4,656
Discovery Miles 46 560
Microwave Active Circuit Analysis and…
Clive Poole, Izzat Darwazeh
Hardcover
Recent Econometric Techniques for…
Gilles Dufrenot, Takashi Matsuki
Hardcover
R4,257
Discovery Miles 42 570
Time Series Analysis and Forecasting…
Ignacio Rojas, Hector Pomares
Hardcover
Human Factors in Safety-Critical Systems
Felix Redmill, Jane Rajan
Hardcover
R3,678
Discovery Miles 36 780
Management Of Information Security
Michael Whitman, Herbert Mattord
Paperback
|