![]() |
![]() |
Your cart is empty |
||
Books > Computing & IT > Applications of computing > Databases > Data capture & analysis
This new edition covers some of the key topics relating to the latest version of MS Office through Excel 2019, including the creation of custom ribbons by injecting XML code into Excel Workbooks and how to link Excel VBA macros to customize ribbon objects. It now also provides examples in using ADO, DAO, and SQL queries to retrieve data from databases for analysis. Operations such as fully automated linear and non-linear curve fitting, linear and non-linear mapping, charting, plotting, sorting, and filtering of data have been updated to leverage the newest Excel VBA object models. The text provides examples on automated data analysis and the preparation of custom reports suitable for legal archiving and dissemination. Functionality Demonstrated in This Edition Includes: Find and extract information raw data files Format data in color (conditional formatting) Perform non-linear and linear regressions on data Create custom functions for specific applications Generate datasets for regressions and functions Create custom reports for regulatory agencies Leverage email to send generated reports Return data to Excel using ADO, DAO, and SQL queries Create database files for processed data Create tables, records, and fields in databases Add data to databases in fields or records Leverage external computational engines Call functions in MATLAB (R) and Origin (R) from Excel
Although there are already some books published on Big Data, most of them only cover basic concepts and society impacts and ignore the internal implementation details-making them unsuitable to R&D people. To fill such a need, Big Data: Storage, Sharing, and Security examines Big Data management from an R&D perspective. It covers the 3S designs-storage, sharing, and security-through detailed descriptions of Big Data concepts and implementations. Written by well-recognized Big Data experts around the world, the book contains more than 450 pages of technical details on the most important implementation aspects regarding Big Data. After reading this book, you will understand how to: Aggregate heterogeneous types of data from numerous sources, and then use efficient database management technology to store the Big Data Use cloud computing to share the Big Data among large groups of people Protect the privacy of Big Data during network sharing With the goal of facilitating the scientific research and engineering design of Big Data systems, the book consists of two parts. Part I, Big Data Management, addresses the important topics of spatial management, data transfer, and data processing. Part II, Security and Privacy Issues, provides technical details on security, privacy, and accountability. Examining the state of the art of Big Data over clouds, the book presents a novel architecture for achieving reliability, availability, and security for services running on the clouds. It supplies technical descriptions of Big Data models, algorithms, and implementations, and considers the emerging developments in Big Data applications. Each chapter includes references for further study.
As today's organizations are capturing exponentially larger amounts of data than ever, now is the time for organizations to rethink how they digest that data. Through advanced algorithms and analytics techniques, organizations can harness this data, discover hidden patterns, and use the newly acquired knowledge to achieve competitive advantages. Presenting the contributions of leading experts in their respective fields, Big Data: Algorithms, Analytics, and Applications bridges the gap between the vastness of Big Data and the appropriate computational methods for scientific and social discovery. It covers fundamental issues about Big Data, including efficient algorithmic methods to process data, better analytical strategies to digest data, and representative applications in diverse fields, such as medicine, science, and engineering. The book is organized into five main sections: Big Data Management-considers the research issues related to the management of Big Data, including indexing and scalability aspects Big Data Processing-addresses the problem of processing Big Data across a wide range of resource-intensive computational settings Big Data Stream Techniques and Algorithms-explores research issues regarding the management and mining of Big Data in streaming environments Big Data Privacy-focuses on models, techniques, and algorithms for preserving Big Data privacy Big Data Applications-illustrates practical applications of Big Data across several domains, including finance, multimedia tools, biometrics, and satellite Big Data processing Overall, the book reports on state-of-the-art studies and achievements in algorithms, analytics, and applications of Big Data. It provides readers with the basis for further efforts in this challenging scientific field that will play a leading role in next-generation database, data warehousing, data mining, and cloud computing research. It also explores related applications in diverse sectors, covering technologies for media/data communication, elastic media/data storage, cross-network media/data fusion, and SaaS.
Data and its technologies now play a large and growing role in humanities research and teaching. This book addresses the needs of humanities scholars who seek deeper expertise in the area of data modeling and representation. The authors, all experts in digital humanities, offer a clear explanation of key technical principles, a grounded discussion of case studies, and an exploration of important theoretical concerns. The book opens with an orientation, giving the reader a history of data modeling in the humanities and a grounding in the technical concepts necessary to understand and engage with the second part of the book. The second part of the book is a wide-ranging exploration of topics central for a deeper understanding of data modeling in digital humanities. Chapters cover data modeling standards and the role they play in shaping digital humanities practice, traditional forms of modeling in the humanities and how they have been transformed by digital approaches, ontologies which seek to anchor meaning in digital humanities resources, and how data models inhabit the other analytical tools used in digital humanities research. It concludes with a glossary chapter that explains specific terms and concepts for data modeling in the digital humanities context. This book is a unique and invaluable resource for teaching and practising data modeling in a digital humanities context.
There is increasing pressure to protect computer networks against unauthorized intrusion, and some work in this area is concerned with engineering systems that are robust to attack. However, no system can be made invulnerable. Data Analysis for Network Cyber-Security focuses on monitoring and analyzing network traffic data, with the intention of preventing, or quickly identifying, malicious activity. Such work involves the intersection of statistics, data mining and computer science. Fundamentally, network traffic is relational, embodying a link between devices. As such, graph analysis approaches are a natural candidate. However, such methods do not scale well to the demands of real problems, and the critical aspect of the timing of communications events is not accounted for in these approaches. This book gathers papers from leading researchers to provide both background to the problems and a description of cutting-edge methodology. The contributors are from diverse institutions and areas of expertise and were brought together at a workshop held at the University of Bristol in March 2013 to address the issues of network cyber security.The workshop was supported by the Heilbronn Institute for Mathematical Research.
A unique, integrated approach to exploratory data mining and data quality Data analysts at information-intensive businesses are frequently asked to analyze new data sets that are often dirty–composed of numerous tables possessing unknown properties. Prior to analysis, this data must be cleaned and explored–often a long and arduous task. Ensuring data quality is a notoriously messy problem that can only be addressed by drawing on methods from many disciplines, including statistics, exploratory data mining, database management, and metadata coding. Where other books on data mining and analysis focus primarily on the last stage of the analysis procedure, Exploratory Data Mining and Data Cleaning uses a uniquely integrated approach to data exploration and data cleaning to develop a suitable modeling strategy that will help analysts to more effectively determine and implement the final technique. The authors, both seasoned data analysts at a major corporation, draw on their own professional experience to:
A groundbreaking addition to the existing literature, Exploratory Data Mining and Data Cleaning serves as an important reference for data analysts who need to analyze large amounts of unfamiliar data, operations managers, and students in undergraduate or graduate-level courses dealing with data analysis and data mining.
This book focuses on computer intensive statistical methods, such as validation, model selection, and bootstrap, that help overcome obstacles that could not be previously solved by methods such as regression and time series modelling in the areas of economics, meteorology, and transportation.
To lead a data science team, you need to expertly articulate technology roadmaps, support a data-driven culture, and plan a data strategy that drives a competitive business plan. In this practical guide, you'll learn leadership techniques the authors have developed building multiple high-performance data teams. In How to Lead in Data Science you'll master techniques for leading data science at every seniority level, from heading up a single project to overseeing a whole company's data strategy. You'll find advice on plotting your long-term career advancement, as well as quick wins you can put into practice right away. Throughout, carefully crafted assessments and interview scenarios encourage introspection, reveal personal blind spots, and show development areas to help advance your career. Leading a data science team takes more than the typical set of business management skills. You need specific know-how to articulate technology roadmaps, support a data-driven culture, and plan a data strategy that drives a competitive business plan. Whether you're looking to manage your team better or work towards a seat at your company's top leadership table, this book will show you how.
What happens when a researcher and a practitioner spend hours crammed in a Fiat discussing data visualization? Beyond creating beautiful charts, they found greater richness in the craft as an integrated whole. Drawing from their unconventional backgrounds, these two women take readers through a journey around perception, semantics, and intent as the triad that influences visualization. This visually engaging book blends ideas from theory, academia, and practice to craft beautiful, yet meaningful visualizations and dashboards. How do you take your visualization skills to the next level? The book is perfect for analysts, research and data scientists, journalists, and business professionals. Functional Aesthetics for Data Visualization is also an indispensable resource for just about anyone curious about seeing and understanding data. Think of it as a coffee book for the data geek in you. https: //www.functionalaestheticsbook.com
Rough Set Theory, introduced by Pawlak in the early 1980s, has
become an important part of soft computing within the last 25
years. However, much of the focus has been on the theoretical
understanding of Rough Sets, with a survey of Rough Sets and their
applications within business and industry much desired. "Rough
Sets: Selected Methods and Applications in Management and
Engineering" provides context to Rough Set theory, with each
chapter exploring a real-world application of Rough Sets. "Rough Sets" is relevant to managers striving to improve their
businesses, industry researchers looking to improve the efficiency
of their solutions, and university researchers wanting to apply
Rough Sets to real-world problems.
This book presents an accessible introduction to data-driven storytelling. Resulting from unique discussions between data visualization researchers and data journalists, it offers an integrated definition of the topic, presents vivid examples and patterns for data storytelling, and calls out key challenges and new opportunities for researchers and practitioners.
Quantitative Intelligence Analysis describes the model-based method of intelligence analysis that represents the analyst's mental models of a subject, as well as the analyst's reasoning process exposing what the analyst believes about the subject, and how they arrived at those beliefs and converged on analytic judgments. It includes: *Specific methods of explicitly representing the analyst's mental models as computational models; *dynamic simulations and interactive analytic games; *the structure of an analyst's mental model and the theoretical basis for capturing and representing the tacit knowledge of these models explicitly as computational models detailed description of the use of these models in rigorous, structured analysis of difficult targets; *model illustrations and simulation descriptions; *the role of models in support of collection and operations; *case studies that illustrate a wide range of intelligence problems; *And a recommended curriculum for technical analysts.
Regarding the set of all feature attributes in a given database as the universal set, this monograph discusses various nonadditive set functions that describe the interaction among the contributions from feature attributes towards a considered target attribute. Then, the relevant nonlinear integrals are investigated. These integrals can be applied as aggregation tools in information fusion and data mining, such as synthetic evaluation, nonlinear multiregressions, and nonlinear classifications. Some methods of fuzzification are also introduced for nonlinear integrals such that fuzzy data can be treated and fuzzy information is retrievable. The book is suitable as a text for graduate courses in mathematics, computer science, and information science. It is also useful to researchers in the relevant area.
This book explains how to perform data de-noising, in large scale, with a satisfactory level of accuracy. Three main issues are considered. Firstly, how to eliminate the error propagation from one stage to next stages while developing a filtered model. Secondly, how to maintain the positional importance of data whilst purifying it. Finally, preservation of memory in the data is crucial to extract smart data from noisy big data. If, after the application of any form of smoothing or filtering, the memory of the corresponding data changes heavily, then the final data may lose some important information. This may lead to wrong or erroneous conclusions. But, when anticipating any loss of information due to smoothing or filtering, one cannot avoid the process of denoising as on the other hand any kind of analysis of big data in the presence of noise can be misleading. So, the entire process demands very careful execution with efficient and smart models in order to effectively deal with it.
This is the first comprehensive book dedicated entirely to the field of decision trees in data mining and covers all aspects of this important technique.Decision trees have become one of the most powerful and popular approaches in knowledge discovery and data mining, the science and technology of exploring large and complex bodies of data in order to discover useful patterns. The area is of great importance because it enables modeling and knowledge extraction from the abundance of data available. Both theoreticians and practitioners are continually seeking techniques to make the process more efficient, cost-effective and accurate. Decision trees, originally implemented in decision theory and statistics, are highly effective tools in other areas such as data mining, text mining, information extraction, machine learning, and pattern recognition. This book invites readers to explore the many benefits in data mining that decision trees offer:
Most biologists use nonlinear regression more than any other statistical technique, but there are very few places to learn about curve-fitting. This book, by the author of the very successful Intuitive Biostatistics, addresses this relatively focused need of an extraordinarily broad range of scientists.
S-PLUS is a powerful environment for the statistical and graphical analysis of data. It provides the tools to implement many statistical ideas which have been made possible by the widespread availability of workstations having good graphics and computational capabilities. This book is a guide to using S-PLUS to perform statistical analyses and provides both an introduction to the use of S-PLUS and a course in modern statistical methods. S-PLUS is available for both Windows and UNIX workstations, and both versions are covered in depth. The aim of the book is to show how to use S-PLUS as a powerful and graphical data analysis system. Readers are assumed to have a basic grounding in statistics, and so the book in intended for would-be users of S-PLUS and both students and researchers using statistics. Throughout, the emphasis is on presenting practical problems and full analyses of real data sets. Many of the methods discussed are state-of-the-art approaches to topics such as linear, nonlinear, and smooth regression models, tree-based methods, multivariate analysis and pattern recognition, survival analysis, time series and spatial statistics. Throughout, modern techniques such as robust methods, non-parametric smoothing, and bootstrapping are used where appropriate. This third edition is intended for users of S-PLUS 4.5, 5.0, 2000 or later, although S-PLUS 3.3/4 are also considered. The major change from the second edition is coverage of the current versions of S-PLUS. The material has been extensively rewritten using new examples and the latest computationally intensive methods. The companion volume on S Programming will provide an in-depth guide for those writing software in the S language. The authors have written several software libraries that enhance S-PLUS; these and all the datasets used are available on the Internet in versions for Windows and UNIX. There are extensive on-line complements covering advanced material, user-contributed extensions, further exercises, and new features of S-PLUS as they are introduced. Dr. Venables is now Statistician with CSRIO in Queensland, having been at the Department of Statistics, University of Adelaide, for many years previously. He has given many short courses on S-PLUS in Australia, Europe, and the USA. Professor Ripley holds the Chair of Applied Statistics at the University of Oxford, and is the author of four other books on spatial statistics, simulation, pattern recognition, and neural networks.
This book provides a first-hand account of business analytics and its implementation, and an account of the brief theoretical framework underpinning each component of business analytics. The themes of the book include (1) learning the contours and boundaries of business analytics which are in scope; (2) understanding the organization design aspects of an analytical organization; (3) providing knowledge on the domain focus of developing business activities for financial impact in functional analysis; and (4) deriving a whole gamut of business use cases in a variety of situations to apply the techniques. The book gives a complete, insightful understanding of developing and implementing analytical solution.
It is universally accepted today that parallel processing is here to stay but that software for parallel machines is still difficult to develop. However, there is little recognition of the fact that changes in processor architecture can significantly ease the development of software. In the seventies the availability of processors that could address a large name space directly, eliminated the problem of name management at one level and paved the way for the routine development of large programs. Similarly, today, processor architectures that can facilitate cheap synchronization and provide a global address space can simplify compiler development for parallel machines. If the cost of synchronization remains high, the pro gramming of parallel machines will remain significantly less abstract than programming sequential machines. In this monograph Bob Iannucci presents the design and analysis of an architecture that can be a better building block for parallel machines than any von Neumann processor. There is another very interesting motivation behind this work. It is rooted in the long and venerable history of dataflow graphs as a formalism for ex pressing parallel computation. The field has bloomed since 1974, when Dennis and Misunas proposed a truly novel architecture using dataflow graphs as the parallel machine language. The novelty and elegance of dataflow architectures has, however, also kept us from asking the real question: "What can dataflow architectures buy us that von Neumann ar chitectures can't?" In the following I explain in a round about way how Bob and I arrived at this question."
This book provides a comprehensive introduction on opinion analysis for online reviews. It offers the newest research on opinion mining, including theories, algorithms and datasets. A new feature presentation method is highlighted for sentiment classification. Then, a three-phase framework for sentiment classification is proposed, where a set of sentiment classifiers are selected automatically to make predictions. Such predictions are integrated via ensemble learning. Finally, to solve the problem of combination explosion encountered, a greedy algorithm is devised to select the base classifiers.
First Published in 2004. Learning how to analyze qualitative data by computer can be fun. That is one assumption underpinning this introduction to qualitative analysis, which takes account of how computing techniques have enhanced and transformed the field. The author provides a practical discussion of the main procedures for analyzing qualitative data by computer, with most of its examples taken from humour or everyday life. He examines ways in which computers can contribute to greater rigour and creativity, as well as greater efficiency in analysis. He discusses some of the pitfalls and paradoxes as well as the practicalities of computer-based qualitative analysis. The perspective of "Qualitative Data Analysis" is pragmatic rather than prescriptive, introducing different possibilities without advocating one particular approach. The result is a largely discipline-neutral text, which is suitable for arts and social science students and first-time qualitative analysts.
This book provides a comprehensive introduction on opinion analysis for online reviews. It offers the newest research on opinion mining, including theories, algorithms and datasets. A new feature presentation method is highlighted for sentiment classification. Then, a three-phase framework for sentiment classification is proposed, where a set of sentiment classifiers are selected automatically to make predictions. Such predictions are integrated via ensemble learning. Finally, to solve the problem of combination explosion encountered, a greedy algorithm is devised to select the base classifiers.
This book is the culmination of three years of research effort on a multidisciplinary project in which physicists, mathematicians, computer scientists and social scientists worked together to arrive at a unifying picture of complex networks. The contributed chapters form a reference for the various problems in data analysis visualization and modeling of complex networks.
The authors provide an understanding of big data and MapReduce by clearly presenting the basic terminologies and concepts. They have employed over 100 illustrations and many worked-out examples to convey the concepts and methods used in big data, the inner workings of MapReduce, and single node/multi-node installation on physical/virtual machines. This book covers almost all the necessary information on Hadoop MapReduce for most online certification exams. Upon completing this book, readers will find it easy to understand other big data processing tools such as Spark, Storm, etc. Ultimately, readers will be able to: * understand what big data is and the factors that are involved * understand the inner workings of MapReduce, which is essential for certification exams * learn the features and weaknesses of MapReduce * set up Hadoop clusters with 100s of physical/virtual machines * create a virtual machine in AWS * write MapReduce with Eclipse in a simple way * understand other big data processing tools and their applications
The increasing availability of data in our current, information overloaded society has led to the need for valid tools for its modelling and analysis. Data mining and applied statistical methods are the appropriate tools to extract knowledge from such data. This book provides an accessible introduction to data mining methods in a consistent and application oriented statistical framework, using case studies drawn from real industry projects and highlighting the use of data mining methods in a variety of business applications. Introduces data mining methods and applications.Covers classical and Bayesian multivariate statistical methodology as well as machine learning and computational data mining methods.Includes many recent developments such as association and sequence rules, graphical Markov models, lifetime value modelling, credit risk, operational risk and web mining.Features detailed case studies based on applied projects within industry.Incorporates discussion of data mining software, with case studies analysed using R.Is accessible to anyone with a basic knowledge of statistics or data analysis.Includes an extensive bibliography and pointers to further reading within the text. "Applied Data Mining for Business and Industry, 2nd edition" is aimed at advanced undergraduate and graduate students of data mining, applied statistics, database management, computer science and economics. The case studies will provide guidance to professionals working in industry on projects involving large volumes of data, such as customer relationship management, web design, risk management, marketing, economics and finance. |
![]() ![]() You may like...
Information Hiding: Steganography and…
Neil F. Johnson, Zoran Duric, …
Hardcover
R2,963
Discovery Miles 29 630
Stochastic Analysis and Related Topics…
Laurent Decreusefond, Jamal Najim
Hardcover
R1,533
Discovery Miles 15 330
Neural Information Processing: Research…
Jagath Chandana Rajapakse, Lipo Wang
Hardcover
R4,633
Discovery Miles 46 330
Bayesian Networks and Decision Graphs
Thomas Dyhre Nielsen, Finn Verner Jensen
Hardcover
R3,636
Discovery Miles 36 360
|