Your cart is empty
Is your business looking out? The world today is drowning in data. There is a treasure trove of valuable and underutilized insights that can be gleaned from information companies and people leave behind on the internet - our 'digital breadcrumbs' - from job postings, to online news, social media, online ad spend, patent applications and more. As a result, we're at the cusp of a major shift in the way businesses are managed and governed - moving from a focus solely on lagging, internal data, toward analyses that also encompass industry-wide, external data to paint a more complete picture of a brand's opportunities and threats and uncover forward-looking insights, in real time. Tomorrow's most successful brands are already embracing Outside Insight, benefitting from an information advantage while their competition is left behind. Drawing on practical examples of transformative, data-led decisions made by brands like Apple, Facebook, Barack Obama and many more, in Outside Insight, Meltwater CEO Jorn Lyseggen illustrates the future of corporate decision-making and offers a detailed plan for business leaders to implement Outside Insight thinking into their company mindset and processes.
This book explores cognitive behavior among Internet of Things. Using a series of current and futuristic examples - appliances, personal assistants, robots, driverless cars, customer care, engineering, monetization, and many more - the book covers use cases, technology and communication aspects of how machines will support individuals and organizations. This book examines the Cognitive Things covering a number of important questions: * What are Cognitive Things? * What applications can be driven from Cognitive Things - today and tomorrow? * How will these Cognitive Things collaborate with each and other, with individuals and with organizations? * What is the cognitive era? How is it different from the automation era? * How will the Cognitive Things support or accelerate human problem solving? * Which technical components make up cognitive behavior? * How does it redistribute the work-load between humans and machines? * What types of data can be collected from them and shared with external organizations? * How do they recognize and authenticate authorized users? How is the data safeguarded from potential theft? Who owns the data and how are the data ownership rights enforced? Overall, Sathi explores ways in which Cognitive Things bring value to individuals as well as organizations and how to integrate the use of the devices into changing organizational structures. Case studies are used throughout to illustrate how innovators are already benefiting from the initial explosion of devices and data. Business executives, operational managers, and IT professionals will understand the fundamental changes required to fully benefit from cognitive technologies and how to utilize them for their own success.
Delve into your data for the key to success Data mining is quickly becoming integral to creating value and business momentum. The ability to detect unseen patterns hidden in the numbers exhaustively generated by day-to-day operations allows savvy decision-makers to exploit every tool at their disposal in the pursuit of better business. By creating models and testing whether patterns hold up, it is possible to discover new intelligence that could change your business's entire paradigm for a more successful outcome. Data Mining for Dummies shows you why it doesn't take a data scientist to gain this advantage, and empowers average business people to start shaping a process relevant to their business's needs. In this book, you'll learn the hows and whys of mining to the depths of your data, and how to make the case for heavier investment into data mining capabilities. The book explains the details of the knowledge discovery process including: * Model creation, validity testing, and interpretation * Effective communication of findings * Available tools, both paid and open-source * Data selection, transformation, and evaluation Data Mining for Dummies takes you step-by-step through a real-world data-mining project using open-source tools that allow you to get immediate hands-on experience working with large amounts of data. You'll gain the confidence you need to start making data mining practices a routine part of your successful business. If you're serious about doing everything you can to push your company to the top, Data Mining for Dummies is your ticket to effective data mining.
Cluster analysis consists of methods for finding groups in data automatically. Most methods have been heuristic and leave open such central questions as: How many clusters are there? Which clustering method should I use? How should I handle outliers? This introduction frames cluster analysis in terms of statistical models, yielding principled estimation, testing and prediction methods, and soundly-based answers to central questions. It develops basic ideas of model-based clustering in an accessible but rigorous way, using extensive real-world data examples and providing R code for many methods, and describes modern developments for high-dimensional data and networks. It explains recent methodological advances, such as Bayesian regularization methods, non-Gaussian model-based clustering, cluster merging, variable selection, semi-supervised classification, clustering of functional data, text and images, and co-clustering. Written for advanced undergraduates and beginning graduate students in data science, researchers, and practitioners, it assumes basic knowledge of multivariate calculus, linear algebra, probability and statistics.
If you are a biologist and want to get the best out of the powerful methods of modern computational statistics, this is your book. You can visualize and analyze your own data, apply unsupervised and supervised learning, integrate datasets, apply hypothesis testing, and make publication-quality figures using the power of R/Bioconductor and ggplot2. This book will teach you 'cooking from scratch', from raw data to beautiful illuminating output, as you learn to write your own scripts in the R language and to use advanced statistics packages from CRAN and Bioconductor. It covers a broad range of basic and advanced topics important in the analysis of high-throughput biological data, including principal component analysis and multidimensional scaling, clustering, multiple testing, unsupervised and supervised learning, resampling, the pitfalls of experimental design, and power simulations using Monte Carlo, and it even reaches networks, trees, spatial statistics, image data, and microbial ecology. Using a minimum of mathematical notation, it builds understanding from well-chosen examples, simulation, visualization, and above all hands-on interaction with data and code.
An accessible primer on how to create effective graphics from data This book provides students and researchers a hands-on introduction to the principles and practice of data visualization. It explains what makes some graphs succeed while others fail, how to make high-quality figures from data using powerful and reproducible methods, and how to think about data visualization in an honest and effective way. Data Visualization builds the reader (TM)s expertise in ggplot2, a versatile visualization library for the R programming language. Through a series of worked examples, this accessible primer then demonstrates how to create plots piece by piece, beginning with summaries of single variables and moving on to more complex graphics. Topics include plotting continuous and categorical variables; layering information on graphics; producing effective oesmall multiple plots; grouping, summarizing, and transforming data for plotting; creating maps; working with the output of statistical models; and refining plots to make them more comprehensible. Effective graphics are essential to communicating ideas and a great way to better understand data. This book provides the practical skills students and practitioners need to visualize quantitative data and get the most out of their research findings. Provides hands-on instruction using R and ggplot2 Shows how the oetidyverse of data analysis tools makes working with R easier and more consistent Includes a library of data sets, code, and functions
Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they're also a good way to dive into the discipline without actually understanding data science. With this updated second edition, you'll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch. If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today's messy glut of data holds answers to questions no one's even thought to ask. This book provides you with the know-how to dig those answers out.
This book, geared toward academic researchers and graduate students, brings together research on all facets of how time and causality relate across the sciences. Time is fundamental to how we perceive and reason about causes. It lets us immediately rule out the sound of a car crash as its cause. That a cause happens before its effect has been a core, and often unquestioned, part of how we describe causality. Research across disciplines shows that the relationship is much more complex than that. This book explores what that means for both the metaphysics and epistemology of causes-what they are and how we can find them. Across psychology, biology, and the social sciences, common themes emerge, suggesting that time plays a critical role in our understanding. The increasing availability of large time series datasets allows us to ask new questions about causality, necessitating new methods for modeling dynamic systems and incorporating mechanistic information into causal models.
During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It is a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book.
This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression & path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for wide'' data (p bigger than n), including multiple testing and false discovery rates.
Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, projection pursuit and gradient boosting.
Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces the fundamental principles of data science, and walks you through the "data-analytic thinking" necessary for extracting useful knowledge and business value from the data you collect. This guide also helps you understand the many data-mining techniques in use today. Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You'll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company's data science projects. You'll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making. Understand how data science fits in your organization - and how you can use it for competitive advantage Treat data as a business asset that requires careful investment if you're to gain real value Approach business problems data-analytically, using the data-mining process to gather good data in the most appropriate way Learn general concepts for actually extracting knowledge from data Apply data science principles when interviewing data science job candidates
In 25 concise steps, you will learn the basics of blockchain technology. No mathematical formulas, program code, or computer science jargon are used. No previous knowledge in computer science, mathematics, programming, or cryptography is required. Terminology is explained through pictures, analogies, and metaphors. This book bridges the gap that exists between purely technical books about the blockchain and purely business-focused books. It does so by explaining both the technical concepts that make up the blockchain and their role in business-relevant applications. What You'll Learn What the blockchain is Why it is needed and what problem it solves Why there is so much excitement about the blockchain and its potential Major components and their purpose How various components of the blockchain work and interact Limitations, why they exist, and what has been done to overcome them Major application scenarios Who This Book Is For Everyone who wants to get a general idea of what blockchain technology is, how it works, and how it will potentially change the financial system as we know it
A practical guide to data mining using SQL and Excel Data Analysis Using SQL and Excel, 2nd Edition shows you how to leverage the two most popular tools for data query and analysis SQL and Excel to perform sophisticated data analysis without the need for complex and expensive data mining tools. Written by a leading expert on business data mining, this book shows you how to extract useful business information from relational databases. You'll learn the fundamental techniques before moving into the "where" and "why" of each analysis, and then learn how to design and perform these analyses using SQL and Excel. Examples include SQL and Excel code, and the appendix shows how non-standard constructs are implemented in other major databases, including Oracle and IBM DB2/UDB. The companion website includes datasets and Excel spreadsheets, and the book provides hints, warnings, and technical asides to help you every step of the way. Data Analysis Using SQL and Excel, 2nd Edition shows you how to perform a wide range of sophisticated analyses using these simple tools, sparing you the significant expense of proprietary data mining tools like SAS. * Understand core analytic techniques that work with SQL and Excel * Ensure your analytic approach gets you the results you need * Design and perform your analysis using SQL and Excel Data Analysis Using SQL and Excel, 2nd Edition shows you how to best use the tools you already know to achieve expert results.
Harness the power of social media to predict customer behavior and improve sales Social media is the biggest source of Big Data. Because of this, 90% of Fortune 500 companies are investing in Big Data initiatives that will help them predict consumer behavior to produce better sales results. Written by Dr. Gabor Szabo, a Senior Data Scientist at Twitter, and Dr. Oscar Boykin, a Software Engineer at Twitter, Social Media Data Mining and Analytics shows analysts how to use sophisticated techniques to mine social media data, obtaining the information they need to generate amazing results for their businesses. Social Media Data Mining and Analytics isn't just another book on the business case for social media. Rather, this book provides hands-on examples for applying state-of-the-art tools and technologies to mine social media - examples include Twitter, Facebook, Pinterest, Wikipedia, Reddit, Flickr, Web hyperlinks, and other rich data sources. In it, you will learn: * The four key characteristics of online services-users, social networks, actions, and content * The full data discovery lifecycle-data extraction, storage, analysis, and visualization * How to work with code and extract data to create solutions * How to use Big Data to make accurate customer predictions Szabo and Boykin wrote this book to provide businesses with the competitive advantage they need to harness the rich data that is available from social media platforms.
This book summarizes the most important findings of the Data for Refugees (D4R) Challenge, which was a non-profit project initiated to improve the conditions of the Syrian refugees in Turkey by providing a database for the scientific community to enable research on urgent problems concerning refugees. The database, based on anonymized mobile call detail records (CDRs) of phone calls and SMS messages of one million Turk Telekom customers, indicates the broad activity and mobility patterns of refugees and citizens in Turkey for the year 1 January to 31 December 2017. Over 100 teams from around the globe applied to take part in the challenge, and 61 teams were granted access to the data. This book describes the challenge, and presents selected and revised project reports on the five major themes: unemployment, health, education, social integration, and safety, respectively. These are complemented by additional invited chapters describing related projects, as well as ethical aspects. The book illustrates the possibilities of big data analytics in coping with refugee crises and humanitarian responses, by showcasing innovative approaches drawing on multiple data sources, information visualization, pattern analysis, and statistical analysis. After the start of the Syrian Civil War in 2011-12, increasing numbers of civilians sought refuge in neighboring countries. By May 2017, Turkey had received over 3 million refugees - the largest refugee population in the world. About 30% of them lived in government-run camps near the Syrian border. Many have moved to cities looking for work and better living conditions. They faced problems of integration, income, welfare, employment, health, education, language, social tension, and discrimination.
Data-driven discovery is revolutionizing the modeling, prediction, and control of complex systems. This textbook brings together machine learning, engineering mathematics, and mathematical physics to integrate modeling and control of dynamical systems with modern methods in data science. It highlights many of the recent advances in scientific computing that enable data-driven methods to be applied to a diverse range of complex systems, such as turbulence, the brain, climate, epidemiology, finance, robotics, and autonomy. Aimed at advanced undergraduate and beginning graduate students in the engineering and physical sciences, the text presents a range of topics and methods from introductory to state of the art.
Learn the art and science of predictive analytics techniques that get results Predictive analytics is what translates big data into meaningful, usable business information. Written by a leading expert in the field, this guide examines the science of the underlying algorithms as well as the principles and best practices that govern the art of predictive analytics. It clearly explains the theory behind predictive analytics, teaches the methods, principles, and techniques for conducting predictive analytics projects, and offers tips and tricks that are essential for successful predictive modeling. Hands-on examples and case studies are included. * The ability to successfully apply predictive analytics enables businesses to effectively interpret big data; essential for competition today * This guide teaches not only the principles of predictive analytics, but also how to apply them to achieve real, pragmatic solutions * Explains methods, principles, and techniques for conducting predictive analytics projects from start to finish * Illustrates each technique with hands-on examples and includes as series of in-depth case studies that apply predictive analytics to common business scenarios * A companion website provides all the data sets used to generate the examples as well as a free trial version of software Applied Predictive Analytics arms data and business analysts and business managers with the tools they need to interpret and capitalize on big data.
Go beyond spreadsheets and tables and design a data presentation that really makes an impact. This practical guide shows you how to use Tableau Software to convert raw data into compelling data visualizations that provide insight or allow viewers to explore the data for themselves. Ideal for analysts, engineers, marketers, journalists, and researchers, this book describes the principles of communicating data and takes you on an in-depth tour of common visualization methods. You'll learn how to craft articulate and creative data visualizations with Tableau Desktop 8.1 and Tableau Public 8.1. Present comparisons of how much and how many Use blended data sources to create ratios and rates Create charts to depict proportions and percentages Visualize measures of mean, median, and mode Lean how to deal with variation and uncertainty Communicate multiple quantities in the same view Show how quantities and events change over time Use maps to communicate positional data Build dashboards to combine several visualizations
Data Science is booming thanks to R and Python, but Java brings the robustness, convenience, and ability to scale critical to today's data science applications. With this practical book, Java software engineers looking to add data science skills will take a logical journey through the data science pipeline. Author Michael Brzustowicz explains the basic math theory behind each step of the data science process, as well as how to apply these concepts with Java. You'll learn the critical roles that data IO, linear algebra, statistics, data operations, learning and prediction, and Hadoop MapReduce play in the process. Throughout this book, you'll find code examples you can use in your applications. Examine methods for obtaining, cleaning, and arranging data into its purest form Understand the matrix structure that your data should take Learn basic concepts for testing the origin and validity of data Transform your data into stable and usable numerical values Understand supervised and unsupervised learning algorithms, and methods for evaluating their success Get up and running with MapReduce, using customized components suitable for data science algorithms
A timely overview of cutting edge technologies for multimedia retrieval with a special emphasis on scalability The amount of multimedia data available every day is enormous and is growing at an exponential rate, creating a great need for new and more efficient approaches for large scale multimedia search. This book addresses that need, covering the area of multimedia retrieval and placing a special emphasis on scalability. It reports the recent works in large scale multimedia search, including research methods and applications, and is structured so that readers with basic knowledge can grasp the core message while still allowing experts and specialists to drill further down into the analytical sections. Big Data Analytics for Large-Scale Multimedia Search covers: representation learning, concept and event-based video search in large collections; big data multimedia mining, large scale video understanding, big multimedia data fusion, large-scale social multimedia analysis, privacy and audiovisual content, data storage and management for big multimedia, large scale multimedia search, multimedia tagging using deep learning, interactive interfaces for big multimedia and medical decision support applications using large multimodal data. Addresses the area of multimedia retrieval and pays close attention to the issue of scalability Presents problem driven techniques with solutions that are demonstrated through realistic case studies and user scenarios Includes tables, illustrations, and figures Offers a Wiley-hosted BCS that features links to open source algorithms, data sets and tools Big Data Analytics for Large-Scale Multimedia Search is an excellent book for academics, industrial researchers, and developers interested in big multimedia data search retrieval. It will also appeal to consultants in computer science problems and professionals in the multimedia industry.
This practical guide provides nearly 200 self-contained recipes to help you solve machine learning challenges you may encounter in your daily work. If you're comfortable with Python and its libraries, including pandas and scikit-learn, you'll be able to address specific problems such as loading data, handling text or numerical data, model selection, and dimensionality reduction and many other topics. Each recipe includes code that you can copy and paste into a toy dataset to ensure that it actually works. From there, you can insert, combine, or adapt the code to help construct your application. Recipes also include a discussion that explains the solution and provides meaningful context. This cookbook takes you beyond theory and concepts by providing the nuts and bolts you need to construct working machine learning applications. You'll find recipes for: Vectors, matrices, and arrays Handling numerical and categorical data, text, images, and dates and times Dimensionality reduction using feature extraction or feature selection Model evaluation and selection Linear and logical regression, trees and forests, and k-nearest neighbors Support vector machines (SVM), naive Bayes, clustering, and neural networks Saving and loading trained models
Since long before computers were even thought of, data has been collected and organized by diverse cultures across the world. Once access to the Internet became a reality for large swathes of the world's population, the amount of data generated each day became huge, and continues to grow exponentially. It includes all our uploaded documents, video, and photos, all our social media traffic, our online shopping, even the GPS data from our cars. 'Big Data' represents a qualitative change, not simply a quantitative one. The term refers both to the new technologies involved, and to the way it can be used by business and government. Dawn E. Holmes uses a variety of case studies to explain how data is stored, analysed, and exploited by a variety of bodies from big companies to organizations concerned with disease control. Big data is transforming the way businesses operate, and the way medical research can be carried out. At the same time, it raises important ethical issues; Holmes discusses cases such as the Snowden affair, data security, and domestic smart devices which can be hijacked by hackers. ABOUT THE SERIES: The Very Short Introductions series from Oxford University Press contains hundreds of titles in almost every subject area. These pocket-sized books are the perfect way to get ahead in a new subject quickly. Our expert authors combine facts, analysis, perspective, new ideas, and enthusiasm to make interesting and challenging topics highly readable.
How can Twitter data be used to study individual-level human behavior and social interaction on a global scale? This book introduces readers to the methods, opportunities, and challenges of using Twitter data to analyze phenomena ranging from the number of people infected by the flu, to national elections, to tomorrow's stock prices. Each chapter, written by leading domain experts in clear and accessible language, takes the reader to the forefront of the newly emerging field of computational social science. An introductory chapter on Twitter data analysis provides an overview of key tools and skills, and gives pointers on how to get started, while the case studies demonstrate shortcomings, limitations, and pitfalls of Twitter data as well as its advantages. The book will be an excellent resource for social science students and researchers wanting to explore the use of online data.
This textbook explores the different aspects of data mining from the fundamentals to the complex data types and their applications, capturing the wide diversity of problem domains for data mining issues. It goes beyond the traditional focus on data mining problems to introduce advanced data types such as text, time series, discrete sequences, spatial data, graph data, and social networks. Until now, no single book has addressed all these topics in a comprehensive and integrated way. The chapters of this book fall into one of three categories: Fundamental chapters: Data mining has four main problems, which correspond to clustering, classification, association pattern mining, and outlier analysis. These chapters comprehensively discuss a wide variety of methods for these problems. Domain chapters: These chapters discuss the specific methods used for different domains of data such as text data, time-series data, sequence data, graph data, and spatial data. Application chapters: These chapters study important applications such as stream mining, Web mining, ranking, recommendations, social networks, and privacy preservation. The domain chapters also have an applied flavor. Appropriate for both introductory and advanced data mining courses, Data Mining: The Textbook balances mathematical details and intuition. It contains the necessary mathematical details for professors and researchers, but it is presented in a simple and intuitive style to improve accessibility for students and industrial practitioners (including those with a limited mathematical background). Numerous illustrations, examples, and exercises are included, with an emphasis on semantically interpretable examples. Praise for Data Mining: The Textbook - "As I read through this book, I have already decided to use it in my classes. This is a book written by an outstanding researcher who has made fundamental contributions to data mining, in a way that is both accessible and up to date. The book is complete with theory and practical use cases. It's a must-have for students and professors alike!" -- Qiang Yang, Chair of Computer Science and Engineering at Hong Kong University of Science and Technology "This is the most amazing and comprehensive text book on data mining. It covers not only the fundamental problems, such as clustering, classification, outliers and frequent patterns, and different data types, including text, time series, sequences, spatial data and graphs, but also various applications, such as recommenders, Web, social network and privacy. It is a great book for graduate students and researchers as well as practitioners." -- Philip S. Yu, UIC Distinguished Professor and Wexler Chair in Information Technology at University of Illinois at Chicago
This book constitutes the refereed proceedings of the 13th International Conference on Machine Learning and Cybernetics, Lanzhou, China, in July 2014. The 45 revised full papers presented were carefully reviewed and selected from 421 submissions. The papers are organized in topical sections on classification and semi-supervised learning; clustering and kernel; application to recognition; sampling and big data; application to detection; decision tree learning; learning and adaptation; similarity and decision making; learning with uncertainty; improved learning algorithms and applications.
You may like...
Symbiotic Interaction - Third…
Giulio Jacucci, Luciano Gamberini, … Paperback
Knowledge Science, Engineering and…
Robert Buchmann, Claudiu Vasile Kifor, … Paperback
Monetising Data - How to Uplift Your…
Andrea Ahlemeyer-Stubbe, Shirley Coleman Hardcover
Ashley Davis Paperback
Mining the Social Web, 3e
Matthew A. Russell, Mikhail Klassen Paperback
Handbook of Educational Data Mining
Cristobal Romero, Sebastian Ventura, … Hardcover R2,581 Discovery Miles 25 810
Bioinformatics in Personalized Medicine…
Ana Teresa Freitas, Arcadi Navarro Paperback
Data Mining Methods for the Content…
Kalev Leetaru Hardcover R1,849 Discovery Miles 18 490
Data Mining Techniques and Applications…
Hongbo Du Paperback
The Top Ten Algorithms in Data Mining
Xindong Wu, Vipin Kumar Hardcover R1,715 Discovery Miles 17 150