![]() |
![]() |
Your cart is empty |
||
Books > Computing & IT > Applications of computing > Databases > Data mining
This book presents a collection of representative and novel work in the field of data mining, knowledge discovery, clustering and classification, based on expanded and reworked versions of a selection of the best papers originally presented in French at the EGC 2014 and EGC 2015 conferences held in Rennes (France) in January 2014 and Luxembourg in January 2015. The book is in three parts: The first four chapters discuss optimization considerations in data mining. The second part explores specific quality measures, dissimilarities and ultrametrics. The final chapters focus on semantics, ontologies and social networks. Written for PhD and MSc students, as well as researchers working in the field, it addresses both theoretical and practical aspects of knowledge discovery and management.
This book introduces basic computing skills designed for industry professionals without a strong computer science background. Written in an easily accessible manner, and accompanied by a user-friendly website, it serves as a self-study guide to survey data science and data engineering for those who aspire to start a computing career, or expand on their current roles, in areas such as applied statistics, big data, machine learning, data mining, and informatics. The authors draw from their combined experience working at software and social network companies, on big data products at several major online retailers, as well as their experience building big data systems for an AI startup. Spanning from the basic inner workings of a computer to advanced data manipulation techniques, this book opens doors for readers to quickly explore and enhance their computing knowledge. Computing with Data comprises a wide range of computational topics essential for data scientists, analysts, and engineers, providing them with the necessary tools to be successful in any role that involves computing with data. The introduction is self-contained, and chapters progress from basic hardware concepts to operating systems, programming languages, graphing and processing data, testing and programming tools, big data frameworks, and cloud computing. The book is fashioned with several audiences in mind. Readers without a strong educational background in CS--or those who need a refresher--will find the chapters on hardware, operating systems, and programming languages particularly useful. Readers with a strong educational background in CS, but without significant industry background, will find the following chapters especially beneficial: learning R, testing, programming, visualizing and processing data in Python and R, system design for big data, data stores, and software craftsmanship.
This book not only discusses the important topics in the area of machine learning and combinatorial optimization, it also combines them into one. This was decisive for choosing the material to be included in the book and determining its order of presentation. Decision trees are a popular method of classification as well as of knowledge representation. At the same time, they are easy to implement as the building blocks of an ensemble of classifiers. Admittedly, however, the task of constructing a near-optimal decision tree is a very complex process. The good results typically achieved by the ant colony optimization algorithms when dealing with combinatorial optimization problems suggest the possibility of also using that approach for effectively constructing decision trees. The underlying rationale is that both problem classes can be presented as graphs. This fact leads to option of considering a larger spectrum of solutions than those based on the heuristic. Moreover, ant colony optimization algorithms can be used to advantage when building ensembles of classifiers. This book is a combination of a research monograph and a textbook. It can be used in graduate courses, but is also of interest to researchers, both specialists in machine learning and those applying machine learning methods to cope with problems from any field of R&D.
This book contains the combined proceedings of the 4th International Conference on Ubiquitous Computing Application and Wireless Sensor Network (UCAWSN-15) and the 16th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT-15). The combined proceedings present peer-reviewed contributions from academic and industrial researchers in fields including ubiquitous and context-aware computing, context-awareness reasoning and representation, location awareness services, and architectures, protocols and algorithms, energy, management and control of wireless sensor networks. The book includes the latest research results, practical developments and applications in parallel/distributed architectures, wireless networks and mobile computing, formal methods and programming languages, network routing and communication algorithms, database applications and data mining, access control and authorization and privacy preserving computation.
Vast amounts of data are nowadays collected, stored and processed, in an effort to assist in making a variety of administrative and governmental decisions. These innovative steps considerably improve the speed, effectiveness and quality of decisions. Analyses are increasingly performed by data mining and profiling technologies that statistically and automatically determine patterns and trends. However, when such practices lead to unwanted or unjustified selections, they may result in unacceptable forms of discrimination. Processing vast amounts of data may lead to situations in which data controllers know many of the characteristics, behaviors and whereabouts of people. In some cases, analysts might know more about individuals than these individuals know about themselves. Judging people by their digital identities sheds a different light on our views of privacy and data protection. This book discusses discrimination and privacy issues related to data mining and profiling practices. It provides technological and regulatory solutions, to problems which arise in these innovative contexts. The book explains that common measures for mitigating privacy and discrimination, such as access controls and anonymity, fail to properly resolve privacy and discrimination concerns. Therefore, new solutions, focusing on technology design, transparency and accountability are called for and set forth.
This book covers diverse aspects of advanced computer and communication engineering, focusing specifically on industrial and manufacturing theory and applications of electronics, communications, computing and information technology. Experts in research, industry, and academia present the latest developments in technology, describe applications involving cutting-edge communication and computer systems and explore likely future directions. In addition, access is offered to numerous new algorithms that assist in solving computer and communication engineering problems. The book is based on presentations delivered at ICOCOE 2014, the 1st International Conference on Communication and Computer Engineering. It will appeal to a wide range of professionals in the field, including telecommunication engineers, computer engineers and scientists, researchers, academics and students.
The development of business intelligence has enhanced the visualization of data to inform and facilitate business management and strategizing. By implementing effective data-driven techniques, this allows for advance reporting tools to cater to company-specific issues and challenges. The Handbook of Research on Advanced Data Mining Techniques and Applications for Business Intelligence is a key resource on the latest advancements in business applications and the use of mining software solutions to achieve optimal decision-making and risk management results. Highlighting innovative studies on data warehousing, business activity monitoring, and text mining, this publication is an ideal reference source for research scholars, management faculty, and practitioners.
This book is intended to spark a discourse on, and contribute to finding a clear consensus in, the debate between conceptualizing a knowledge strategy and planning a knowledge strategy. It explores the complex relationship between the notions of knowledge and strategy in the business context, one that is of practical importance to companies. After reviewing the extant literature, the book shows how the concept of knowledge strategies can be seen as a new perspective for exploring business strategies. It proposes a new approach that clarifies how planned and emergent knowledge strategies allow companies to make projections into the uncertain and unpredictable future that dominates today's economy.
This book focuses on the development of a theory of info-dynamics to support the theory of info-statics in the general theory of information. It establishes the rational foundations of information dynamics and how these foundations relate to the general socio-natural dynamics from the primary to the derived categories in the universal existence and from the potential to the actual in the ontological space. It also shows how these foundations relate to the general socio-natural dynamics from the potential to the possible to give rise to the possibility space with possibilistic thinking; from the possible to the probable to give rise to possibility space with probabilistic thinking; and from the probable to the actual to give rise to the space of knowledge with paradigms of thought in the epistemological space. The theory is developed to explain the general dynamics through various transformations in quality-quantity space in relation to the nature of information flows at each variety transformation. The theory explains the past-present-future connectivity of the evolving information structure in a manner that illuminates the transformation problem and its solution in the never-ending information production within matter-energy space under socio-natural technologies to connect the theory of info-statics, which in turn presents explanations to the transformation problem and its solution. The theoretical framework is developed with analytical tools based on the principle of opposites, systems of actual-potential polarities, negative-positive dualities under different time-structures with the use of category theory, fuzzy paradigm of thought and game theory in the fuzzy-stochastic cost-benefit space. The rational foundations are enhanced with categorial analytics. The value of the theory of info-dynamics is demonstrated in the explanatory and prescriptive structures of the transformations of varieties and categorial varieties at each point of time and over time from parent-offspring sequences. It constitutes a general explanation of dynamics of information-knowledge production through info-processes and info-processors induced by a socio-natural infinite set of technologies in the construction-destruction space.
This book addresses the impacts of various types of services such as infrastructure, platforms, software, and business processes that cloud computing and Big Data have introduced into business. Featuring chapters which discuss effective and efficient approaches in dealing with the inherent complexity and increasing demands in data science, a variety of application domains are covered. Various case studies by data management and analysis experts are presented in these chapters. Covered applications include banking, social networks, bioinformatics, healthcare, transportation and criminology. Highlighting the Importance of Big Data Management and Analysis for Various Applications will provide the reader with an understanding of how data management and analysis are adapted to these applications. This book will appeal to researchers and professionals in the field.
This book provides an overview of crowdsourced data management. Covering all aspects including the workflow, algorithms and research potential, it particularly focuses on the latest techniques and recent advances. The authors identify three key aspects in determining the performance of crowdsourced data management: quality control, cost control and latency control. By surveying and synthesizing a wide spectrum of studies on crowdsourced data management, the book outlines important factors that need to be considered to improve crowdsourced data management. It also introduces a practical crowdsourced-database-system design and presents a number of crowdsourced operators. Self-contained and covering theory, algorithms, techniques and applications, it is a valuable reference resource for researchers and students new to crowdsourced data management with a basic knowledge of data structures and databases.
The rapid increase in computing power and communication speed, coupled with computer storage facilities availability, has led to a new age of multimedia app- cations. Multimedia is practically everywhere and all around us we can feel its presence in almost all applications ranging from online video databases, IPTV, - teractive multimedia and more recently in multimedia based social interaction. These new growing applications require high-quality data storage, easy access to multimedia content and reliable delivery. Moving ever closer to commercial - ployment also aroused a higher awareness of security and intellectual property management issues. All the aforementioned requirements resulted in higher demands on various - eas of research (signal processing, image/video processing and analysis, com- nication protocols, content search, watermarking, etc.). This book covers the most prominent research issues in multimedia and is divided into four main sections: i) content based retrieval, ii) storage and remote access, iii) watermarking and co- right protection and iv) multimedia applications. Chapter 1 of the first section presents an analysis on how color is used and why is it crucial in nowadays multimedia applications. In chapter 2 the authors give an overview of the advances in video abstraction for fast content browsing, transm- sion, retrieval and skimming in large video databases and chapter 3 extends the discussion on video summarization even further. Content retrieval problem is tackled in chapter 4 by describing a novel method for producing meaningful s- ments suitable for MPEG-7 description based on binary partition trees (BPTs).
This book discusses the psychological traits associated with drug consumption through the statistical analysis of a new database with information on 1885 respondents and use of 18 drugs. After reviewing published works on the psychological profiles of drug users and describing the data mining and machine learning methods used, it demonstrates that the personality traits (five factor model, impulsivity, and sensation seeking) together with simple demographic data make it possible to predict the risk of consumption of individual drugs with a sensitivity and specificity above 70% for most drugs. It also analyzes the correlations of use of different substances and describes the groups of drugs with correlated use, identifying significant differences in personality profiles for users of different drugs. The book is intended for advanced undergraduates and first-year PhD students, as well as researchers and practitioners. Although no previous knowledge of machine learning, advanced data mining concepts or modern psychology of personality is assumed, familiarity with basic statistics and some experience in the use of probabilities would be helpful. For a more detailed introduction to statistical methods, the book provides recommendations for undergraduate textbooks.
This book explores how PPPM, clinical practice, and basic research could be best served by information technology (IT). A use-case was developed for hepatocellular carcinoma (HCC). The subject was approached with four interrelated tasks: (1) review of clinical practices relating to HCC; (2) propose an IT system relating to HCC, including clinical decision support and research needs; (3) determine how a clinical liver cancer center can contribute; and, (4) examine the enhancements and impact that the first three tasks will have on the management of HCC. An IT System for Personalized Medicine (ITS-PM) for HCC will provide the means to identify and determine the relative value of the wide number of variables, including clinical assessment of the patient -- functional status, liver function, degree of cirrhosis, and comorbidities; tumor biology, at a molecular, genetic and anatomic level; tumor burden and individual patient response; medical and operative treatments and their outcomes.
This book presents recent machine learning paradigms and advances in learning analytics, an emerging research discipline concerned with the collection, advanced processing, and extraction of useful information from both educators' and learners' data with the goal of improving education and learning systems. In this context, internationally respected researchers present various aspects of learning analytics and selected application areas, including: * Using learning analytics to measure student engagement, to quantify the learning experience and to facilitate self-regulation; * Using learning analytics to predict student performance; * Using learning analytics to create learning materials and educational courses; and * Using learning analytics as a tool to support learners and educators in synchronous and asynchronous eLearning. The book offers a valuable asset for professors, researchers, scientists, engineers and students of all disciplines. Extensive bibliographies at the end of each chapter guide readers to probe further into their application areas of interest.
This book is the proceedings of the 3rd World Conference on Soft Computing (WCSC), which was held in San Antonio, TX, USA, on December 16-18, 2013. It presents start-of-the-art theory and applications of soft computing together with an in-depth discussion of current and future challenges in the field, providing readers with a 360 degree view on soft computing. Topics range from fuzzy sets, to fuzzy logic, fuzzy mathematics, neuro-fuzzy systems, fuzzy control, decision making in fuzzy environments, image processing and many more. The book is dedicated to Lotfi A. Zadeh, a renowned specialist in signal analysis and control systems research who proposed the idea of fuzzy sets, in which an element may have a partial membership, in the early 1960s, followed by the idea of fuzzy logic, in which a statement can be true only to a certain degree, with degrees described by numbers in the interval [0,1]. The performance of fuzzy systems can often be improved with the help of optimization techniques, e.g. evolutionary computation, and by endowing the corresponding system with the ability to learn, e.g. by combining fuzzy systems with neural networks. The resulting "consortium" of fuzzy, evolutionary, and neural techniques is known as soft computing and is the main focus of this book.
In a world increasingly awash in information, the field of information science has become an umbrella stretched so broadly as to threaten its own integrity. However, while traditional information science seeks to make sense of information systems against a social, cultural, and political backdrop, there exists a lack of current literature exploring how such transactions can exert force in the other direction-that is, how information systems mold the individuals who utilize them and society as a whole. The Handbook of Research on Innovations in Information Retrieval, Analysis, and Management explores new developments in the field of information and communication technologies and explores how complex information systems interact with and affect one another, woven into the fabric of an information-rich world. Touching on such topics as machine learning, research methodologies, and mobile data aggregation, this book targets an audience of researchers, developers, managers, strategic planners, and advanced-level students. This handbook contains chapters on topics including, but not limited to, customer experience management, information systems planning, cellular networking, public policy development, and knowledge governance.
This book offers an introduction to artificial adaptive systems and a general model of the relationships between the data and algorithms used to analyze them. It subsequently describes artificial neural networks as a subclass of artificial adaptive systems, and reports on the backpropagation algorithm, while also identifying an important connection between supervised and unsupervised artificial neural networks. The book's primary focus is on the auto contractive map, an unsupervised artificial neural network employing a fixed point method versus traditional energy minimization. This is a powerful tool for understanding, associating and transforming data, as demonstrated in the numerous examples presented here. A supervised version of the auto contracting map is also introduced as an outstanding method for recognizing digits and defects. In closing, the book walks the readers through the theory and examples of how the auto contracting map can be used in conjunction with another artificial neural network, the "spin-net," as a dynamic form of auto-associative memory.
This book presents a contemporary view of the role of information quality in information fusion and decision making, and provides a formal foundation and the implementation strategies required for dealing with insufficient information quality in building fusion systems for decision making. Information fusion is the process of gathering, processing, and combining large amounts of information from multiple and diverse sources, including physical sensors to human intelligence reports and social media. That data and information may be unreliable, of low fidelity, insufficient resolution, contradictory, fake and/or redundant. Sources may provide unverified reports obtained from other sources resulting in correlations and biases. The success of the fusion processing depends on how well knowledge produced by the processing chain represents reality, which in turn depends on how adequate data are, how good and adequate are the models used, and how accurate, appropriate or applicable prior and contextual knowledge is. By offering contributions by leading experts, this book provides an unparalleled understanding of the problem of information quality in information fusion and decision-making for researchers and professionals in the field.
The academic landscape has been significantly enhanced by the advent of new technology. These tools allow researchers easier information access to better increase their knowledge base. Research 2.0 and the Impact of Digital Technologies on Scholarly Inquiry is an authoritative reference source for the latest insights on the impact of web services and social technologies for conducting academic research. Highlighting international perspectives, emerging scholarly practices, and real-world contexts, this book is ideally designed for academicians, practitioners, upper-level students, and professionals interested in the growing field of digital scholarship.
This book provides a comprehensive overview of the field of pattern mining with evolutionary algorithms. To do so, it covers formal definitions about patterns, patterns mining, type of patterns and the usefulness of patterns in the knowledge discovery process. As it is described within the book, the discovery process suffers from both high runtime and memory requirements, especially when high dimensional datasets are analyzed. To solve this issue, many pruning strategies have been developed. Nevertheless, with the growing interest in the storage of information, more and more datasets comprise such a dimensionality that the discovery of interesting patterns becomes a challenging process. In this regard, the use of evolutionary algorithms for mining pattern enables the computation capacity to be reduced, providing sufficiently good solutions. This book offers a survey on evolutionary computation with particular emphasis on genetic algorithms and genetic programming. Also included is an analysis of the set of quality measures most widely used in the field of pattern mining with evolutionary algorithms. This book serves as a review of the most important evolutionary algorithms for pattern mining. It considers the analysis of different algorithms for mining different type of patterns and relationships between patterns, such as frequent patterns, infrequent patterns, patterns defined in a continuous domain, or even positive and negative patterns. A completely new problem in the pattern mining field, mining of exceptional relationships between patterns, is discussed. In this problem the goal is to identify patterns which distribution is exceptionally different from the distribution in the complete set of data records. Finally, the book deals with the subgroup discovery task, a method to identify a subgroup of interesting patterns that is related to a dependent variable or target attribute. This subgroup of patterns satisfies two essential conditions: interpretability and interestingness.
R is a powerful and free software system for data analysis and graphics, with over 5,000 add-on packages available. This book introduces R using SAS and SPSS terms with which you are already familiar. It demonstrates which of the add-on packages are most like SAS and SPSS and compares them to R's built-in functions. It steps through over 30 programs written in all three packages, comparing and contrasting the packages' differing approaches. The programs and practice datasets are available for download. The glossary defines over 50 R terms using SAS/SPSS jargon and again using R jargon. The table of contents and the index allow you to find equivalent R functions by looking up both SAS statements and SPSS commands. When finished, you will be able to import data, manage and transform it, create publication quality graphics, and perform basic statistical analyses. This new edition has updated programming, an expanded index, and even more statistical methods covered in over 25 new sections.
Soft computing, as an engineering science, and statistics, as a
classical branch of mathematics, emphasize different aspects of
data analysis. |
![]() ![]() You may like...
The Data and Analytics Playbook - Proven…
Lowell Fryman, Gregory Lampshire, …
Paperback
R1,272
Discovery Miles 12 720
Implementation of Machine Learning…
Veljko Milutinovi, Nenad Mitic, …
Hardcover
R7,211
Discovery Miles 72 110
Mathematical Foundations of Data Science…
Frank Emmert-Streib, Salissou Moutari, …
Hardcover
New Opportunities for Sentiment Analysis…
Aakanksha Sharaff, G. R. Sinha, …
Hardcover
R7,211
Discovery Miles 72 110
Big Data and Smart Service Systems
Xiwei Liu, Rangachari Anand, …
Hardcover
Opinion Mining and Text Analytics on…
Pantea Keikhosrokiani, Moussa Pourya Asl
Hardcover
R10,065
Discovery Miles 100 650
|