Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
|||
Books > Computing & IT > Applications of computing > Databases > Data warehousing
A groundbreaking, flexible approach to computer science anddata science The Deitels' Introduction to Python for ComputerScience and Data Science: Learning to Program with AI, Big Data and the Cloudoffers a unique approach to teaching introductory Python programming,appropriate for both computer-science and data-science audiences. Providing themost current coverage of topics and applications, the book is paired withextensive traditional supplements as well as Jupyter Notebooks supplements.Real-world datasets and artificial-intelligence technologies allow students towork on projects making a difference in business, industry, government andacademia. Hundreds of examples, exercises, projects (EEPs) and implementationcase studies give students an engaging, challenging and entertainingintroduction to Python programming and hands-on data science. The book's modular architecture enables instructors toconveniently adapt the text to a wide range of computer-science anddata-science courses offered to audiences drawn from many majors.Computer-science instructors can integrate as much or as little data-scienceand artificial-intelligence topics as they'd like, and data-science instructorscan integrate as much or as little Python as they'd like. The book aligns withthe latest ACM/IEEE CS-and-related computing curriculum initiatives and withthe Data Science Undergraduate Curriculum Proposal sponsored by the NationalScience Foundation.
Due to the tremendous amount of data generated daily from fields such as business, research, and sciences, big data is everywhere. Therefore, alternative management and processing methods have to be created to handle this complex and unstructured data size. Big Data Management, Technologies, and Applications discusses the exponential growth of information size and the innovative methods for data capture, storage, sharing, and analysis for big data. With its prevalence, this collection of articles on big data methodologies and technologies are beneficial for IT workers, researchers, students, and practitioners in this timely field.
Technology has revolutionized the ways in which libraries store, share, and access information. As digital resources and tools continue to advance, so too do the opportunities for libraries to become more efficient and house more information. E-Discovery Tools and Applications in Modern Libraries presents critical research on the digitization of data and how this shift has impacted knowledge discovery, storage, and retrieval. This publication explores several emerging trends and concepts essential to electronic discovery, such as library portals, responsive websites, and federated search technology. The timely research presented within this publication is designed for use by librarians, graduate-level students, technology developers, and researchers in the field of library and information science.
This contributed volume discusses essential topics and the fundamentals for Big Data Emergency Management and primarily focusses on the application of Big Data for Emergency Management. It walks the reader through the state of the art, in different facets of the big disaster data field. This includes many elements that are important for these technologies to have real-world impact. This book brings together different computational techniques from: machine learning, communication network analysis, natural language processing, knowledge graphs, data mining, and information visualization, aiming at methods that are typically used for processing big emergency data. This book also provides authoritative insights and highlights valuable lessons by distinguished authors, who are leaders in this field. Emergencies are severe, large-scale, non-routine events that disrupt the normal functioning of a community or a society, causing widespread and overwhelming losses and impacts. Emergency Management is the process of planning and taking actions to minimize the social and physical impact of emergencies and reduces the community's vulnerability to the consequences of emergencies. Information exchange before, during and after the disaster periods can greatly reduce the losses caused by the emergency. This allows people to make better use of the available resources, such as relief materials and medical supplies. It also provides a channel through which reports on casualties and losses in each affected area, can be delivered expeditiously. Big Data-Driven Emergency Management refers to applying advanced data collection and analysis technologies to achieve more effective and responsive decision-making during emergencies. Researchers, engineers and computer scientists working in Big Data Emergency Management, who need to deal with large and complex sets of data will want to purchase this book. Advanced-level students interested in data-driven emergency/crisis/disaster management will also want to purchase this book as a study guide.
This book provides readers the "big picture" and a comprehensive survey of the domain of big data processing systems. For the past decade, the Hadoop framework has dominated the world of big data processing, yet recently academia and industry have started to recognize its limitations in several application domains and thus, it is now gradually being replaced by a collection of engines that are dedicated to specific verticals (e.g. structured data, graph data, and streaming data). The book explores this new wave of systems, which it refers to as Big Data 2.0 processing systems. After Chapter 1 presents the general background of the big data phenomena, Chapter 2 provides an overview of various general-purpose big data processing systems that allow their users to develop various big data processing jobs for different application domains. In turn, Chapter 3 examines various systems that have been introduced to support the SQL flavor on top of the Hadoop infrastructure and provide competing and scalable performance in the processing of large-scale structured data. Chapter 4 discusses several systems that have been designed to tackle the problem of large-scale graph processing, while the main focus of Chapter 5 is on several systems that have been designed to provide scalable solutions for processing big data streams, and on other sets of systems that have been introduced to support the development of data pipelines between various types of big data processing jobs and systems. Next, Chapter 6 focuses on covering the emerging frameworks and systems in the domain of scalable machine learning and deep learning processing. Lastly, Chapter 7 shares conclusions and an outlook on future research challenges. This new and considerably enlarged second edition not only contains the completely new chapter 6, but also offers a refreshed content for the state-of-the-art in all domains of big data processing over the last years. Overall, the book offers a valuable reference guide for professional, students, and researchers in the domain of big data processing systems. Further, its comprehensive content will hopefully encourage readers to pursue further research on the subject.
This 2 volume-set of IFIP AICT 583 and 584 constitutes the refereed proceedings of the 16th IFIP WG 12.5 International Conference on Artificial Intelligence Applications and Innovations, AIAI 2020, held in Neos Marmaras, Greece, in June 2020.* The 70 full papers and 5 short papers presented were carefully reviewed and selected from 149 submissions. They cover a broad range of topics related to technical, legal, and ethical aspects of artificial intelligence systems and their applications and are organized in the following sections: Part I: classification; clustering - unsupervised learning -analytics; image processing; learning algorithms; neural network modeling; object tracking - object detection systems; ontologies - AI; and sentiment analysis - recommender systems. Part II: AI ethics - law; AI constraints; deep learning - LSTM; fuzzy algebra - fuzzy systems; machine learning; medical - health systems; and natural language. *The conference was held virtually due to the COVID-19 pandemic.
Covering aspects from principles and limitations of statistical significance tests to topic set size design and power analysis, this book guides readers to statistically well-designed experiments. Although classical statistical significance tests are to some extent useful in information retrieval (IR) evaluation, they can harm research unless they are used appropriately with the right sample sizes and statistical power and unless the test results are reported properly. The first half of the book is mainly targeted at undergraduate students, and the second half is suitable for graduate students and researchers who regularly conduct laboratory experiments in IR, natural language processing, recommendations, and related fields.Chapters 1-5 review parametric significance tests for comparing system means, namely, t-tests and ANOVAs, and show how easily they can be conducted using Microsoft Excel or R. These chapters also discuss a few multiple comparison procedures for researchers who are interested in comparing every system pair, including a randomised version of Tukey's Honestly Significant Difference test. The chapters then deal with known limitations of classical significance testing and provide practical guidelines for reporting research results regarding comparison of means. Chapters 6 and 7 discuss statistical power. Chapter 6 introduces topic set size design to enable test collection builders to determine an appropriate number of topics to create. Readers can easily use the author's Excel tools for topic set size design based on the paired and two-sample t-tests, one-way ANOVA, and confidence intervals. Chapter 7 describes power-analysis-based methods for determining an appropriate sample size for a new experiment based on a similar experiment done in the past, detailing how to utilize the author's R tools for power analysis and how to interpret the results. Case studies from IR for both Excel-based topic set size design and R-based power analysis are also provided.
As new concepts such as virtualisation, cloud computing, and web applications continue to emerge, XML has begun to assume the role as the universal language for communication among contrasting systems that grow throughout the internet. Innovations in XML Applications and Metadata Management: Advancing Technologies addresses the functionality between XML and its related technologies towards application development based on previous concepts. This book aims to highlights the variety of purposes for XML applications and how the technology development brings together advancements in the virtual world.
This book explores the digitization of culture as a means of experiencing and understanding cultural heritage in Namibia and from international perspectives. It provides various views and perspectives on the digitization of culture, the goal being to stimulate further research, and to rapidly disseminate related discoveries. Aspects covered here include: virtual and augmented reality, audio and video technology, art, multimedia and digital media integration, cross-media technologies, modeling, visualization and interaction as a means of experiencing and grasping cultural heritage. Over the past few decades, digitization has profoundly changed our cultural experience, not only in terms of digital technology-based access, production and dissemination, but also in terms of participation and creation, and learning and partaking in a knowledge society. Computing researchers have developed a wealth of new digital systems for preserving, sharing and interacting with cultural resources. The book provides important information and tools for policy makers, knowledge experts, cultural and creative industries, communication scientists, professionals, educators, librarians and artists, as well as computing scientists and engineers conducting research on cultural topics.
This 2 volume-set of IFIP AICT 583 and 584 constitutes the refereed proceedings of the 16th IFIP WG 12.5 International Conference on Artificial Intelligence Applications and Innovations, AIAI 2020, held in Neos Marmaras, Greece, in June 2020.* The 70 full papers and 5 short papers presented were carefully reviewed and selected from 149 submissions. They cover a broad range of topics related to technical, legal, and ethical aspects of artificial intelligence systems and their applications and are organized in the following sections: Part I: classification; clustering - unsupervised learning -analytics; image processing; learning algorithms; neural network modeling; object tracking - object detection systems; ontologies - AI; and sentiment analysis - recommender systems. Part II: AI ethics - law; AI constraints; deep learning - LSTM; fuzzy algebra - fuzzy systems; machine learning; medical - health systems; and natural language. *The conference was held virtually due to the COVID-19 pandemic.
This book introduces fundamentals and trade-offs of data de-duplication techniques. It describes novel emerging de-duplication techniques that remove duplicate data both in storage and network in an efficient and effective manner. It explains places where duplicate data are originated, and provides solutions that remove the duplicate data. It classifies existing de-duplication techniques depending on size of unit data to be compared, the place of de-duplication, and the time of de-duplication. Chapter 3 considers redundancies in email servers and a de-duplication technique to increase reduction performance with low overhead by switching chunk-based de-duplication and file-based de-duplication. Chapter 4 develops a de-duplication technique applied for cloud-storage service where unit data to be compared are not physical-format but logical structured-format, reducing processing time efficiently. Chapter 5 displays a network de-duplication where redundant data packets sent by clients are encoded (shrunk to small-sized payload) and decoded (restored to original size payload) in routers or switches on the way to remote servers through network. Chapter 6 introduces a mobile de-duplication technique with image (JPEG) or video (MPEG) considering performance and overhead of encryption algorithm for security on mobile device.
Market Basket Analysis (MBA) provides the ability to continually monitor the affinities of a business and can help an organization achieve a key competitive advantage. Time Variant data enables data warehouses to directly associate events in the past with the participants in each individual event. In the past however, the use of these powerful tools in tandem led to performance degradation and resulted in unactionable and even damaging information. Data Warehouse Designs: Achieving ROI with Market Basket Analysis and Time Variance presents an innovative, soup-to-nuts approach that successfully combines what was previously incompatible, without degradation, and uses the relational architecture already in place. Built around two main chapters, Market Basket Solution Definition and Time Variant Solution Definition, it provides a tangible how-to design that can be used to facilitate MBA within the context of a data warehouse. Presents a solution for creating home-grown MBA data marts Includes database design solutions in the context of Oracle, DB2, SQL Server, and Teradata relational database management systems (RDBMS) Explains how to extract, transform, and load data used in MBA and Time Variant solutions The book uses standard RDBMS platforms, proven database structures, standard SQL and hardware, and software and practices already accepted and used in the data warehousing community to fill the gaps left by most conceptual discussions of MBA. It employs a form and language intended for a data warehousing audience to explain the practicality of how data is delivered, stored, and viewed. Offering a comprehensive explanation of the applications that provide, store, and use MBA data, Data Warehouse Designs provides you with the language and concepts needed to require and receive information that is relevant and actionable.
Recent technological advancements in data warehousing have been contributing to the emergence of business intelligence useful for managerial decision making. ""Progressive Methods in Data Warehousing and Business Intelligence: Concepts and Competitive Analytics"" presents the latest trends, studies, and developments in business intelligence and data warehousing contributed by experts from around the globe. Consisting of four main sections, this book covers crucial topics within the field such as OLAP and patterns, spatio-temporal data warehousing, and benchmarking of the subject.
The concept of a big data warehouse appeared in order to store moving data objects and temporal data information. Moving objects are geometries that change their position and shape continuously over time. In order to support spatio-temporal data, a data model and associated query language is needed for supporting moving objects. Emerging Perspectives in Big Data Warehousing is an essential research publication that explores current innovative activities focusing on the integration between data warehousing and data mining with an emphasis on the applicability to real-world problems. Featuring a wide range of topics such as index structures, ontology, and user behavior, this book is ideally designed for IT consultants, researchers, professionals, computer scientists, academicians, and managers.
With this textbook, Vaisman and Zimanyi deliver excellent coverage of data warehousing and business intelligence technologies ranging from the most basic principles to recent findings and applications. To this end, their work is structured into three parts. Part I describes "Fundamental Concepts" including conceptual and logical data warehouse design, as well as querying using MDX, DAX and SQL/OLAP. This part also covers data analytics using Power BI and Analysis Services. Part II details "Implementation and Deployment," including physical design, ETL and data warehouse design methodologies. Part III covers "Advanced Topics" and it is almost completely new in this second edition. This part includes chapters with an in-depth coverage of temporal, spatial, and mobility data warehousing. Graph data warehouses are also covered in detail using Neo4j. The last chapter extensively studies big data management and the usage of Hadoop, Spark, distributed, in-memory, columnar, NoSQL and NewSQL database systems, and data lakes in the context of analytical data processing. As a key characteristic of the book, most of the topics are presented and illustrated using application tools. Specifically, a case study based on the well-known Northwind database illustrates how the concepts presented in the book can be implemented using Microsoft Analysis Services and Power BI. All chapters have been revised and updated to the latest versions of the software tools used. KPIs and Dashboards are now also developed using DAX and Power BI, and the chapter on ETL has been expanded with the implementation of ETL processes in PostgreSQL. Review questions and exercises complement each chapter to support comprehensive student learning. Supplemental material to assist instructors using this book as a course text is available online and includes electronic versions of the figures, solutions to all exercises, and a set of slides accompanying each chapter. Overall, students, practitioners and researchers alike will find this book the most comprehensive reference work on data warehouses, with key topics described in a clear and educational style. "I can only invite you to dive into the contents of the book, feeling certain that once you have completed its reading (or maybe, targeted parts of it), you will join me in expressing our gratitude to Alejandro and Esteban, for providing such a comprehensive textbook for the field of data warehousing in the first place, and for keeping it up to date with the recent developments, in this current second edition." From the foreword by Panos Vassiliadis, University of Ioannina, Greece.
Sensor network data management poses new challenges outside the scope of conventional systems where data is represented and regulated. Intelligent Techniques for Warehousing and Mining Sensor Network Data presents fundamental and theoretical issues pertaining to data management. Covering a broad range of topics on warehousing and mining sensor networks, this advanced title provides significant industry solutions to those in database, data warehousing, and data mining research communities.
As the first to focus on the issue of Data Warehouse Requirements Engineering, this book introduces a model-driven requirements process used to identify requirements granules and incrementally develop data warehouse fragments. In addition, it presents an approach to the pair-wise integration of requirements granules for consolidating multiple data warehouse fragments. The process is systematic and does away with the fuzziness associated with existing techniques. Thus, consolidation is treated as a requirements engineering issue. The notion of a decision occupies a central position in the decision-based approach. On one hand, information relevant to a decision must be elicited from stakeholders; modeled; and transformed into multi-dimensional form. On the other, decisions themselves are to be obtained from decision applications. For the former, the authors introduce a suite of information elicitation techniques specific to data warehousing. This information is subsequently converted into multi-dimensional form. For the latter, not only are decisions obtained from decision applications for managing operational businesses, but also from applications for formulating business policies and for defining rules for enforcing policies, respectively. In this context, the book presents a broad range of models, tools and techniques. For readers from academia, the book identifies the scientific/technological problems it addresses and provides cogent arguments for the proposed solutions; for readers from industry, it presents an approach for ensuring that the product meets its requirements while ensuring low lead times in delivery.
Organizations rely on data mining and warehousing technologies to store, integrate, query, and analyze essential data. Strategic Advancements in Utilizing Data Mining and Warehousing Technologies: New Concepts and Developments discusses developments in data mining and warehousing as well as techniques for successful implementation. Contributions investigate theoretical queries along with real-world applications, providing a useful foundation for academicians and practitioners to research new techniques and methodologies.
This book includes high-quality papers presented at the Second International Symposium on Computer Vision and Machine Intelligence in Medical Image Analysis (ISCMM 2021), organized by Computer Applications Department, SMIT in collaboration with Department of Pathology, SMIMS, Sikkim, India, and funded by Indian Council of Medical Research, during 11 - 12 November 2021. It discusses common research problems and challenges in medical image analysis, such as deep learning methods. It also discusses how these theories can be applied to a broad range of application areas, including lung and chest x-ray, breast CAD, microscopy and pathology. The studies included mainly focus on the detection of events from biomedical signals.
There is growing recognition of the need to address the fragility of digital information, on which our society heavily depends for smooth operation in all aspects of daily life. This has been discussed in many books and articles on digital preservation, so why is there a need for yet one more? Because, for the most part, those other publications focus on documents, images and webpages - objects that are normally rendered to be simply displayed by software to a human viewer. Yet there are clearly many more types of digital objects that may need to be preserved, such as databases, scientific data and software itself. David Giaretta, Director of the Alliance for Permanent Access, and his contributors explain why the tools and techniques used for preserving rendered objects are inadequate for all these other types of digital objects, and they provide the concepts, techniques and tools that are needed. The book is structured in three parts. The first part is on theory, i.e., the concepts and techniques that are essential for preserving digitally encoded information. The second part then shows practice, i.e., the use and validation of these tools and techniques. Finally, the third part concludes by addressing how to judge whether money is being well spent, in terms of effectiveness and cost sharing. Various examples of digital objects from many sources are used to explain the tools and techniques presented. The presentation style mainly aims at practitioners in libraries, archives and industry who are either directly responsible for preservation or who need to prepare for audits of their archives. Researchers in digital preservation and developers of preservation tools and techniques will also find valuable practical information here. Researchers creating digitally encoded information of all kinds will also need to be aware of these topics so that they can help to ensure that their data is usable and can be valued by others now and in the future. To further assist the reader, the book is supported by many hours of videos and presentations from the CASPAR project and by a set of open source software.
|
You may like...
The Shape of Data in Digital Humanities…
Julia Flanders, Fotis Jannidis
Paperback
R1,215
Discovery Miles 12 150
Data Science for Healthcare…
Sergio Consoli, Diego Reforgiato Recupero, …
Hardcover
R3,958
Discovery Miles 39 580
The Shape of Data in Digital Humanities…
Julia Flanders, Fotis Jannidis
Hardcover
R3,877
Discovery Miles 38 770
|