![]() |
Welcome to Loot.co.za!
Sign in / Register |Wishlists & Gift Vouchers |Help | Advanced search
|
Your cart is empty |
||
|
Books > Computing & IT > Computer hardware & operating systems > Computer architecture & logic design > Parallel processing
This book covers the essential elements of parallel processing and parallel algorithms. It is unique in that it is a self-contained book covering everything fundamental of parallel processing from computer architecture to parallel programming and parallel algorithms. It is designed to function as a text for an undergraduate course in parallel processing, but also works well as a comprehensive reference for professionals interested in all phases of parallel processing and parallel programming.
Computer vision falls short of human vision in two respects: execution time and intelligent interpretation. This book addresses the question of execution time. It is based on a workshop on specialized processors for real-time image analysis, held as part of the activities of an ESPRIT Basic Research Action, the Working Group on Vision. The aim of the book is to examine the state of the art in vision-oriented computers. Two approaches are distinguished: multiprocessor systems and fine-grain massively parallel computers. The development of fine-grain machines has become more important over the last decade, but one of the main conclusions of the workshop is that this does not imply the replacement of multiprocessor machines. The book is divided into four parts. Part 1 introduces different architectures for vision: associative and pyramid processors as examples of fine-grain machines and a workstation with bus-oriented network topology as an example of a multiprocessor system. Parts 2 and 3 deal with the design and development of dedicated and specialized architectures. Part 4 is mainly devoted to applications, including road segmentation, mobile robot guidance and navigation, reconstruction and identification of 3D objects, and motion estimation.
This book provides an introduction to decision making in a distributed computational framework. Classical detection theory assumes a centralized configuration. All observations are processed by a central processor to produce the decision. In the decentralized detection system, distributed detectors generate decisions based on locally available observations; these decisions are then conveyed to the fusion center that makes the global decision. Using numerous examples throughout the book, the author discusses such distributed detection processes under several different formulations and in a wide variety of network topologies.
Distributed and Parallel Systems: Cluster and Grid Computing is the proceedings of the fourth Austrian-Hungarian Workshop on Distributed and Parallel Systems organized jointly by Johannes Kepler University, Linz, Austria and the MTA SZTAKI Computer and Automation Research Institute. The papers in this volume cover a broad range of research topics presented in four groups. The first one introduces cluster tools and techniques, especially the issues of load balancing and migration. Another six papers deal with grid and global computing including grid infrastructure, tools, applications and mobile computing. The next nine papers present general questions of distributed development and applications. The last four papers address a crucial issue in distributed computing: fault tolerance and dependable systems. This volume will be useful to researchers and scholars interested in all areas related to parallel and distributed computing systems.
Dependence Analysis may be considered to be the second edition of the author's 1988 book, Dependence Analysis for Supercomputing. It is, however, a completely new work that subsumes the material of the 1988 publication. This book is the third volume in the series Loop Transformations for Restructuring Compilers. This series has been designed to provide a complete mathematical theory of transformations that can be used to automatically change a sequential program containing FORTRAN-like do loops into an equivalent parallel form. In Dependence Analysis, the author extends the model to a program consisting of do loops and assignment statements, where the loops need not be sequentially nested and are allowed to have arbitrary strides. In the context of such a program, the author studies, in detail, dependence between statements of the program caused by program variables that are elements of arrays. Dependence Analysis is directed toward graduate and undergraduate students, and professional writers of restructuring compilers. The prerequisite for the book consists of some knowledge of programming languages, and familiarity with calculus and graph theory. No knowledge of linear programming is required.
Automatic Performance Prediction of Parallel Programs presents a unified approach to the problem of automatically estimating the performance of parallel computer programs. The author focuses primarily on distributed memory multiprocessor systems, although large portions of the analysis can be applied to shared memory architectures as well. The author introduces a novel and very practical approach for predicting some of the most important performance parameters of parallel programs, including work distribution, number of transfers, amount of data transferred, network contention, transfer time, computation time and number of cache misses. This approach is based on advanced compiler analysis that carefully examines loop iteration spaces, procedure calls, array subscript expressions, communication patterns, data distributions and optimizing code transformations at the program level; and the most important machine specific parameters including cache characteristics, communication network indices, and benchmark data for computational operations at the machine level. The material has been fully implemented as part of P3T, which is an integrated automatic performance estimator of the Vienna Fortran Compilation System (VFCS), a state-of-the-art parallelizing compiler for Fortran77, Vienna Fortran and a subset of High Performance Fortran (HPF) programs. A large number of experiments using realistic HPF and Vienna Fortran code examples demonstrate highly accurate performance estimates, and the ability of the described performance prediction approach to successfully guide both programmer and compiler in parallelizing and optimizing parallel programs. A graphical user interface is described and displayed that visualizes each program source line together with the corresponding parameter values. P3T uses color-coded performance visualization to immediately identify hot spots in the parallel program. Performance data can be filtered and displayed at various levels of detail. Colors displayed by the graphical user interface are visualized in greyscale. Automatic Performance Prediction of Parallel Programs also includes coverage of fundamental problems of automatic parallelization for distributed memory multicomputers, a description of the basic parallelization strategy and a large variety of optimizing code transformations as included under VFCS.
The use of parallel processing technology in the next generation of
Database Management Systems (DBMSs) makes it possible to meet new
and challenging requirements. Database technology in rapidly
expanding new application areas brings unique challenges such as
increased functionality and efficient handling of very large
heterogeneous databases.
Parallel Programming: Concepts and Practice provides an upper level introduction to parallel programming. In addition to covering general parallelism concepts, this text teaches practical programming skills for both shared memory and distributed memory architectures. The authors' open-source system for automated code evaluation provides easy access to parallel computing resources, making the book particularly suitable for classroom settings.
This text for students and professionals in computer science provides a valuable overview of current knowledge concerning parallel algorithms. These computer operations have recently acquired increased importance due to their ability to enhance the power of computers by permitting multiple processors to work on different parts of a problem independently and simultaneously. This approach has led to solutions of difficult problems in a number of vital fields, including artificial intelligence, image processing, and differential equations. As the first up-to-date summary of the topic, this book will be sought after by researchers, computer science professionals, and advanced students involved in parallel computing and parallel algorithms.
Exploring new trends in computer technology, Corporal introduces an innovative and exciting concept: Transport Triggered Architecture (TTAs). Unlike most traditional architectures, where programmed operations trigger internal data transports, TTAs function through programming the data transports themselves. As a result the new architecture alleviates bottlenecks, allows for new code-generation optimizations and exploits hardware more efficiently. Founded on the author’s recent research, this book evaluates the attributes of different classes of architectures. It demonstrates how TTAs can be used as a template for automatic generation of application-specific processors and highlights their suitability for embedded system design. Several commercial TTA implementations have proven its concepts and advantages. Features includes:
Microprocessor Architectures is cutting-edge text which will prove invaluable to both industrial hardware and software engineers involved in embedded system design and to postgraduate electrical engineering and computer science students. This clearly-structured reference demonstrates the versatility of TTAs and explores their influential role in the next generation of computer architecture.
Parallel algorithms Made Easy The complexity of today's applications coupled with the widespread use of parallel computing has made the design and analysis of parallel algorithms topics of growing interest. This volume fills a need in the field for an introductory treatment of parallel algorithms—appropriate even at the undergraduate level, where no other textbooks on the subject exist. It features a systematic approach to the latest design techniques, providing analysis and implementation details for each parallel algorithm described in the book. Introduction to Parallel Algorithms covers foundations of parallel computing; parallel algorithms for trees and graphs; parallel algorithms for sorting, searching, and merging; and numerical algorithms. This remarkable book:
This book enables universities to offer parallel algorithm courses at the senior undergraduate level in computer science and engineering. It is also an invaluable text/reference for graduate students, scientists, and engineers in computer science, mathematics, and engineering.
A comprehensive overview of the current evolution of research in algorithms, architectures and compilation for parallel systems is provided by this publication. The contributions focus specifically on domains where embedded systems are required, either oriented to application-specific or to programmable realisations. These are crucial in domains such as audio, telecom, instrumentation, speech, robotics, medical and automotive processing, image and video processing, TV, multimedia, radar and sonar. The book will be of particular interest to the academic community because of the detailed descriptions of research results presented. In addition, many contributions feature the "real-life" applications that are responsible for driving research and the impact of their specific characteristics on the methodologies is assessed. The publication will also be of considerable value to senior design engineers and CAD managers in the industrial arena, who wish either to anticipate the evolution of commercially available design tools or to utilize the presented concepts in their own R&D programmes.
With the evolution of technology and sudden growth in the number of smart vehicles, traditional Vehicular Ad hoc NETworks (VANETs) face several technical challenges in deployment and management due to less flexibility, scalability, poor connectivity, and inadequate intelligence. VANETs have raised increasing attention from both academic research and industrial aspects resulting from their important role in driving assistant system. Vehicular Ad Hoc Networks focuses on recent advanced technologies and applications that address network protocol design, low latency networking, context-aware interaction, energy efficiency, resource management, security, human-robot interaction, assistive technology and robots, application development, and integration of multiple systems that support Vehicular Networks and smart interactions. Simulation is a key tool for the design and evaluation of Intelligent Transport Systems (ITS) that take advantage of communication-capable vehicles in order to provide valuable safety, traffic management, and infotainment services. It is widely recognized that simulation results are only significant when realistic models are considered within the simulation tool chain. However, quite often research works on the subject are based on simplistic models unable to capture the unique characteristics of vehicular communication networks. The support that different simulation tools offer for such models is discussed, as well as the steps that must be undertaken to fine-tune the model parameters in order to gather realistic results. Moreover, the book provides handy hints and references to help determine the most appropriate tools and models. This book will promote best simulation practices in order to obtain accurate results.
An all-inclusive survey of the fundamentals of parallel and distributed computing. The use of parallel and distributed computing has increased dramatically over the past few years, giving rise to a variety of projects, implementations, and buzzwords surrounding the subject. Although the areas of parallel and distributed computing have traditionally evolved separately, these models have overlapping goals and characteristics. Parallel and Distributed Computing surveys the models and paradigms in this converging area of parallel and distributed computing and considers the diverse approaches within a common text. Covering a comprehensive set of models and paradigms, the material also skims lightly over more specific details and serves as both an introduction and a survey. Novice readers will be able to quickly grasp a balanced overview with the review of central concepts, problems, and ideas, while the more experienced researcher will appreciate the specific comparisons between models, the coherency of the parallel and distributed computing field, and the discussion of less well-known proposals. Other topics covered include:
Parallel and Distributed Computing is a perfect tool for students and can be used as a foundation for parallel and distributed computing courses. Application developers will find this book helpful to get an overview before choosing a particular programming style to study in depth, and researchers and programmers will appreciate the wealth of information concerning the various areas of parallel and distributed computing.
If you need to learn CUDA but don't have experience with
parallel computing, "CUDA Programming: A Developer's Introduction
"offers a detailed guide to CUDA with a grounding in parallel
fundamentals. It starts by introducing CUDA and bringing you up to
speed on GPU parallelism and hardware, then delving into CUDA
installation. Chapters on core concepts including threads, blocks,
grids, and memory focus on both parallel and CUDA-specific issues.
Later, the book demonstrates CUDA in practice for optimizing
applications, adjusting to new hardware, and solving common
problems.
The organization of data is clearly of great importance in the design of high performance algorithms and architectures. Although there are several landmark papers on this subject, no comprehensive treatment has appeared. This monograph is intended to fill that gap. We introduce a model of computation for parallel computer architec tures, by which we are able to express the intrinsic complexity of data or ganization for specific architectures. We apply this model of computation to several existing parallel computer architectures, e.g., the CDC 205 and CRAY vector-computers, and the MPP binary array processor. The study of data organization in parallel computations was introduced as early as 1970. During the development of the ILLIAC IV system there was a need for a theory of possible data arrangements in interleaved mem ory systems. The resulting theory dealt primarily with storage schemes also called skewing schemes for 2-dimensional matrices, i.e., mappings from a- dimensional array to a number of memory banks. By means of the model of computation we are able to apply the theory of skewing schemes to var ious kinds of parallel computer architectures. This results in a number of consequences for both the design of parallel computer architectures and for applications of parallel processing."
NB-IoT is the Internet of Things (IoT) technology used for cellular communication. NB-IoT devices deliver much better capability and performance, such as: increased area coverage of up to one kilometer; a massive number of devices-up to 200,000-per a single base-station area; longer battery lifetime of ten years; and better indoor and outdoor coverage for areas with weak signal, such as underground garages. The cellular NB-IoT technology is a challenging technology to use and understand. With more than 30 projects presented in this book, covering many use cases and scenarios, this book provides hands-on and practical experience of how to use the cellular NB-IoT for smart applications using Arduino (TM), Amazon Cloud, Google Maps, and charts. The book starts by explaining AT commands used to configure the NB-IoT modem; data serialization and deserialization; how to set up the cloud for connecting NB-IoT devices; setting up rules, policy, security certificates, and a NoSQL database on the cloud; how to store and read data in the cloud; how to use Google Maps to visualize NB-IoT device geo-location; and how to use charts to visualize sensor datasets. Projects for Arduino are presented in four parts. The first part explains how to connect the device to the mobile operator and cellular network; perform communication using different network protocols, such as TCP, HTTP, SSL, or MQTT; how to use GPS for geo-location applications; and how to upgrade NB-IoT modem firmware over the air. The second part explains the microcontroller unit and how to build and run projects, such as a 7-segment display or a real-time clock. The third part explains how NB-IoT can be used with sensor devices, such as ultrasonic and environmental sensors. Finally, the fourth part explains how NB-IoT can be used to control actuators, such as stepper motors and relays. This book is a unique resource for understanding practical uses of the NB-IoT technology and serves as a handbook for technical and non-technical readers who are looking for practicing and exercising the cellular NB-IoT technology. The book can be used by engineers, students, researchers, system integrators, mobile operators' technical staff, and electronics enthusiasts. To download the software which can be used with the book, go to: https://github.com/5ghub/NB-IoT About the Author: Hossam Fattah is a technology expert in 4G/5G wireless systems and networking. He received his Ph.D. in Electrical and Computer Engineering from University of British Columbia, Vancouver, Canada in 2003. He received his Master of Applied Science in Electrical and Computer Engineering from University of Victoria, Victoria, Canada in 2000. He completed his B.Sc. degree in Computers and Systems Engineering from Al-Azhar University, Cairo, Egypt in 1995. Between 2003 and 2011, he was in academia and industry, including Texas A&M University. Between 2011 and 2013, he was with Spirent Communications, NJ, USA. Since 2013, he has been with Microsoft, USA. He is also an affiliate associate professor at University of Washington, Tacoma, WA, USA, teaching graduate courses on IoT and distributed systems and collaborating on 5G research and innovations. He has had many patents and technical publications in conferences and journals. He is a registered professional Engineer with the Association of Professional Engineers, British Columbia, Canada. He is the author of the recent book 5G LTE Narrowband Internet of Things (NB-IoT). His research interest is in wireless communications and radio networks and protocols, cellular quality of service, radio resource management, traffic and packet scheduling, network analytics, and mobility.
Computing systems are becoming highly complex, harder to understand, and therefore more prone to failure. Where such systems control aircraft for example, system failure could have disastrous consequences. It is important therefore that we are able to employ mathematical techniques to specify the behavior of critical systems. This thesis uses the theory of Communicating Sequential Processes to show how a real-time system (a system that maintains a continuous interaction with its environment) may be specified. Included is a case study in which a local area network protocol is described at two levels of abstraction, and a general method for structuring CSP descriptions of layered protocols is given. The research contained here represents the very latest work on the specification and verification of real-time systems.
In view of the growing presence and popularity of multicore and manycore processors, accelerators, and coprocessors, as well as clusters using such computing devices, the development of efficient parallel applications has become a key challenge to be able to exploit the performance of such systems. This book covers the scope of parallel programming for modern high performance computing systems. It first discusses selected and popular state-of-the-art computing devices and systems available today, These include multicore CPUs, manycore (co)processors, such as Intel Xeon Phi, accelerators, such as GPUs, and clusters, as well as programming models supported on these platforms. It next introduces parallelization through important programming paradigms, such as master-slave, geometric Single Program Multiple Data (SPMD) and divide-and-conquer. The practical and useful elements of the most popular and important APIs for programming parallel HPC systems are discussed, including MPI, OpenMP, Pthreads, CUDA, OpenCL, and OpenACC. It also demonstrates, through selected code listings, how selected APIs can be used to implement important programming paradigms. Furthermore, it shows how the codes can be compiled and executed in a Linux environment. The book also presents hybrid codes that integrate selected APIs for potentially multi-level parallelization and utilization of heterogeneous resources, and it shows how to use modern elements of these APIs. Selected optimization techniques are also included, such as overlapping communication and computations implemented using various APIs. Features: Discusses the popular and currently available computing devices and cluster systems Includes typical paradigms used in parallel programs Explores popular APIs for programming parallel applications Provides code templates that can be used for implementation of paradigms Provides hybrid code examples allowing multi-level parallelization Covers the optimization of parallel programs
GPU Parallel Program Development using CUDA teaches GPU programming by showing the differences among different families of GPUs. This approach prepares the reader for the next generation and future generations of GPUs. The book emphasizes concepts that will remain relevant for a long time, rather than concepts that are platform-specific. At the same time, the book also provides platform-dependent explanations that are as valuable as generalized GPU concepts. The book consists of three separate parts; it starts by explaining parallelism using CPU multi-threading in Part I. A few simple programs are used to demonstrate the concept of dividing a large task into multiple parallel sub-tasks and mapping them to CPU threads. Multiple ways of parallelizing the same task are analyzed and their pros/cons are studied in terms of both core and memory operation. Part II of the book introduces GPU massive parallelism. The same programs are parallelized on multiple Nvidia GPU platforms and the same performance analysis is repeated. Because the core and memory structures of CPUs and GPUs are different, the results differ in interesting ways. The end goal is to make programmers aware of all the good ideas, as well as the bad ideas, so readers can apply the good ideas and avoid the bad ideas in their own programs. Part III of the book provides pointer for readers who want to expand their horizons. It provides a brief introduction to popular CUDA libraries (such as cuBLAS, cuFFT, NPP, and Thrust),the OpenCL programming language, an overview of GPU programming using other programming languages and API libraries (such as Python, OpenCV, OpenGL, and Apple's Swift and Metal,) and the deep learning library cuDNN.
It is universally accepted today that parallel processing is here to stay but that software for parallel machines is still difficult to develop. However, there is little recognition of the fact that changes in processor architecture can significantly ease the development of software. In the seventies the availability of processors that could address a large name space directly, eliminated the problem of name management at one level and paved the way for the routine development of large programs. Similarly, today, processor architectures that can facilitate cheap synchronization and provide a global address space can simplify compiler development for parallel machines. If the cost of synchronization remains high, the pro gramming of parallel machines will remain significantly less abstract than programming sequential machines. In this monograph Bob Iannucci presents the design and analysis of an architecture that can be a better building block for parallel machines than any von Neumann processor. There is another very interesting motivation behind this work. It is rooted in the long and venerable history of dataflow graphs as a formalism for ex pressing parallel computation. The field has bloomed since 1974, when Dennis and Misunas proposed a truly novel architecture using dataflow graphs as the parallel machine language. The novelty and elegance of dataflow architectures has, however, also kept us from asking the real question: "What can dataflow architectures buy us that von Neumann ar chitectures can't?" In the following I explain in a round about way how Bob and I arrived at this question."
Shared Memory Application Programming presents the key concepts and applications of parallel programming, in an accessible and engaging style applicable to developers across many domains. Multithreaded programming is today a core technology, at the basis of all software development projects in any branch of applied computer science. This book guides readers to develop insights about threaded programming and introduces two popular platforms for multicore development: OpenMP and Intel Threading Building Blocks (TBB). Author Victor Alessandrini leverages his rich experience to explain each platform's design strategies, analyzing the focus and strengths underlying their often complementary capabilities, as well as their interoperability. The book is divided into two parts: the first develops the essential concepts of thread management and synchronization, discussing the way they are implemented in native multithreading libraries (Windows threads, Pthreads) as well as in the modern C++11 threads standard. The second provides an in-depth discussion of TBB and OpenMP including the latest features in OpenMP 4.0 extensions to ensure readers' skills are fully up to date. Focus progressively shifts from traditional thread parallelism to modern task parallelism deployed by modern programming environments. Several chapter include examples drawn from a variety of disciplines, including molecular dynamics and image processing, with full source code and a software library incorporating a number of utilities that readers can adapt into their own projects.
Companies are spending billions on machine learning projects, but it's money wasted if the models can't be deployed effectively. In this practical guide, Hannes Hapke and Catherine Nelson walk you through the steps of automating a machine learning pipeline using the TensorFlow ecosystem. You'll learn the techniques and tools that will cut deployment time from days to minutes, so that you can focus on developing new models rather than maintaining legacy systems. Data scientists, machine learning engineers, and DevOps engineers will discover how to go beyond model development to successfully productize their data science projects, while managers will better understand the role they play in helping to accelerate these projects. Understand the steps to build a machine learning pipeline Build your pipeline using components from TensorFlow Extended Orchestrate your machine learning pipeline with Apache Beam, Apache Airflow, and Kubeflow Pipelines Work with data using TensorFlow Data Validation and TensorFlow Transform Analyze a model in detail using TensorFlow Model Analysis Examine fairness and bias in your model performance Deploy models with TensorFlow Serving or TensorFlow Lite for mobile devices Learn privacy-preserving machine learning techniques |
You may like...
Computation and Storage in the Cloud…
Dong Yuan, Yun Yang, …
Paperback
The Sourcebook of Parallel Computing
Jack Dongarra, Ian Foster, …
Hardcover
R1,988
Discovery Miles 19 880
Migrating Legacy Applications…
Anca Daniela Ionita, Marin Litoiu, …
Hardcover
R4,968
Discovery Miles 49 680
A Parallel Algorithm Synthesis Procedure…
Ian N. Dunn, Gerard G.L. Meyer
Hardcover
R2,715
Discovery Miles 27 150
Introduction to Parallel Computing - A…
Wesley Petersen, Peter Arbenz
Hardcover
R5,836
Discovery Miles 58 360
Constraint Decision-Making Systems in…
Santosh Kumar Das, Nilanjan Dey
Hardcover
R6,687
Discovery Miles 66 870
|