![]() |
![]() |
Your cart is empty |
||
Books > Computing & IT > Computer hardware & operating systems > Computer architecture & logic design > Parallel processing
This book offers readers a set of new approaches and tools a set of tools and techniques for facing challenges in parallelization with design of embedded systems. It provides an advanced parallel simulation infrastructure for efficient and effective system-level model validation and development so as to build better products in less time. Since parallel discrete event simulation (PDES) has the potential to exploit the underlying parallel computational capability in today's multi-core simulation hosts, the author begins by reviewing the parallelization of discrete event simulation, identifying problems and solutions. She then describes out-of-order parallel discrete event simulation (OoO PDES), a novel approach for efficient validation of system-level designs by aggressively exploiting the parallel capabilities of todays' multi-core PCs. This approach enables readers to design simulators that can fully exploit the parallel processing capability of the multi-core system to achieve fast speed simulation, without loss of simulation and timing accuracy. Based on this parallel simulation infrastructure, the author further describes automatic approaches that help the designer quickly to narrow down the debugging targets in faulty ESL models with parallelism.
The book discusses some key scientific and technological developments in high performance computing, identifies significant trends, and defines desirable research objectives. It covers general concepts and emerging systems, software technology, algorithms and applications. Coverage includes hardware, software tools, networks and numerical methods, new computer architectures, and a discussion of future trends. Beyond purely scientific/engineering computing, the book extends to coverage of enterprise-wide, commercial applications, including papers on performance and scalability of database servers and Oracle DBM systems. Audience: Most papers are research level, but some are suitable for computer literate managers and technicians, making the book useful to users of commercial parallel computers.
Distributed and Parallel Systems: From Instruction Parallelism to Cluster Computing is the proceedings of the third Austrian-Hungarian Workshop on Distributed and Parallel Systems organized jointly by the Austrian Computer Society and the MTA SZTAKI Computer and Automation Research Institute. This book contains 18 full papers and 12 short papers from 14 countries around the world, including Japan, Korea and Brazil. The paper sessions cover a broad range of research topics in the area of parallel and distributed systems, including software development environments, performance evaluation, architectures, languages, algorithms, web and cluster computing. This volume will be useful to researchers and scholars interested in all areas related to parallel and distributed computing systems.
Advances in optical technologies have made it possible to implement optical interconnections in future massively parallel processing systems. Photons are non-charged particles, and do not naturally interact. Consequently, there are many desirable characteristics of optical interconnects, e.g. high speed (speed of light), increased fanout, high bandwidth, high reliability, longer interconnection lengths, low power requirements, and immunity to EMI with reduced crosstalk. Optics can utilize free-space interconnects as well as guided wave technology, neither of which has the problems of VLSI technology mentioned above. Optical interconnections can be built at various levels, providing chip-to-chip, module-to-module, board-to-board, and node-to-node communications. Massively parallel processing using optical interconnections poses new challenges; new system configurations need to be designed, scheduling and data communication schemes based on new resource metrics need to be investigated, algorithms for a wide variety of applications need to be developed under the novel computation models that optical interconnections permit, and so on. Parallel Computing Using Optical Interconnections is a collection of survey articles written by leading and active scientists in the area of parallel computing using optical interconnections. This is the first book which provides current and comprehensive coverage of the field, reflects the state of the art from high-level architecture design and algorithmic points of view, and points out directions for further research and development.
Cellular automata can be viewed both as computational models and modelling systems of real processes. This volume emphasises the first aspect. In articles written by leading researchers, sophisticated massive parallel algorithms (firing squad, life, Fischer's primes recognition) are treated. Their computational power and the specific complexity classes they determine are surveyed, while some recent results in relation to chaos from a new dynamic systems point of view are also presented. Audience: This book will be of interest to specialists of theoretical computer science and the parallelism challenge.
Parallel Numerical Computations with Applications contains selected edited papers presented at the 1998 Frontiers of Parallel Numerical Computations and Applications Workshop, along with invited papers from leading researchers around the world. These papers cover a broad spectrum of topics on parallel numerical computation with applications; such as advanced parallel numerical and computational optimization methods, novel parallel computing techniques, numerical fluid mechanics, and other applications related to material sciences, signal and image processing, semiconductor technology, and electronic circuits and systems design. This state-of-the-art volume will be an up-to-date resource for researchers in the areas of parallel and distributed computing.
Parallel and Distributed Information Systems brings together in one place important contributions and up-to-date research results in this fast moving area. Parallel and Distributed Information Systems serves as an excellent reference, providing insight into some of the most challenging research issues in the field.
Mathematics is playing an ever more important role in the physical and biological sciences, provoking a blurring of boundaries between scientific dis ciplines and a resurgence of interest in the modern as well as the classical techniques of applied mathematics. This renewal of interest, both in research and teaching, has led to the establishment of the series: Texts in Applied Mathe matics (TAM). The development of new courses is a natural consequence of a high level of excitement on the research frontier as newer techniques, such as numerical and symbolic computer systems, dynamical systems, and chaos, mix with and reinforce the traditional methods of applied mathematics. Thus, the purpose of this textbook series is to meet the current and future needs of these advances and encourage the teaching of new courses. TAM will publish textbooks suitable for use in advanced undergraduate and beginning graduate courses, and will complement the Applied Mathematical Sciences (AMS) series, which will focus on advanced textbooks and research level monographs. Preface A successful concurrent numerical simulation requires physics and math ematics to develop and analyze the model, numerical analysis to develop solution methods, and computer science to develop a concurrent implemen tation. No single course can or should cover all these disciplines. Instead, this course on concurrent scientific computing focuses on a topic that is not covered or is insufficiently covered by other disciplines: the algorith mic structure of numerical methods."
Advances in microelectronic technology have made massively parallel computing a reality and triggered an outburst of research activity in parallel processing architectures and algorithms. Distributed memory multiprocessors - parallel computers that consist of microprocessors connected in a regular topology - are increasingly being used to solve large problems in many application areas. In order to use these computers for a specific application, existing algorithms need to be restructured for the architecture and new algorithms developed. The performance of a computation on a distributed memory multiprocessor is affected by the node and communication architecture, the interconnection network topology, the I/O subsystem, and the parallel algorithm and communication protocols. Each of these parametersis a complex problem, and solutions require an understanding of the interactions among them. This book is based on the papers presented at the NATO Advanced Study Institute held at Bilkent University, Turkey, in July 1991. The book is organized in five parts: Parallel computing structures and communication, Parallel numerical algorithms, Parallel programming, Fault tolerance, and Applications and algorithms.
This IMA Volume in Mathematics and its Applications ALGORITHMS FOR PARALLEL PROCESSING is based on the proceedings of a workshop that was an integral part of the 1996-97 IMA program on "MATHEMATICS IN HIGH-PERFORMANCE COMPUTING. " The workshop brought together algorithm developers from theory, combinatorics, and scientific computing. The topics ranged over models, linear algebra, sorting, randomization, and graph algorithms and their analysis. We thank Michael T. Heath of University of lllinois at Urbana (Com puter Science), Abhiram Ranade of the Indian Institute of Technology (Computer Science and Engineering), and Robert S. Schreiber of Hewlett Packard Laboratories for their excellent work in organizing the workshop and editing the proceedings. We also take this opportunity to thank the National Science Founda tion (NSF) and the Army Research Office (ARO), whose financial support made the workshop possible. A vner Friedman Robert Gulliver v PREFACE The Workshop on Algorithms for Parallel Processing was held at the IMA September 16 - 20, 1996; it was the first workshop of the IMA year dedicated to the mathematics of high performance computing. The work shop organizers were Abhiram Ranade of The Indian Institute of Tech nology, Bombay, Michael Heath of the University of Illinois, and Robert Schreiber of Hewlett Packard Laboratories. Our idea was to bring together researchers who do innovative, exciting, parallel algorithms research on a wide range of topics, and by sharing insights, problems, tools, and methods to learn something of value from one another."
During the last three decades, breakthroughs in computer technology have made a tremendous impact on optimization. In particular, parallel computing has made it possible to solve larger and computationally more difficult problems. The book covers recent developments in novel programming and algorithmic aspects of parallel computing as well as technical advances in parallel optimization. Each contribution is essentially expository in nature, but of scholarly treatment. In addition, each chapter includes a collection of carefully selected problems. The first two chapters discuss theoretical models for parallel algorithm design and their complexity. The next chapter gives the perspective of the programmer practicing parallel algorithm development on real world platforms. Solving systems of linear equations efficiently is of great importance not only because they arise in many scientific and engineering applications but also because algorithms for solving many optimization problems need to call system solvers and subroutines (chapters four and five). Chapters six through thirteen are dedicated to optimization problems and methods. They include parallel algorithms for network problems, parallel branch and bound techniques, parallel heuristics for discrete and continuous problems, decomposition methods, parallel algorithms for variational inequality problems, parallel algorithms for stochastic programming, and neural networks. Audience: Parallel Computing in Optimization is addressed not only to researchers of mathematical programming, but to all scientists in various disciplines who use optimization methods in parallel and multiprocessing environments to model and solve problems.
This book presents the theory behind software-implemented hardware fault tolerance, as well as the practical aspects needed to put it to work on real examples. By evaluating accurately the advantages and disadvantages of the already available approaches, the book provides a guide to developers willing to adopt software-implemented hardware fault tolerance in their applications. Moreover, the book identifies open issues for researchers willing to improve the already available techniques.
Multiple processor systems are an important class of parallel systems. Over the years, several architectures have been proposed to build such systems to satisfy the requirements of high performance computing. These architectures span a wide variety of system types. At the low end of the spectrum, we can build a small, shared-memory parallel system with tens of processors. These systems typically use a bus to interconnect the processors and memory. Such systems, for example, are becoming commonplace in high-performance graph ics workstations. These systems are called uniform memory access (UMA) multiprocessors because they provide uniform access of memory to all pro cessors. These systems provide a single address space, which is preferred by programmers. This architecture, however, cannot be extended even to medium systems with hundreds of processors due to bus bandwidth limitations. To scale systems to medium range i. e., to hundreds of processors, non-bus interconnection networks have been proposed. These systems, for example, use a multistage dynamic interconnection network. Such systems also provide global, shared memory like the UMA systems. However, they introduce local and remote memories, which lead to non-uniform memory access (NUMA) architecture. Distributed-memory architecture is used for systems with thousands of pro cessors. These systems differ from the shared-memory architectures in that there is no globally accessible shared memory. Instead, they use message pass ing to facilitate communication among the processors. As a result, they do not provide single address space."
THE CONTEXT OF PARALLEL PROCESSING The field of digital computer architecture has grown explosively in the past two decades. Through a steady stream of experimental research, tool-building efforts, and theoretical studies, the design of an instruction-set architecture, once considered an art, has been transformed into one of the most quantitative branches of computer technology. At the same time, better understanding of various forms of concurrency, from standard pipelining to massive parallelism, and invention of architectural structures to support a reasonably efficient and user-friendly programming model for such systems, has allowed hardware performance to continue its exponential growth. This trend is expected to continue in the near future. This explosive growth, linked with the expectation that performance will continue its exponential rise with each new generation of hardware and that (in stark contrast to software) computer hardware will function correctly as soon as it comes off the assembly line, has its down side. It has led to unprecedented hardware complexity and almost intolerable dev- opment costs. The challenge facing current and future computer designers is to institute simplicity where we now have complexity; to use fundamental theories being developed in this area to gain performance and ease-of-use benefits from simpler circuits; to understand the interplay between technological capabilities and limitations, on the one hand, and design decisions based on user and application requirements on the other.
Developing correct and efficient software is far more complex for parallel and distributed systems than it is for sequential processors. Some of the reasons for this added complexity are: the lack of a universally acceptable parallel and distributed programming paradigm, the criticality of achieving high performance, and the difficulty of writing correct parallel and distributed programs. These factors collectively influence the current status of parallel and distributed software development tools efforts. Tools and Environments for Parallel and Distributed Systems addresses the above issues by describing working tools and environments, and gives a solid overview of some of the fundamental research being done worldwide. Topics covered in this collection are: mainstream program development tools, performance prediction tools and studies; debugging tools and research; and nontraditional tools. Audience: Suitable as a secondary text for graduate level courses in software engineering and parallel and distributed systems, and as a reference for researchers and practitioners in industry.
Multithreaded computer architecture has emerged as one of the most promising and exciting avenues for the exploitation of parallelism. This new field represents the confluence of several independent research directions which have united over a common set of issues and techniques. Multithreading draws on recent advances in dataflow, RISC, compiling for fine-grained parallel execution, and dynamic resource management. It offers the hope of dramatic performance increases through parallel execution for a broad spectrum of significant applications based on extensions to traditional' approaches. Multithreaded Computer Architecture is divided into four parts, reflecting four major perspectives on the topic. Part I provides the reader with basic background information, definitions, and surveys of work which have in one way or another been pivotal in defining and shaping multithreading as an architectural discipline. Part II examines key elements of multithreading, highlighting the fundamental nature of latency and synchronization. This section presents clever techniques for hiding latency and supporting large synchronization name spaces. Part III looks at three major multithreaded systems, considering issues of machine organization and compilation strategy. Part IV concludes the volume with an analysis of multithreaded architectures, showcasing methodologies and actual measurements. Multithreaded Computer Architecture: A Summary of the State of the Art is an excellent reference source and may be used as a text for advanced courses on the subject.
The most important use of computing in the future will be in the context of the global "digital convergence" where everything becomes digital and every thing is inter-networked. The application will be dominated by storage, search, retrieval, analysis, exchange and updating of information in a wide variety of forms. Heavy demands will be placed on systems by many simultaneous re quests. And, fundamentally, all this shall be delivered at much higher levels of dependability, integrity and security. Increasingly, large parallel computing systems and networks are providing unique challenges to industry and academia in dependable computing, espe cially because of the higher failure rates intrinsic to these systems. The chal lenge in the last part of this decade is to build a systems that is both inexpensive and highly available. A machine cluster built of commodity hardware parts, with each node run ning an OS instance and a set of applications extended to be fault resilient can satisfy the new stringent high-availability requirements. The focus of this book is to present recent techniques and methods for im plementing fault-tolerant parallel and distributed computing systems. Section I, Fault-Tolerant Protocols, considers basic techniques for achieving fault-tolerance in communication protocols for distributed systems, including synchronous and asynchronous group communication, static total causal order ing protocols, and fail-aware datagram service that supports communications by time."
Performance Evaluation, Prediction and Visualization in Parallel Systems presents a comprehensive and systematic discussion of theoretics, methods, techniques and tools for performance evaluation, prediction and visualization of parallel systems. Chapter 1 gives a short overview of performance degradation of parallel systems, and presents a general discussion on the importance of performance evaluation, prediction and visualization of parallel systems. Chapter 2 analyzes and defines several kinds of serial and parallel runtime, points out some of the weaknesses of parallel speedup metrics, and discusses how to improve and generalize them. Chapter 3 describes formal definitions of scalability, addresses the basic metrics affecting the scalability of parallel systems, discusses scalability of parallel systems from three aspects: parallel architecture, parallel algorithm and parallel algorithm-architecture combinations, and analyzes the relations of scalability and speedup. Chapter 4 discusses the methodology of performance measurement, describes the benchmark- oriented performance test and analysis and how to measure speedup and scalability in practice. Chapter 5 analyzes the difficulties in performance prediction, discusses application-oriented and architecture-oriented performance prediction and how to predict speedup and scalability in practice. Chapter 6 discusses performance visualization techniques and tools for parallel systems from three stages: performance data collection, performance data filtering and performance data visualization, and classifies the existing performance visualization tools. Chapter 7 describes parallel compiling-based, search-based and knowledge-based performance debugging, which assists programmers to optimize the strategy or algorithm in their parallel programs, and presents visual programming-based performance debugging to help programmers identify the location and cause of the performance problem. It also provides concrete suggestions on how to modify their parallel program to improve the performance. Chapter 8 gives an overview of current interconnection networks for parallel systems, analyzes the scalability of interconnection networks, and discusses how to measure and improve network performances. Performance Evaluation, Prediction and Visualization in Parallel Systems serves as an excellent reference for researchers, and may be used as a text for advanced courses on the topic.
Multiprocessing: Trade-Offs in Computation and Communication presents an in-depth analysis of several commonly observed regular and irregular computations for multiprocessor systems. This book includes techniques which enable researchers and application developers to quantitatively determine the effects of algorithm data dependencies on execution time, on communication requirements, on processor utilization and on the speedups possible. Starting with simple, two-dimensional, diamond-shaped directed acyclic graphs, the analysis is extended to more complex and higher dimensional directed acyclic graphs. The analysis allows for the quantification of the computation and communication costs and their interdependencies. The practical significance of these results on the performance of various data distribution schemes is clearly explained. Using these results, the performance of the parallel computations are formulated in an architecture independent fashion. These formulations allow for the parameterization of the architecture specitific entities such as the computation and communication rates. This type of parameterized performance analysis can be used at compile time or at run-time so as to achieve the most optimal distribution of the computations. The material in Multiprocessing: Trade-Offs in Computation and Communication connects theory with practice, so that the inherent performance limitations in many computations can be understood, and practical methods can be devised that would assist in the development of software for scalable high performance systems.
The communication complexity of two-party protocols is an only 15 years old complexity measure, but it is already considered to be one of the fundamen tal complexity measures of recent complexity theory. Similarly to Kolmogorov complexity in the theory of sequential computations, communication complex ity is used as a method for the study of the complexity of concrete computing problems in parallel information processing. Especially, it is applied to prove lower bounds that say what computer resources (time, hardware, memory size) are necessary to compute the given task. Besides the estimation of the compu tational difficulty of computing problems the proved lower bounds are useful for proving the optimality of algorithms that are already designed. In some cases the knowledge about the communication complexity of a given problem may be even helpful in searching for efficient algorithms to this problem. The study of communication complexity becomes a well-defined indepen dent area of complexity theory. In addition to a strong relation to several funda mental complexity measures (and so to several fundamental problems of com plexity theory) communication complexity has contributed to the study and to the understanding of the nature of determinism, nondeterminism, and random ness in algorithmics. There already exists a non-trivial mathematical machinery to handle the communication complexity of concrete computing problems, which gives a hope that the approach based on communication complexity will be in strumental in the study of several central open problems of recent complexity theory."
Nonlinear Assignment Problems (NAPs) are natural extensions of the classic Linear Assignment Problem, and despite the efforts of many researchers over the past three decades, they still remain some of the hardest combinatorial optimization problems to solve exactly. The purpose of this book is to provide in a single volume, major algorithmic aspects and applications of NAPs as contributed by leading international experts. The chapters included in this book are concerned with major applications and the latest algorithmic solution approaches for NAPs. Approximation algorithms, polyhedral methods, semidefinite programming approaches and heuristic procedures for NAPs are included, while applications of this problem class in the areas of multiple-target tracking in the context of military surveillance systems, of experimental high energy physics, and of parallel processing are presented. Audience: Researchers and graduate students in the areas of combinatorial optimization, mathematical programming, operations research, physics, and computer science.
Despite five decades of research, parallel computing remains an exotic, frontier technology on the fringes of mainstream computing. Its much-heralded triumph over sequential computing has yet to materialize. This is in spite of the fact that the processing needs of many signal processing applications continue to eclipse the capabilities of sequential computing. The culprit is largely the software development environment. Fundamental shortcomings in the development environment of many parallel computer architectures thwart the adoption of parallel computing. Foremost, parallel computing has no unifying model to accurately predict the execution time of algorithms on parallel architectures. Cost and scarce programming resources prohibit deploying multiple algorithms and partitioning strategies in an attempt to find the fastest solution. As a consequence, algorithm design is largely an intuitive art form dominated by practitioners who specialize in a particular computer architecture. This, coupled with the fact that parallel computer architectures rarely last more than a couple of years, makes for a complex and challenging design environment. To navigate this environment, algorithm designers need a road map, a detailed procedure they can use to efficiently develop high performance, portable parallel algorithms. The focus of this book is to draw such a road map. The Parallel Algorithm Synthesis Procedure can be used to design reusable building blocks of adaptable, scalable software modules from which high performance signal processing applications can be constructed. The hallmark of the procedure is a semi-systematic process for introducing parameters to control the partitioning andscheduling of computation and communication. This facilitates the tailoring of software modules to exploit different configurations of multiple processors, multiple floating-point units, and hierarchical memories. To showcase the efficacy of this procedure, the book presents three case studies requiring various degrees of optimization for parallel execution. This book can be used as a reference for algorithm designers or as a text for an advanced course on parallel programming.
Automatic transformation of a sequential program into a parallel form is a subject that presents a great intellectual challenge and promises a great practical award. There is a tremendous investment in existing sequential programs, and scientists and engineers continue to write their application programs in sequential languages (primarily in Fortran). The demand for higher speedups increases. The job of a restructuring compiler is to discover the dependence structure and the characteristics of the given machine. Much attention has been focused on the Fortran do loop. This is where one expects to find major chunks of computation that need to be performed repeatedly for different values of the index variable. Many loop transformations have been designed over the years, and several of them can be found in any parallelizing compiler currently in use in industry or at a university research facility. The book series on KappaLoop Transformations for Restructuring Compilerskappa provides a rigorous theory of loop transformations and dependence analysis. We want to develop the transformations in a consistent mathematical framework using objects like directed graphs, matrices, and linear equations. Then, the algorithms that implement the transformations can be precisely described in terms of certain abstract mathematical algorithms. The first volume, Loop Transformations for Restructuring Compilers: The Foundations, provided the general mathematical background needed for loop transformations (including those basic mathematical algorithms), discussed data dependence, and introduced the major transformations. The current volume, Loop Parallelization, builds a detailed theory of iteration-level loop transformations based on the material developed in the previous book.
Mining Very Large Databases with Parallel Processing addresses the problem of large-scale data mining. It is an interdisciplinary text, describing advances in the integration of three computer science areas, namely intelligent' (machine learning-based) data mining techniques, relational databases and parallel processing. The basic idea is to use concepts and techniques of the latter two areas - particularly parallel processing - to speed up and scale up data mining algorithms. The book is divided into three parts. The first part presents a comprehensive review of intelligent data mining techniques such as rule induction, instance-based learning, neural networks and genetic algorithms. Likewise, the second part presents a comprehensive review of parallel processing and parallel databases. Each of these parts includes an overview of commercially-available, state-of-the-art tools. The third part deals with the application of parallel processing to data mining. The emphasis is on finding generic, cost-effective solutions for realistic data volumes. Two parallel computational environments are discussed, the first excluding the use of commercial-strength DBMS, and the second using parallel DBMS servers. It is assumed that the reader has a knowledge roughly equivalent to a first degree (BSc) in accurate sciences, so that (s)he is reasonably familiar with basic concepts of statistics and computer science. The primary audience for Mining Very Large Databases with Parallel Processing is industry data miners and practitioners in general, who would like to apply intelligent data mining techniques to large amounts of data. The book will also be of interest to academic researchers and postgraduate students, particularly database researchers, interested in advanced, intelligent database applications, and artificial intelligence researchers interested in industrial, real-world applications of machine learning. |
![]() ![]() You may like...
|