Descriptor: "Concurrent computing" / Publisher: ieee comput. soc. press - Searchworks@Jio Institute Digital Library Search Results

1. Working Sets, Cache Sizes, And Node Granularity Issues For Large-scale Multiprocessors

Author: Edward Rothberg, Anoop Gupta, and Jaswinder Pal Singh
Subjects: Computer science, CPU cache, Distributed computing, Cache-only memory architecture, Multiprocessing, Parallel computing, General Medicine, Cache-oblivious algorithm, Cache pollution, Supercomputer, Concurrent computing, Distributed memory, Cache, Cache hierarchy
Abstract: The distribution of resources among processors, memory and caches is a crucial question faced by designers of large-scale parallel machines. If a machine is to solve problems with a certain data set size, should it be built with a large number of processors each with a small amount of memory, or a smaller number of processors each with a large amount of memory? How much cache memory should be provided per processor for cost-effectiveness? And how do these decisions change as larger problems are run on larger machines? In this paper, we explore the above questions based on the characteristics of five important classes of large-scale parallel scientific applications. We first show that all the applications have a hierarchy of well-defined per-processor working sets, whose size, performance impact and scaling characteristics can help determine how large different levels of a multiprocessor's cache hierarchy should be. Then, we use these working sets together with certain other important characteristics of the applications—such as communication to computation ratios, concurrency, and load balancing behavior—to reflect upon the broader question of the granularity of processing nodes in high-performance multiprocessors. We find that very small caches whose sizes do not increase with the problem or machine size are adequate for all but two of the application classes. Even in the two exceptions, the working sets scale quite slowly with problem size, and the cache sizes needed for problems that will be run in the foreseeable future are small. We also find that relatively fine-grained machines, with large numbers of processors and quite small amounts of memory per processor, are appropriate for all the applications.
Published: 2005

2. Panel 4: What Types Of Research Papers Should We Be Writing?

Author: Lionel M. Ni
Subjects: Concurrent data structure, business.industry, Computer science, Operating system, Concurrent computing, Software engineering, business, computer.software_genre, computer
Published: 2005

3. Panel 5: Parallel Processing: What Have We Done Wrong?

Author: L.M. Ni and Kuo-Wei Wu
Subjects: Parallel processing (DSP implementation), Computer science, Order (business), Programming language, Embarrassingly parallel, Concurrent computing, Supercomputer, computer.software_genre, Massively parallel, Data science, computer, Field (computer science)
Abstract: Parallel processing has been a subject of extensive research for over 20 years, especially in the last 10 years, with many commercial parallel machines becoming available, from small scale parallel machines to massively parallel machines. At one time, it was claimed that parallel machines will become the mainstream computers. However, more recently, some parallel computer vendors have gone out of business and some others are struggling. Some pessimists even claimed that this is a dying field. So, what’s wrong? Five distinguished panelists are invited to share their views on this issue. The panelists are also expected to address what could be done and could be done in order to make parallel computers truly mainstream computers. Panelists
Published: 2005

4. Parallel And Distributed Algorithms An Introduction To The Minitrack

Author: S. Olatiu, M.A. Langston, and J.L. Schwing
Subjects: Distributed design patterns, Concurrency control, Computer science, Distributed algorithm, Distributed computing, Distributed concurrency control, Parallel algorithm, Concurrent computing, Self-stabilization, Algorithm design, Parallel computing
Published: 2005

5. Main frame diagnosis support system

Author: H. Nishine, K. Ohga, T. Nishida, Y. Tsubuku, H. Shiga, and M. Kaneko
Subjects: Automatic test equipment, Logic synthesis, Computer engineering, Computer science, Frame (networking), Data file, Concurrent computing, Hardware_PERFORMANCEANDRELIABILITY, Fault (power engineering), Error detection and correction, Fault detection and isolation
Abstract: The authors have developed an automatic system called CONDOR (concurrent error checker diagnosability analyzer) which quantitatively evaluates the effectiveness of the arrangement of error detection circuits and automatically generates a fault dictionary based on the computer logic design data file. The CONDOR diagnosis support was applied to a large-scale logic computer with more than a million gates. It designed the diagnosis facility for the large-scale computer quickly by simplifying the logic design data. The labor required for generating a fault dictionary can be reduced by 90% or more. Also, the fault-locating accuracy of the fault dictionary was increased from 97% to 100%. >
Published: 2003

6. A priori execution time analysis for parallel processes

Author: W.A. Halang
Subjects: Spectrum analyzer, Programming language, Computer science, High-level programming language, A priori and a posteriori, Concurrent computing, Parallel computing, computer.software_genre, computer, Compiled language, Upper and lower bounds, Language construct, Scheduling (computing)
Abstract: A method of knowing a priori the time required by parallel processes to complete their execution is described, which allows for the automatic estimation of an upper bound for a task's execution time. The method is discussed within the framework of the high-level real-time programming language Pearl. Several language extensions are defined to enable the execution-time estimations for all language constructs. The practical implementation of the method is based on a combination of a control-flow analyzer with procedures determining the execution times of compiled code and carrying out the developed estimation rules, respectively. The importance of the method for the utilization of deadline-driven scheduling is pointed out. >
Published: 2003

7. Parallel automated test pattern generation on the Connection Machine

Author: P. Mayor, Vijay Pitchumani, and V. Narayanan
Subjects: Automatic test equipment, Parallel processing (DSP implementation), Data parallelism, Computer science, Parallel algorithm, Concurrent computing, Algorithm design, SIMD, Parallel computing, Automatic test pattern generation
Abstract: The authors present an SIMD (single-instruction multiple-data) algorithm for automated test pattern generation. An effort was made to parallelize the individual steps of FAN by employing the massive parallelism of the Connection Machine. The algorithm considers one fault at a time and generates a test for it. Fine-grain parallelism is achieved by several gates within a level simultaneously doing multiple backtrace or forward simulation. >
Published: 2003

8. Designing time critical systems with TACT

Author: H.S.M. Zedan and R.F. Stone
Subjects: Theoretical computer science, Syntax (programming languages), Computer science, Semantics (computer science), Programming language, occam, Tact, computer.software_genre, Small set, Simple (abstract algebra), Concurrent computing, Design paradigm, computer, computer.programming_language
Abstract: The authors propose a small set of extensions with language and run-time support, based on reliable communication and atomic actions, to the concurrent programming language OCCAM2. A description of the syntax and a formal definition of the semantics of the constructs, which are based on labeled transition systems, are given. Provisions for backward error recovery are described that allow well-established fault-tolerant strategies to be constructed. A simple example of the design paradigm is described. Some details of implementation issues are also given. >
Published: 2003

9. A protocol for timed atomic commitment

Author: Susan B. Davidson, Victor Fay Wolfe, and Insup Lee
Subjects: Theoretical computer science, Correctness, Computer science, Process (engineering), Distributed computing, Clock drift, Concurrent computing, Commit, State (computer science), Outcome (game theory), Execution time, Protocol (object-oriented programming)
Abstract: A model and correctness criteria for timed atomic commitment (TAC) are presented which require the processes to be functionally consistent, but allow the outcome to include an exceptional state, indicating that timing constraints have been violated. Correct TAC behavior is defined by presenting an abstract description of the processes involved in the commitment and minimal correctness criteria for their behavior. The correctness criteria capture the intuitive notion that an exception outcome should only occur in the presence of faults, and an aborted outcome should only occur if faults occur or some process votes no. A centralized two-phase commit protocol was modified to meet the correctness criteria by introducing deadlines on the various stages the participants go through (voting and performing), and on the decision phase for the coordinator. The deadlines are derived using several system parameters: maximum message delay, clock drift, and execution time. The protocol is then shown to be correct. >
Published: 2003

10. An enhanced high performance combinational fault simulator using two-way parallelism

Author: S.P. Smith
Subjects: Memory management, Parallel processing (DSP implementation), Computer science, Parallelism (grammar), Preprocessor, Concurrent computing, Fault Simulator, Parallel computing, Fault (power engineering), Fault detection and isolation
Abstract: A combinational fault simulator using two-way parallelism and a number of refinements to reduce memory usage is presented. The results show the approach to be generally superior to the basic parallel patterns single fault propagation (PPSFP) algorithm. One of the refinements, processing to encourage the sharing of fault machine indices by independent faults, appears to require substantially more processing time than it saves during simulation. The remaining preprocessing steps are comparable to those used for basic PPSFP, and are highly justified by their run-time savings. The concept of adjusting the parallelism factor to account for increasingly random-resistant faults seems to work quite well. >
Published: 2003

11. Concurrent programming support for a multimanipulator experiment on RIPS

Author: Yulun Wang, S.E. Butner, and A. Mangaser
Subjects: Unix, business.industry, Computer science, Programming language, Hardware_PERFORMANCEANDRELIABILITY, Software prototyping, computer.software_genre, Robot control, Software, TheoryofComputation_ANALYSISOFALGORITHMSANDPROBLEMCOMPLEXITY, Embedded system, Synchronization (computer science), Hardware_INTEGRATEDCIRCUITS, Robot, Concurrent computing, Compiler, business, computer
Abstract: The authors discuss a concurrent programming environment and its application to a two-arm cooperative manipulation experiment on RIPS (robot instruction processing system). RIPS is a hierarchical multiprocessor architecture in which various custom and general-purpose processors are applied to a partitioning of the robot control problem. The system provides hardware support for synchronization and communication primitives, making it easier to write concurrent programs for RIPS' heterogeneous processors. The experiment demonstrated the viability of RIPS in supporting computationally intensive robot control methodologies, and as a byproduct has helped to develop a parallel programming environment for RIPS, called USE RIPS (user software environment for RIPS). By building USE RIPS on Unix and using a layered approach, it is possible to adapt or make use of various existing programs and utilities, such as the GNU and C++ compilers. >
Published: 2003

12. Language constructs for timed atomic commitment

Author: Susan B. Davidson, Insup Lee, and Victor Fay Wolfe
Subjects: Computer science, Transaction processing, High-level programming language, Programming language, Concurrent computing, State (computer science), Control (linguistics), computer.software_genre, Outcome (game theory), computer, Language construct, Task (project management)
Abstract: In a large class of hard-real-time control applications, components execute concurrently on distributed nodes and must coordinate, under timing constraints, to perform the control task. As such, they perform a type of atomic commitment. In traditional atomic commitment there are no timing constraints; agreement is eventual. The authors present a definition of timed atomic commitment (TAC) which requires the processes to be functionally consistent, but allows the outcome to include an exceptional state, indicating that faults have caused timing constraints to be violated. The authors also present a high-level language construct that facilitates the use of TAC in distributed real-time programming and discuss its behavior when faults occur. >
Published: 2003

13. Towards a consistent view of the design tools and process in distributed problem solving environment

Author: Nino Vidovic, Daniel P. Siewiorek, Zary Segall, and D. Vrsalovic
Subjects: Unix, Integrated design, Computer science, Programming language, Instrumentation, Problem solving environment, Workbench, Concurrent computing, Process design, Electronic design automation, Instrumentation (computer programming), computer.software_genre, computer
Abstract: A description is given of the issues encountered in generating an integrated design environment (IDE) based on the DEMETER workbench (DWB) and PIE (a parallel programming and instrumentation environment for UNIX machines). Some of the reasons for using a general integration methodology are explained. DEMETER, which supports complexity reduction, CAD tools management and manipulation, and distributed/parallel problem-solving, is presented. DWB is described. It is shown how the IDE can be built using the DWB and the basic concepts developed in PIE. >
Published: 2003

14. Efficient shared memory for testing parallel algorithms on distributed systems

Author: M.S. Atkins
Subjects: Workstation, Shared memory, law, Computer science, Distributed computing, Node (networking), Parallel algorithm, Local area network, Concurrent computing, Parallel computing, Load balancing (computing), Data structure, law.invention
Abstract: A distributed data structure called a MOOSE (modifiable object structure), which is both efficient enough and general enough to be used by a wide variety of parallel algorithms, is outlined. The MOOSE structure is aimed at a loosely coupled distributed system in which several processors are connected over a local area network. It is implemented in the high-level distributed programming language SR on several Sun-2 and Sun-3 workstations running the Unix operating system and connected by an Ethernet. The MOOSE shared memory has been designed with customizable features for efficiency of implementation in such an environment. This enables the communication and computation performance of parallel algorithms on non-shared-memory hardware to be studied. If the application is run in the background on several network nodes, automatic load balancing is achieved and the programs may be tolerant of node failure during the computation. >
Published: 2003

15. Broadcast-bus elimination without any loss of time efficiency in iterative (cellular or systolic) arrays

Author: Roland Vollmar, Hiroshi Umeo, and Thomas Worsch
Subjects: Very-large-scale integration, business.industry, Computer science, Parallel algorithm, Time efficiency, Concurrent computing, Parallel computing, Broadcasting, business, Application software, computer.software_genre, computer, Computer hardware
Abstract: The authors study the effects of broadcasting bus systems augmented with a mesh-connected computer. They develop a direct-proof technique for the elimination of broadcasting buses. As an application of the technique, they show that a rich variety of broadcasting bus systems on one- and two-dimensional arrays can be eliminated without any loss of time efficiency. No-time-loss elimination of broadcasting buses on one-dimensional arrays has been achieved using the technique of O.H. Ibarra et al. (1985) but, without the present technique, it would be more difficult, although not impossible, to get the same results. >
Published: 2003

16. A floating communication processor architecture in a distributed real-time system

Author: Y. Muthuswamy and Kang G. Shin
Subjects: business.industry, Computer science, Node (networking), Embedded system, Distributed computing, Message passing, Concurrent computing, Fault tolerance, Data structure, business, Real-time operating system, Microarchitecture
Abstract: The issues involved in providing hardware communication support at each mode in a distributed real-time system are studied. First, a general architecture for each node of the system is described. An algorithm for message handling by dedicated hardware called a communication processor (CP) is proposed, to maximize the number of requests handled over the various constraints. A floating CP architecture is proposed, to maximize the number of requests handled under the various constraints. A floating CP architecture is proposed to maximize the utilization of the processors at a node and provide greater fault-tolerance in the system. >
Published: 2003

17. Linking consistency with object/thread semantics: an approach to robust computation

Author: R. Chen and Partha Dasgupta
Subjects: Atomicity, Data consistency, Sequential consistency, Computer science, Programming language, Distributed computing, Release consistency, Consistency model, Thread (computing), computer.software_genre, Data type, Concurrency control, Concurrent computing, computer
Abstract: An object/thread based paradigm is presented that links data consistency with object/thread semantics. The paradigm can be used to achieve a wide range of consistency semantics from strict atomic transactions to standard process semantics. The paradigm supports three types of data consistency. Object programmers indicate the type of consistency desired on a per-operation basis, and the system performs automatic concurrency control and recovery management to ensure that those consistency requirements are met. This allows programmers to customize consistency and recovery on a per-application basis without having to supply complicated, custom recovery management schemes. The paradigm allows robust and nonrobust computation to operate concurrently on the same data in a well-defined manner. The operating system need support only one vehicle of computation-the thread. >
Published: 2003

18. Performability analysis of parallel and distributed algorithms

Author: Hany H. Ammar, S.M.R. Islam, and Su Deng
Subjects: Computer science, Distributed algorithm, Distributed computing, Component (UML), Parallel algorithm, Stochastic Petri net, Concurrent computing, Hypercube, Parallel computing, Petri net, Supercomputer
Abstract: A generalized stochastic Petri net (GSPN) performability model of a parallel and distributed computation is developed. The performance-related activities such as computations and communications are orders of magnitude faster than the component failure and repair activities. Based on the notion of time-scale decomposition, a hierarchy of two levels is defined. At the lower level the performance submodel describes the activities in the application program, while at the higher level the component failure and repair submodel for the underlying architecture defines the current configuration of processors and communication links available for the computation. These two submodels define the reward model needed for performability analysis. Two parallel FFT (fast Fourier transform) algorithms on a hypercube architecture are presented to illustrate the above modeling technique. A general and extended reliability model of the hypercube is also developed. Various performability measures are presented to demonstrate the importance of performability evaluation for mission-critical parallel applications. >
Published: 2003

19. Efficient algorithms for resource allocation in distributed and parallel query processing environments

Author: T. Masuda, Yasushi Kiyoki, and P. Liu
Subjects: Scheme (programming language), Parallel processing (DSP implementation), Distributed database, Computational complexity theory, Computer science, Distributed computing, Concurrent computing, Resource allocation, Resource management, Algorithm design, computer, computer.programming_language
Abstract: Several effective algorithms are presented for the optimal allocation of computer resources in a proposed stream-oriented parallel-processing scheme for database operations. These algorithms can be utilized to obtain the optimal allocation of memory resources for every type of query in sequential-processing environments, parallel-processing environments with shared-memory multiprocessors, and distributed-processing environments. The computation complexities of the proposed algorithms are analyzed and used to clarify the effectiveness of those algorithms. >
Published: 2003

20. CHIMERA: a real-time programming environment for manipulator control

Author: R. Hoffman, Takeo Kanade, D.E. Schmitz, and Pradeep K. Khosla
Subjects: Unix, Workstation, business.industry, Computer science, Real-time computing, Modular design, Porting, law.invention, Robot control, Software, law, Embedded system, Robot, Concurrent computing, business
Abstract: CHIMERA is a real-time computing environment used in the Reconfigurable Modular Manipulator System project. CHIMERA, which is both a hardware and software environment, allows rapid development and implementation of real-time control programs. It provides a C/Unix-flavored concurrent programming environment for a Motorola 68020 multiprocessor hardware configuration connected to a Sun workstation. CHIMERA has been implemented using commercial hardware in conjunction with a sophisticated, locally developed software package, resulting in a reliable, reasonably priced, and easily duplicated system. CHIMERA is currently being ported for real-time control of the CMU Direct Drive Arm II. The authors describe the implementation and capabilities of the CHIMERA environment and illustrate how these features are used in robot control applications. >
Published: 2003

21. Qlisp: parallel processing in Lisp

Author: R. Goldman and R.P. Gabriel
Subjects: Programming language, Computer science, Multiprocessing, Fexpr, Parallel computing, Data structure, computer.software_genre, Data type, Spawn (computing), Parallel processing (DSP implementation), Synchronization (computer science), Concurrent computing, Preprocessor, Common Lisp, Lisp, computer, Software, computer.programming_language
Abstract: One of the major problems in converting serial programs to take advantage of parallel processing has been the lack of a multiprocessing language that is both powerful and understandable to programmers. The authors describe multiprocessing extensions to Common Lisp designed to be suitable for studying styles of parallel programming at the medium-grain level in a shared-memory architecture. The resulting language is called Qlisp. Two features for addressing synchronization problems are included in Qlisp. The first is the concept of heavyweight features, and the second is a novel type of function called a partially multiply invoked function. An initial implementation of Qlisp has been carried out, and various experiments performed. Results to date indicate that its performance is about as good as expected. >
Published: 2003

22. A parallel architecture for large scale production systems

Author: Jaideep Srivastava, K.W. Hwang, J. Tan, J.H. Wang, and W.T. Tsai
Subjects: Job shop scheduling, Abort, Computer science, Concurrency, Distributed computing, Two-phase locking, Concurrent computing, Parallel computing, computer.software_genre, computer, Synchronization, Expert system, Scheduling (computing)
Abstract: The authors present an architecture, suitable for implementation on a shared memory multiprocessor system, in which all the phases can run in parallel. Running multiple match, execution, and select phases causes subtle synchronization problems, which if not resolved can lead to altered semantics. The proposed architecture uses a lock and interference manager and a scheduler to resolve the possible synchronization conflicts. A new lock which provides concurrency beyond the standard two-phase locking in databases is used. The conflict resolution phase has been formalized as a scheduling problem. The approach taken is conservative in the sense that the scheduler performs careful analysis (interference avoidance and abort avoidance tests) to prevent interference, abort, and blocking. >
Published: 2003

23. Using checkpoints to localize the effects of faults in distributed systems

Author: L. Lin and Mustaque Ahamad
Subjects: Client–server model, Self-certifying File System, File server, Computer science, business.industry, Server, Distributed computing, Concurrent computing, Stable storage, Distributed File System, business, Shared resource, Computer network
Abstract: A checkpointing scheme can be used to ensure forward progress of a computation (program) even when failures occur. In a distributed system, many autonomous programs can execute concurrently and obtain services from a set of shared servers. In such a system, it is desirable to to restrict a checkpoint or rollback operation to a single program to localize the effects of failures, even when processes of different programs communicate with servers. This can be achieved by a scheme based on message logging and consistent checkpoints when the system is deterministic. When the system (communication network or programs) is nondeterministic, the semantics of the server functions should be exploited to reduce the additional synchronization that needs to be introduced to ensure locality. The authors illustrate this by presenting efficient algorithms for a file server that do not require the logging of messages on stable storage. >
Published: 2003

24. Collecting unused processing capacity: an analysis of transient distributed systems

Author: Leonard Kleinrock and W. Korfhage
Subjects: Computer science, Distributed computing, Probability density function, Parallel computing, Execution time, Power (physics), Normal distribution, Idle, Computational Theory and Mathematics, Hardware and Architecture, Signal Processing, Resource allocation, Resource allocation (computer), Concurrent computing, Transient (computer programming), Computer Science::Operating Systems, Central limit theorem
Abstract: Distributed systems having large numbers of idle computers and workstations are analyzed using a very simple model of a distributed program (a fixed amount of work) to see how the use of transient processors affects the program's service time. The probability density of the length of time it takes to finish a fixed amount of work is determined. An equation is given for the main result for an M-processor network. Simulations confirm that Brownian motion with drift is an accurate model of system performance. With large programs that run for a long time relative to the length of available and nonavailable periods, the central limit-theorem applies, and the Brownian-motion-with-drift model remains good regardless of the distributions of the available and the nonavailable periods. Under these assumptions, the distribution of finishing time is very tight about its mean and well approximated by a normal distribution. >
Published: 2003

25. Parallel-concurrent fault simulation

Author: D.G. Saab, J.T. Rahmeh, and Ibrahim N. Hajj
Subjects: Stuck-at fault, Computer Science::Hardware Architecture, Computer science, Group (mathematics), Logic gate, Fault coverage, Concurrent computing, Hardware_PERFORMANCEANDRELIABILITY, Parallel computing, Fault (power engineering), Computer Science::Operating Systems, Computer Science::Distributed, Parallel, and Cluster Computing, Word (computer architecture)
Abstract: A fault simulation algorithm based on the partitioning of faults into groups, with the group size equal to the number of bits in the host computer word, is presented. The fault effects of a particular group are evaluated using parallel fault simulation techniques and propagated using concurrent fault simulation techniques. The speed of the algorithm depends on the circuit and on the fault-grouping criterion. Three static grouping criteria are examined and compared in terms of speed and memory requirements. A dynamic regrouping technique is developed and is shown to improve the performance of static grouping. >
Published: 2003

26. The temporal specification technique for operating systems mechanisms

Author: A. Hoppe
Subjects: Processor sharing, Global information, Propositional temporal logic, Computer science, Programming language, Distributed computing, Liveness, Operating system, Concurrent computing, computer.software_genre, Semaphore, computer, Logic programming
Abstract: A technique is presented for specifying properties of concurrent programs at different levels of detail. The author models the concurrent execution of several programs by interleaved execution sequences of system states. These states can contain additional global information about programs being suspended, or about current processor allocation. Then the author uses linear, propositional temporal logic for characterizing the properties of such sequences. This enables him to specify the safety and liveness properties of concurrent programming constructs such as semaphores and communication commands of CSP. He applies the technique for specifying the functions implemented by the lowest layers of an operating system, such as processor sharing and input/output, and the low-level mechanisms used for implementing these functions, such as interrupts. >
Published: 2003

27. CODE: the Computation Oriented Display Environment

Author: J.C. Browne
Subjects: MIMD, Parallel processing (DSP implementation), Computer science, Component (UML), Computation, Parallel programming model, Software construction, Parallel algorithm, Concurrent computing, Parallel computing
Abstract: The goals for the Computation Oriented Display Environment (CODE) are to provide a representation power sufficient for facile expression of a wide class of parallel algorithms while at the same time permitting compilation to reasonably efficient programs on a wide spectrum of parallel execution environments and to provide a hierarchical approach to development of parallel programs. CODE is based on a formally specified model of parallel computation which covers most conventional MIMD models of parallel computation. The model is formulated at a higher level of abstraction than conventional MIMD shared-name-space and partitioned-name-space models of parallel computation. The conceptual foundation of CODE, in particular basing the language on an abstract model of parallel computation, has led to two significant capabilities which had not been anticipated: a calculus of composition which may be exploitable for automated or semiautomated program construction and a natural basis for highly effective component reuse. >
Published: 2003

28. Implementing fault-tolerant replicated objects using Psync

Author: Richard D. Schlichting, Shivakant Mishra, and Larry L. Peterson
Subjects: Set (abstract data type), Concurrency control, Broadcasting (networking), Computer science, Concurrency, Distributed computing, Concurrent computing, Fault tolerance, Protocol (object-oriented programming), Host (network)
Abstract: Psync is an IPC protocol that explicitly preserves the partial order of messages exchanged among a set of processes. A description is given of how Psync can be used to implement replicated objects in the presence of network and host failures. Unlike conventional algorithms that depend on an underlying mechanism that totally orders messages for implementing replicated objects, the authors' approach exploits the partial order provided by Psync to achieve additional concurrency. >
Published: 2003

29. A design for a fault-tolerant, distributed implementation of Linda

Author: A. Xu and Barbara Liskov
Subjects: Out-of-order execution, Parallel processing (DSP implementation), Computer science, Semantics (computer science), Concurrency, Distributed computing, Concurrent computing, Tuple space, ComputerApplications_COMPUTERSINOTHERSYSTEMS, Fault tolerance, Parallel computing, Replication (computing)
Abstract: A distributed implementation of a parallel system is of interest because it can provide an economical source of concurrency, can be scaled easily to match the needs of particular computations, and can be fault-tolerant. A design is described for such an implementation for the Linda parallel programming system, in which processes share a memory called the tuple space. Fault tolerance is achieved by replication: by having more than one copy of the tuple space, some replicas can provide information when others are not accessible due to failures. The replication technique takes advantage of the semantics of Linda so that processes encounter little delay in accessing the tuple space. In addition to providing an efficient implementation for Linda, the study extends work on replication techniques by showing what can be done when semantics are taken into account. >
Published: 2003

30. Performance modeling of the modified mesh-connected parallel computer

Author: C.H. Wu, V.P. Nelson, and C.J. Wang
Subjects: Wafer-scale integration, Computer science, Fast Fourier transform, Message passing, Stochastic Petri net, Overhead (computing), Concurrent computing, Parallel computing, Petri net
Abstract: A message-passing computer architecture called the modified mesh-connected parallel computer (MMCPC) is proposed and studied. The MMCPC is designed to be general-purpose parallel architecture suitable for wafer-scale integration. Generalized stochastic Petri nets (GSPNs) are used to model the behavior of the MMCPC. The GSPN performance modeling results show a need for a new processing element (PE). A PE architecture, able to handle data processing and message passing concurrently, is proposed, and the silicon overhead is estimated in comparison with transputerlike PEs. Based on the proposed PE, optimum sizes of the MMCPC for different program structures are derived. A two-dimensional fast Fourier transform problem is used as an example to demonstrate that the MMCPC is a cost-effective performance-enhancement architecture to a real problem. >
Published: 2003

31. Time-critical database scheduling: a framework for integrating real-time scheduling and concurrency control

Author: U. Dayal, Alejandro Buchmann, D. R. McCarthy, and M. Hsu
Subjects: Rate-monotonic scheduling, Job shop scheduling, Database, Computer science, Distributed computing, Dynamic priority scheduling, Round-robin scheduling, computer.software_genre, Fair-share scheduling, Scheduling (computing), Concurrency control, Two-level scheduling, Concurrent computing, computer
Abstract: A framework is presented for analysis of time-critical scheduling algorithms. The main assumptions are analyzed behind real-time scheduling and concurrency control algorithms, and a unified approach is proposed. Two main classes of schedulers are identified according to the availability of information about resource requirements and execution times: conflict-resolving schedulers resolve conflicts at run-time, and hence can only produce a sequence of operations satisfying task priorities and resource constraints; and conflict-avoiding schedulers determine resource requirements and expected execution times through offline transaction-class preanalysis and produce a complete time-critical schedule satisfying both timing and resource constraints. For the latter case, the resolution of overload is essential. Examples are given to illustrate the framework and the main classes of scheduling algorithms. >
Published: 2003

32. Use of a functional programming model in fault tolerant parallel processing

Author: Richard E. Harper, G. Nagle, and M.A. Serrano
Subjects: Load management, Functional programming, Computer science, Distributed computing, Redundancy (engineering), Concurrent computing, Fault tolerance, Load balancing (computing), User interface, Fault detection and isolation
Abstract: In a fault-tolerant parallel computer, a functional programming model can facilitate distributed checkpointing, error recovery, load balancing, and graceful degradation. Such a model has been implemented on the Draper fault-tolerant parallel processor (FTPP). When used in conjunction with the FTPP's fault-detection and masking capabilities, this implementation results in a graceful degradation of system performance after faults. Three graceful degradation algorithms are presented. A user interface has been implemented which requires minimal cognitive overhead by the application programmer, masking such complexities as the system's redundancy, distributed nature, variable complement of processing resources, load balancing, fault occurrence, and recovery. This user interface is described and its use demonstrated. >
Published: 2003

33. Performance comparison of concurrency control protocols for transaction processing systems with regional locality

Author: Bruno Ciciani, Daniel M. Dias, and Philip S. Yu
Subjects: Concurrency control, Distributed database, Computer science, Transaction processing, Serializability, Distributed computing, Distributed concurrency control, Concurrent computing, Transaction processing system, computer.software_genre, computer, Replication (computing)
Abstract: An examination is made of a system structure and protocols to improve the performance and availability of a distributed transaction processing (TP) system when there is some regional locality of data reference. Several TP applications, such as reservation systems, insurance, and banking, belong to this category. While maintaining a distributed system at each region, a central system is introduced with a replication of all databases at the distributed sites. Specialized protocols can be designed to keep the copies at the distributed and centralized systems consistent without incurring the overhead and delay of generalized protocols for fully replicated databases. The authors study the advantages of this system structure and the tradeoffs between protocols for concurrency and coherency control of the duplicate copies of the databases. An approximate analytic model is used to estimate the system performance and the method is validated through simulations. >
Published: 2003

34. The algorithm of a synthesis technique for concurrent systems

Author: Y. Yaw, Fuin-Law, and W.-D. Ju
Subjects: Set (abstract data type), Matrix (mathematics), Polynomial, Theoretical computer science, Parallel processing (DSP implementation), Computer science, Systems design, Concurrent computing, Process design, Petri net, Algorithm
Abstract: A synthesis technique that relieves the complexity problems that can be encountered in verification of concurrent systems by avoiding verification is presented. With this technique, Petri nets are used for modeling concurrent systems. A temporal matrix is used to record relationship (concurrent, exclusive, serial, etc.) among processes and to detect rule violations as new generations are generated. A set of synthesis rules was developed for incrementally generating new processes without incurring logical incorrectness. The authors develop an algorithm to detect rule violations and to update the T-matrix. The complexity of the algorithm is polynomial and O(N/sup 3/), where N is the total number of pseudo-processes when the final system design is completed. >
Published: 2003

35. An environment for the conversion of sequential programs into parallel forms

Author: W. Eventoff
Subjects: Diagrammatic reasoning, business.industry, Computer science, Programming language, Graph (abstract data type), Concurrent computing, Software engineering, business, Application software, computer.software_genre, computer, Graph, Visual programming language
Abstract: A new program development environment that is based on visual programming techniques and avoids the problems associated with the use of existing methods, fostering parallel thinking, is presented. The environment, which reflects the intelligent apprentice paradigm (i.e. an assistant that helps the user but depends on the user for advice), provides an integrated set of tools that allows the user both to understand the dependencies within the program readily and to manipulate the dependence graph directly to maximize the amount of parallelism that can be exploited. To enter this environment, the user simply compiles the program under study. The tools analyze the dependences within the program and present the user with a diagrammatic, hierarchical representation of the program's dependence graph, reformatted to reflect available parallelism and highlighting constructs that inhibit the parallelization process (e.g. runtime-determined array bounds, I/O operations, computed GOTOs, etc.). >
Published: 2003

36. A 3D HDI ASP: a cost-effective alternative to WSI signal processors

Author: J.H. Reche, K.D. Warren, R.M. Lea, and W.J. Jacobi
Subjects: Very-large-scale integration, Signal processing, Wafer-scale integration, Computer architecture, Parallel processing (DSP implementation), Computer science, business.industry, Embedded system, String (computer science), SIGNAL (programming language), Concurrent computing, business, Massively parallel
Abstract: A research project, in which various high-density interconnect (HDI) and wafer-scale integration (WSI) implementation variants of a common computer architecture, the associative string processor (ASP), are compared, is discussed. The ASP is a fault-tolerant and highly versatile massively parallel processor capable of sustaining high performance over a wide range of computationally intensive tasks and, unlike most other computer architectures, the ASP has been designed to exploit state-of-the-art microelectronic technology. The study indicates that, until the feasibility of WSI ASP technology has been proven, a 3-D HDI ASP seems to offer a cost-effective alternative technology for the development of highly compact massively parallel processors for aerospace and automotive applications. >
Published: 2003

37. A single chip processor architecture for video rate two-dimensional digital filtering

Author: Seong-Mo Park and W.E. Alexander
Subjects: Very-large-scale integration, Cellular architecture, Computer science, business.industry, Media processor, Concurrent computing, Throughput, business, Digital filter, Computer hardware, Dataflow architecture, Microarchitecture
Abstract: A VLSI single-chip processor architecture for real-time 2-D digital signal-processing applications is presented. This architecture extends the concept of using a single processing unit to the use of multiple processing units. The advantage of this architecture is that the complexity and the number of computations per unit of input does not increase as the size of 2-D input data increases. Thus, it can process a very large amount of 2-D data efficiently and nearly in real-time. The processor architecture yields a simple and efficient system configuration. It is especially suited for a large class of digital signal-processing algorithms classified as discrete linear shift-invariant systems. >
Published: 2003

38. Parallel computing with distributed shared data

Author: Meichun Hsu
Subjects: Distributed shared memory, Distributed database, Computer science, Shared disk architecture, Distributed computing, Degree of parallelism, Parallel computing, Thread (computing), Concurrency control, Memory management, Virtual address space, Concurrent computing, Data-intensive computing, Distributed memory, Resource management, Data diffusion machine
Abstract: Summary form only given. The issue of ease of using shared data in a data-intensive parallel computing environment is discussed. An approach is investigated for transparently supporting data sharing in a loosely coupled parallel computing environment, where a moderate to a large number of individual computing elements are connected via a high-bandwidth network without necessarily physically sharing memory. A system called VOYAGER is discussed which serves as the underlying system facility that supervises the distributed shared virtual memory. VOYAGER allows shared-data parallel applications to take advantage of parallel and distributed processing with relative ease. The application program merely maps the shared data onto its virtual address space replicates itself on distributed machines and spawns appropriate execution threads; the threads would automatically be given coordinated access to the shared data distributed in the network. Multiple computation threads migrate and populate the processors of a number of computing elements, making use of the multiple processors to achieve a high degree of parallelism. The low-level resource management chores are made available once and for all in the underlying facility VOYAGER, usable by many different data-intensive applications. >
Published: 2003

39. Legal firing sequence and related problems of Petri nets

Author: Kenji Onaga, Toshimasa Watanabe, and Y. Mizobata
Subjects: Sequence, TheoryofComputation_COMPUTATIONBYABSTRACTDEVICES, Theoretical computer science, Development (topology), Computational complexity theory, Computer science, Concurrent computing, Process architecture, Petri net, Computer Science::Formal Languages and Automata Theory
Abstract: Development of computational tools and techniques dealing with large-scale Petri nets will provide a firm foundation of Petri net theory. A discussion is presented of the computational complexity aspect of the legal firing sequence problem (LFS) and some related problems of Petri nets, each having applications to practical problems. Their NP-completeness and polynomial-time solvability are presented. >
Published: 2003

40. Time complexity modeling and comparison of parallel architectures for Fourier transform oriented algorithms

Author: Veljko Milutinovic, C. Gimarc, and O.K. Ersoy
Subjects: business.industry, Computer science, symbols.namesake, Fourier transform, Parallel processing (DSP implementation), Worst-case complexity, symbols, Concurrent computing, Probabilistic analysis of algorithms, Algorithm design, business, Time complexity, Algorithm, Digital signal processing
Abstract: A technique for modeling the time-domain complexity of the implementation of an algorithm is described. The model includes algorithm-, architecture-, and technology-related parameters. The model is used here to compare architectures for various Fourier-transform-oriented algorithms; however, use of the model can point to possible changes in algorithm or architecture that will increase performance. The development of the model is discussed, and an analysis of five different Fourier-transform algorithms is given. >
Published: 2003

41. A game-theoretic modeling of concurrency

Author: Yiannis N. Moschovakis
Subjects: Nondeterministic algorithm, Naturalness, Recursion, Theoretical computer science, Computer science, Asynchronous communication, Robustness (computer science), Concurrency, Concurrent computing, Game theory
Abstract: A model is introduced for asynchronous concurrent communication, where each agent's perception of the system is represented by a game of interaction. The model combines strict fair merge with full recursion, and the main mathematical results provide evidence for the robustness and naturalness of his interpretation of recursive definitions of nondeterministic processes. The approach is closest to D. Park's (1980, 1983) whose ideas are starting points for this work. >
Published: 2003

42. CSP-based object-oriented description of parallel reconfiguration architectures

Author: W.K. Fuchs and D.K. Hwang
Subjects: Object-oriented programming, Computer science, business.industry, Parallel algorithm, Control reconfiguration, Communicating sequential processes, Computer architecture, Concurrent computing, Algorithm design, business, computer, Visual programming language, Graphical user interface, computer.programming_language
Abstract: An approach to object-oriented description of reconfigurable parallel architectures based on an extended communicating sequential processes (CSP) model of communication is presented. A workbench called OODRA (object-oriented design of reconfigurable architectures), has been designed and implemented, based on this approach which is suitable for the development of highly concurrent, special-purpose, reconfigurable architectures. Intended uses of OODRA include parallel algorithm/architecture functional simulation and reconfiguration algorithm simulation, with an interactive graphical interface for parallel architecture design. Visual programming and parameterized architecture family approaches to design are supported. >
Published: 2003

43. Summary of a distributed control algorithm for a dynamically reconfigurable array architecture

Author: T.S. White and F.G. Gray
Subjects: Interconnection, Wafer-scale integration, Hyperplane, Plane (geometry), Computer science, Control reconfiguration, Concurrent computing, Process control, Parallel computing, State (computer science), Computational science
Abstract: An array architecture composed of two hyperplanes is described. The execution plane is constructed using polymorphic processing elements and a programmable, switch-controlled, reconfigurable interconnection network. The control of the processing plane is accomplished by using a smaller hyperplane composed of small processing elements and a two-dimensional mesh interconnection scheme. The control plane cells have the capability of determining from a single seed the global configuration of the array, and of determining the size of the fault-free area surrounding each cell. Once the size of this area is determined, the size necessary for the executing algorithm can be determined, and the state pattern generated in an area with sufficient size. Reconfiguration elements in the control plane make it possible to have local reconfiguration around faulty cells, and to increase the fault-free area adjacent to a cell by isolating faulty cells. The discussion is limited to the local reconfiguration algorithm. >
Published: 2003

44. Redundancy for yield enhancement in the 3-D computer

Author: M.W. Yung, M. J. Little, R.D. Etchells, and J.G. Nash
Subjects: Triple modular redundancy, Very-large-scale integration, business.industry, Computer science, Integrated circuit, law.invention, Vector processor, law, Redundancy (engineering), Electronic engineering, Concurrent computing, Dual modular redundancy, business, Testability, Computer hardware
Abstract: A prototype 3-D Computer which demonstrates the feasibility of the stacked wafer approach is discussed. The wafer-scale integrated circuits of the 3-D Computer have been carefully partitioned to enable redundancy to be used to insure high yields. Redundancy approaches, implementation issues, and testability both for the schemes used in the prototype and in future generations of the 3-D Computer are discussed. The circuits of the 32*32 array processor used 100% interstitial redundancy using one-way connectivity. The 128*128 array processor now underway construction, as well as a future larger array processor, use a mixture of redundancy schemes: a 50% redundancy with four-way connectivity for the array logic and 100% redundancy for the nearest-neighbor communication and control circuits. The 50% redundancy minimizes the area and test overhead for sparing while improving yields. >
Published: 2003

45. Low overhead distributed diagnostic algorithms for very large multiple processor systems

Author: S.H. Hosseini
Subjects: Very-large-scale integration, Low overhead, Computer science, Distributed algorithm, Distributed computing, Small number, Control reconfiguration, Concurrent computing, System testing, Overhead (computing)
Abstract: The increasing need for the design of high-performance, highly reliable systems has led to the design of very large systems made up of hundreds of thousands of processors. The author proposes distributed algorithms for testing and reconfiguration of these systems. In these algorithms the number of tests and the amount of testing message overhead are reduced by making testing assignment dynamic. Initially a small number of processors, ideally one, is assigned to test every processor, and when some of the processor or communication channels fail, a new testing assignment is made to assign again a small number of testers to every processor. >
Published: 2003

46. An object-method programming language for data parallel computation

Author: Pearl Y. Wang, Stephen B. Seidman, Michael D. Rice, and T.E. Gerasch
Subjects: Object-oriented programming, Theoretical computer science, Programming language, Computer science, computer.software_genre, Object (computer science), Data structure, Set (abstract data type), High-level programming language, Concurrent computing, Object type, SIMD, Programmer, computer
Abstract: DAPL is a data-parallel programming language that allows the programmer to define geometric organizations of virtual processors, called objects, that are machine-independent. These organizations can be built up from members of a collection of fundamental geometric types provided by the language. Each fundamental type has a set of associated primitives that may be invoked for data movement within objects. Alternatively, object types can be defined that have nonregular data communication patterns, and objects or virtual processors can be allocated dynamically. Information can also be transferred between objects. Typical SIMD operations such as broadcasting, reduction, processor selection, data aggregation, and parallel input/output are supported by DAPL. Several application programs are presented to illustrate the flexibility and power of the language. >
Published: 2003

47. A general performance modeling technique for degradable computer communication networks with blocking

Author: J. Hsieh and Donald R. Ucci
Subjects: Computer science, Concurrency, Node (networking), Distributed computing, Stochastic Petri net, Local area network, Concurrent computing, Throughput, Petri net, Blocking (statistics), Telecommunications network
Abstract: A new model is presented for evaluating the performance of a degradable computer communication network with blocking, wherein network nodes are modeled with the generalized stochastic petri net (GSPN) method. The authors introduce into the GSPN a concept called the blocking arc to enable proper simulation of the blocking phenomenon. The advantages of GSPN, such as concurrency and parallelism, are exploited. Performance analyses are given for a degradable five-server blocking system which exactly match and extend previous results. >
Published: 2003

48. A distributed, real-time programming language for robotics

Author: G. Pocock
Subjects: Object-oriented programming, Programming language, Computer science, business.industry, Robotics, computer.software_genre, Very high-level programming language, Schema (genetic algorithms), High-level programming language, Robot, Concurrent computing, Artificial intelligence, First-generation programming language, business, computer
Abstract: The author describes the Robot Schema Programming language (RSPL), a concurrent, object-oriented, real-time language designed for sensory-based robotics. The language is based on a formal model of computation originally developed by D. Lyons and M.A. Arbib (1989) but extended to accommodate real-time requirements. The major features of the RSPL are highlighted. The schema definition and the system function calls are examined. >
Published: 2003

49. Transaction synchronization in distributed shared virtual memory systems

Author: M. Hsu and V.-O. Tam
Subjects: Memory coherence, Distributed shared memory, Memory management, Computer science, Transaction processing, Distributed computing, Synchronization (computer science), Concurrent computing, Data synchronization, Distributed memory, Data diffusion machine, Database transaction, Synchronization
Abstract: Synchronization in DSVM (distributed shared virtual memory) can be approached top-down by first understanding the synchronization needs at the process level instead of only at the memory access level. The authors demonstrate this idea in the context of transaction synchronization, devising two-phase locking-based algorithms under two DSVM scenarios: with and without an underlying memory coherence system. They compare the performances of the two algorithms and argue that significant performance gain can potentially result from bypassing memory coherence and supporting process synchronization directly on distributed memory. They also study the role of the optimistic algorithms in transaction synchronization in DSVM and show that some optimistic policy appears promising under the scenarios studied. >
Published: 2003

50. VLSI pipelined trees and pyramids for image processing

Author: A. Antola and R. Negrini
Subjects: Very-large-scale integration, Parallel processing (DSP implementation), Pixel, Computer science, law, Parallelism (grammar), Concurrent computing, Commutator (electric), Image processing, Parallel computing, Image resolution, law.invention
Abstract: The authors consider architectures directly implementing algorithms for real-time image processing (low-level processing or image coding). Well-known real-time architectures, capable of generating and processing pyramids and of granting the required performance by using fine-grained parallelism, adopt large quantities of mesh- or pyramid-connected small processing elements (PEs), each PE executing the same basic algorithmic steps. New architectures are presented that adopt one-dimension pipelines, constituted by a linear array of stages, each stage consisting of two cascaded modules: one PE and one commutator module (to modify the ordering of pixel data flowing from stage to stage). Compared with mesh- and pyramid-connected structures, these pipelines are easier to implement, and techniques for overcoming production defects or failures can be applied in simpler and more reliable ways. >
Published: 2003

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

1,025 results on '"Concurrent computing"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources