91 results on '"Serial computer"'
Search Results
2. Goal
- Author
-
Haken, Hermann and Haken, Hermann, editor
- Published
- 2004
- Full Text
- View/download PDF
3. Networks
- Author
-
Mender, Donald and Mender, Donald
- Published
- 1994
- Full Text
- View/download PDF
4. Characteristics, Predictors And Prognostic Value Of Coronary Artery Plaque Progression Using Serial Computer Tomography Imaging
- Author
-
B Merkely, Borbála Vattay, Pál Maurovich-Horvat, Bálint Szilveszter, Judit Simon, Melinda Boussoussou, and Márton Kolossváry
- Subjects
medicine.medical_specialty ,medicine.anatomical_structure ,business.industry ,Plaque progression ,Medicine ,Radiology, Nuclear Medicine and imaging ,Radiology ,Tomography ,Cardiology and Cardiovascular Medicine ,business ,Value (mathematics) ,Serial computer ,Artery - Published
- 2020
- Full Text
- View/download PDF
5. serial computer
- Author
-
Weik, Martin H. and Weik, Martin H.
- Published
- 2001
- Full Text
- View/download PDF
6. Parallel peridynamics–SPH simulation of explosion induced soil fragmentation by using OpenMP
- Author
-
Houfu Fan and Shaofan Li
- Subjects
Fluid Flow and Transfer Processes ,Scheme (programming language) ,Numerical Analysis ,Peridynamics ,Computer science ,Computation ,Computational Mechanics ,Parallel algorithm ,02 engineering and technology ,Parallel computing ,01 natural sciences ,Market fragmentation ,010101 applied mathematics ,Smoothed-particle hydrodynamics ,Computational Mathematics ,020303 mechanical engineering & transports ,0203 mechanical engineering ,Modeling and Simulation ,State (computer science) ,0101 mathematics ,computer ,Civil and Structural Engineering ,Serial computer ,computer.programming_language - Abstract
In this work, we use the OpenMP-based shared-memory parallel programming to implement the recently developed coupling method of state-based peridynamics and smoothed particle hydrodynamics (PD-SPH), and we then employ the program to simulate dynamic soil fragmentation induced by the explosion of the buried explosives. The paper offers detailed technical description and discussion on the PD-SHP coupling algorithm and how to use the OpenMP shared-memory programming to implement such large-scale computation in a desktop environment, with an example to illustrate the basic computing principle and the parallel algorithm structure. In specific, the paper provides a complete OpenMP parallel algorithm for the PD-SPH scheme with the programming and parallelization details. Numerical examples of soil fragmentation caused by the buried explosives are also presented. Results show that the simulation carried out by the OpenMP parallel code is much faster than that by the corresponding serial computer code.
- Published
- 2016
- Full Text
- View/download PDF
7. Comparative Study of the Parallelization of the Smith-Waterman Algorithm on OpenMP and Cuda C
- Author
-
Oumarou Sie and Amadou Chaibou
- Subjects
Smith–Waterman algorithm ,CUDA ,Matrix (mathematics) ,Parallelizable manifold ,Computer science ,Structure (category theory) ,Parallel computing ,ENCODE ,Representation (mathematics) ,Computational science ,Serial computer - Abstract
In this paper, we present parallel programming approaches to calculate the values of the cells in matrix’s scoring used in the Smith-Waterman’s algorithm for sequence alignment. This algorithm, well known in bioinformatics for its applications, is unfortunately time-consuming on a serial computer. We use formulation based on anti-diagonals structure of data. This representation focuses on parallelizable parts of the algorithm without changing the initial formulation of the algorithm. Approaching data in that way give us a formulation more flexible. To examine this approach, we encode it in OpenMP and Cuda C. The performance obtained shows the interest of our paper.
- Published
- 2015
- Full Text
- View/download PDF
8. Parallelizing Affinity Propagation Using Graphics Processing Units for Spatial Cluster Analysis over Big Geospatial Data
- Author
-
Xuan Shi
- Subjects
Multi-core processor ,Geospatial analysis ,Computer science ,05 social sciences ,0211 other engineering and technologies ,0507 social and economic geography ,Graphics processing unit ,02 engineering and technology ,Parallel computing ,computer.software_genre ,Bottleneck ,Article ,Scalability ,Affinity propagation ,Graphics ,050703 geography ,computer ,021101 geological & geomatics engineering ,Serial computer - Abstract
Introduced in 2007, affinity propagation (AP) is a relatively new machine learning algorithm for unsupervised classification that has seldom been applied in geospatial applications. One bottleneck is that AP could hardly handle large data, and a serial computer program would take a long time to complete an AP calculation. New multicore and manycore computer architectures, combined with application accelerators, show promise for achieving scalable geocomputation by exploiting task and data levels of parallelism. This chapter introduces our recent progress in parallelizing the AP algorithm on a graphics processing unit (GPU) for spatial cluster analysis, the potential of the proposed solution to process big geospatial data, and its broader impact for the GIScience community.
- Published
- 2017
9. Identification and robustness analysis of nonlinear multi-stage enzyme-catalytic dynamical system in batch culture
- Author
-
Xu Zhang, Zhilong Xiu, Hongchao Yin, Bing Tan, Jinlong Yuan, Xi Zhu, and Enmin Feng
- Subjects
Parameter identification problem ,Multi stage ,Equilibrium point ,Computational Mathematics ,Nonlinear system ,Mathematical optimization ,Control theory ,Approximation error ,Computer science ,Applied Mathematics ,Computation ,Serial computer ,Stable state - Abstract
In this paper, based on biological phenomena of different characters at different stages, we propose a nonlinear multi-stage enzyme-catalytic dynamic system with unknown time and system parameters. Such system starts at different initial conditions for formulating batch culture process of glycerol bio-dissimilation to 1,3-propanediol. Some properties of the nonlinear system are discussed. In view of the difficulty in accurately measuring the concentration of intracellular substances and the absence of equilibrium points for the nonlinear system, we quantitatively define biological robustness for the entire process of batch culture instead of one for the approximately stable state of continuous culture. Taking the biological robustness of the intracellular substances together with the relative error between the experimental data and the computational values of the extracellular substances as the cost function, we formulate an identification problem subject to the nonlinear system, continuous state inequality constraints and parameter constraints. Analytical solution to system is not naturally available, therefore, a huge number of numerical computations of the proposed system and the proposed biological robustness make solving the identification problem by a serial computer a very complicated task. To improve computational efficiency, we develop an effective parallelized optimization algorithm, based on the constraint transcription and smoothing approximation techniques, for seeking the optimal time and system parameters. Compared with previous work, we assert that the optimal time and system parameters together with the corresponding nonlinear multi-stage dynamical system can reasonably describe batch fermentation at different initial conditions.
- Published
- 2014
- Full Text
- View/download PDF
10. Massively Parallel Multi-Computer Hardware=Software Structures for Learning
- Author
-
Uhr, L. and Haken, Hermann, editor
- Published
- 1985
- Full Text
- View/download PDF
11. Process-structured architectures to transform information flowing through
- Author
-
Uhr, Leonard, Goos, G., editor, Hartmanis, J., editor, Barstow, D., editor, Brauer, W., editor, Brinch Hansen, P., editor, Gries, D., editor, Luckham, D., editor, Moler, C., editor, Pnueli, A., editor, Seegmüller, G., editor, Stoer, J., editor, Wirth, N., editor, Wolf, Gottfried, editor, Legendi, Tamáas, editor, and Schendel, Udo, editor
- Published
- 1989
- Full Text
- View/download PDF
12. Time of searching for similar binary vectors in associative memory
- Author
-
Alexander A. Frolov, Dušan Húsek, and D. A. Rachkovskii
- Subjects
Hopfield network ,Theoretical computer science ,General Computer Science ,Computer science ,Hash function ,Entropy (information theory) ,Binary number ,Bidirectional associative memory ,Content-addressable memory ,Algorithm ,Associative property ,Serial computer - Abstract
Times of searching for similar binary vectors in neural-net and traditional associative memories are investigated and compared. The neural-net approach is demonstrated to surpass the traditional ones even if it is implemented on a serial computer when the entropy of a space of signals is of order of several hundreds and the number of stored vectors is vastly larger than the entropy.
- Published
- 2006
- Full Text
- View/download PDF
13. Configuration and Performance of a Beowulf Cluster for Large-Scale Scientific Simulations
- Author
-
Matthias K. Gobbert
- Subjects
Configuration management ,Partial differential equation ,General Computer Science ,Computer science ,Computer cluster ,Numerical analysis ,General Engineering ,Cluster (physics) ,Disk storage ,Parallel computing ,Finite element method ,Serial computer ,Computational science - Abstract
To achieve optimal performance on a Beowulf cluster for large-scale scientific simulations, it's necessary to combine the right numerical method with its efficient implementation to exploit the cluster's critical high-performance components. This process is demonstrated using a simple but prototypical problem of solving a time-dependent partial differential equation. Beowulf clusters in virtually every price range are readily available today for purchase in fully integrated form from a large variety of vendors. At the University of Maryland, Baltimore County (UMBC), a medium-sized 64-processor cluster with high-performance interconnect and extended disk storage was bought from IBM. The cluster has several critical components, and this article demonstrates their roles using a prototype problem from the numerical solution of time-dependent partial differential equations (PDEs). The problem was selected to show how judiciously combining a numerical algorithm and its efficient implementation with the right hardware (in this case, the Beowulf cluster) can achieve parallel computing's two fundamental goals: to solve problems faster and to solve larger problems than we can on a serial computer.
- Published
- 2005
- Full Text
- View/download PDF
14. A New Approximation for 3D Electromagnetic Scattering in the Presence of Anisotropic Conductive Media
- Author
-
Sheng Fang, Guozhong Gao, and Carlos Torres-Verdín
- Subjects
Physics ,Smoothness ,business.industry ,Scattering ,General Engineering ,CPU time ,Integral equation ,Computational physics ,Optics ,Computer Science::Multimedia ,Benchmark (computing) ,business ,Anisotropy ,Electrical conductor ,Serial computer - Abstract
Accurate and efficient modeling of three-dimensional (3D) electromagnetic (EM) scattering remains an open challenge in the presence of anisotropic conductive media. Numerical algorithms used to simulate the response of dipping and anisotropic rock formations can easily exceed standard computer resources as EM fields become fully coupled in general. In the past, several scattering approximations have been developed to efficiently simulate complex EM problems arising in the probing of subsurface rock formations. These approximations include the Born, Rytov, Extended Born (ExBorn) and quasi-linear (QL) methods, among others. However, so far none of these approximations have been adapted to simulate scattering in the presence of anisotropic conductive media. In this paper, we describe and benchmark a novel EM scattering approximation that remains accurate and efficient in the presence of 3D anisotropic conductive media. The approximation is based on the integral formulation of EM scattering and takes advantage of the spatial smoothness and general vectorial properties of EM fields internal to scatterers. A general vectorial formulation is used to properly account for complex EM coupling due to anisotropy. Several numerical examples borrowed from borehole induction logging are used to describe and assess the accuracy and efficiency of the new EM scattering approximation. The approximation allows one to accurately simulate the EM response of more than 1 million cells within a few minutes of CPU time on a serial computer with standard memory and speed resources.
- Published
- 2003
- Full Text
- View/download PDF
15. Evaluation of the Length and Isometric Pattern of the Anterolateral Ligament With Serial Computer Tomography
- Author
-
Paulo Victor Partezani Helito, Marco Kawamura Demange, José Ricardo Pécora, Roberto Freire da Mota e Albuquerque, Marcelo Bordalo-Rodrigues, Marcelo Batista Bonadio, Gilberto Luis Camanho, and Camilo Partezani Helito
- Subjects
Anterolateral ligament ,medicine.medical_specialty ,rotatory instability ,anatomy ,business.industry ,Pivot shift ,anterolateral ligament ,Isometric exercise ,Anatomy ,tomography ,musculoskeletal system ,Surgery ,medicine.anatomical_structure ,Rotatory instability ,medicine ,Orthopedics and Sports Medicine ,Tomography ,business ,Serial computer - Abstract
Background: Recent anatomical studies have identified the anterolateral ligament (ALL). Injury to this structure may lead to the presence of residual pivot shift in some reconstructions of the anterior cruciate ligament. The behavior of the length of this structure and its tension during range of motion has not been established and is essential when planning reconstruction. Purpose: To establish differences in the ALL length during range of knee motion. Study Design: Descriptive laboratory study. Methods: Ten unpaired cadavers were dissected. The attachments of the ALL were isolated. Its origin and insertion were marked with a 2 mm–diameter metallic sphere. Computed tomography scans were performed on the dissected parts under extension and 30°, 60°, and 90° of flexion; measurements of the distance between the 2 markers were taken at all mentioned degrees of flexion. The distances between the points were compared. Results: The mean ALL length increased with knee flexion. Its mean length at full extension and at 30°, 60°, and 90° of flexion was 37.9 ± 5.3, 39.3 ± 5.4, 40.9 ± 5.4, and 44.1 ± 6.4 mm, respectively. The mean increase in length from 0° to 30° was 3.99% ± 4.7%, from 30° to 60° was 4.20% ± 3.2%, and from 60° to 90° was 7.45% ± 4.8%. From full extension to 90° of flexion, the ligament length increased on average 16.7% ± 12.1%. From 60° to 90° of flexion, there was a significantly higher increase in the mean distance between the points compared with the flexion from 0° to 30° and from 30° to 60°. Conclusion: The ALL shows no isometric behavior during the range of motion of the knee. The ALL increases in length from full extension to 90° of flexion by 16.7%, on average. The increase in length was greater from 60° to 90° than from 0° to 30° and from 30° to 60°. The increase in length at higher degrees of flexion suggests greater tension with increasing flexion. Clinical Relevance: Knowledge of ALL behavior during the range of motion of the knee will allow for fixation (during its reconstruction) to be performed with a higher or lower tension, depending on the chosen degree of flexion.
- Published
- 2014
16. Stochastic optimization models in forest planning: a progressive hedging solution approach
- Author
-
Roger J.-B. Wets, Fernando Badilla Veliz, Jean-Paul Watson, Andrés Weintraub, and David L. Woodruff
- Subjects
Forest planning ,Mathematical optimization ,Road construction ,Theory of computation ,Economics ,General Decision Sciences ,Stochastic optimization ,Decomposition method (constraint satisfaction) ,Management Science and Operations Research ,Stochastic programming ,Serial computer ,Extensive-form game - Abstract
We consider the important problem of medium term forest planning with an integrated approach considering both harvesting and road construction decisions in the presence of uncertainty modeled as a multi-stage problem. We give strengthening methods that enable the solution of problems with many more scenarios than previously reported in the literature. Furthermore, we demonstrate that a scenario-based decomposition method (Progressive Hedging) is competitive with direct solution of the extensive form, even on a serial computer. Computational results based on a real-world example are presented.
- Published
- 2014
- Full Text
- View/download PDF
17. A Higher-Order Compact Method in Space and Time Based on Parallel Implementation of the Thomas Algorithm
- Author
-
Philip J. Morris and Alex Povitsky
- Subjects
Numerical Analysis ,Schedule ,Physics and Astronomy (miscellaneous) ,Computer science ,Applied Mathematics ,Computation ,Tridiagonal matrix algorithm ,Computer Science Applications ,Numerical integration ,Computational Mathematics ,symbols.namesake ,Gaussian elimination ,Ramer–Douglas–Peucker algorithm ,Modeling and Simulation ,symbols ,Time domain ,Algorithm ,Serial computer - Abstract
In this study we propose a novel method to parallelize high-order compact numerical algorithms for the solution of three-dimensional PDEs in a space?time domain. For such a numerical integration most of the computer time is spent in computation of spatial derivatives at each stage of the Runge?Kutta temporal update. The most efficient direct method to compute spatial derivatives on a serial computer is a version of Gaussian elimination for narrow linear banded systems known as the Thomas algorithm. In a straightforward pipelined implementation of the Thomas algorithm processors are idle due to the forward and backward recurrences of the Thomas algorithm. To utilize processors during this time, we propose to use them for either nonlocal data-independent computations, solving lines in the next spatial direction, or local data-dependent computations by the Runge?Kutta method. To achieve this goal, control of processor communication and computations by a static schedule is adopted. Thus, our parallel code is driven by a communication and computation schedule instead of the usual “creative programming” approach. The obtained parallelization speed-up of the novel algorithm is about twice as much as that for the basic pipelined algorithm and close to that for the explicit DRP algorithm. Use of the algorithm is demonstrated and comparisons with other schemes are given.
- Published
- 2000
- Full Text
- View/download PDF
18. Optimal Decomposition of the Domain in Spectral Methods for Wave-Like Phenomena
- Author
-
Carl Erik Wasberg and David Gottlieb
- Subjects
Quantitative Biology::Biomolecules ,ComputerSystemsOrganization_COMPUTERSYSTEMIMPLEMENTATION ,Mathematics::Complex Variables ,Applied Mathematics ,Mathematical analysis ,MathematicsofComputing_NUMERICALANALYSIS ,Domain decomposition methods ,ComputerSystemsOrganization_PROCESSORARCHITECTURES ,Grid ,Computer Science::Numerical Analysis ,Upper and lower bounds ,Mathematics::Numerical Analysis ,Computational Mathematics ,Error function ,Function approximation ,Collocation method ,Applied mathematics ,Spectral method ,Serial computer ,Mathematics - Abstract
A strategy for determining the optimal number of grid points and subdomains in a spectral method with domain decomposition on a serial computer is presented. The rapidly growing computational cost for large numbers of grid points in each subdomain is balanced against the exponential convergence for spectral approximation of smooth functions, and the optimum is found as the number of grid points and subdomains that gives the minimal computational cost for a given accuracy. The typical length scale of the problem is found to influence the number of subdomains but not the number of grid points within each subdomain.
- Published
- 2000
- Full Text
- View/download PDF
19. Fast parallel algorithms for vandermonde determinants
- Author
-
Lei Li and Tadao Nakamura
- Subjects
Algebra ,Computational Theory and Mathematics ,Applied Mathematics ,Parallel algorithm ,SIMD ,Arithmetic ,Type (model theory) ,Supercomputer ,Vandermonde matrix ,Computer Science Applications ,Serial computer ,Mathematics - Abstract
We know that evaluating annxn Vandennonde determinant usually needs 0(n2) number of arithmetic operations. This paper presents a few fast algorithms for Vandennonde determinants, confluent Vandennonde determinants and generalized Vandennonde determinants. These algorithms need only number of arithmetic operations on a serial computer, or at most need number of the parallel steps on a SIMD type supercomputer with n processors.
- Published
- 2000
- Full Text
- View/download PDF
20. Majority-Vote Cellular Automata, Ising Dynamics, and P-Completeness
- Author
-
Cristopher Moore
- Subjects
Discrete mathematics ,Majority rule ,Statistical Mechanics (cond-mat.stat-mech) ,Computer science ,Cellular Automata and Lattice Gases (nlin.CG) ,Boolean circuit ,FOS: Physical sciences ,Statistical and Nonlinear Physics ,Nonlinear Sciences - Adaptation and Self-Organizing Systems ,Cellular automaton ,Ising model ,State (computer science) ,Completeness (statistics) ,Adaptation and Self-Organizing Systems (nlin.AO) ,Nonlinear Sciences - Cellular Automata and Lattice Gases ,Time complexity ,Condensed Matter - Statistical Mechanics ,Mathematical Physics ,Serial computer - Abstract
We study cellular automata where the state at each site is decided by a majority vote of the sites in its neighborhood. These are equivalent, for a restricted set of initial conditions, to non-zero probability transitions in single spin-flip dynamics of the Ising model at zero temperature. We show that in three or more dimensions these systems can simulate Boolean circuits of AND and OR gates, and are therefore P-complete. That is, predicting their state t time-steps in the future is at least as hard as any other problem that takes polynomial time on a serial computer. Therefore, unless a widely believed conjecture in computer science is false, it is impossible even with parallel computation to predict majority-vote cellular automata, or zero-temperature single spin-flip Ising dynamics, qualitatively faster than by explicit simulation., 10 pages with figures
- Published
- 1997
- Full Text
- View/download PDF
21. Parallelization in a spatially explicit individual-based ecological model—1. Spatial data interpolation
- Author
-
Michael W. Berry, John C. Dempsey, E. Jane Comiskey, Catherine A. Abbott, Hang-Kwang Luh, and Louis J. Gross
- Subjects
MIMD ,Parallel processing (DSP implementation) ,Computer science ,Parallel algorithm ,Bilinear interpolation ,Parallel computing ,SIMD ,Computers in Earth Sciences ,Spatial analysis ,Information Systems ,Serial computer ,Interpolation - Abstract
In this paper, we compare parallel implementations for spatial data interpolation on two parallel computers, MasPar MP-2 and CM-5, with the sequential implementations on a serial computer, SUN SPARCStation 20/61. Performance statistics indicate that: (1) parallel implementations can attain significant speed improvements over sequential implementations in a long-term simulation (above 100 time steps); (2) a data parallel algorithm using the local map method performs better than the partitioned map method on the MasPar MP-2, but there is no significant difference between them on the CM-5; and (3) for the particular problem of spatial data interpolation, the SIMD parallel computer (the MasPar MP-2 with 4096 processor elements) performs better than the MIMD parallel computer (the CM-5 with 32 parallel processing nodes).
- Published
- 1997
- Full Text
- View/download PDF
22. Cataloging Computer Files as Serials
- Author
-
Elizabeth Allerton and Cathy Kellum
- Subjects
World Wide Web ,Computer science ,Interpretation (philosophy) ,Computer file ,Cataloging ,Library and Information Sciences ,Serial computer - Abstract
Due to the nature of serials, cataloging machine-readable records that address the "serial" aspect of any material can be challenging as well as frustrating. The cataloging of computer files also presents unique difficulties. A serials cataloger discusses the advantages of treating computer files as serials for cataloging and processing purposes, and outlines methodologies for doing so. She examines general concerns and addresses issues pertaining to both direct and remote access to the materials. Giving hints from her own experience, she suggests tools and resources available that can assist both catalogers and non-catalogers in the interpretation of serial computer file records.
- Published
- 1996
- Full Text
- View/download PDF
23. PVM-AMBER: A parallel implementation of the AMBER molecular mechanics package for workstation clusters
- Author
-
Eric Swanson and Terry P. Lybrand
- Subjects
Ethernet ,Speedup ,ComputerSystemsOrganization_COMPUTERSYSTEMIMPLEMENTATION ,business.industry ,Computer science ,Computation ,General Chemistry ,Parallel computing ,computer.software_genre ,DEC Alpha ,Computational Mathematics ,Software ,Virtual machine ,Graphics ,business ,computer ,Serial computer - Abstract
A parallel version of the popular molecular mechanics package AMBER suitable for execution on workstation clusters has been developed. Computer-intensive portions of molecular dynamics or free-energy perturbation computations, such as nonbonded pair list generation or calculation of nonbonded energies and forces, are distributed across a collection of Unix workstations linked by Ethernet or FDDI connections. This parallel implementation utilizes the message-passing software PVM (Parallel Virtual Machine) from Oak Ridge National Laboratory to coordinate data exchange and processor synchronization. Test simulations performed for solvated peptide, protein, and lipid bilayer systems indicate that reasonable parallel efficiency (70–90%) and computational speedup (2–5 × serial computer runtimes) can be achieved with small workstation clusters (typically six to eight machines) for typical biomolecular simulation problems. PVM-AMBER is also easily and rapidly portable to different hardware platforms due to the availability of PVM for numerous computers. The current version of PVM-AMBER has been tested successfully on Silicon Graphics, IBM RS6000, DEC ALPHA, and HP 735 workstation clusters and heterogeneous clusters of these machines, as well as on CRAY T3D and Kendall Square KSR2 parallel supercomputers. Thus, PVM-AMBER provides a simple and cost-effective mechanism for parallel molecular dynamics simulations on readily available hardware platforms. Factors that affect the efficiency of this approach are discussed. © 1995 by John Wiley & Sons, Inc.
- Published
- 1995
- Full Text
- View/download PDF
24. A rigorous comparison of the Ewald method and the fast multipole method in two dimensions
- Author
-
John W. Perram, Dorthe Sølvason, Jiří Kolafa, and Henrik Gordon Petersen
- Subjects
Dipole ,Hardware and Architecture ,Fast multipole method ,Transputer ,Range (statistics) ,General Physics and Astronomy ,Value (computer science) ,Charge (physics) ,Multipole expansion ,Algorithm ,Serial computer ,Mathematics - Abstract
The most efficient and proper standard method for simulating charged or dipolar systems is the Ewald method, which asymptotically scales as N32 where N is the number of charges. However, recently the “fast multipole method” (FMM) which scales linearly with N has been developed. The break-even of the two methods (that is, the value of N below which Ewald is faster and above which FMM is faster) is very sensitive to the way the methods are optimized and implemented and to the required simulation accuracy. In this paper we use theoretical estimates and simulation results for the accuracies to carefully compare the two methods with respect to speed. We have developed and implemented highly efficient algorithms for both methods for a serial computer (a SPARCstation ELC) as well as a parallel computer (a T800 transputer based MEIKO computer). Breakevens in the range between N = 10 000 and N = 30 000 were found for reasonable values of the average accuracies found in our simulations. Furthermore, we illustrate how huge but rare single charge pair errors in the FMM inflate the error for some of the charges.
- Published
- 1995
- Full Text
- View/download PDF
25. Parallel three-dimensional finite difference beam propagation methods
- Author
-
John M. Arnold and H.M. Masoudi
- Subjects
Computer science ,Modeling and Simulation ,Transputer ,Finite difference ,Electronic engineering ,Power dividers and directional couplers ,Rib waveguides ,Electrical and Electronic Engineering ,Beam (structure) ,Computer Science Applications ,Computational science ,Connection (mathematics) ,Serial computer - Abstract
In this work, we show the implementation of two explicit three-dimensional finite-difference beam propagation methods (BPM) on two different parallel computers, namely a transputer array and a Connection Machine (CM). To assess the performance of using parallel computers, serial computer codes of the two methods have ben implemented and a comparison between the speed of the serial and paralledl codes has been made. Large gains in the speed of the paralled FD-BPMs has been obtained compared to the serial implementations. In addition, a comparison between the performance of the transputer array and the CM in executing the two FD-BPMs has been discussed. Finally, to assess and compare the two methods, three different rib waveguides and three different directional couplers have been analysed and the results compared with published results.
- Published
- 1995
- Full Text
- View/download PDF
26. REAL-TIME COMPUTATION OF ELEMENT STIFFNESS MATRIX BASED ON BP NEURAL NETWORKS
- Author
-
Zhonglai Wang, Jian Xiong, Huanwei Xu, Haiqing Li, and H-Z Huang
- Subjects
Engineering ,Artificial neural network ,business.industry ,Computation ,Fracture (geology) ,Foundation (engineering) ,Direct stiffness method ,Structural engineering ,Condensed Matter Physics ,business ,Finite element method ,Stiffness matrix ,Serial computer - Abstract
Elastoplasticity computation is the analysis foundation of many complex mechanical behaviours such as fatigue, damage, and fracture. It is impossible to realize real-time computation of elastoplasticity based on serial computer. T he form of the element stiffness matrix is analyzed. A method that may realize the real-time computation of element stiffness matrix based on BP neural networks is proposed. The element stiffness matrix is computed using finite element analysis and BP neural networks, respectively. DOI: http://dx.doi.org/10.5755/j01.mech.18.1.1279
- Published
- 2012
- Full Text
- View/download PDF
27. The benefits of parallel multibody simulation
- Author
-
A. Eichberger, R. Schwertassek, and C. Führer
- Subjects
Numerical Analysis ,Exploit ,Dynamical systems theory ,Computer science ,Applied Mathematics ,General Engineering ,Multibody simulation ,Parallel computing ,Multibody system ,Inverse problem ,computer.software_genre ,Numerical integration ,Computer Aided Design ,Algorithm ,computer ,Serial computer - Abstract
To exploit the benefits of parallel computer architectures for multibody system simulation, an interdisciplinary approach has been pursued, combining knowledge of the three disciplines of dynamics, numerical mathematics and computer science. An analysis of the options available for the formulation and numerical solution of the dynamical system equations yielded a surprising result. A method initially proposed to solve the inverse problem of dynamics is the best choice to generate the system equations required for solving the simulation problem, when relying on implicit integration routines. Such routines have the particular advantage of handling stiff systems, too. The new O(N)-residual formalism, generating the system equations in a form required for implicit numerical integration, has a high potential to benefit from parallel computer architectures. Two strategies of medium and coarse grain parallelization have been implemented on a Transputer network to obtain a package for parallel multibody simulation. An analysis of the performance of this package demonstrates for typical multibody simulation problems that the new code is five times faster than existing codes when implemented on a serial computer. An additional speed-up by the same order of magnitude is obtained when the code is implemented on a Transputer network.
- Published
- 1994
- Full Text
- View/download PDF
28. Geometric algorithms for digitized pictures on a mesh-connected computer
- Author
-
Quentin F. Stout and Russ Miller
- Subjects
Convex hull ,Pixel ,Applied Mathematics ,Image processing ,Computational geometry ,Computational Theory and Mathematics ,Artificial Intelligence ,Component (UML) ,Computer Vision and Pattern Recognition ,Extreme point ,Algorithm ,Software ,Linear separability ,Mathematics ,Serial computer - Abstract
Although mesh-connected computers are used almost exclusively for low-level local image processing, they are also suitable for higher level image processing tasks. We illustrate this by presenting new optimal (in the O-notational sense) algorithms for computing several geometric properties of figures. For example, given a black/white picture stored one pixel per processing element in an n × n mesh-connected computer, we give ?(n) time algorithms for determining the extreme points of the convex hull of each component, for deciding if the convex hull of each component contains pixels that are not members of the component, for deciding if two sets of processors are linearly separable, for deciding if each component is convex, for determining the distance to the nearest neighboring component of each component, for determining internal distances in each component, for counting and marking minimal internal paths in each component, for computing the external diameter of each component, for solving the largest empty circle problem, for determining internal diameters of components without holes, and for solving the all-points farthest point problem. Previous mesh-connected computer algorithms for these problems were either nonexistent or had worst case times of ?(n2). Since any serial computer has a best case time of ?(n2) when processing an n × n image, our algorithms show that the mesh-connected computer provides significantly better solutions to these problems.
- Published
- 2011
29. Parallel Computer Processing Systems Are Better Than Serial Computer Processing Systems
- Author
-
Zvi Retchkiman Konigsberg
- Subjects
Lyapunov stability ,Theoretical computer science ,Computer science ,Event (computing) ,Parallel computing ,Algebra over a field ,Petri net ,Max-plus algebra ,Serial computer - Abstract
The main objective and contribution of this paper consists in using a formal and mathematical approach to prove that parallel computer processing systems are better than serial computer processing systems, better related to: saving time and/or money and being able to solve larger problems. This is achieved thanks to the theory of Lyapunov stability and max-plus algebra applied to discrete event systems modeled with time Petri nets.
- Published
- 2011
- Full Text
- View/download PDF
30. Massively parallel computational methods for finite element analysis of transient structural responses
- Author
-
R.C. Shieh
- Subjects
Runge–Kutta methods ,Computer science ,Conjugate gradient method ,Diagonal matrix ,General Engineering ,Finite difference ,System of linear equations ,Massively parallel ,Algorithm ,Finite element method ,Serial computer ,Computational science - Abstract
With the emphasis on the finitely damped system (e.g. control structure interaction) case, two fully implicit and two semi-implicit sets of finite element method-based numerical algorithms are formulated for transient response analysis of space frame and truss structures in a massively parallel processing (MPP) environment. All algorithm sets use an implicit force calculation/vector equation of motion assembly procedure. The semi-implicit algorithms are based on the explicit central difference (CD) and the fourth-order Runge-Kutta (RK4) schemes, respectively, in conjunction with the use of mass lumping techniques so that solution of the recurrence equations for unknown displacements is reduced to a trivial diagonal matrix inversion operation. The fully implicit algorithm sets are based on the Newmark Beta (NB) and CD schemes, respectively, in conjunction with use of the (iterative) preconditioned conjugate gradient (PCG) method for solving the linear algebraic recurrence equations. The semi-implicit algorithm sets are fully implemented and assessed on an MPP CM-2 computer. A preliminary assessment of the fully implicit sets of algorithms is made on a Sun Workstation. These numerical study results show that the newly formulated MPP algorithms are, to a varying degree, superefficient (or potentially superefficient) on the CM-2 compared with, and even highly competitive against, the conventional sequential algorithms on an advanced serial computer.
- Published
- 1993
- Full Text
- View/download PDF
31. Self-replicating sequences of binary numbers. Foundations II: Strings of length N=4
- Author
-
Wolfgang Banzhaf
- Subjects
Discrete mathematics ,education.field_of_study ,General Computer Science ,String (computer science) ,Population ,Binary number ,Folding (DSP implementation) ,Pseudorandom binary sequence ,Nonlinear system ,Complementary sequences ,education ,Algorithm ,Biotechnology ,Mathematics ,Serial computer - Abstract
We study an algorithm which allows sequences of binary numbers (strings) to interact with each other. The simplest system of this kind with a population of 4-bit sequences is considered here. Previously proposed folding methods are used to generate alternative two-dimensional forms of the binary sequences. The interaction of two-dimensional and one-dimensional forms of strings is simulated in a serial computer. The reaction network for the N = 4 system is established. Development of string populations initially generated randomly is observed. Nonlinear rate equations are proposed which provide a model for this simplest system.
- Published
- 1993
- Full Text
- View/download PDF
32. A Parallel Algorithm for Variational Assimilation in Oceanography and Meteorology
- Author
-
Jerome R. Baugh and Andrew F. Bennett
- Subjects
Atmospheric Science ,Speedup ,Oceanography ,Circulation (fluid dynamics) ,Intel iPSC ,Meteorology ,Atmospheric models ,Parallel algorithm ,Ocean tide ,Ocean Engineering ,Variational assimilation ,Serial computer ,Mathematics - Abstract
A parallel algorithm is described for variational assimilation of observations into oceanic and atmospheric models. The algorithm may be coded first for execution on a serial computer and then trivially modified for execution on a parallel computer such as the Intel iPSC/860. The speedup factor for parallel execution is roughly P(2M+3) (2M+3P)−1, where P is the number of processors and M is the number of observations (M≥P). The speedup factor approaches P from below as M→∞. The algorithm has been applied in serial form to ocean tides (Bennett and McIntosh 1982; McIntosh and Bennett 1984; Bennett 1985) and oceanic equatorial interannual variability (Bennett 1990). It has been applied in parallel form to oceanic synoptic-scale circulation (Bennett and Thorburn 1992); a parallel application to operational forecasting of tropical cyclones is in progress (Bennett et al. 1992). For the sake of simplicity, the parallel algorithm is described here for a model consisting of a linear, first-order wave equa...
- Published
- 1992
- Full Text
- View/download PDF
33. The re-analysis of linear structures on serial and parallel computers
- Author
-
J.J. Modi and R.K. Livesley
- Subjects
Rank (linear algebra) ,Computer science ,General Engineering ,Parallel computing ,Coefficient matrix ,System of linear equations ,Field (computer science) ,Serial computer - Abstract
This paper discusses the problem of re-computing the solution of a large sparse and banded system of linear equations when a change of low rank is made to the coefficient matrix, using examples taken from the field of structural design. It describes a number of algorithms and gives operation counts and times for their implementation on a serial computer. The paper concludes with a discussion of ways in which the algorithms can be implemented on two types of parallel computer.
- Published
- 1991
- Full Text
- View/download PDF
34. A parallel frontal solver on the alliant FX/80
- Author
-
Eric M. Lui and Weiping Zhang
- Subjects
Computer program ,Mechanical Engineering ,Parallel algorithm ,Multiprocessing ,Parallel computing ,Finite element method ,Computer Science Applications ,MIMD ,Modeling and Simulation ,General Materials Science ,Algorithm ,Civil and Structural Engineering ,Mathematics ,Serial computer ,Stiffness matrix ,Frontal solver - Abstract
A methodology for large-scale structural analysis using a parallel frontal solution algorithm is presented. The algorithm is well suited for concurrent processing by a multiple instructions, multiple data (MIMD) machine. The method is very adaptive to a parallel environment because of the inherent parallelism of the approach, which does not require the formation of the global structure stiffness matrix. Thus large memory and storage space are not required. The method can be applied to large-scale structures which normally cannot be analyzed efficiently by a conventional serial computer. Numerical studies using the Alliant FX/80 multiprocessor computer at the Northeast Parallel Architectures Center (NPAC) of Syracuse University indicate that significant speed-up can be achieved by applying the method to large-scale problems. The method is general and can be extended to enhance its applicability and versatility.
- Published
- 1991
- Full Text
- View/download PDF
35. Parallelization of expert systems with recursive applications
- Author
-
Ece Yaprak and L. Anneberg
- Subjects
Recursion ,Computer science ,End user ,business.industry ,Computation ,Local area network ,Parallel computing ,Condensed Matter Physics ,computer.software_genre ,Partition (database) ,Atomic and Molecular Physics, and Optics ,Expert system ,Surfaces, Coatings and Films ,Electronic, Optical and Magnetic Materials ,Knowledge base ,Electrical and Electronic Engineering ,Safety, Risk, Reliability and Quality ,business ,computer ,Serial computer - Abstract
During the last decade, the number of local area networks (LANs) has increased [1]. LANs provide a path for computers to communicate with each other by sending messages, and by sharing the same databases, programs, expensive memories, and I/O equipment. Parallel programs could also be run on LANs [2]. Expert systems are a subtopic within the artificial intelligence area. A database which is associated with the expert system normally interfaces with it after a question is answered by the end user to find out if any recommendation(s) can be made. A search of the entire list of rules (knowledge base or properly organized database) is then made. A parallel processing application of an expert system with associated database on a LAN was discussed in [3]. It was shown that if the database is seqmented and searched in smaller, but parallel sections, a significant speed-up (computation time on a serial computer/computation time on a parallel computer) can be achieved. In the hierarchically organized knowledge base, intermediate goals may be generated which will lead to the final recommendation. The recursion develops when an intermediate result in one parallel node is needed to make the final recommendation on another parallel node. The knowledge engineer will organize the knowledge base hierarchically, based on the particular rules, goals, and factors. Recursive techniques may be necessary when the knowledge base does not partition easily. In this paper, an algorithm will be developed for expert systems to be run on LANs when resursive applications are needed. The proposed speed-up will be developed using sessions in LANs.
- Published
- 1990
- Full Text
- View/download PDF
36. Random Number Generation in the Parallel Environment
- Author
-
H.F. Sharp and C.H. Still
- Subjects
Lavarand ,Convolution random number generator ,Pseudorandom number generator ,Sequence ,Generator (computer programming) ,Random number generation ,Computer science ,Random seed ,Parallel computing ,Serial computer - Abstract
By randomly seeding individual processors in a parallel environment with unique random number generators it is possible to take full advantage of the economies of scale present in the parallel environment to achieve more accurate simulations. While a single random number generator is sufficient for a serial computer, the same is not true for a parallel computer. Multiple copies of the same generator do not improve the quality of the simulation as the period may be insufficient to prevent exhaustion or ’banding’ of the variates. Our approach is to provide each processor with its own unique random number generator and use a common seed value. This ensures each simulation is unique as each generator is different due to random assignment by the front-end computer. The linear congruential method was chosen due to widespread familiarity and acceptance of the technique. By using a sequence of random numbers generated on the front-end computer, prime numbers are selected from a predefined array of 2048 primes and assigned to processors. To provide maximum possible period to the generators, all 2048 primes in the array are six digits in size. This gives the researcher the ability to run simulations involving up to a million random numbers with a high degree of certainty that each processor is running a different simulation. By taking advantage of the large periods, the economies of scale available on a parallel machine can then be expolited to run large scale simulations involving millions of numbers which would be prohibitive on a serial machine.
- Published
- 2005
- Full Text
- View/download PDF
37. Implementing (nondeterministic) parallel assignments
- Author
-
Piercarlo Grandi
- Subjects
Sequence ,Computer science ,Parallel algorithm ,Parallel computing ,computer.software_genre ,Computer Science Applications ,Theoretical Computer Science ,Nondeterministic algorithm ,Signal Processing ,Compiler ,Central processing unit ,Computer Science::Operating Systems ,computer ,Algorithm ,Information Systems ,Serial computer - Abstract
We present an algorithm that implements a parallel assignment as an optimal sequence of single assignments on a serial computer. The algorithm is optimal in the sense that it generates the minimum number of single assignments. Further, if the CPU is capable of executing multiple instruction threads, the algorithm can generate the minimal sequence that takes advantage of them.
- Published
- 1996
- Full Text
- View/download PDF
38. Can optical interconnects lead to cheaper high-performance multiprocessors?
- Author
-
Uzi Vishkin
- Subjects
Cost reduction ,Interconnection ,Engineering ,Parallel processing (DSP implementation) ,business.industry ,CPU cache ,Embedded system ,Optical interconnect ,Multiprocessing ,Cache ,business ,Serial computer - Abstract
A new paradigm for an all-to-all optical interconnect is presented. It could be part of an interconnection fabric between parallel processing elements and the first level of the cache in a computer system. Parallel processing has traditionally aspired to improve performance of such systems. An optical interconnect raises a new possibility: obtain both improved performance and significant cost reduction with respect to standard serial computer system models.
- Published
- 2004
- Full Text
- View/download PDF
39. Parallel networking and visualization on the Connection Machine CM-5
- Author
-
Gary C. Oberbrunner
- Subjects
Ethernet ,business.industry ,Computer science ,HIPPI ,computer.software_genre ,Visualization ,Inter-process communication ,Data visualization ,Broadcasting (networking) ,Fiber Distributed Data Interface ,business ,computer ,Serial computer ,Computer network - Abstract
The Connection Machine CM-5 supports a Unix-compatible, data-parallel, transparent networking architecture to send parallel data over HIPPI, FDDI and Ethernet. CM-5 and serial computer processes can now all communicate simply and efficiently with each other. This interprocess communication system has been proven useful for parallel X Window drawing support and parallel data visualization systems such as AVS. >
- Published
- 2003
- Full Text
- View/download PDF
40. Standard Hardware Interfaces
- Author
-
Howard Austerlitz
- Subjects
Scheme (programming language) ,Clock signal ,business.industry ,Computer science ,Interface (computing) ,RS-232 ,Serial port ,Multiplexing ,Synchronous Serial Interface ,Embedded system ,business ,Parallel port ,computer ,Computer hardware ,Serial computer ,computer.programming_language - Abstract
This chapter discusses the several parallel and serial computer interfaces. The serial interface consists of only one data line and one or more control lines. In this scheme, the data is time multiplexed. Control lines are used to indicate when the receiving end is ready to get the data along with other functions. The digital value of the data line represents a different bit at a different time. This requires a timing reference for the receiving end to decode the data accurately. When an external timing reference is used, this becomes a synchronous serial interface, with a control line carrying the required clock signal. Even though a parallel interface is inherently faster than an equivalent serial interface, it has drawbacks. Most parallel interfaces use standard digital logic voltage levels, usually TTL compatible. The parallel printer interface, sometimes called the Centronics interface, is available for nearly all personal computers and is supported by most printers.
- Published
- 2003
- Full Text
- View/download PDF
41. Variable tracking technique: a single-pass method to determine data dependence
- Author
-
K. Forward and D. Wo
- Subjects
Schedule ,Source code ,Computer science ,media_common.quotation_subject ,Process (computing) ,Parallel computing ,computer.software_genre ,Programmable logic array ,Logic synthesis ,Computer engineering ,Compiler ,Field-programmable gate array ,computer ,Serial computer ,media_common - Abstract
This paper presents a new data dependence checking technique called the variable tracking technique (VTT). It is a single-pass data dependence checking method which locates dependent statements in a serial computer program. VTT produces a schedule which lists the operations in the source code in groups. The list of operations in a particular group can be executed concurrently. The user is not required to provide a profile of the program to the compiler, hence VTT is suitable for applications which automate the process of exploiting parallelism. Here we describe the use of this technique in gacc, a parallelising compiler, which compiles C functions to field programmable gate array (FPGA) circuits. The results presented in this paper show that VTT has been instrumental in gaining improved performance from a parallelising compiler which automates the process of executing the computational intensive portion of the program in hardware. >
- Published
- 2002
- Full Text
- View/download PDF
42. Cataloging Serial Computer Files
- Author
-
Margaret Mering, Colleen Thorburn, and Rebecca Ringler
- Subjects
Multimedia ,Computer science ,Cataloging ,Library and Information Sciences ,computer.software_genre ,computer ,Serial computer - Published
- 1993
- Full Text
- View/download PDF
43. The growth factor and efficiency of Gaussian elimination with rook pivoting
- Author
-
Leslie V. Foster
- Subjects
Numerical analysis ,Applied Mathematics ,Growth factor ,Computer Science::Numerical Analysis ,Exponential function ,Exponential error ,symbols.namesake ,Computational Mathematics ,Gaussian elimination ,symbols ,Applied mathematics ,Rook pivoting ,Almost surely ,Gauss–Seidel method ,Algorithm ,Mathematics ,Pivot element ,Serial computer - Abstract
Gaussian elimination is among the most widely used tools in scientific computing. Gaussian elimination with partial pivoting requires only O(n2) comparisons beyond the work required in Gaussian elimination with no pivoting but can, in principle, have error growth that is exponential in the matrix size n. Gaussian elimination with complete pivoting, on the other hand, cannot have exponential error growth but requires O(n3) comparisons beyond the work required by Gaussian elimination with no pivoting. Numerical experiments suggest that Gaussian elimination with rook pivoting is between partial pivoting and complete pivoting in terms of efficiency and accuracy. In this paper we prove that rook pivoting cannot have exponential error growth. We also introduce a combination of partial pivoting and rock pivoting which we call Gaussian elimination with partial rook pivoting and we prove the partial rook pivoting cannot have exponential error growth. We include numerical experiments showing that on a serial computer the run times for cook pivoting are almost always close to those of partial pivoting and the run times for partial rook pivoting appear to be the same as those of partial pivoting.
- Published
- 1998
- Full Text
- View/download PDF
44. Efficient morphological processing of maps and line drawings based on directional interval coding
- Author
-
Gady Agam, Its'hak Dinstein, Javier Frydman, and Oren Amiram
- Subjects
Morphological processing ,Logical operations ,Computer science ,Binary image ,Line drawings ,Image processing ,Mathematical morphology ,Algorithm ,Serial computer ,Coding (social sciences) - Abstract
In previous work, we presented algorithms for the analysis of maps and line-drawing images which are based on the processing of directional edge planes by directional morphological operations. This paper discusses the problem of efficient morphological processing of directional edge planes on a serial computer, where it is assumed that arbitrary kernels may be used. The proposed approach is based on a compact representation of the edge planes, which is obtained by using directional interval coding, where the direction of the interval as is adapted individually in each directional edge plane. In a broader sense, the proposed approach provides a general framework for efficient processing of binary images which as based on a directional interval coding. This framework supports basic morphological operations with arbitrary kernels and basic logical operations between any number of images.
- Published
- 1997
- Full Text
- View/download PDF
45. Parallel Multigrid Methods
- Author
-
Jim E. Jones and Stephen F. McCormick
- Subjects
Class (computer programming) ,Multigrid method ,Theoretical computer science ,Partial differential equation ,Computer science ,Numerical analysis ,Residual ,Field (computer science) ,Serial computer ,Variety (cybernetics) - Abstract
Multigrid methods have proved to be among the fastest numerical methods for solving a broad class of problems, from many types of partial differential equations to problems with no continuous origin. On a serial computer, multigrid methods are able to solve a widening class of problems with work equivalent to a few evaluations of the discrete residual (i.e., a few relaxations). Many research projects have been conducted on parallel multigrid methods, and they have addressed a variety of subjects: from proposed new algorithms to theoretical studies to questions about practical implementation. The aim here is to provide a brief overview of this active and abundant field of research, with the aim of providing some guidance to those who contemplate entering it.
- Published
- 1997
- Full Text
- View/download PDF
46. Chapter 3 Introduction to the bidirectional associative memory model: Implications for psychopathology, treatment, and research
- Author
-
Warren W. Tryon
- Subjects
Communication ,Distributed shared memory ,Recall ,business.industry ,Analogy ,Bidirectional associative memory ,Distributed memory ,Arithmetic ,business ,Gradient descent ,Psychology ,Memory map ,Serial computer - Abstract
Summary We have formed a simple three memory BAM connecting three pairs of 8-bit stimuli and 5-bit response patterns. The three separate associations have been combined into a single memory matrix whose elements can be interpreted as synaptic weights between a set of stimulus “neurons” and a set of response “neurons”. It is crucial to note that all elements of the memory matrix participate in storing each of the three S-R associations (memories). This is the essence of a parallel distributed memory system and differs fundamentally from the serial computer analogy of storing information in discrete places. An energy value is associated with every memory resulting in a funnel shaped memory well with a basin of attraction. Memory retrieval entails descending to the bottom of the memory well through a process known as gradient descent. Our example took two cycles to reach an energy minimum and correctly recall R1 when given S1. We now apply these fundamental concepts to psychopathology.
- Published
- 1997
- Full Text
- View/download PDF
47. Medical image processing utilizing neural networks trained on a massively parallel computer
- Author
-
J.P. Kerr and E.B. Bartlett
- Subjects
Workstation ,Computer science ,Computer Science::Neural and Evolutionary Computation ,Health Informatics ,Image processing ,law.invention ,Microcomputers ,law ,Computer Systems ,Software Design ,Image Processing, Computer-Assisted ,Humans ,Computer vision ,SIMD ,Massively parallel ,Serial computer ,Tomography, Emission-Computed, Single-Photon ,Tomographic reconstruction ,Artificial neural network ,business.industry ,Reproducibility of Results ,Computers, Mainframe ,Computer Science Applications ,Software design ,Artificial intelligence ,Neural Networks, Computer ,business - Abstract
While finding many applications in science, engineering, and medicine, artificial neural networks (ANNs) have typically been limited to small architectures. In this paper, we demonstrate how very large architecture neural networks can be trained for medical image processing utilizing a massively parallel, single-instruction multiple data (SIMD) computer. The two- to three-orders of magnitude improvement in processing time attainable using a parallel computer makes it practical to train very large architecture ANNs. As an example we have trained several ANNs to demonstrate the tomographic reconstruction of 64 x 64 single photon emission computed tomography (SPECT) images from 64 planar views of the images. The potential for these large architecture ANNs lies in the fact that once the neural network is properly trained on the parallel computer the corresponding interconnection weight file can be loaded on a serial computer. Subsequently, relatively fast processing of all novel images can be performed on a PC or workstation.
- Published
- 1995
48. Parallel algorithms for the circuit value update problem
- Author
-
Charles E. Leiserson and Keith H. Randall
- Subjects
Combinational logic ,Computer science ,Parallel algorithm ,Value (computer science) ,Hardware_PERFORMANCEANDRELIABILITY ,Circuit extraction ,Theoretical Computer Science ,Glitch ,Computer Science::Emerging Technologies ,Computational Theory and Mathematics ,Asynchronous communication ,Bounded function ,Theory of computation ,Hardware_INTEGRATEDCIRCUITS ,Equivalent circuit ,Constant (mathematics) ,Algorithm ,Hardware_LOGICDESIGN ,Serial computer ,Mathematics - Abstract
The circuit value update problem is the problem of updating values in a representation of a combinational circuit when some of the inputs are changed. We assume for simplicity that each combinational element has bounded fan-in and fan-out and can be evaluated in constant time. This problem is easily solved on an ordinary serial computer in O(W+D) time, where W is the number of elements in the altered subcircuit and D is the subcircuit's embedded depth (its depth measured in the original circuit). In this paper we show how to solve the circuit value update problem efficiently on a P-processor parallel computer. We give a straightforward synchronous, parallel algorithm that runs in $O(W/P + D\lg P)$ expected time. Our main contribution, however, is an optimistic, asynchronous, parallel algorithm that runs in $O(W/P+D+\lg W + \lg P)$ expected time, where W and D are the size and embedded depth, respectively, of the ``volatile'' subcircuit, the subcircuit of elements that have inputs which either change or glitch as a result of the update. To our knowledge, our analysis provides the first analytical bounds on the running time of an optimistic, asynchronous, parallel algorithm.
- Published
- 1995
- Full Text
- View/download PDF
49. Parallel Implementation of the GTH Algorithm for Markov Chains
- Author
-
Daniel P. Heyman, David M. Cohen, Asya Rabinovitch, and Danit Brown
- Subjects
Markov chain ,Tridiagonal matrix ,Computer science ,Stochastic matrix ,Parallel computing ,State (computer science) ,Algorithm ,Massively parallel ,Block size ,Serial computer ,Block (data storage) - Abstract
We are concerned with computing the steady-state distribution it of an finite irreducible Markov chain with transition matrix P. We will use the GTH [1]algorithm which has excellent numerical properties which have been demonstrated empirically[2] and mathematically [3] We have at our disposal workstations and a massively parallel computer; we want to see how execution times on the latter compare to execution times on the former. Embedded in this endeavor is an exploration of how to harness the massively parallel computer to work on the GTH algorithm. Our main conclusions are: Our massively parallel computer can solve a problem with one thousand states one hundred times as fast as a serial computer. Extrapolation of our experience using one-eighth of the available memory indicates that for a problem with eleven thousand states,the serial machine would require 104 as much time as the parallel machine Having enough memory to store the transition matrix is the limiting factor for our parallel computer. When the transition matrix has the block tridiagonal form, our parallel computer can store many thousands of states (depending on the block size; 16 × 106 states can be stored when the blocks are 2 × 2) and compute the steady-state distribution in a few hours. The 16 × 106 state example can be done in 24 hours.
- Published
- 1995
- Full Text
- View/download PDF
50. Implementation of destination-address block-location using an SIMD machine
- Author
-
Joe Zheng and Tianhao Ding
- Subjects
business.industry ,Computer science ,law.invention ,MIMD ,Microprocessor ,Software ,Parallel processing (DSP implementation) ,law ,Embedded system ,Central processing unit ,SIMD ,business ,Computer hardware ,Block (data storage) ,Serial computer - Abstract
The implementation of destination address block location using an SIMD machine is described. Both systemarchitecture and software considerations are presented. The results and the processing times for regular envelopesare included to demonstrate that implementation on an SIMD machine is very cost-effective in image-based appli-cations with real-time requirements. The conclusion is made that a mixed mode of SIMD and MIMD is the bestapproach to the efficient implementation of destination address block location. 1. Introduction Toautomate sortingand routing of mail peces in postal applications, there have been many efforts devoted to automat-ically locating the destination address block(DAB)on envelopes and packages. Most methodologies are image-based and con-ventionally perform in a manner dealing with bit-represented pixels. To operate a large number of pixels at a high throughputrate, it can be shown that conventional serial computer architectures can not provide sufficient data processing power. Forexample, the CPU instruction execution rate for a 386 DX microprocessor (33 MHz) is eight MIPS (millions of instructions persecond).
- Published
- 1993
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.