132 results on '"Explicitly parallel instruction computing"'
Search Results
2. Coexistence of lazy frogs on ${\mathbb{Z}}$.
- Author
-
Holmes, Mark and Kious, Daniel
- Subjects
PROBABILITY theory ,GROWTH rate ,MATHEMATICS theorems ,EXPLICITLY parallel instruction computing ,EQUALITY - Abstract
We study the so-called frog model on ${\mathbb{Z}}$ with two types of lazy frogs, with parameters $p_1,p_2\in (0,1]$ respectively, and a finite expected number of dormant frogs per site. We show that for any such $p_1$ and $p_2$ there is positive probability that the two types coexist (i.e. that both types activate infinitely many frogs). This answers a question of Deijfen, Hirscher, and Lopes in dimension one. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
3. Balancing Competences and the Margin of Appreciation: Structuring Deference at the ECtHR.
- Author
-
Chagas, Carolina A
- Subjects
JUDICIAL deference ,BALANCING machines ,APPRECIATION (Accounting) ,EXPLICITLY parallel instruction computing ,ETCHERS - Abstract
The margin of appreciation is an important argumentative framework employed by the ECtHR. Through its application, the Court may establish a balanced relationship with the member states. This is why the margin is one of the main sources for the ECtHR's exercise of deference. Deference happens when low intensity of review is applied – or a wide margin. Therefore, to properly know when to act deferentially demands a clear procedure to determine the intensity of review. However, the application of the margin still presents some weak points and lacks consistency. In this paper, I defend the possibility of applying formal balancing to provide a clearer structure for the exercise of the margin of appreciation and, thus, a way to improve deferential practices by the ECtHR. With the clear structure of balancing, factors are employed in a more organized manner and the relationships behind the idea of determining the intensity of review are explicitly justified. Hence, the notion and structure of balancing competences organize the margin of appreciation in a way to free it from its main criticisms and fulfill the argumentative potential it has. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
4. Sufficient conditions for unique global solutions in optimal control of semilinear equations with C¹-nonlinearity.
- Author
-
Ali, Ahmad Ahmad, Deckelnick, Klaus, and Hinze, Michael
- Subjects
OPTIMAL control theory ,SEMILINEAR elliptic equations ,NUMERICAL analysis ,EXPLICITLY parallel instruction computing - Abstract
We consider a semilinear elliptic optimal control problem possibly subject to control and/or state constraints. Generalizing previous work, presented in Ahmad Ali, Deckelnick and Hinze (2016) we provide a condition which guarantees that a solution of the necessary first order conditions is a global minimum. A similiar result also holds at the discrete level where the corresponding condition can be evaluated explicitly. Our investigations are motivated by Günter Leugering, who raised the question whether the problem class considered in Ahmad Ali, Deckelnick and Hinze (2016) can be extended to the nonlinearity φ(s) = s|s|. We develop a corresponding analysis and present several numerical test examples demonstrating its usefulness in practice. [ABSTRACT FROM AUTHOR]
- Published
- 2019
5. The constitutive relations of initially stressed incompressible Mooney-Rivlin materials.
- Author
-
Agosti, Abramo, Gower, Artur L., and Ciarletta, Pasquale
- Subjects
- *
STRAIN energy , *RESIDUAL stresses , *EXPLICITLY parallel instruction computing , *DEFORMATION potential , *ISOTROPIC properties - Abstract
Highlights • A strain energy function for initially stressed Mooney-Rivlin materials is derived. • The strain energy functions for initially stressed Mooney and Neo-hookean materials are given explicitly. • Our constitutive model does not require the a priori knowledge of a virtual stress-free configuration. • The derived strain energy functions can be used to design non-destructive methods to measure the initial stresses. Abstract Initial stresses originate in soft materials by the occurrence of misfits in the undeformed microstructure. Since the reference configuration is not stress-free, the effects of initial stresses on the hyperelastic behavior must be constitutively addressed. Notably, the free energy of an initially stressed material may not possess the same symmetry group as the one of the same material deforming from a naturally unstressed configuration. This work assumes that the hyperelastic strain energy density is characterized only by the deformation gradient and the initial stress tensor, using an explicit functional dependence on their independent invariants. In particular, we consider a subclass of constitutive behaviors in which the material constants do not depend on the choice of the reference configuration. Within this theoretical framework, a constitutive equation is derived for an initially stressed body that naturally behaves as an incompressible Mooney-Rivlin material. The strain energy densities for initially stressed neo-Hookean and Mooney materials are derived as special sub-cases. By assuming the existence of a virtual state that is naturally stress-free, the resulting strain energy functions are proved to fulfill the required frame-independence constraints for this special class of constitutive models. In the case of plane strain, great simplifications arise in the expression of the constitutive relations. Finally, the resulting constitutive relations prove useful guidelines for designing non-destructive methods for the quantification of the underlying initial stresses in naturally isotropic materials. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
6. Unifying local-global type properties in vector optimization.
- Author
-
Bagdasar, Ovidiu and Popovici, Nicolae
- Subjects
EXPLICITLY parallel instruction computing ,GLOBAL optimization ,CONVEX domains ,GENERALIZED coordinates ,ALGEBRAIC functions - Abstract
It is well-known that all local minimum points of a semistrictly quasiconvex real-valued function are global minimum points. Also, any local maximum point of an explicitly quasiconvex real-valued function is a global minimum point, provided that it belongs to the intrinsic core of the function’s domain. The aim of this paper is to show that these “local min-global min” and “local max-global min” type properties can be extended and unified by a single general local-global extremality principle for certain generalized convex vector-valued functions with respect to two proper subsets of the outcome space. For particular choices of these two sets, we recover and refine several local-global properties known in the literature, concerning unified vector optimization (where optimality is defined with respect to an arbitrary set, not necessarily a convex cone) and, in particular, classical vector/multicriteria optimization. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
7. Optimizing a linear fractional function over the integer efficient set.
- Author
-
Drici, Wassila, Ouail, Fatma Zohra, and Moulaï, Mustapha
- Subjects
- *
LINEAR programming , *INTEGER programming , *BRANCHING processes , *MATHEMATICAL programming , *EXPLICITLY parallel instruction computing - Abstract
In this article, a new exact method is proposed to solve a problem, say (ILFP)E
, of maximizing a linear fractional function over the integer efficient set of multi-objective integer linear programming problem (MOILP). The method is developed through the branch and cut technique and the continuous linear fractional programming, to come up with an integer optimal solution for problem (ILFP)E without having to explicitly list all efficient solutions of problem (MOILP). The branching process is strengthened by an efficient cut as well as an efficiency test so that a large number of non-efficient feasible solutions can be avoided. Illustrative example and an experimental study are reported to show the merit of this new approach. [ABSTRACT FROM AUTHOR] - Published
- 2018
- Full Text
- View/download PDF
8. Research on the characteristics of evolution in knowledge flow networks of strategic alliance under different resource allocation.
- Author
-
Jianyu, Zhao, Baizhou, Li, Xi, Xi, Guangdong, Wu, and Tienan, Wang
- Subjects
- *
RESOURCE allocation , *EXPLICITLY parallel instruction computing , *BIFURCATION theory , *CLOUD computing , *BUSINESS networks - Abstract
This paper takes the four types of resource allocation (randomly oriented, relationship-oriented, cooperation oriented, and knowledge-embedded) as its premise and investigates the complex characteristics of knowledge flow network evolution in strategic alliances, taking into account the mutual variance effects of the evolution mechanism. Existing research has neglected the differences in resource allocation types, by and large employed statistical analysis methods, and identified only the linear relationships among experimental variances of cross-sectional data. The present study differs from existing research in the following ways: First, we thoroughly consider the multi-faceted nature of resource allocation. Second, we use the method of multi-agent imitation according to perspective of dynamic system evolution and the principle of phase theory, allowing the explicitly analysis of nonlinear functional logic, forms and patterns in the variance. Finally, we analyze the appropriateness of different resource allocation models. Our paper features several significant findings: (1) The evolution of the knowledge flow network of a strategic alliance can produce a bifurcation phenomenon composed of saddle-node bifurcation and transcritical bifurcation. (2) The number of nodes exhibits a logarithmic growth distribution, the connection intensity and the network gain exhibit exponential growth distributions, and the connectivity and knowledge flow frequency are mutually influential in the form of a power function. (3) Knowledge-embedded resource allocation is most effective for improving the knowledge flow rate of networks and can further supply ample impetus for evolution. (4) Cooperation-oriented resource allocation is most beneficial for quickly propelling the network into the evolution realm. (5) Relationship-oriented resource allocation can aid the network in capturing more profit. Furthermore, this research is beneficial for understanding the key problems of each resource allocation model and the evolution of strategic alliance in knowledge flow networks. Our proposed methods and framework can be more widely applied to the fields of complex networks, knowledge management, and strategic innovation. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
9. On the Langevin equation with variable friction.
- Author
-
Ishii, Hitoshi, Souganidis, Panagiotis, and Tran, Hung
- Subjects
LANGEVIN equations ,MATHEMATICAL variables ,FRICTION ,EXPLICITLY parallel instruction computing ,ASYMPTOTIC efficiencies - Abstract
We study two asymptotic problems for the Langevin equation with variable friction coefficient. The first is the small mass asymptotic behavior, known as the Smoluchowski-Kramers approximation, of the Langevin equation with strictly positive variable friction. The second result is about the limiting behavior of the solution when the friction vanishes in regions of the domain. Previous works on this subject considered one dimensional settings with the conclusions based on explicit computations. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
10. Variations on parallel explicit emptiness checks for generalized Büchi automata.
- Author
-
Renault, E., Duret-Lutz, A., Kordon, F., and Poitrenaud, D.
- Subjects
- *
EXPLICITLY parallel instruction computing , *MACHINE theory , *LINEAR statistical models , *SYNCHRONIZATION , *INFORMATION sharing - Abstract
We present new parallel explicit emptiness checks for LTL model checking. Unlike existing parallel emptiness checks, these are based on a strongly connected component (SCC) enumeration and support generalized Büchi acceptance, and require no synchronization points or recomputing procedures. A salient feature of our algorithms is the use of a global union-find data structure in which multiple threads share structural information about the automaton checked. Besides these basic algorithms, we present one architectural variant isolating threads that write to the union-find, and one extension that decomposes the automaton based on the strength of its SCCs to use more optimized emptiness checks. The results from an extensive experimentation of our algorithms and their variations show encouraging performances, especially when the decomposition technique is used. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
11. Codoping and Interstitial Deactivation in the Control of Amphoteric Li Dopant in ZnO for the Realization of p-Type TCOs.
- Author
-
Catellani, Alessandra and Calzolari, Arrigo
- Subjects
- *
DOPING agents (Chemistry) , *HALOGENS , *CONDUCTING polymer composites , *OXYGEN compounds , *EXPLICITLY parallel instruction computing - Abstract
We report on first principle investigations about the electrical character of Li-X codoped ZnO transparent conductive oxides (TCOs). We studied a set of possible X codopants including either unintentional dopants typically present in the system (e.g., H, O) or monovalent acceptor groups, based on nitrogen and halogens (F, Cl, I). The interplay between dopants and structural point defects in the host (such as vacancies) is also taken explicitly into account, demonstrating the crucial effect that zinc and oxygen vacancies have on the final properties of TCOs. Our results show that Li-ZnO has a p-type character, when Li is included as Zn substitutional dopant, but it turns into an n-type when Li is in interstitial sites. The inclusion of X-codopants is considered to deactivate the n-type character of interstitial Li atoms: the total Li-X compensation effect and the corresponding electrical character of the doped compounds selectively depend on the presence of vacancies in the host. We prove that LiF-doped ZnO is the only codoped system that exhibits a p-type character in the presence of Zn vacancies. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
12. Critical Oscillation Constant for Euler Type Half-Linear Differential Equation Having Multi-Different Periodic Coefficients.
- Author
-
Misir, Adil and Mermerkaya, Banu
- Subjects
- *
OSCILLATION theory of difference equations , *COEFFICIENTS (Statistics) , *EULER equations (Rigid dynamics) , *EXPLICITLY parallel instruction computing , *MATHEMATICAL constants - Abstract
We compute explicitly the oscillation constant for Euler type half-linear second-order differential equation having multi-different periodic coefficients. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
13. A Non-Convex Partition of Unity and Stress Analysis of a Cracked Elastic Medium.
- Author
-
Hong, Won-Tak
- Subjects
MECHANICAL stress analysis ,PARTITIONS (Mathematics) ,COORDINATES ,LINEAR elastic fracture ,EXPLICITLY parallel instruction computing - Abstract
A stress analysis using a mesh-free method on a cracked elastic medium needs a partition of unity for a non-convex domain whether it is defined explicitly or implicitly. Constructing such partition of unity is a nontrivial task when we choose to create a partition of unity explicitly. We further extend the idea of the almost everywhere partition of unity and apply it to linear elasticity problem. We use a special mapping to build a partition of unity on a non-convex domain. The partition of unity that we use has a unique feature: the mapped partition of unity has a curved shape in the physical coordinate system. This novel feature is especially useful when the enrichment function has polar form, f(r,θ)=rλg(θ), because we can partition the physical domain in radial and angular directions to perform a highly accurate numerical integration to deal with edge-cracked singularity. The numerical test shows that we obtain a highly accurate result without refining the background mesh. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
14. No Telescoping Effect with Dual Tendon Vibration.
- Author
-
Bellan, Valeria, Wallwork, Sarah B., Stanton, Tasha R., Reverberi, Carlo, Gallace, Alberto, and Moseley, G. Lorimer
- Subjects
- *
TENDONS , *EXPLICITLY parallel instruction computing - Abstract
The tendon vibration illusion has been extensively used to manipulate the perceived position of one’s own body part. However, findings from previous research do not seem conclusive sregarding the perceptual effect of the concurrent stimulation of both agonist and antagonist tendons over one joint. On the basis of recent data, it has been suggested that this paired stimulation generates an inconsistent signal about the limb position, which leads to a perceived shrinkage of the limb. However, this interesting effect has never been replicated. The aim of the present study was to clarify the effect of a simultaneous and equal vibration of the biceps and triceps tendons on the perceived location of the hand. Experiment 1 replicated and extended the previous findings. We compared a dual tendon stimulation condition with single tendon stimulation conditions and with a control condition (no vibration) on both ‘upward-downward’ and ‘towards-away from the elbow’ planes. Our results show a mislocalisation towards the elbow of the position of the vibrated arm during dual vibration, in line with previous results; however, this did not clarify whether the effect was due to arm representation contraction (i.e., a ‘telescoping’ effect). Therefore, in Experiment 2 we investigated explicitly and implicitly the perceived arm length during the same conditions. Our results clearly suggest that in all the vibration conditions there was a mislocalisation of the entire arm (including the elbow), but no evidence of a contraction of the perceived arm length. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
15. Fourth-Moment Analysis for Wave Propagation in the White-Noise Paraxial Regime.
- Author
-
Garnier, Josselin and Sølna, Knut
- Subjects
- *
THEORY of wave motion , *GAUSSIAN sums , *WHITE noise theory , *EXPLICITLY parallel instruction computing , *STATISTICAL standards - Abstract
In this paper we consider the Itô-Schrödinger model for wave propagation in random media in the paraxial regime. We solve the equation for the fourth-order moment of the field in the regime where the correlation length of the medium is smaller than the initial beam width. In terms of applications we prove that the centered fourth-order moments of the field satisfy the Gaussian summation rule, we derive the covariance function of the intensity of the transmitted beam, and the variance of the smoothed Wigner transform of the transmitted field. The second application is used to explicitly quantify the scintillation of the transmitted beam and the third application to quantify the statistical stability of the Wigner transform. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
16. Beyond Relevance.
- Author
-
Belém, Fabiano M., Batista, Carolina S., Santos, Rodrygo L. T., Almeida, Jussara M., and Gonçalves, Marcos A.
- Subjects
- *
RELEVANCE ranking (Information science) , *TAGS (Metadata) , *RANDOM forest algorithms , *EXPLICITLY parallel instruction computing , *INFORMATION retrieval - Abstract
The design and evaluation of tag recommendation methods has historically focused on maximizing the relevance of the suggested tags for a given object, such as a movie or a song. However, relevance by itself may not be enough to guarantee recommendation usefulness. Promoting novelty and diversity in tag recommendation not only increases the chances that the user will select “some” of the recommended tags but also promotes complementary information (i.e., tags), which helps to cover multiple aspects or topics related to the target object. Previous work has addressed the tag recommendation problem by exploiting at most two of the following aspects: (1) relevance, (2) explicit topic diversity, and (3) novelty. In contrast, here we tackle these three aspects conjointly, by introducing two new tag recommendation methods that cover all three aspects of the problem at different levels. Our first method, called Random Forest with topic-related attributes, or RFt, extends a relevance-driven tag recommender based on the Random Forest (RF) learning-to-rank method by including new tag attributes to capture the extent to which a candidate tag is related to the topics of the target object. This solution captures topic diversity as well as novelty at the attribute level while aiming at maximizing relevance in its objective function. Our second method, called Explicit Tag Recommendation Diversifier with Novelty Promotion, or xTReND, reranks the recommendations provided by any tag recommender to jointly promote relevance, novelty, and topic diversity. We use RFt as a basic recommender applied before the reranking, thus building a solution that addresses the problem at both attribute and objective levels. Furthermore, to enable the use of our solutions on applications in which category information is unavailable, we investigate the suitability of using latent Dirichlet allocation (LDA) to automatically generate topics for objects. We evaluate all tag recommendation approaches using real data from five popular Web 2.0 applications. Our results show that RFt greatly outperforms the relevance-driven RF baseline in diversity while producing gains in relevance as well. We also find that our new xTReND reranker obtains considerable gains in both novelty and relevance when compared to that same baseline while keeping the same relevance levels. Furthermore, compared to our previous reranker method, xTReD, which does not consider novelty, xTReND is also quite effective, improving the novelty of the recommended tags while keeping similar relevance and diversity levels in most datasets and scenarios. Comparing our two new proposals, we find that xTReND considerably outperforms RFt in terms of novelty and diversity with only small losses (under 4%) in relevance. Overall, considering the trade-off among relevance, novelty, and diversity, our results demonstrate the superiority of xTReND over the baselines and the proposed alternative, RFt. Finally, the use of automatically generated latent topics as an alternative to manually labeled categories also provides significant improvements, which greatly enhances the applicability of our solutions to applications where the latter is not available. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
17. A design of EPIC type processor based on MIPS architecture
- Author
-
Akinori Kanasugi and Takahito Hayashi
- Subjects
Bubble sort ,Computer science ,Pipeline (computing) ,Byte ,Parallel computing ,ComputerSystemsOrganization_PROCESSORARCHITECTURES ,General Biochemistry, Genetics and Molecular Biology ,Bit field ,Artificial Intelligence ,Very long instruction word ,Explicitly parallel instruction computing ,VHDL ,Field-programmable gate array ,computer ,computer.programming_language - Abstract
This paper proposes an EPIC (Explicitly Parallel Instruction Computing Architecture) type processor based on MIPS. VLIW processors can execute multiple instructions simultaneously, but due to dependency of instructions, it is often impossible to execute maximum parallel execution. As a result, program contains many NOP instructions. EPIC processor can reduce NOP instructions by changing number of instructions to be executed simultaneously. To implement EPIC type processor, five bit field is embedded in the machine instruction code. For comparison, a 5-stage pipeline processor (basic processor), and a Very Long Instruction Word (VLIW) processor are designed. The proposed processors are described in hardware description language (VHDL) and implemented using FPGA. Operations are confirmed by software Tera Term. Processors are evaluated for instruction parallelism and program size using bubble sort program. It is confirmed that the proposed processor is 1.9 times faster than the basic processor. In addition, the program size of the proposed processor is 64 bytes, the basic processor is 56 bytes, and the VLIW processor is 80 bytes.
- Published
- 2019
- Full Text
- View/download PDF
18. FAST AND BACKWARD STABLE COMPUTATION OF ROOTS OF POLYNOMIALS.
- Author
-
AURENTZ, JARED L., MACH, THOMAS, VANDEBRIL, RAF, and WATKINS, DAVID S.
- Subjects
- *
POLYNOMIALS , *ALGORITHMS , *EIGENVALUES , *EXPLICITLY parallel instruction computing , *ACCURACY - Abstract
A stable algorithm to compute the roots of polynomials is presented. The roots are found by computing the eigenvalues of the associated companion matrix by Francis's implicitly shifted QR algorithm. A companion matrix is an upper Hessenberg matrix that is unitary-plus-rank-one, that is, it is the sum of a unitary matrix and a rank-one matrix. These properties are preserved by iterations of Francis's algorithm, and it is these properties that are exploited here. The matrix is represented as a product of 3n -- 1 Givens rotators plus the rank-one part, so only O(n) storage space is required. In fact, the information about the rank-one part is also encoded in the rotators, so it is not necessary to store the rank-one part explicitly. Francis's algorithm implemented on this representation requires only O(n) flops per iteration and thus O(n²) flops overall. The algorithm is described, normwise backward stability is proved, and an extensive set of numerical experiments is presented. The algorithm is shown to be about as accurate as the (slow) Francis QR algorithm applied to the companion matrix without exploiting the structure. It is faster than other fast methods that have been proposed, and its accuracy is comparable or better. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
19. On break-even correlation: the way to price structured credit derivatives by replication.
- Author
-
Fermanian, Jean-David and Vigneron, Olivier
- Subjects
- *
BREAK-even analysis , *CREDIT derivatives , *DERIVATIVE securities , *COPULA functions , *EXPLICITLY parallel instruction computing - Abstract
We consider the pricing of European-style structured credit pay-off under the Gaussian Copula Model (GCM). When no sudden jump-to-default events occur, the perfect replication of these pay-offs under the GCM is obtained if and only if the underlying single-name credit spreads follow a particular family of dynamics and if the pricing parameters are given by so-called ‘break-even’ correlations. We exhibit a class of Merton-style models that are consistent with this result. We calculate break-even correlations explicitly to price nth-to-default baskets under the GCM. Finally, we illustrate the usefulness of this concept as a relative-value tool. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
20. The Medieval French Lexicon of Translation.
- Author
-
Stoll, Jessica
- Subjects
VOCABULARY ,TRANSLATING & interpreting ,EXPLICITLY parallel instruction computing ,CONTENT analysis ,NATIVE language instruction ,CANON (Literature) - Abstract
This article examines the medieval French vocabulary for translation. Only translater has been previously widely thought to designate translation explicitly. It is accepted that other words, such as trover/controver, traire/retraire, and metre en escrit/romanz, signify processes of textual production, but this research investigates how medieval usage indicates that in addition to these recognised meanings, they may also connote translation. The article subsequently brings into focus three Latin writers whose understanding of translatio was influential upon vernacular writers, thus placing old French concepts of translation within their wider linguistic and literary context. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
21. A bilevel programming approach to double optimal stopping.
- Author
-
Makasu, Cloud
- Subjects
- *
BILEVEL programming , *OPTIMAL stopping (Mathematical statistics) , *PROBLEM solving , *EXPLICITLY parallel instruction computing , *MATHEMATICAL proofs - Abstract
Abstract: This paper treats a class of double optimal stopping problems arising in the pricing of integral options. Under certain conditions, we give an explicit form of the double stopping time for such type of optimal stopping problems. The present results are essentially derived by solving a certain nonlinear bilevel programming problem explicitly. [Copyright &y& Elsevier]
- Published
- 2014
- Full Text
- View/download PDF
22. Complete set of observables for photoproduction of two pseudoscalars on a nucleon.
- Author
-
Arenhövel, H. and Fix, A.
- Subjects
- *
PARTICLES (Nuclear physics) , *SCATTERING (Mathematics) , *HERMITIAN forms , *EXPLICITLY parallel instruction computing , *POLARIZATION (Nuclear physics) - Abstract
The problem of determining completely the spin amplitudes of photoproduction of two pseudoscalar mesons on a nucleón from observables is studied. The procedure of reconstruction of the scattering matrix elements from a complete set of observables is based on the expressions of all observables as quadratic Hermitian forms in the reaction matrix elements which are derived explicitly. Their inversion allows one to find explicit solutions for the reaction matrix elements in terms of observables. Two methods for finding a complete set of observables are presented. In particular, one set was found that does not contain a triple polarization observable. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
23. Generating Alternative Modules for a Plant Alarm System based on First-out Alarm Alternative Signals.
- Author
-
Hamaguchi, T., Mondori, B., Takeda, K., Kimura, N., and Noda, M.
- Subjects
ALARMS ,CHEMICAL plants ,EXPLICITLY parallel instruction computing ,OPERATOR theory ,MATHEMATICAL models - Abstract
Abstract: Support is required for operator activity in the correcting abnormalities in chemical plants. A plant alarm system must provide useful information to operators as the third layer of the Independent Protection Layers. Therefore, a method for designing a plant alarm system is important for plant safety. Because plants are modified throughout their plant lifecycles, any alarm systems need to be properly managed throughout the plant lifecycles. To manage the changes, the design rationale of the alarm system should be explained explicitly. Takeda et al. (2013) [3] proposed a logical and systematic alarm system design method that explicitly explains design rationales from know-why information for appropriate management of change throughout the plant lifecycle. For the combined or branched component of a cause-effect (CE) model, multiple alternative modules have been proposed. We propose a method of generating alternative modules for a plant alarm system based on first-out alarm alternative signals for the combined or branched component of a CE model. [Copyright &y& Elsevier]
- Published
- 2013
- Full Text
- View/download PDF
24. About the Heisenberg's uncertainty principle and the determination of effective optical indices in integrated photonics at high sub-wavelength regime.
- Author
-
Bêche, B. and Gaviot, E.
- Subjects
- *
HEISENBERG uncertainty principle , *EXPLICITLY parallel instruction computing , *EIGENVALUE equations , *GEOMETRIC analysis , *INTEGRATED optics , *PHOTONICS - Abstract
Within the Heisenberg's uncertainty principle it is explicitly discussed the impact of these inequalities on the theory of integrated photonics at sub-wavelength regime. More especially, the uncertainty of the effective index values in nanophotonics at sub-wavelength regime, which is defined as the eigenvalue of the overall opto-geometric problems in integrated photonics, appears directly stemming from Heisenberg's uncertainty. An apt formula is obtained allowing us to assume that the incertitude and the notion of eigenvalue called effective optical index or propagation constant is inversely proportional to the spatial dimensions of a given nanostructure yielding a transfer of the fuzziness on relevant senses of eigenvalues below a specific limit's volume. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
25. Modeling instruction cache and instruction buffer for performance estimation of VLIW architectures using native simulation
- Author
-
Omayma Matoussi, Frédéric Pétrot, Techniques of Informatics and Microelectronics for integrated systems Architecture (TIMA), Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA), BEN TITO, Laurence, Techniques de l'Informatique et de la Microélectronique pour l'Architecture des systèmes intégrés (TIMA), and Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])
- Subjects
010302 applied physics ,[INFO.INFO-AR]Computer Science [cs]/Hardware Architecture [cs.AR] ,Orthogonal instruction set ,[INFO.INFO-AR] Computer Science [cs]/Hardware Architecture [cs.AR] ,Speedup ,Computer science ,Cycles per instruction ,Instruction scheduling ,02 engineering and technology ,Parallel computing ,01 natural sciences ,020202 computer hardware & architecture ,Instruction set ,Instruction set simulator ,Minimal instruction set computer ,Computer architecture ,Very long instruction word ,PACS 85.42 ,0103 physical sciences ,Explicitly parallel instruction computing ,0202 electrical engineering, electronic engineering, information engineering ,Cache ,Hardware_CONTROLSTRUCTURESANDMICROPROGRAMMING ,ComputingMilieux_MISCELLANEOUS - Abstract
In this work, we propose an icache performance estimation approach that focuses on a component necessary to handle the instruction parallelism in a very long instruction word (VLIW) processor: the instruction buffer (IB). Our annotation approach is founded on an intermediate level native-simulation framework. It is evaluated with reference to a cycle accurate instruction set simulator leading to an average cycle count error of 9.3% and an average speedup of 10.
- Published
- 2017
26. Explicit Parallel Instruction Computing
- Author
-
Pranjal Mathur
- Subjects
Computer science ,Explicitly parallel instruction computing ,Parallel computing - Published
- 2016
- Full Text
- View/download PDF
27. What is Itanium Memory Consistency from the Programmer's Point of View?
- Author
-
Jalal Kawash, LillAnne Jackson, and Lisa Higham
- Subjects
Distributed shared memory ,Flat memory model ,General Computer Science ,Programming language ,Computer science ,Uniform memory access ,Consistency model ,computer.software_genre ,Memory map ,Theoretical Computer Science ,Programmer-centric memory consistency ,Non-uniform memory access ,itanium multiprocessor ,Shared memory ,Memory ordering ,Explicitly parallel instruction computing ,Itanium ,Distributed memory ,Memory model ,computer ,Computer Science(all) ,Memory protection - Abstract
A programmer-centric model describes the memory consistency rules of amultiprocessor as a collection, one for each processor, of 'views' of instructions and some agreements between these views. It also requires the natural notion of validity: the value read from a shared memory location is the one that was most recently stored, according to a given view. This allows reasoning about programs at a non-operational level in the natural way, not obscured by the implementation details of the underlying architecture. In this paper, we formulate a programmer-centric description of the memory consistency model provided by the Itanium architecture. However, our definition is not tight. We provide two very similar definitions and show that the specification of the Itanium memory model lies between the two. These two definitions are motivated by slightly different implementations of load-acquire instructions. A further entertainment of a handful of other load-acquire rules leads us to question whether the specification of the Itanium memory order [Intel Corporation. A formal specification of the intel itanium processor family memory ordering. http://www.intel.com/, Oct 2002] is indeed faithful to the Itanium architecture intentions.
- Published
- 2007
- Full Text
- View/download PDF
28. Montecito: A Dual-Core, Dual-Thread Itanium Processor
- Author
-
C. McNairy and Rohit Bhatia
- Subjects
Power management ,Hardware_MEMORYSTRUCTURES ,ComputerSystemsOrganization_COMPUTERSYSTEMIMPLEMENTATION ,Computer science ,Hyper-threading ,Parallel computing ,ComputerSystemsOrganization_PROCESSORARCHITECTURES ,computer.software_genre ,Instruction set ,Smart Cache ,Hardware and Architecture ,Multithreading ,Explicitly parallel instruction computing ,Operating system ,Itanium ,Cache ,Electrical and Electronic Engineering ,Cache hierarchy ,computer ,Software - Abstract
Intel's Montecito is the first Itanium processor to feature duplicate, dual-thread cores and cache hierarchies on a single die. It features a landmark 1.72 billion transistors and server-focused technologies, and it requires only 100 watts of power. Intel's Itanium 2 processor series has regularly delivered additional performance through the increased frequency and cache as evidenced by the 6-Mbyte and 9-Mbyte versions.
- Published
- 2005
- Full Text
- View/download PDF
29. Predicated switching - optimizing speculation on EPIC machines
- Author
-
M.F. Jacome and S. Pillai
- Subjects
Computer science ,Parallel computing ,computer.software_genre ,Computer Graphics and Computer-Aided Design ,Set (abstract data type) ,Parallel processing (DSP implementation) ,Explicitly parallel instruction computing ,Code (cryptography) ,Code generation ,Compiler ,Electrical and Electronic Engineering ,Instruction-level parallelism ,computer ,Software - Abstract
Explicitly parallel instruction computing (EPIC) processors are a very attractive platform for many of today's multimedia and communications applications. In particular, clustered EPIC machines can take aggressive advantage of the available instruction-level parallelism, while maintaining high energy-delay efficiency. However, multicluster machines are more challenging to compile to than centralized machines. In this paper, we propose a novel compiler-directed speculation technique called predicated switching (PS) that can be applied to both centralized and multicluster EPIC machines. The two novel contributions in PS are: 1) a compiler transformation, denoted static single assignment-predicated switching, that leverages required data transfers between clusters for performance gains and 2) a static speculation algorithm to decide which specific kernel operations should actually be speculated in the final code, so as to maximize execution performance on the target processor. Experimental results performed on a representative set of time critical kernels compiled for a number of target machines show that, when compared to "resource-unaware" speculation techniques, PS improves performance with respect to at least one of the baselines in 80% of the cases by up to 38%. Moreover, we show that code size and register pressure are not adversely affected by our technique.
- Published
- 2005
- Full Text
- View/download PDF
30. Unpredication, unscheduling, unspeculation: reverse engineering Itanium executables
- Author
-
Saumya K. Debray, Gregory R. Andrews, and N. Snavely
- Subjects
Reverse engineering ,Programming language ,Computer science ,Optimizing compiler ,Parallel computing ,computer.file_format ,Program optimization ,Undo ,computer.software_genre ,Instruction set ,Explicitly parallel instruction computing ,Itanium ,Executable ,Hardware_CONTROLSTRUCTURESANDMICROPROGRAMMING ,computer ,Software - Abstract
EPIC (explicitly parallel instruction computing) architectures, exemplified by the Intel Itanium, support a number of advanced architectural features, such as explicit instruction-level parallelism, instruction predication, and speculative loads from memory. However, compiler optimizations that take advantage of these features can profoundly restructure the program's code, making it potentially difficult to reconstruct the original program logic from an optimized Itanium executable. This paper describes techniques to undo some of the effects of such optimizations and thereby improve the quality of reverse engineering such executables.
- Published
- 2005
- Full Text
- View/download PDF
31. One Instruction Set Computers for Image Processing
- Author
-
William F. Gilreath and Phillip A. Laplante
- Subjects
Reduced instruction set computing ,Computer science ,One instruction set computer ,Reconfigurable computing ,Instruction set ,Minimal instruction set computer ,Computer architecture ,Signal Processing ,Explicitly parallel instruction computing ,Electrical and Electronic Engineering ,Unconventional computing ,Field-programmable gate array ,History of computing ,Information Systems - Abstract
Over the fifty-year history of computing computer engineers have sporadically sought to construct minimal computers using only a single simple instruction. While it might appear to be a simple academic exercise, remarkably, a rich computation paradigm can be developed using this approach, with important applications and implications in reconfigurable, chemical, optical, biological (DNA) and quantum computing and in the study of computer architecture. More recently, the widespread use of the Field Programmable Gate Array (FPGA) technology has made such an approach not only desirable, but also practical. To understand the motivation behind single instruction or one instruction computing (OISC), the history of it is reviewed. It is then shown how the paradigm can be used to implement a variety of imaging operations efficiently with a one instruction processor. Finally, a practical application and future work in languages and tools are discussed.
- Published
- 2004
- Full Text
- View/download PDF
32. Itanium 2 processor 6M: higher frequency and larger L3 cache
- Author
-
Harry Muljono, B. Cherkauer, and Stefan Rusu
- Subjects
Workstation ,CPU cache ,Computer science ,Pipeline burst cache ,Parallel computing ,computer.software_genre ,law.invention ,Hardware and Architecture ,law ,Explicitly parallel instruction computing ,Operating system ,Itanium ,Electrical and Electronic Engineering ,Cache algorithms ,computer ,Software - Abstract
The third-generation Itanium processor targets the high-performance server and workstation market. To do so, the design team sought to provide higher performance through increased frequency and a larger L3 cache. At the same time, we had to limit the power dissipation to fit into the existing platform envelope. These considerations led to what we now call the Itanium 2 processor 6M: the latest generation of Itanium 2, which features a 6-Mbyte, 24-way set-associative on-die L3 cache. The design implements a 2-bundle 64-bit explicitly parallel instruction computing (EPIC) architecture and is fully compatible with previous implementations. Although this processors frequency is 50 percent higher than that of the previous generation, the maximum power dissipation holds flat at 130 W to ensure the platform's backward compatibility.
- Published
- 2004
- Full Text
- View/download PDF
33. A 1.5-GHz 130-nm Itanium 2 processor with 6-MB on-die L3 cache
- Author
-
Simon M. Tam, Justin Leung, Stefan Rusu, B. Cherkauer, J. Stinson, and Harry Muljono
- Subjects
CPU cache ,Computer science ,Circuit design ,Design for testing ,Mixed-signal integrated circuit ,Hardware_PERFORMANCEANDRELIABILITY ,Integrated circuit design ,Parallel computing ,Circuit extraction ,Explicitly parallel instruction computing ,Hardware_INTEGRATEDCIRCUITS ,Itanium ,Cache ,Electrical and Electronic Engineering ,Physical design - Abstract
This 130-nm Itanium 2 processor implements the explicitly parallel instruction computing (EPIC) architecture and features an on-die 6-MB 24-way set-associative level-3 cache. The 374-mm/sup 2/ die contains 410 M transistors and is implemented in a dual-V/sub t/ process with six Cu interconnect layers and FSG dielectric. The processor runs at 1.5 GHz at 1.3 V and dissipates a maximum of 130 W. This paper reviews circuit design and package details, power delivery, the reliability, availability, and serviceability (RAS) features, design for test (DFT), and design for manufacturability (DFM) features, as well as an overview of the design and verification methodology. The fuse-based clock deskew circuit achieves 24-ps skew across the entire die, while the scan-based skew control further reduces it to 7 ps. The 128-bit front-side bus has a bandwidth of 6.4 GB/s and supports up to four processors on a single bus.
- Published
- 2003
- Full Text
- View/download PDF
34. Itanium 2 processor microarchitecture
- Author
-
D. Soltis and C. McNairy
- Subjects
Hardware_MEMORYSTRUCTURES ,ComputerSystemsOrganization_COMPUTERSYSTEMIMPLEMENTATION ,Computer science ,Parallel computing ,ComputerSystemsOrganization_PROCESSORARCHITECTURES ,computer.software_genre ,Microarchitecture ,Binary code compatibility ,Hardware and Architecture ,Explicitly parallel instruction computing ,Operating system ,Itanium ,Hardware_CONTROLSTRUCTURESANDMICROPROGRAMMING ,Electrical and Electronic Engineering ,Hardware_REGISTER-TRANSFER-LEVELIMPLEMENTATION ,computer ,Software - Abstract
The Itanium 2 processor extends the processing power of the Itanium processor family with a capable and balanced microarchitecture. Executing up to six instructions at a time, it provides both performance and binary compatibility for Itanium-based applications and operating systems.
- Published
- 2003
- Full Text
- View/download PDF
35. The implementation of the Itanium 2 microprocessor
- Author
-
Glenn T. Colon-Bonet, Thomas J Sullivan, T. Grutkowski, S.D. Naffziger, T. Fischer, and Reid James Riedlinger
- Subjects
Computer science ,business.industry ,Pipeline (computing) ,Processor design ,Integrated circuit design ,law.invention ,Microarchitecture ,Microprocessor ,Computer architecture ,law ,Embedded system ,Explicitly parallel instruction computing ,Itanium ,Cache ,Electrical and Electronic Engineering ,business - Abstract
This 64-b microprocessor is the second-generation design of the new Itanium architecture, termed explicitly parallel instruction computing (EPIC). The design seeks to extract maximum performance from EPIC by optimizing the memory system and execution resources for a combination of high bandwidth and low latency. This is achieved by tightly coupling microarchitecture choices to innovative circuit designs and the capabilities of the transistors and wires in the 0.18-/spl mu/m bulk Al metal process. The key features of this design are: a short eight-stage pipeline, 11 sustainable issue ports (six integer, four floating point, half-cycle access level-1 caches, 64-GB/s level-2 cache and 3-MB level-3 cache), all integrated on a 421 mm/sup 2/ die. The chip operates at over 1 GHz and is built on significant advances in CMOS circuits and methodologies. After providing an overview of the processor microarchitecture and design, this paper describes a few of these key enabling circuits and design techniques.
- Published
- 2002
- Full Text
- View/download PDF
36. Compiler-assisted multiple instruction word retry for VLIW architectures
- Author
-
Shyh-Kwei Chen and W.K. Fuchs
- Subjects
Instruction register ,Computer science ,Instruction scheduling ,Parallel computing ,computer.software_genre ,Instruction set ,Computational Theory and Mathematics ,Computer architecture ,Hardware and Architecture ,Very long instruction word ,Signal Processing ,Explicitly parallel instruction computing ,Benchmark (computing) ,Compiler ,Hardware_CONTROLSTRUCTURESANDMICROPROGRAMMING ,Instruction-level parallelism ,computer - Abstract
Very Long Instruction Word (VLIW) architectures can enhance performance by exploiting fine-grained instruction level parallelism. In this paper, we describe a compiler assisted multiple instruction word retry scheme for VLIW architectures. A read buffer is used to resolve the more frequent on-path hazards, while the compiler resolves the remaining branch hazards. Performance evaluation is described for 11 benchmark programs based on the IBM VLIW research compiler, Chameleon. Experimental results indicate that, for a VLIW machine with P functional units to rollback N instruction words, a read buffer of 2NP entries with the compiler assist can be an effective approach in producing low overhead runtime performance and small code growth, for P = 4, 8, 12, and 16 and N /spl les/ 3.
- Published
- 2001
- Full Text
- View/download PDF
37. The first IA-64 microprocessor
- Author
-
G. Singer and Stefan Rusu
- Subjects
CPU cache ,Computer science ,business.industry ,Integrated circuit design ,law.invention ,Instruction set ,Microprocessor ,law ,Embedded system ,Explicitly parallel instruction computing ,Hardware_INTEGRATEDCIRCUITS ,IA-64 ,Electrical and Electronic Engineering ,business - Abstract
The first implementation of the IA-64 architecture achieves high performance by using a highly parallel execution core, while maintaining binary compatibility with the IA-32 instruction set. Explicitly parallel instruction computing (EPIC) design maximizes performance through hardware and software synergy. The processor contains 25.4 million transistors and operates at 800 MHz. The chip is fabricated in a 0.18-/spl mu/m CMOS process with six metal layers and packaged in a 1012-pad organic land grid array using C4 (flip chip) assembly technology. A core speed back-side bus connects the processor to a 4-MB L3 cache.
- Published
- 2000
- Full Text
- View/download PDF
38. The Intel IA-64 compiler code generator
- Author
-
Jay Bharadwaj, W. Chuang, Kalyan Muthukumar, K. Menezes, Gerolf Hoflehner, J. Pierce, and W.Y. Chen
- Subjects
Intermediate language ,Computer science ,Programming language ,Loop-invariant code motion ,Instruction scheduling ,Register file ,Parallel computing ,computer.software_genre ,Code bloat ,Dead code elimination ,Branch predication ,Hardware and Architecture ,Explicitly parallel instruction computing ,Code generation ,Compiler ,IA-64 ,Electrical and Electronic Engineering ,computer ,Software ,Compiler correctness - Abstract
In planning the new EPIC (Explicitly Parallel Instruction Computing) architecture, Intel designers wanted to exploit the high level of instruction-level parallelism (ILP) found in application code. To accomplish this goal, they incorporated a powerful set of features such as control and data speculation, predication, register rotation, loop branches, and a large register file. By using these features, the compiler plays a crucial role in achieving the overall performance of an IA-64 platform. This paper describes the electron code generator (ECG), the component of Intel's IA-64 production compiler that maximizes the benefits of these features. The ECG consists of multiple phases. The first phase, translation, converts the optimizer's intermediate representation (ILO) of the program into the ECG IR. Predicate region formation, if conversion, and compare generation occur in the predication phase. The ECG contains two schedulers: the software pipeliner for targeted cyclic regions and the global code scheduler for all remaining regions. Both schedulers make use of control and data speculation. The software pipeliner also uses rotating registers, predication, and loop branches to generate efficient schedules for integer as well as floating-point loops.
- Published
- 2000
- Full Text
- View/download PDF
39. EPIC: Explicitly Parallel Instruction Computing
- Author
-
B.R. Rau and M.S. Schlansker
- Subjects
General Computer Science ,Computer science ,Programming language ,business.industry ,computer.software_genre ,law.invention ,Instruction set ,Microprocessor ,Software ,Parallel processing (DSP implementation) ,law ,Very long instruction word ,Explicitly parallel instruction computing ,Concurrent computing ,Software engineering ,business ,computer - Abstract
Over the past two and a half decades, the computer industry has grown accustomed to the spectacular rate of increase in microprocessor performance. The industry accomplished this without fundamentally rewriting programs in parallel form, without changing algorithms or languages, and often without even recompiling programs. Instruction level parallel processing achieves high performance without major changes to software. However, computers have thus far achieved this goal at the expense of tremendous hardware complexity-a complexity that has grown so large as to challenge the industry's ability to deliver ever-higher performance. The authors developed the Explicitly Parallel Instruction Computing (EPIC) style of architecture to enable higher levels of instruction-level-parallelism without unacceptable hardware complexity. They focus on the broader concept of EPIC as embodied by HPL-PD (formerly known as HPL PlayDoh) architecture, which encompasses a large space of possible EPIC ISAs (instruction set architectures). In this article, the authors focus on HPL-PD because it represents the essence of the EPIC philosophy while avoiding the idiosyncracies of a specific ISA.
- Published
- 2000
- Full Text
- View/download PDF
40. Itanium processor microarchitecture
- Author
-
H. Sharangpani and H. Arora
- Subjects
Orthogonal instruction set ,Reduced instruction set computing ,Computer architecture simulator ,Computer science ,Application-specific instruction-set processor ,Program counter ,CAS latency ,Microarchitecture ,Instruction set ,Memory address ,Computer architecture ,Hardware and Architecture ,Very long instruction word ,Explicitly parallel instruction computing ,Itanium ,Electrical and Electronic Engineering ,Software - Abstract
The Itanium processor is the first implementation of the IA-64 instruction set architecture (ISA). The design team optimized the processor to meet a wide range of requirements: high performance on Internet servers and workstations, support for 64-bit addressing, reliability for mission-critical applications, full IA-32 instruction set compatibility in hardware, and scalability across a range of operating systems and platforms. The processor employs EPIC (explicitly parallel instruction computing) design concepts for a tighter coupling between hardware and software. In this design style the hardware-software interface lets the software exploit all available compilation time information and efficiently deliver this information to the hardware. It addresses several fundamental performance bottlenecks in modern computers, such as memory latency, memory address disambiguation, and control flow dependencies.
- Published
- 2000
- Full Text
- View/download PDF
41. The limits of instruction level parallelism in SPEC95 applications
- Author
-
Trevor Mudge, David Greene, Gary Tyson, and Matthew Postiff
- Subjects
Instruction set ,Minimal instruction set computer ,Computer science ,Data parallelism ,Explicitly parallel instruction computing ,Task parallelism ,General Medicine ,Parallel computing ,Implicit parallelism ,Instruction-level parallelism - Published
- 1999
- Full Text
- View/download PDF
42. [Untitled]
- Author
-
B. Ramakrishna Rau, Shail Aditya, and Vinod Kathail
- Subjects
Computer science ,Opcode ,Parallel computing ,Chip ,computer.software_genre ,Computer architecture ,Hardware and Architecture ,Very long instruction word ,Explicitly parallel instruction computing ,Code generation ,Compiler ,Hardware_CONTROLSTRUCTURESANDMICROPROGRAMMING ,Instruction-level parallelism ,Hardware_REGISTER-TRANSFER-LEVELIMPLEMENTATION ,computer ,Software ,Register allocation - Abstract
In the past, due to the restricted gate count available on an inexpensive chip, embedded DSPs have had limited parallelism, few registers and irregular, incomplete interconnectivity. More recently, with increasing levels of integration, embedded VLIW processors have started to appear. Such processors typically have higher levels of instruction-level parallelism, more registers, and a relatively regular interconnect between the registers and the functional units. The central challenges faced by a code generator for an EPIC (Explicitly Parallel Instruction Computing) or VLIW processor are quite different from those for the earlier DSPs and, consequently, so is the structure of a code generator that is designed to be easily retargetable. In this paper, we explain the nature of the challenges faced by an EPIC or VLIW compiler and present a strategy for performing code generation in an incremental fashion that is best suited to generating high-quality code efficiently. We also describe the Operation Binding Lattice, a formal model for incrementally binding the opcodes and register assignments in an EPIC code generator. As we show, this reflects the phase structure of the EPIC code generator. It also defines the structure of the machine-description database, which is queried by the code generator for the information that it needs about the target processor. Lastly, we discuss our implementation of these ideas and techniques in Elcor, our EPIC compiler research infrastructure.
- Published
- 1999
- Full Text
- View/download PDF
43. An Overview of the EPIC Architecture for Cognition and Performance With Application to Human-Computer Interaction
- Author
-
David E. Meyer and David E. Kieras
- Subjects
Human-Computer Interaction ,Computational model ,Computer science ,Computer Applications ,Human–computer interaction ,Explicitly parallel instruction computing ,Information processing ,Cognition ,Construct (python library) ,Cognitive architecture ,EPIC ,Applied Psychology ,Variety (cybernetics) - Abstract
EPIC (Executive Process-Interactive Control) is a cognitive architecture especially suited for modeling human multimodal and multiple-task performance. The EPIC architecture includes peripheral sensory-motor processors surrounding a production-rule cognitive processor and is being used to construct precise computational models for a variety of human-computer interaction situations. We briefly describe some of these models to demonstrate how EPIC clarifies basic properties of human performance and provides usefully precise accounts of performance speed.
- Published
- 1997
- Full Text
- View/download PDF
44. Predictive engineering models based on the EPIC architecture for a multimodal high-performance human-computer interaction task
- Author
-
David E. Meyer, Scott d. Wood, and David E. Kieras
- Subjects
Human-Computer Interaction ,Computer science ,business.industry ,Human–computer interaction ,GOMS ,Interface (computing) ,Explicitly parallel instruction computing ,Usability engineering ,Task analysis ,Information processing ,Usability ,business ,Task (project management) - Abstract
Engineering models of human performance permit some aspects of usability of interface designs to be predicted from an analysis of the task, and thus they can replace to some extent expensive user-testing data. We successfully predicted human performance in telephone operator tasks with engineering models constructed in the EPIC ( E xecutive P rocess- I nteractive C ontrol) architecture for human information processing, which is especially suited for modeling multimodal, complex tasks, and has demonstrated success in other task domains. Several models were constructed on an a priori basis to represent different hypotheses about how operators coordinate their activities to produce rapid task performance. The models predicted the total time with useful accuracy and clarified some important properties of the task. The best model was based directly on the GOMS analysis of the task and made simple assumptions about the operator's task strategy, suggesting that EPIC models are a feasible approach to predicting performance in multimodal high-performance tasks.
- Published
- 1997
- Full Text
- View/download PDF
45. Superscalar instruction issue
- Author
-
D. Sima
- Subjects
Orthogonal instruction set ,Out-of-order execution ,Reduced instruction set computing ,Programming language ,Computer science ,Instruction scheduling ,computer.software_genre ,Microarchitecture ,Instruction set ,Addressing mode ,Minimal instruction set computer ,Hardware and Architecture ,Very long instruction word ,Superscalar ,Explicitly parallel instruction computing ,Hardware_CONTROLSTRUCTURESANDMICROPROGRAMMING ,Electrical and Electronic Engineering ,computer ,Software - Abstract
Clearly, instruction issue and execution are closely related: The more parallel the instruction execution, the higher the requirements for the parallelism of instruction issue. Thus, we see the continuous and harmonized increase of parallelism in instruction issue and execution. This article focuses on superscalar instruction issue, tracing the way parallel instruction execution and issue have increased performance. It also spans the design space of instruction issue, identifying important design aspects and available design choices. The article also demonstrates a concise way to represent the design space using DS trees, reviews the most frequently used issue schemes, and highlights trends for each design aspect of instruction issue.
- Published
- 1997
- Full Text
- View/download PDF
46. Compilers for instruction-level parallelism
- Author
-
C.L. Thompson, J.Z. Fang, Kemal Ebcioglu, M.S. Schlansker, Thomas M. Conte, and J. Dehnert
- Subjects
General Computer Science ,Programming language ,Data parallelism ,Computer science ,Instruction scheduling ,Task parallelism ,Parallel computing ,computer.software_genre ,Instruction set ,Explicitly parallel instruction computing ,Parallelism (grammar) ,Compiler ,Hardware_CONTROLSTRUCTURESANDMICROPROGRAMMING ,Implicit parallelism ,Instruction-level parallelism ,computer - Abstract
Discovering and exploiting instruction level parallelism in code will be key to future increases in microprocessor performance. What technical challenges must compiler writers meet to better use ILP? Instruction level parallelism allows a sequence of instructions derived from a sequential program to be parallelized for execution on multiple pipelined functional units. If industry acceptance is a measure of importance, ILP has blossomed. It now profoundly influences the design of almost all leading edge microprocessors and their compilers. Yet the development of ILP is far from complete, as research continues to find better ways to use more hardware parallelism over a broader class of applications.
- Published
- 1997
- Full Text
- View/download PDF
47. PARALLELISING COMPILERS AND SYSTEMS
- Author
-
Boleslaw K. Szymanski and Balaram Sinharoy
- Subjects
General Computer Science ,Memory hierarchy ,Computer science ,Data parallelism ,Programming language ,Embarrassingly parallel ,Task parallelism ,Parallel computing ,computer.software_genre ,Supercomputer ,Explicitly parallel instruction computing ,Implicit parallelism ,Instruction-level parallelism ,computer - Abstract
In recent years, high performance computing underwent a deep transformation. In this paper, we review the state of parallel computation with detailed discussion of the current and future research issues in the area of parallel architectures and compilation methods, instruction level parallelism and optimization methods to improve the performance of the memory hierarchy.
- Published
- 1997
- Full Text
- View/download PDF
48. A configurable multi-core processor for teaching parallel processing
- Author
-
L.S.K. Udugama, W. V. Kuruppuarachchi, and Janath Geeganage
- Subjects
Instruction set ,SISD ,Multi-core processor ,MIMD ,Computer architecture ,Parallel processing (DSP implementation) ,Computer science ,Explicitly parallel instruction computing ,Central processing unit ,SIMD - Abstract
Parallel processing is a complex topic found in computing education and has become an essential topic in the curricula owing to the recent developments in both software and hardware. Ensuring access to parallel computers in order to provide a better education at universities is not guaranteed due to the high cost of these devices. Alternatively, parallel processing can be taught using simulators. Accordingly, a multi-core processor, MCSEP, was developed as a tool for teaching parallel computing and architectures. MCSEP consists of 16 SEP (Students' Experimental Processor) cores connected via a 2D mesh. It can be configured to implement the following parallel architectures found in Flynn's taxonomy: Single Instruction Single Data (SISD), Single Instruction Multiple Data (SIMD), and Multiple Instructions Multiple Data (MIMD). In addition, Multiple-SIMD and Multiple-MIMD are also implemented. The salient feature of MCSEP is its ability to configure each core using any of the six instruction set architectures (ISAs) available in SEP. MCSEP is designed and modeled using VHDL. Therefore, it enables the implementation on FPGAs.
- Published
- 2013
- Full Text
- View/download PDF
49. The history and use of pipelining computer architecture: MIPS pipelining implementation
- Author
-
Iro Pantazi-Mytarelli
- Subjects
Instruction set ,Branch predication ,Software pipelining ,Reduced instruction set computing ,Computer architecture ,Cycles per instruction ,Computer science ,Explicitly parallel instruction computing ,Classic RISC pipeline ,Parallel computing ,Central processing unit - Abstract
Pipelining is an implementation technique whereby multiple instructions are overlapped in execution; it takes the advantage of parallelism that exists among the actions needed to execute an instruction. Today, pipelining is the key implementation technique used to make fast CPUs. However, most of the times, there are data dependencies that create problems during the execution and need to be solved. In this paper, we implemented pipelining in MIPS architecture and we observed the way that data dependencies were handled by our system.
- Published
- 2013
- Full Text
- View/download PDF
50. Extending and Applying the EPIC Architecture for Human Cognition and Performance: Auditory and Spatial Components
- Author
-
David E. Kieras and Gregory H. Wakefield
- Subjects
Auditory perception ,Navy ,SIMPLE (military communications protocol) ,Human–computer interaction ,Computer science ,Speech recognition ,Explicitly parallel instruction computing ,Cognition ,Cognitive architecture ,Architecture ,Task (project management) - Abstract
This is the final report for a project that was in a series of projects on the development and validation of the EPIC cognitive architecture for modeling human cognition and performance. This project focussed on extending the architecture to account for sound and speech phenomena, with emphasis on multichannel speech comprehension in a simple command-and-control task for which considerable empirical data is available. Additional work concerned application of the EPIC architecture to Navy research problems.
- Published
- 2013
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.