721 results
Search Results
102. Service Rate Control of Tandem Queues With Power Constraints.
- Author
-
Xia, Li, Miller, Daniel, Zhou, Zhengyuan, and Bambos, Nicholas
- Subjects
- *
CLIENT/SERVER computing , *COST functions , *RESOURCE management , *FEEDBACK control systems , *CONVEX functions , *MATHEMATICAL optimization - Abstract
In this paper, we study the optimal control of service rates of a tandem queue with power constraints. The service rate of a server is determined by the power allocated to that server. The total power of the system is fixed. The system cost is comprised of two parts, the holding cost reflecting the congestion of queues and the operating cost reflecting the power consumed at servers. The optimization objective is to find the optimal power allocation policy among servers, which can minimize the system average cost. We formulate this problem as a Markov decision process with a constrained action space. Sensitivity-based optimization theory is applied to study this problem. The necessary and sufficient condition of optimal service rates, and the optimality of the vertexes of the feasible domain are derived when the operating cost has a linear or concave form. An iterative algorithm is further developed to find the optimal service rates. This algorithm may work well even when the cost function has a general form. The extension to general tandem queues with many servers is also studied. Finally, we conduct numerical experiments under different parameter settings to demonstrate the main idea of this paper. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
103. An Efficient Methodology for Mapping Quantum Circuits to the IBM QX Architectures.
- Author
-
Zulehner, Alwin, Paler, Alexandru, and Wille, Robert
- Subjects
- *
QUANTUM computers , *QUANTUM gates , *QUANTUM computing , *SOFTWARE development tools , *COMPUTER architecture , *LOGIC circuits - Abstract
In the past years, quantum computers more and more have evolved from an academic idea to an upcoming reality. IBM’s project IBM ${Q}$ can be seen as evidence of this progress. Launched in March 2017 with the goal to provide access to quantum computers for a broad audience, this allowed users to conduct quantum experiments on a 5-qubit and, since June 2017, also on a 16-qubit quantum computer (called IBM QX2 and IBM QX3, respectively). Revised versions of these 5- and 16-qubit quantum computers (named IBM QX4 and IBM QX5, respectively) are available since September 2017. In order to use these, the desired quantum functionality (e.g., provided in terms of a quantum circuit) has to be properly mapped so that the underlying physical constraints are satisfied—a complex task. This demands solutions to automatically and efficiently conduct this mapping process. In this paper, we propose a methodology which addresses this problem, i.e., maps the given quantum functionality to a realization which satisfies all constraints given by the architecture and, at the same time, keeps the overhead in terms of additionally required quantum gates minimal. The proposed methodology is generic, can easily be configured for similar future architectures, and is fully integrated into IBM’s SDK. Experimental evaluations show that the proposed approach clearly outperforms IBM’s own mapping solution. In fact, for many quantum circuits, the proposed approach determines a mapping to the IBM architecture within minutes, while IBM’s solution suffers from long runtimes and runs into a timeout of 1 h in several cases. As an additional benefit, the proposed approach yields mapped circuits with smaller costs (i.e., fewer additional gates are required). All implementations of the proposed methodology are publicly available at http://iic.jku.at/eda/research/ibm_qx_mapping. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
104. A Reconfigurable Memory-Based Fast VLSI Architecture for Computation of the Histogram.
- Author
-
Mondal, Pulak and Banerjee, Swapna
- Subjects
- *
HISTOGRAMS , *VERY large scale circuit integration , *DIGITAL image processing , *COMPUTERS , *FIELD programmable gate arrays - Abstract
Histogram computation is a fundamental task often encountered in image processing systems. This plays an important role in various applications like image registration, show-through correction method in scanned image of duplex mode printed document, etc. Mutual information (MI) is one of the best metric for intensity-based multimodality image registration. Similarity measurement between the images consumes significant amount of the execution time in image registration. Computation of MI requires to obtain individual and joint histogram of two images. Typical joint histogram sizes range between 322 and 2562. The demand of hardware resource for computation of histogram increases substantially with increasing of histogram size. The array-based method may not be a suitable candidate in this application, because of large histogram size. Computation of histogram is inherently sequential in nature. But a parallel computation of histogram would reduce the processing time, which is going to help various imaging systems. In this paper, a memory-based parallel algorithm for histogram computation and its possible VLSI architecture have been presented. The architecture is mapped in field programmable gate array. The proposed architecture utilizes 99.66% less of the hardware compared to latest available architecture in the literature and consumes 8.78 mW power. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
105. A Coarse-to-Fine Model for Rail Surface Defect Detection.
- Author
-
Yu, Haomin, Li, Qingyong, Tan, Yunqiang, Gan, Jinrui, Wang, Jianzhu, Geng, Yangli-ao, and Jia, Lei
- Subjects
- *
COMPUTERS , *DETECTORS , *RAILROADS , *MECHANICAL abrasion , *ULTRASONIC imaging - Abstract
Computer vision systems have attracted much attention in recent years for use in detecting surface defects on rails; however, accurate and efficient recognition of possible defects remains challenging due to the variations shown by defects and also noise. This paper proposes a coarse-to-fine model (CTFM) to identify defects at different scales. The model works on three scales from coarse to fine: subimage level, region level, and pixel level. At the subimage level, the background subtraction model exploits row consistency in the longitudinal direction, and strongly filters the defect-free range, leaving roughly identified subimages within which defects may exist. At the next level, the region extraction model, inspired by visual saliency models, locates definite defect regions using phase-only Fourier transforms. At the finest level, the pixel subtraction model uses pixel consistency to refine the shape of each defect. The proposed method is evaluated using Type-I and Type-II rail surface defect detection data sets and an actual rail line. The experimental results show that CTFM outperforms state-of-the-art methods according to both the pixel-level index and the defect-level index. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
106. Content Aware Refresh: Exploiting the Asymmetry of DRAM Retention Errors to Reduce the Refresh Frequency of Less Vulnerable Data.
- Author
-
Wang, Shibo, Bojnordi, Mahdi Nazm, Guo, Xiaochen, and Ipek, Engin
- Subjects
- *
ERROR correction (Information theory) , *DYNAMIC random access memory , *MICROPROCESSORS , *COMPUTER systems , *COMPUTER architecture - Abstract
DRAM refresh is responsible for significant performance and energy overheads in a wide range of computer systems, from mobile platforms to datacenters . With the growing demand for DRAM capacity and the worsening retention time characteristics of deeply scaled DRAM, refresh is expected to become an even more pronounced problem in future technology generations . This paper examines content aware refresh, a new technique that reduces the refresh frequency by exploiting the unidirectional nature of DRAM retention errors: assuming that a logical 1 and 0 respectively are represented by the presence and absence of charge, 1-to-0 failures are much more likely than 0-to-1 failures. As a result, in a DRAM system that uses a block error correcting code (ECC) to protect memory, blocks with fewer 1s can attain a specified reliability target (i.e., mean time to failure) with a refresh rate lower than that which is required for a block with all 1s. Leveraging this key insight, and without compromising memory reliability, the proposed content aware refresh mechanism refreshes memory blocks with fewer 1s less frequently. To keep the overhead of tracking multiple refresh rates manageable, refresh groups—groups of DRAM rows refreshed together—are dynamically arranged into one of a predefined number of refresh bins and refreshed at the rate determined by the ECC block with the greatest number of 1s in that bin. By tailoring the refresh rate to the actual content of a memory block rather than assuming a worst case data pattern, content aware refresh respectively outperforms DRAM systems that employ RAS-only Refresh, all-bank Auto Refresh, and per-bank Auto Refresh mechanisms by 12, 8, and 13 percent. It also reduces DRAM system energy by 15, 13, and 16 percent as compared to these systems. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
107. Sensitivity of the Projected Subtraction Approach to Mesh Degeneracies and Its Impact on the Forward Problem in EEG.
- Author
-
Beltrachini, Leandro
- Subjects
- *
ELECTROENCEPHALOGRAPHY , *TETRAHEDRA , *TESSELLATIONS (Mathematics) , *COMPUTERS , *ERRORS - Abstract
Objective: Subtraction-based techniques are known for being theoretically rigorous and accurate methods for solving the forward problem in electroencephalography (EEG-FP) by means of the finite-element method. Within them, the projected subtraction (PS) approach is generally adopted because of its computational efficiency. Although this technique received the attention of the community, its sensitivity to degenerated elements is still poorly understood. In this paper, we investigate the impact of low-quality tetrahedra on the results computed with the PS approach. Methods: We derived upper bounds on the relative error of the element source vector as a function of geometrical features describing the tetrahedral discretization of the domain. These error bounds were then utilized for showing the instability of the PS method with regards to the mesh quality. To overcome this issue, we proposed an alternative technique, coined projected gradient subtraction (PGS) approach, that exploits the stability of the corresponding bounds. Results: Computer simulations showed that the PS method is extremely sensitive to the mesh shape and size, leading to unacceptable solutions of the EEG-FP in case of using suboptimal tessellations. This was not the case of the PGS approach, which led to stable and accurate results in a comparable amount of time. Conclusion: Solutions of the EEG-FP computed with the PS method are highly sensitive to degenerated elements. Such errors can be mitigated by the PGS approach, which showed better performance than the PS technique. Significance: The PGS is an efficient method for computing high-quality lead field matrices even in the presence of degenerated elements. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
108. Optimizing Polynomial Convolution for NTRUEncrypt.
- Author
-
Dai, Wei, Whyte, William, and Zhang, Zhenfei
- Subjects
- *
COMPUTER network security , *ENCRYPTION protocols , *COMPUTER security standards , *ALGORITHMS , *PUBLIC key cryptography - Abstract
$\sf{ NTRUEncrypt}$ is one of the most promising candidates for quantum-safe cryptography. In this paper, we focus on the $\sf{ NTRU743}$ parameter set. We give a report on all known attacks against this parameter set and show that it delivers 256 bits of security against classical attackers and 128 bits of security against quantum attackers. We then present a parameter-dependent optimization using a tailored hierarchy of multiplication algorithms as well as the Intel AVX2 instructions, and show that this optimization is constant-time. Our implementation is two to three times faster than the reference implementation of $\sf{ NTRUEncrypt}$ . [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
109. Toward Optimal Secure Distributed Storage Systems With Exact Repair.
- Author
-
Tandon, Ravi, Amuru, SaiDhiraj, Clancy, Thomas Charles, and Buehrer, Richard Michael
- Subjects
- *
OPTIMAL control theory , *DISTRIBUTED computing , *INFORMATION retrieval , *BANDWIDTHS , *DISTRIBUTED databases - Abstract
Distributed storage systems (DSSs) in the presence of an external wiretapper are considered. A DSS is parameterized by $(n, k, d)$ , in which the data are stored across $n$ nodes (each with storage capacity $\alpha $ ), and must be recoverable by accessing the contents stored on any $k$ out of $n$ nodes. If a node fails, any $d \geq k$ out of $(n-1)$ nodes help in the repair (regeneration) of the failed node (by sending $d\beta $ units of repair data, where $\beta \leq \alpha $ ), so that the data can still be recovered from the DSS. For such a $(n, k, d)$ -DSS, security from the two types of wiretappers is investigated: 1) Type-I (node data) wiretapper, which can read the data stored on any $\ell
- Published
- 2016
- Full Text
- View/download PDF
110. Efficient Search of Girth-Optimal QC-LDPC Codes.
- Author
-
Tasdighi, Alireza, Banihashemi, Amir H., and Sadeghi, Mohammad-Reza
- Subjects
- *
ISOMORPHISM (Mathematics) , *TANNER graphs , *COMPUTATIONAL complexity , *ELECTRONIC data processing , *MACHINE theory - Abstract
In this paper, we study the cycle structure of quasi-cyclic (QC) low-density parity-check (LDPC) codes with the goal of obtaining the shortest code with a given degree distribution and girth. We focus on QC-LDPC codes, whose Tanner graphs are cyclic liftings of fully connected base graphs of size $3 \times n, n \geq 4$ , and obtain minimal lifting degrees that result in girths 6 and 8. This is performed through an efficient exhaustive search, and as a result, we also find all the possible non-isomorphic codes with the same minimum block length, girth, and degree distribution. The exhaustive search, which is ordinarily a formidable task, is made possible by pruning the search space of many codes that are isomorphic to those previously examined in the search process. Many of the pruning techniques proposed in this paper are also applicable to QC-LDPC codes with base graphs other than the $3 \times n$ fully connected ones discussed here, as well as to codes with a larger girth. To further demonstrate the effectiveness of the pruning techniques, we use them to search for QC-LDPC codes with girths 10 and 12, and find a number of such codes that have a shorter block length compared with the best known similar codes in the literature. In addition, motivated by the exhaustive search results, we tighten the lower bound on the block length of QC-LDPC codes of girth 6 constructed from fully connected $3 \times n$ base graphs, and construct codes that achieve the lower bound for an arbitrary value of $n \geq 4$ . [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
111. IEMI Threats for Information Security: Remote Command Injection on Modern Smartphones.
- Author
-
Kasmi, Chaouki and Lopes Esteves, Jose
- Subjects
- *
ELECTROMAGNETIC interference , *INFORMATION technology security , *SMARTPHONES , *ELECTRONIC equipment , *CYBERTERRORISM - Abstract
Numerous papers dealing with the analysis of electromagnetic attacks against critical electronic devices have been made publicly available. In this paper, we exploit the principle of front-door coupling on smartphones headphone cables with specific electromagnetic waveforms. We present a smart use of intentional electromagnetic interference, resulting in finer impacts on an information system than a classical denial of service effect. As an outcome, we introduce a new silent remote voice command injection technique on modern smartphones. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
112. Design of Hybrid Second-Level Caches.
- Author
-
Valero, Alejandro, Sahuquillo, Julio, Petit, Salvador, Lopez, Pedro, and Duato, Jose
- Subjects
- *
CACHE memory , *HYBRID systems , *SYSTEMS design , *EMBEDDED computer systems , *ENERGY consumption - Abstract
In recent years, embedded dynamic random-access memory (eDRAM) technology has been implemented in last-level caches due to its low leakage energy consumption and high density. However, the fact that eDRAM presents slower access time than static RAM (SRAM) technology has prevented its inclusion in higher levels of the cache hierarchy. This paper proposes to mingle SRAM and eDRAM banks within the data array of second-level (L2) caches. The main goal is to achieve the best trade-off among performance, energy, and area. To this end, two main directions have been followed. First, this paper explores the optimal percentage of banks for each technology. Second, the cache controller is redesigned to deal with performance and energy. Performance is addressed by keeping the most likely accessed blocks in fast SRAM banks. In addition, energy savings are further enhanced by avoiding unnecessary destructive reads of eDRAM blocks. Experimental results show that, compared to a conventional SRAM L2 cache, a hybrid approach requiring similar or even lower area speedups the performance on average by 5.9 percent, while the total energy savings are by 32 percent. For a 45 nm technology node, the energy-delay-area product confirms that a hybrid cache is a better design than the conventional SRAM cache regardless of the number of eDRAM banks, and also better than a conventional eDRAM cache when the number of SRAM banks is an eighth of the total number of cache banks. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
113. Analysis of Introducing Active Learning Methodologies in a Basic Computer Architecture Course.
- Author
-
Arbelaitz, Olatz, Martin, Jose I., and Muguerza, Javier
- Subjects
- *
ACTIVE learning , *COMPUTER architecture , *INTERDISCIPLINARY education , *ACADEMIC workload of students , *PROJECT method in teaching , *STUDENT interests , *TEACHING methods - Abstract
This paper presents an analysis of introducing active methodologies in the Computer Architecture course taught in the second year of the Computer Engineering Bachelor's degree program at the University of the Basque Country (UPV/EHU), Spain. The paper reports the experience from three academic years, 2011–2012, 2012–2013, and 2013–2014, in which three types of data were considered for analysis: students' dedication, as measured by time spent on the project, their marks, and their level of satisfaction. The study shows that about 86% of students are satisfied with the teaching methodology and are willing to continue using it in future courses. The study also shows that the active methodologies used contribute to the students' cross-curricular training and do not generate any great increase in student workload. Finally, a statistical analysis of the evolution of student performance showed that marks improved to a statistically significant extent after introducing active methodologies. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
114. Parallel H.264/AVC Fast Rate-Distortion Optimized Motion Estimation by Using a Graphics Processing Unit and Dedicated Hardware.
- Author
-
Shahid, Muhammad Usman, Ahmed, Ashfaq, Martina, Maurizio, Masera, Guido, and Magli, Enrico
- Subjects
- *
ESTIMATION theory , *GRAPHICS processing units , *COMPUTERS , *FIELD programmable gate arrays , *INTEGRATED circuits , *MOTION estimation (Signal processing) , *RATE distortion theory - Abstract
Heterogeneous systems on a single chip composed of a central processing unit, graphics processing unit (GPU), and field-programmable gate array (FPGA) are expected to emerge in the near future. In this context, the system on chip can be dynamically adapted to employ different architectures for execution of data-intensive applications. Motion estimation (ME) is one such task that can be accelerated using FPGA and GPU for high-performance H.264/Advanced Video Coding encoder implementation. This paper presents an inherent parallel low-complexity rate-distortion (RD) optimized fast ME algorithm well suited for parallel implementations, eliminating various data dependencies caused by a reliance on spatial predictions. In addition, this paper provides details of the GPU and FPGA implementations of the parallel algorithm by using OpenCL and Very High Speed Integrated Circuits (VHSIC) Hardware Descriptive Language (VHDL), respectively, and presents a practical performance comparison between the two implementations. The experimental results show that the proposed scheme achieves significant speedup on GPU and FPGA, and has comparable RD performance with respect to sequential fast ME algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
115. Conceptual Design of 3-D FDTD Dedicated Computer With Dataflow Architecture for High Performance Microwave Simulation.
- Author
-
Kawaguchi, Hideki and Matsuoka, Shun-Suke
- Subjects
- *
CONCEPTUAL design , *FINITE difference time domain method , *DATA flow computing , *COMPUTER architecture , *MICROWAVES - Abstract
For practical use of microwave simulations in industry applications such as high frequency product design, this paper presents a conceptual design of 3-D finite difference time domain (FDTD) dedicated computer with dataflow architecture as one of the portable high performance computing technologies. A basic concept of the dataflow architecture for the FDTD dedicated computer itself was presented already in 2003 for 2-D microwave simulations. Detail design of 3-D FDTD dataflow machine is considered in this paper. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
116. Symmetric Property and Reliability of Balanced Hypercube.
- Author
-
Zhou, Jin-Xin, Wu, Zhen-Lin, Yang, Shi-Chen, and Yuan, Kui-Wu
- Subjects
- *
HYPERCUBE networks (Computer networks) , *RELIABILITY in engineering , *INTEGRATED circuit interconnections , *CAYLEY graphs , *PROBLEM solving - Abstract
Huang and Wu in [IEEE Transactions on Computers 46 (1997) 484-490] introduced the balanced hypercube BH_n as an interconnection network topology for computing systems, and they proved that BH_n is vertex-transitive. However, some other symmetric properties, say edge-transitivity and arc-transitivity, of BH_n remained unknown. In this paper, we solve this problem and prove that BH_n is an arc-transitive Cayley graph. Using this, we also investigate some reliability measures, including super-connectivity, cyclic connectivity, etc., in BH_n. First, we prove that every minimum edge-cut of BH_n (n\ge 2) isolates a vertex, and every minimum vertex-cut of BH_n (n\ge 3) isolates a vertex. This is stronger than that obtained by Wu and Huang which shows the connectivity and edge-connectivity of BH_n are 2n. Second, Yang [Applied Mathematics and Computation 219 (2012) 970-975.] proved that for n\ge 2, the super-connectivity of BH_n is 4n-4 and the super edge-connectivity of BH_n is 4n-2. In this paper, we proved that BH_n (n\ge 2) is super-\lambda^\prime but not super-\kappa^\prime . That is, every minimum super edge-cut of BH_n (n\ge 2) isolates an edge, but the minimum super vertex-cut of BH_n (n\ge 2) does not isolate an edge. Third, we also obtain that for n\ge 2, the cyclic connectivity of BH_n is 4n-4 and the cyclic edge-connectivity of BH_n is 4(2n-2). That is, to become a disconnected graph which has at least two components containing cycles, we need to remove at least 4n-4 vertices (resp. 4(4n-2) edges) from BH_n (n\ge 2). [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
117. A Stochastic Approach to Analysis of Energy-Aware DVS-Enabled Cloud Datacenters.
- Author
-
Xia, YunNi, Zhou, MengChu, Luo, Xin, Pang, ShanChen, and Zhu, QingSheng
- Subjects
- *
CLOUD computing , *ELECTRIC power consumption , *QUEUEING networks , *STOCHASTIC analysis , *VIRTUAL machine systems , *MARKOV processes - Abstract
With the increasing call for green cloud, reducing energy consumption has been an important requirement for cloud resource providers not only to reduce operating costs, but also to improve system reliability. Dynamic voltage scaling (DVS) has been a key technique in exploiting the hardware characteristics of cloud datacenters to save energy by lowering the supply voltage and operating frequency. This paper presents a novel stochastic framework for energy efficiency and performance analysis of DVS-enabled cloud. This framework uses virtual machine request arrival rate, failure rate, repair rate, and service rate of datacenter servers as model inputs. Based on a queuing-network-based analysis, this paper gives analytic solutions of three metrics. The proposed framework can be used to help the design and optimization of energy-aware high performance cloud systems. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
118. Robotic Adherent Cell Injection for Characterizing Cell–Cell Communication.
- Author
-
Liu, Jun, Siragam, Vinayakumar, Gong, Zheng, Chen, Jun, Fridman, Michael D., Leung, Clement, Lu, Zhe, Ru, Changhai, Xie, Shaorong, Luo, Jun, Hamilton, Robert M., and Sun, Yu
- Subjects
- *
MEDICAL robotics , *CELL membranes , *MUSCLE cells , *CELL lines , *BIOLOGICAL membranes - Abstract
Compared to robotic injection of suspended cells (e.g., embryos and oocytes), fewer attempts were made to automate the injection of adherent cells (e.g., cancer cells and cardiomyocytes) due to their smaller size, highly irregular morphology, small thickness (a few micrometers thick), and large variations in thickness across cells. This paper presents a robotic system for automated microinjection of adherent cells. The system is embedded with several new capabilities: automatically locating micropipette tips; robustly detecting the contact of micropipette tip with cell culturing surface and directly with cell membrane; and precisely compensating for accumulative positioning errors. These new capabilities make it practical to perform adherent cell microinjection truly via computer mouse clicking in front of a computer monitor, on hundreds and thousands of cells per experiment (versus a few to tens of cells as state of the art). System operation speed, success rate, and cell viability rate were quantitatively evaluated based on robotic microinjection of over 4000 cells. This paper also reports the use of the new robotic system to perform cell–cell communication studies using large sample sizes. The gap junction function in a cardiac muscle cell line (HL-1 cells), for the first time, was quantified with the system. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
119. Statistics of the MLE and Approximate Upper and Lower Bounds–Part II: Threshold Computation and Optimal Pulse Design for TOA Estimation.
- Author
-
Mallat, Achraf, Gezici, Sinan, Dardari, Davide, and Vandendorpe, Luc
- Subjects
- *
DATA distribution , *ASYMPTOTIC distribution , *SIGNAL-to-noise ratio , *COMPUTERS , *SIGNAL processing - Abstract
Threshold and ambiguity phenomena are studied in Part I of this paper where approximations for the mean-squared error (MSE) of the maximum-likelihood estimator are proposed using the method of interval estimation (MIE), and where approximate upper and lower bounds are derived. In this part, we consider time-of-arrival estimation and we employ the MIE to derive closed-form expressions of the begin-ambiguity, end-ambiguity and asymptotic signal-to-noise ratio (SNR) thresholds with respect to some features of the transmitted signal. Both baseband and passband pulses are considered. We prove that the begin-ambiguity threshold depends only on the shape of the envelope of the ACR, whereas the end-ambiguity and asymptotic thresholds only on the shape of the ACR. We exploit the results on the begin-ambiguity and asymptotic thresholds to optimize, with respect to the available SNR, the pulse that achieves the minimum attainable MSE. The results of this paper are valid for various estimation problems. [ABSTRACT FROM PUBLISHER]
- Published
- 2014
- Full Text
- View/download PDF
120. On Characterization of Elementary Trapping Sets of Variable-Regular LDPC Codes.
- Author
-
Karimi, Mehdi and Banihashemi, Amir H.
- Subjects
- *
LOW density parity check codes , *TANNER graphs , *ITERATIVE decoding , *CHARGE carriers , *COMPUTER algorithms , *ERROR correction (Information theory) - Abstract
In this paper, we study the graphical structure of elementary trapping sets (ETSs) of variable-regular low-density parity-check (LDPC) codes. ETSs are known to be the main cause of error floor in LDPC coding schemes. For the set of LDPC codes with a given variable node degree \(d_{l}\) and girth \(g\) , we identify all the nonisomorphic structures of an arbitrary class of \((a,b)\) ETSs, where \(a\) is the number of variable nodes and \(b\) is the number of odd-degree check nodes in the induced subgraph of the ETS. This paper leads to a simple characterization of dominant classes of ETSs (those with relatively small values of \(a\) and \(b\) ) based on short cycles in the Tanner graph of the code. For such classes of ETSs, we prove that any set \({\cal S}\) in the class is a layered superset (LSS) of a short cycle, where the term layered is used to indicate that there is a nested sequence of ETSs that starts from the cycle and grows, one variable node at a time, to generate \({\cal S}\) . This characterization corresponds to a simple search algorithm that starts from the short cycles of the graph and finds all the ETSs with LSS property in a guaranteed fashion. Specific results on the structure of ETSs are presented for \(d_{l} = 3, 4, 5, 6\) , \(g = 6, 8\) , and \(a, b \leq 10\) in this paper. The results of this paper can be used for the error floor analysis and for the design of LDPC codes with low error floors. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
121. Hardware-Based Trusted Computing Architectures for Isolation and Attestation.
- Author
-
Maene, Pieter, Gotzfried, Johannes, de Clercq, Ruan, Muller, Tilo, Freiling, Felix, and Verbauwhede, Ingrid
- Subjects
- *
COMPUTER architecture , *INTERNET of things , *EMBEDDED computer systems , *COMPUTER input-output equipment , *MALWARE , *INDUSTRIAL controls manufacturing - Abstract
Attackers target many different types of computer systems in use today, exploiting software vulnerabilities to take over the device and make it act maliciously. Reports of numerous attacks have been published, against the constrained embedded devices of the Internet of Things, mobile devices like smartphones and tablets, high-performance desktop and server environments, as well as complex industrial control systems. Trusted computing architectures give users and remote parties like software vendors guarantees about the behaviour of the software they run, protecting them against software-level attackers. This paper defines the security properties offered by them, and presents detailed descriptions of twelve hardware-based attestation and isolation architectures from academia and industry. We compare all twelve designs with respect to the security properties and architectural features they offer. The presented architectures have been designed for a wide range of devices, supporting different security properties. [ABSTRACT FROM PUBLISHER]
- Published
- 2018
- Full Text
- View/download PDF
122. A Controllable Bidirectional Battery Charger for Electric Vehicles with Vehicle-to-Grid Capability.
- Author
-
de Melo, Hugo Neves, Trovao, Joao Pedro F., Pereirinha, Paulo G., Jorge, Humberto M., and Antunes, Carlos Henggeler
- Subjects
- *
ELECTRIC vehicles , *ELECTRIC vehicle batteries , *ELECTRIC vehicle charging stations , *BATTERY chargers , *ENERGY management - Abstract
This paper proposes a comprehensive methodology for the design of a controllable electric vehicle charger capable of making the most of the interaction with an autonomous smart energy management system (EMS) in a residential setting. Autonomous EMSs aim achieving the potential benefits associated with energy exchanges between consumers and the grid, using bidirectional and power-controllable electric vehicle chargers. A suitable design for a controllable charger is presented, including the sizing of passive elements and controllers. This charger has been implemented using an experimental setup with a digital signal processor to validate its operation. The experimental results obtained foresee an adequate interaction between the proposed charger and a compatible autonomous EMS in a typical residential setting. [ABSTRACT FROM PUBLISHER]
- Published
- 2018
- Full Text
- View/download PDF
123. A Fully Pipelined Hardware Architecture for Intra Prediction of HEVC.
- Author
-
Min, Biao, Xu, Zhe, and Cheung, Ray C. C.
- Subjects
- *
FIELD programmable gate arrays , *COMPUTERS , *COMPUTER architecture , *OPTICAL resolution , *VIDEO coding - Abstract
Ultrahigh definition (UHD), such as 4K/8K, is becoming the mainstream of video resolution nowadays. High Efficiency Video Coding (HEVC) is the emerging video coding standard to process the encoding and decoding of UHD video. This paper first develops multiple techniques that allow the proposed hardware architecture for intra prediction of HEVC working in full pipeline. The proposed techniques include: 1) a novel buffer structure for reference samples; 2) a mode-dependent scanning order; and 3) an inverse method for reference sample extension. The size of the buffer is 3K b for luma component and 3K b for chroma components, providing sufficient accessing to the reference samples. Since the data dependency between two neighboring blocks is addressed by the mode-dependent scanning order, the proposed fully pipelined design can produce 4 pixels/clock cycle. As a result, the throughput of the proposed architecture is capable to support $3840 \times 2160$ videos at 30 frames/s. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
124. Investigation of Unintentional Video Emanations From a VGA Connector in the Desktop Computers.
- Author
-
Zhang, Nan, Lu, Yinghua, Cui, Qiang, and Wang, Yiying
- Subjects
- *
PERSONAL computers , *ELECTRONIC data processing , *VIDEO compression , *VIDEO graphics array , *DISPLAY systems - Abstract
This paper focuses on the compromising video emanations from a desktop computer with a liquid-crystal display (LCD). Near-field tests are performed in a common unshielded office environment to investigate the video information leakage, and the tests show that the video graphics adapter (VGA) connector is a leaking source. To illustrate the mechanism of the video leakage, a wire antenna model is developed according to the physical dimensions of the connector. The radiated fields of the connector are calculated in the time domain from this model, the results are verified by the tests. Combining the expressions of the radiated fields and the transfer function of the test probe, the function relationship between the original red/green/blue signal and the intercepted signal is established. Finally, a readable text is reconstructed from the intercepted signal when a Chinese document is displayed on the LCD. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
125. Exponential Sums and Correctly-Rounded Functions.
- Author
-
Brisebarre, Nicolas, Hanrot, Guillaume, and Robert, Olivier
- Subjects
- *
EXPONENTIAL sums , *FLOATING-point arithmetic , *HEURISTIC algorithms , *DISTRIBUTION (Probability theory) , *BINARY number system - Abstract
The 2008 revision of the IEEE-754 standard, which governs floating-point arithmetic, recommends that a certain set of elementary functions should be correctly rounded. Successful attempts for solving the Table Maker's Dilemma in binary64 made it possible to design
CRlibm , a library which offers correctly rounded evaluation in binary64 of some functions of the usuallibm . It evaluates functions using a two step strategy, which relies on a folklore heuristic that is well spread in the community of mathematical functions designers. Under this heuristic, one can compute the distribution of the lengths of runs of zeros/ones after the rounding bit of the value of the function at a given floating-point number. The goal of this paper is to change, whenever possible, this heuristic into a rigorous statement. The underlying mathematical problem amounts to counting integer points in the neighborhood of a curve, which we tackle using so-called exponential sums techniques, a tool from analytic number theory. [ABSTRACT FROM AUTHOR]- Published
- 2017
- Full Text
- View/download PDF
126. Gamifying Video Object Segmentation.
- Author
-
Spampinato, Concetto, Palazzo, Simone, and Giordano, Daniela
- Subjects
- *
DIGITAL image processing , *IMAGE segmentation , *COMPUTER vision , *ENERGY function , *DATA mining - Abstract
Video object segmentation can be considered as one of the most challenging computer vision problems. Indeed, so far, no existing solution is able to effectively deal with the peculiarities of real-world videos, especially in cases of articulated motion and object occlusions; limitations that appear more evident when we compare the performance of automated methods with the human one. However, manually segmenting objects in videos is largely impractical as it requires a lot of time and concentration. To address this problem, in this paper we propose an interactive video object segmentation method, which exploits, on one hand, the capability of humans to identify correctly objects in visual scenes, and on the other hand, the collective human brainpower to solve challenging and large-scale tasks. In particular, our method relies on a game with a purpose to collect human inputs on object locations, followed by an accurate segmentation phase achieved by optimizing an energy function encoding spatial and temporal constraints between object regions as well as human-provided location priors. Performance analysis carried out on complex video benchmarks, and exploiting data provided by over 60 users, demonstrated that our method shows a better trade-off between annotation times and segmentation accuracy than interactive video annotation and automated video object segmentation approaches. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
127. Variable Inductor Based Bidirectional DC–DC Converter for Electric Vehicles.
- Author
-
Beraki, Mebrahtom Woldelibanos, Trovao, Joao Pedro F., Perdigao, Marina S., and Dubois, Maxime R.
- Subjects
- *
DC-to-DC converters , *ELECTRIC vehicles , *MAGNETIC cores , *TRACTION motor protection , *ELECTRIC inductors , *ELECTRIC inductance - Abstract
This paper presents the feasibility study of variable inductor (VI)-based bidirectional dc–dc converter for applications with a wide range of load variations such as electric vehicles. An additional winding is introduced to the conventional power inductor to inject a control current for adjusting the permeability of magnetic cores. This has significant merits in controlling the current ripple and enhancing the current handling capability of power inductors, thereby reducing the size of magnetic components and improving the performance. For current values, twice and three times the rated current, the current ripple is reduced by 40.90 % and 36.10 %, respectively. Nonetheless, this device requires a precisely controlled dc-current. As such, a small current controlled, low-power and low-cost buck converter is built to power up the auxiliary winding. To improve the reliability and robustness of the VI, an integrated closed loop control that enables the control of the main converter and the auxiliary converter is also implemented and tested in real time to test the viability of the VI. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
128. Fault Propagation Reasoning and Diagnosis for Computer Networks Using Cyclic Temporal Constraint Network Model.
- Author
-
Cui, Yiqian, Shi, Junyou, and Wang, Zili
- Subjects
- *
COMPUTER networks , *DEBUGGING , *INFORMATION processing - Abstract
Fault diagnosis, including fault detection and isolation, is a critical task for computer networks. Among the various techniques used for online system-level diagnosis, we are interested in the approach based on temporal information processing. The delays of the computer networks are inevitable, and the fault localization process has to take into account bounded delays or the temporal constraints. Temporal information is fundamental in model-based diagnosis. There can be cycles or loops in a computer network, but the fault reasoning methods for such cases are seldom considered in the literature. This paper provides an analytic model based on the cyclic temporal constraint network (CTCN), which aims at the fault diagnosis of cyclic computer networks using temporal information. The goal of the proposed framework is twofold: given the network structures and the predetermined candidate fault causes, the CTCN model corresponding the computer network under test is formulated; based on the CTCN model, given the alarms sequences with timestamps, the fault diagnosis process is executed to determine the most likely fault cause(s) with its/their time interval(s) of occurrence(s). The reasoning method is dependent on time point and time distance information, with which the fault motivators (i.e., actors) and fault responders (i.e., victims) can be identified. The calculation process consists of three steps: 1) establishment of the objective function; 2) determination of the fault propagation paths; and 3) determination of the expected states with a given fault hypothesis. Finally, the proposed method is demonstrated via an application study, and the effectiveness of our proposed method is verified. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
129. Dynamic Checkpointing Policy in Heterogeneous Real-Time Standby Systems.
- Author
-
Levitin, Gregory, Xing, Liudong, Dai, Yuanshun, and Vokkarane, Vinod M.
- Subjects
- *
REAL-time computing , *COMPUTER systems , *ALGORITHMS , *SOFTWARE failures , *TASKS , *COMPUTER software - Abstract
This paper models 1-out-of-N standby computing systems with a dynamic checkpointing policy. The system performs a real-time mission task that has to be accomplished within an allowed mission time. During the mission, to facilitate an effective failure recovery the system undergoes checkpointing procedures according to a policy that dynamically determines a checkpointing frequency based on the activated element and remaining work for completing the mission. System elements are heterogeneous; they can follow different, arbitrary types of time-to-failure distributions, have different performance and wait in different standby modes before their activation. A new numerical algorithm based on state space event transitions is first proposed to evaluate mission success probability of the real-time standby systems considered in this work. Additional new contributions are made by formulating and solving optimal dynamic checkpointing policy problems, as well as an integrated optimization problem that finds the optimal combination of checkpointing policy and element activation sequence maximizing mission success probability. Advantages of using the dynamic checkpointing policy over fixed even checkpoints are demonstrated through examples. Examples and results are also provided to illustrate effects of different mission and element parameters on mission success probability as well as on the optimal dynamic checkpointing policy. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
130. Cooper: Expedite Batch Data Dissemination in Computer Clusters with Coded Gossips.
- Author
-
Liu, Yan, Niu, Di, and Khabbazian, Majid
- Subjects
- *
COMPUTER workstation clusters , *CODING theory , *PARALLEL computer software , *DATA transmission systems , *COMPUTATIONAL complexity , *KNOWLEDGE transfer - Abstract
Data transfers happen frequently in server clusters for software and application deployment, and in parallel computing clusters to transmit intermediate results in batches among servers between computation stages. This paper presents Cooper, an optimized prototype system to speedup multi-batch data transfers among a cluster of servers, leveraging a theoretically proven optimal algorithm called “coded permutation gossip,” which employs a simple random topology control scheme to best utilize bandwidth and decentralized random linear network coding to maximize the useful information transmitted. On a process-level coding-transfer pipeline, we investigate the best block division, batch division and inter-batch scheduling strategies to minimize the broadcast finish time in a realistic setting. For batch-based transfers, we propose a scheduling algorithm with low overhead that overlaps the transfers of consecutive batches and temporarily prioritizes later batches, to further reduce the broadcast finish time. We describe an asynchronous and distributed implementation of Cooper and have deployed it on Amazon EC2 for evaluation. Based on results from real experiments, we show that Cooper can almost double the speed of data transfers in computing clusters, as compared to state-of-the-art content distribution tools like BitTorrent, at a low CPU overhead. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
131. On the Universality of Memcomputing Machines.
- Author
-
Pei, Yan Ru, Traversa, Fabio L., and Di Ventra, Massimiliano
- Subjects
- *
TURING machines , *QUANTUM computing , *MACHINING , *SET theory , *INFORMATION processing - Abstract
Universal memcomputing machines (UMMs) represent a novel computational model in which memory (time nonlocality) accomplishes both tasks of storing and processing of information. UMMs have been shown to be Turing-complete, namely, they can simulate any Turing machine. In this paper, we first introduce a novel set theory approach to compare different computational models and use it to recover the previous results on Turing-completeness of UMMs. We then relate UMMs directly to liquid-state machines (or “reservoir-computing”) and quantum machines (“quantum computing”). We show that UMMs can simulate both types of machines, hence they are both “liquid-” or “reservoir-complete” and “quantum-complete.” Of course, these statements pertain only to the type of problems these machines can solve and not to the amount of resources required for such simulations. Nonetheless, the set-theoretic method presented here provides a general framework which describes the relationship between any computational models. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
132. Comparing Online to Face-to-Face Delivery of Undergraduate Digital Circuits Content.
- Author
-
LaMeres, Brock J. and Plumb, Carolyn
- Subjects
- *
FACE-to-face communication , *DIGITAL electronics , *UNDERGRADUATES , *LOGIC circuits , *STUDENTS , *MICROPROCESSORS - Abstract
This paper presents a comparison of online to traditional face-to-face delivery of undergraduate digital systems material. Two specific components of digital content were compared and evaluated: a sophomore logic circuits course with no laboratory, and a microprocessor laboratory component of a junior-level computer systems course. For each of these, a baseline level of student understanding was evaluated when they were being taught using traditional, face-to-face delivery. The course and lab component were then converted to being fully online, and the level of student understanding was again measured. In both cases, the same purpose-developed assessment tools were used to carry out the measurement of understanding. This paper presents the details of how the course components were converted to online delivery, including a discussion of the technology used to accomplish remote access of the electronic test equipment used in the laboratory. A comparison is then presented between the control and the experimental groups, including a statistical analysis of whether the delivery approach impacted student learning. Finally, student satisfaction is discussed, and instructor observations are given for the successful remote delivery of this type of class and laboratory. [ABSTRACT FROM PUBLISHER]
- Published
- 2014
- Full Text
- View/download PDF
133. HRT-PLRU: A New Paging Schemefor Executing Hard Real-Time Programson NAND Flash Memory.
- Author
-
We, Kyoung-Soo, Lee, Chang-Gun, Yi, Kyongsu, Lin, Kwei-Jay, and Lee, Yun Sang
- Subjects
- *
REAL-time computing , *COMPUTER software execution , *NAND gates , *FLASH memory , *FEATURE extraction , *EMBEDDED computer systems , *RANDOM access memory - Abstract
For advanced features of next generation vehicles, the real-time programs in automotive embedded systems are dramatically increasing. For such large volume program codes, this paper proposes a novel framework to use high-density and low-cost nonvolatile memory, i.e., NAND flash memory, as a low-cost means of storing and executing hard real-time programs. Regarding this, one challenge is that NAND flash memory allows only 2 KB page-based read operations not per-byte random accesses, which requires RAM as working storage for code executions. This paper proposes two solutions, i.e., partitioned RAM solution and shared RAM solution, that minimize the RAM size required to deterministically guarantee the deadlines of all the hard real-time tasks. The proposed solutions are verified with the actual real-time programs for unmanned autonomous driving. To the best of our knowledge, this is the first work that allows us to use NAND flash memory for hard real-time program executions with the minimal usage of RAM. [ABSTRACT FROM PUBLISHER]
- Published
- 2014
- Full Text
- View/download PDF
134. Characterization of Intermodulation and Memory Effects Using Offset Multisine Excitation.
- Author
-
Farsi, Saeed, Draxler, Paul, Gheidi, Hamed, Nauwelaers, Bart K. J. C., Asbeck, Peter, and Schreurs, Dominique
- Subjects
- *
INTERMODULATION , *ELECTRONIC excitation , *ELECTRIC circuits , *RADIO frequency measurement , *MATHEMATICAL models - Abstract
This paper proposes a new class of multisine excitations that allows efficient characterization of nonlinear circuits. By offsetting the frequency of tones, one can distinguish between different intermodulation products in a multisine response. This property leads to many applications for nonlinear circuit characterization, such as in-band distortion measurements, memory effects characterization, and model performance assessment. Some applications are highlighted in this paper, focusing especially on the characterization of memory effects. The effectiveness of the approach is demonstrated with a series of measurement results. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
135. Real-Time Call Admission Control for Packet-Switched Networking by Cellular Neural Networks.
- Author
-
Levendovszky, János and Alpár Fancsali
- Subjects
- *
NEURAL computers , *COMPUTERS , *ARTIFICIAL intelligence , *SWITCHING circuits , *ELECTRONIC circuits , *ALGORITHMS - Abstract
In this paper, novel call admission control (CAC) algorithms are developed based on cellular neural networks. These algorithms can achieve high network utilization by performing CAC in real-time, which is imperative in supporting quality of service (QoS) communication over packet-switched networks. The proposed solutions are of basic significance in access technology where a subscriber population (connected to the Internet via an access module) needs to receive services. In this case, QoS can only be preserved by admitting those user configurations which will not overload the access module. The paper treats CAC as a set separation problem where the separation surface is approximated based on a training set. This casts CAC as an image processing task in which a complex admission pattern is to be recognized from a couple of initial points belonging to the training set. Since CNNs can implement any propagation models to explore complex patterns, CAC can then be carried out by a CNN. The major challenge is to find the proper template matrix which yields high network utilization. On the other hand, the proposed method is also capable of handling three-dimensional separation surfaces, as in a typical access scenario there are three traffic classes (e.g., two type of Internet access and one voice over asymmetric digital subscriber line. [ABSTRACT FROM AUTHOR]
- Published
- 2004
- Full Text
- View/download PDF
136. Memristor Crossbar Arrays Performing Quantum Algorithms.
- Author
-
Fyrigos, Iosif-Angelos, Ntinas, Vasileios, Vasileiadis, Nikolaos, Sirakoulis, Georgios Ch., Dimitrakis, Panagiotis, Zhang, Yue, and Karafyllidis, Ioannis G.
- Subjects
- *
QUANTUM computers , *QUANTUM computing , *ALGORITHMS , *COMPUTER algorithms , *QUBITS , *ERROR rates - Abstract
There is a growing interest in quantum computers and quantum algorithm development. It has been proved that ideal quantum computers, with zero error rates and large decoherence times, can solve problems that are intractable for today’s classical computers. Quantum computers use two resources, superposition and entanglement, that have no classical analog. Since quantum computer platforms that are currently available comprise only a few dozen of qubits, the use of quantum simulators is essential in developing and testing new quantum algorithms. We present a novel quantum simulator based on memristor crossbar circuits and use them to simulate well-known quantum algorithms, namely the Deutsch and Grover quantum algorithms. In quantum computing the dominant algebraic operations are matrix-vector multiplications. The execution time grows exponentially with the simulated number of qubits, causing an exponential slowdown in quantum algorithm execution using classical computers. In this work, we show that the inherent characteristics of memristor arrays can be used to overcome this problem and that memristor arrays can be used not only as independent quantum simulators but also as a part of a quantum computer stack where classical computers accelerators are connected. Our memristive crossbar circuits are re-configurable and can be programmed to simulate any quantum algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
137. High-Throughput Cognitive-Amplification Detector for LDPC Decoders.
- Author
-
Lim, Melvin Heng Li and Goh, Wang Ling
- Subjects
- *
DETECTORS , *ELECTRONIC amplifiers , *DECODERS (Electronics) , *COMPUTERS , *ARITHMETIC - Abstract
With the advent of technology over the recent years, the low-density parity-check (LDPC) codes, which were once seen as an impractical concept, are now poised to be the next big thing in the communication standards of today for their near-capacity performances. Nonetheless, the physical implementation of LDPC decoders is more often than not encumbered by the arithmetic of its decoding algorithm. Entangled by numerous computations of minima, LDPC decoders not only require considerable amount of resources to the implement cascaded pair-wise comparators, but also yield low decoding throughputs. In this paper, we propound a novel design for the computation of minimum and subminimum in LDPC decoding, known as the cognitive-amplification detector (CAD). By leveraging on the finite precision of fixed-point binary representation in actual hardware, our CAD proposition renders significant gains in decoding throughput and savings in resource consumption of up to 20% and 15%, respectively, not to mention negligible trade-offs in error-correcting capabilities. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
138. Universal Hardware for Systems With Acceptable Representations as Low Order Polynomials.
- Author
-
Burg, Ariel and Keren, Osnat
- Subjects
- *
COMPUTERS , *POLYNOMIALS , *ALGEBRA , *COEFFICIENTS (Statistics) , *MATHEMATICAL variables - Abstract
This paper presents a novel hardware architecture for adaptive systems whose exact specification is unknown. The architecture is suitable for linear and nonlinear systems whose inputs are real or complex signals (variables), and that have an acceptable representation as low order polynomials in these variables. The implementation is based on using an a priori selected subset of Walsh spectral coefficients. The proposed architecture can acquire its target functionality and adapt itself to changing environments even if the number of variables, their order and precision are unknown in advance. This is in contrast to conventional multiply-and-accumulate (MAC) based architectures where this information must be determined before the design and implementation of the system. In this context (of systems whose functionality is unknown), the delay and the implementation cost of the proposed architecture is significantly lower than MAC-based solutions. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
139. Memristor Crossbar-Based Neuromorphic Computing System: A Case Study.
- Author
-
Hu, Miao, Li, Hai, Chen, Yiran, Wu, Qing, Rose, Garrett S., and Linderman, Richard W.
- Subjects
- *
BIOLOGICAL neural networks , *COMPUTER systems , *VON Neumann architecture (Computers) , *COMPUTERS , *SYNAPSES , *FLUCTUATIONS (Physics) - Abstract
By mimicking the highly parallel biological systems, neuromorphic hardware provides the capability of information processing within a compact and energy-efficient platform. However, traditional Von Neumann architecture and the limited signal connections have severely constrained the scalability and performance of such hardware implementations. Recently, many research efforts have been investigated in utilizing the latest discovered memristors in neuromorphic systems due to the similarity of memristors to biological synapses. In this paper, we explore the potential of a memristor crossbar array that functions as an autoassociative memory and apply it to brain-state-in-a-box (BSB) neural networks. Especially, the recall and training functions of a multianswer character recognition process based on the BSB model are studied. The robustness of the BSB circuit is analyzed and evaluated based on extensive Monte Carlo simulations, considering input defects, process variations, and electrical fluctuations. The results show that the hardware-based training scheme proposed in the paper can alleviate and even cancel out the majority of the noise issue. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
140. Asymptotic Analysis of Complex LASSO via Complex Approximate Message Passing (CAMP).
- Author
-
Maleki, Arian, Anitori, Laura, Yang, Zai, and Baraniuk, Richard G.
- Subjects
- *
CONFERENCES & conventions , *COMPUTATIONAL complexity , *MESSAGE passing (Computer science) , *LINEAR systems , *EMAIL systems - Abstract
Recovering a sparse signal from an undersampled set of random linear measurements is the main problem of interest in compressed sensing. In this paper, we consider the case where both the signal and the measurements are complex-valued. We study the popular recovery method of \ell 1-regularized least squares or LASSO. While several studies have shown that LASSO provides desirable solutions under certain conditions, the precise asymptotic performance of this algorithm in the complex setting is not yet known. In this paper, we extend the approximate message passing (AMP) algorithm to solve the complex-valued LASSO problem and obtain the complex approximate message passing algorithm (CAMP). We then generalize the state evolution framework recently introduced for the analysis of AMP to the complex setting. Using the state evolution, we derive accurate formulas for the phase transition and noise sensitivity of both LASSO and CAMP. Our theoretical results are concerned with the case of i.i.d. Gaussian sensing matrices. Simulations confirm that our results hold for a larger class of random matrices. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
141. SDLDS—System for Digital Logic Design and Simulation.
- Author
-
Stanisavljevic, Zarko, Pavlovic, Vladimir, Nikolic, Bosko, and Djordjevic, Jovan
- Subjects
- *
LOGIC design , *DIGITAL electronics , *SIMULATION methods & models , *SWITCHING circuits , *SCHOOLS , *EYE , *ENGINEERING education - Abstract
This paper presents the basic features of a software system developed to support the teaching of digital logic, as well as the experience of using it in the Digital Logic course taught at the School of Electrical Engineering, University of Belgrade, Serbia. The system has been used for several years, both by students for self-learning and laboratory work, and by teachers to automate the assessment and verification of students' work. The system allows users to design and simulate a switching circuit. It also collects data on all student activities and transfers these to the school's information system. Finally, the paper gives figures demonstrating the overall benefits of the system. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
142. Underdesigned and Opportunistic Computing in Presence of Hardware Variability.
- Author
-
Gupta, Puneet, Agarwal, Yuvraj, Dolecek, Lara, Dutt, Nikil, Gupta, Rajesh K., Kumar, Rakesh, Mitra, Subhasish, Nicolau, Alexandru, Rosing, Tajana Simunic, Srivastava, Mani B., Swanson, Steven, and Sylvester, Dennis
- Subjects
- *
MICROELECTRONICS , *ENERGY consumption , *ELECTRONIC systems , *MINIATURE electronic equipment , *INFORMATION technology , *COMPUTERS - Abstract
Microelectronic circuits exhibit increasing variations in performance, power consumption, and reliability parameters across the manufactured parts and across use of these parts over time in the field. These variations have led to increasing use of overdesign and guardbands in design and test to ensure yield and reliability with respect to a rigid set of datasheet specifications. This paper explores the possibility of constructing computing machines that purposely expose hardware variations to various layers of the system stack including software. This leads to the vision of underdesigned hardware that utilizes a software stack that opportunistically adapts to a sensed or modeled hardware. The envisioned underdesigned and opportunistic computing (UnO) machines face a number of challenges related to the sensing infrastructure and software interfaces that can effectively utilize the sensory data. In this paper, we outline specific sensing mechanisms that we have developed and their potential use in building UnO machines. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
143. Mapping a Jacobi Iterative Solver onto a High-Performance Heterogeneous Computer.
- Author
-
Morris, Gerald R. and Abed, Khalid H.
- Subjects
- *
COMPUTER performance , *JACOBIAN matrices , *ITERATIVE methods (Mathematics) , *FIELD programmable gate arrays , *COMPUTATIONAL complexity , *MICROPROCESSORS , *COMPUTER software - Abstract
High-performance heterogeneous computers that employ field programmable gate arrays (FPGAs) as computational elements are known as high-performance reconfigurable computers (HPRCs). For floating-point applications, these FPGA-based processors must satisfy a variety of heuristics and rules of thumb to achieve a speedup compared with their software counterparts. By way of a simple sparse matrix Jacobi iterative solver, this paper illustrates some of the issues associated with mapping floating-point kernels onto HPRCs. The Jacobi method was chosen based on heuristics developed from earlier research. Furthermore, Jacobi is relatively easy to understand, yet is complex enough to illustrate the mapping issues. This paper is not trying to demonstrate the speedup of a particular application nor is it suggesting that Jacobi is the best way to solve equations. The results demonstrate a nearly threefold wall clock runtime speedup when compared with a software implementation. A formal analysis shows that these results are reasonable. The purpose of this paper is to illuminate the challenging floating-point mapping process while simultaneously showing that such mappings can result in significant speedups. The ideas revealed by research such as this have already been and should continue to be used to facilitate a more automated mapping process. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
144. Space-Time Network Coding With Multiple AF Relays Over Nakagami- $m$ Fading Channels.
- Author
-
Zhang, Yu, Xiong, Ke, Fan, Pingyi, Yang, Hong-Chuan, and Zhou, Xianwei
- Subjects
- *
LINEAR network coding , *NAKAGAMI channels , *SPACETIME , *SYMBOL error rate , *SIGNAL-to-noise ratio , *ELECTRIC relays - Abstract
This paper first analyzes the symbol error rate (SER) performance of the space-time network coding (STNC) over independent but not necessarily identically distributed (i.n.i.d) Nakagami- $m$ fading channels, where multiple sources transmit information symbols to a destination through multiple helping amplify-and-forward (AF) relays. The exact expressions of the overall end-to-end received signal-to-noise ratio (SNR) via multiple STNC-AF relays and its moment generating function (MGF) are derived. Based on these results, the closed-form expression of STNC-AF for the SER with $M$ -ary phase-shift keying and $M$ -ary quadrature-amplitude modulation (QAM) modulations are then presented by adopting the unified MGF method. Furthermore, an approximate SER expression with low computational complexity is also given. In order to observe the performance limit, the diversity order and the nonorthogonality of STNC codes are discussed. Simulation results demonstrate our analytical results and it is illustrated that the diversity order of the STNC with multiple AF relay nodes is a sum function of the fading index of the direct link and the minimal fading indices of the multiple two-hop links. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
145. Improved Lower Bounds for Coded Caching.
- Author
-
Ghasemi, Hooshang and Ramamoorthy, Aditya
- Subjects
- *
CACHE memory , *CODING theory , *CONTENT delivery networks , *SIGNAL processing , *MATHEMATICAL bounds , *TREE codes (Coding theory) - Abstract
Caching is often used in content delivery networks as a mechanism for reducing network traffic. Recently, the technique of coded caching was introduced whereby coding in the caches and coded transmission signals from the central server were considered. Prior results in this area demonstrate that carefully designing the placement of content in the caches and designing appropriate coded delivery signals from the server allow for a system where the delivery rates can be significantly smaller than conventional schemes. However, matching upper and lower bounds on the transmission rate have not yet been obtained. In this paper, we derive tighter lower bounds on the coded caching rate than were known previously. We demonstrate that this problem can equivalently be posed as a combinatorial problem of optimally labeling the leaves of a directed tree. Our proposed labeling algorithm allows for significantly improved lower bounds on the coded caching rate. Furthermore, we study certain structural properties of our algorithm that allow us to analytically quantify improvements on the rate lower bound for general values of the problem parameters. This allows us to obtain a multiplicative gap of at most four between the achievable rate and our lower bound. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
146. Cost-Aware Region-Level Data Placement in Multi-Tiered Parallel I/O Systems.
- Author
-
He, Shuibing, Wang, Yang, Li, Zheng, Sun, Xian-He, and Xu, Chenzhong
- Subjects
- *
PARALLEL computers , *HIGH performance computing , *HARD disks , *COMPUTER input-output equipment , *CLIENT/SERVER computing - Abstract
Multi-tiered Parallel I/O systems that combine traditional HDDs with emerging SSDs mitigate the cost burden of SSDs while benefiting from their superior I/O performance. While a multi-tiered parallel I/O system is promising for data-intensive applications in high-performance (HPC) domains, placing data on each tier of the system to achieve high I/O performance remains a challenge. In this paper, we propose a cost-aware region-level (CARL) data placement scheme in multi-tiered parallel I/O systems. CARL divides a large file into several small regions, and then places regions on different types of servers based on region access costs. CARL includes a static policy S-CARL and a dynamic policy D-CARL. For applications whose I/O access patterns are completely known, S-CARL calculates the region costs within the entire workload duration, and uses a static data placement scheme to selectively place regions on the proper servers. To adapt to applications whose access patterns are unknown in advance, D-CARL uses a dynamic data placement scheme which migrates data among different servers within each time window. We have implemented CARL under MPI-IO library and OrangeFS parallel file system environment. Our evaluation with representative benchmarks and an application shows that CARL is both feasible and able to improve I/O performance significantly. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
147. Symbolic Analysis of Higher-Order Side Channel Countermeasures.
- Author
-
Bisi, Elia, Melzani, Filippo, and Zaccaria, Vittorio
- Subjects
- *
SYMBOLIC circuit analysis , *EMBEDDED computer systems , *ELECTRONIC countermeasures , *CRYPTOGRAPHY , *MATHEMATICAL notation , *UNIVARIATE analysis - Abstract
In this paper, we deal with the problem of efficiently assessing the higher order vulnerability of a hardware cryptographic circuit. Our main concern is to provide methods that allow a circuit designer to detect early in the design cycle if the implementation of a Boolean-additive masking countermeasure does not hold up to the required protection order. To achieve this goal, we promote the search for vulnerabilities from a statistical problem to a purely symbolical one and then provide a method for reasoning about this new symbolical interpretation. Eventually we show, with a synthetic example, how the proposed conceptual tool can be used for exploring the vulnerability space of a cryptographic primitive. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
148. Ephemeral Content Popularity at the Edge and Implications for On-Demand Caching.
- Author
-
Carlsson, Niklas and Eager, Derek
- Subjects
- *
CACHE memory , *INTERNET , *COMPUTER networks , *METADATA - Abstract
The ephemeral content popularity seen with many content delivery applications can make indiscriminate on-demand caching in edge networks highly inefficient, since many of the content items that are added to the cache will not be requested again from that network. In this paper, we address the problem of designing and evaluating more selective edge-network caching policies. The need for such policies is demonstrated through an analysis of a dataset recording YouTube video requests from users on an edge network over a 20-month period. We then develop a novel workload modelling approach for such applications and apply it to study the performance of alternative edge caching policies, including indiscriminate caching and cache on $k$
th request for different $k$- Published
- 2017
- Full Text
- View/download PDF
149. The Effects of a Robot Game Environment on Computer Programming Education for Elementary School Students.
- Author
-
Shim, Jaekwoun, Kwon, Daiyoung, and Lee, Wongyu
- Subjects
- *
ELEMENTARY education , *COMPUTER programming education , *PROBLEM solving , *SCHOOL environment , *SCHOOL children , *USER-centered system design - Abstract
In the past, computer programming was perceived as a task only carried out by computer scientists; in the 21st century, however, computer programming is viewed as a critical and necessary skill that everyone should learn. In order to improve teaching of problem-solving abilities in a computing environment, extensive research is being done on teaching–learning methods, types of teaching software, the educational environment, and related tools. This paper, based on diverse experimental results, proposes an environment where elementary students can easily learn and practice computer programming. The proposed robot game environment used a tangible programming tool with which students can easily create robot programs, without learning syntax, and then validate their programming results; it can also provide various game activities to incite students’ interest. Observation of elementary school students placed in the robot game environment confirmed the tool’s usability and entertainment aspects, and students’ attitudes toward programming and their understanding of programming concepts improved. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
150. Color Image-Guided Boundary-Inconsistent Region Refinement for Stereo Matching.
- Author
-
Jiao, Jianbo, Wang, Ronggang, Wang, Wenmin, Li, Dagang, and Gao, Wen
- Subjects
- *
KINECT (Motion sensor) , *COST analysis , *COMPUTER vision , *EDGE detection (Image processing) , *PIXELS - Abstract
Cost computation, cost aggregation, disparity optimization, and disparity refinement are the four main steps for stereo matching. While the first three steps have been widely investigated, few efforts have been taken on disparity refinement. In this paper, we propose a color image-guided disparity refinement method to further remove the boundary-inconsistent regions on disparity map. First, the origins of boundary-inconsistent regions are analyzed. Then, these regions are detected with the proposed hybrid-superpixel-based strategy. Finally, the detected boundary-inconsistent regions are refined by a modified weighted median filtering method. Experimental results on various stereo matching conditions validate the effectiveness of the proposed method. Furthermore, depth maps obtained by active depth acquisition devices like Kinect can also be well refined with our proposed method. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.