52 results on '"Mingsong Lv"'
Search Results
2. An Energy-Efficient Mixed-Bit CNN Accelerator With Column Parallel Readout for ReRAM-Based In-Memory Computing
- Author
-
Dingbang Liu, Haoxiang Zhou, Wei Mao, Jun Liu, Yuliang Han, Changhai Man, Qiuping Wu, Zhiru Guo, Mingqiang Huang, Shaobo Luo, Mingsong Lv, Quan Chen, and Hao Yu
- Subjects
Electrical and Electronic Engineering - Published
- 2022
- Full Text
- View/download PDF
3. Toward the Predictability of Dynamic Real-Time DNN Inference
- Author
-
Mingsong Lv, Weiguang Pang, Di Liu, Wang Yi, Xu Jiang, and Teng Gao
- Subjects
Computer science ,business.industry ,Inference ,Machine learning ,computer.software_genre ,Computer Graphics and Computer-Aided Design ,Execution time ,Constraint (information theory) ,Embedded applications ,Deep neural networks ,Artificial intelligence ,Electrical and Electronic Engineering ,Predictability ,Adaptation (computer science) ,business ,computer ,Software - Abstract
Deep neural networks (DNNs) have been widely used in many Cyber-Physical Systems (CPS). However, it is still challenging work to deploy DNNs in real-time systems. In particular, the execution time of DNN inference must be predictable, s.t. it could be known whether the run-time inference can complete within a required timing constraint. Moreover, the timing constraints may change dynamically with the run-time environment in many embedded applications, such as autonomous cars. A possible way to meet such dynamic real-time requirements is to execute different sub-networks of a DNN at run-time. However, improper construction of sub-networks may not only introduce unpredictable inference time, s.t. the real-timing constraints could be violated unexpectedly, but also has poor compatibility with the well-optimized machine learning framework (e.g., TensorFlow). In this paper, we study the predictability when executing different sub-networks of a DNN. In particular, we present a feature-wise run-time adaptation framework for DNN inference, which is implemented and validated on NVIDIA Jetson TX2 and Nano with TensorFlow. The experimental results show that our method can achieve predictable inference time in comparison with the state-of-the-art methods.
- Published
- 2022
- Full Text
- View/download PDF
4. The shape of a DAG: bounding the response time using long paths
- Author
-
Qingqiang He, Nan Guan, Mingsong Lv, Xu Jiang, and Wanli Chang
- Subjects
Control and Optimization ,Computer Networks and Communications ,Control and Systems Engineering ,Modeling and Simulation ,Electrical and Electronic Engineering ,Computer Science Applications - Published
- 2023
- Full Text
- View/download PDF
5. Dual BN quantum dot/Ag co-catalysts synergistically promote electron-hole separation on g-C3N4 nanosheets for efficient antibiotics oxidation and Cr(VI) reduction
- Author
-
Haifeng Shi, C. L. Zhang, Mingsong Lv, Kaixu Ren, and Qijing Xie
- Subjects
Materials science ,business.industry ,General Chemistry ,Electron hole ,Photochemistry ,Redox ,Catalysis ,Semiconductor ,Quantum dot ,Oxidizing agent ,Photocatalysis ,General Materials Science ,business ,Ternary operation - Abstract
The urgent challenge of semiconductor photocatalysis technology is to prevent the rapid recombination of photogenerated electron-hole pairs on the basis of making full use of solar energy. Fortunately, the co-catalysts usually play a non-negligible role in achieving high photocatalytic performance. Herein, BN quantum dots (BNQDs) and Ag as novel dual co-catalysts are introduced on g-C3N4 (CN) nanosheets that could transfer carriers rapidly in a large area, boosting the photocatalytic performance of CN. Specifically, Ag is a bright choice for improving solar energy utilization and serving as electron sinks, while BNQDs could act as superior photoinduced-hole extractors. The photogenerated electron-hole pairs are finally pulled apart due to the synergistic effect of the dual co-catalysts, stimulating a large number of photogenerated electrons and holes to participate in their respective redox reactions efficiently. In consequence, the CN/Ag/BNQDs(3) ternary composites exhibit stronger oxidizing and reducing properties, which are reflected in the oxidative degradation efficiency of TC (80.54%) and the ability to reduce Cr(Ⅵ) (88.93%) within 60 min were 3.04 and 10.03 times than pure CN. This research paves a path for the design of photocatalysts with high-efficiency carrier separation capabilities, and broadens the way for the application of co-catalysts in the field of photocatalysis.
- Published
- 2022
- Full Text
- View/download PDF
6. Efficient DNN Execution on Intermittently-Powered IoT Devices With Depth-First Inference
- Author
-
Mingsong Lv and Enyu Xu
- Subjects
General Computer Science ,General Engineering ,General Materials Science ,Electrical and Electronic Engineering - Published
- 2022
- Full Text
- View/download PDF
7. Deep Learning on Energy Harvesting IoT Devices: Survey and Future Challenges
- Author
-
Mingsong Lv and Enyu Xu
- Subjects
General Computer Science ,General Engineering ,General Materials Science ,Electrical and Electronic Engineering - Published
- 2022
- Full Text
- View/download PDF
8. Worst-Case Time Disparity Analysis of Message Synchronization in ROS
- Author
-
Ruoxiang Li, Nan Guan, Xu Jiang, Zhishan Guo, Zheng Dong, and Mingsong Lv
- Published
- 2022
- Full Text
- View/download PDF
9. Precise and scalable shared cache contention analysis for WCET estimation
- Author
-
Wei Zhang, Mingsong Lv, Wanli Chang, and Lei Ju
- Published
- 2022
- Full Text
- View/download PDF
10. Scheduling and analysis of real-time tasks with parallel critical sections
- Author
-
Yang Wang, Xu Jiang, Nan Guan, Mingsong Lv, Dong Ji, and Wang Yi
- Published
- 2022
- Full Text
- View/download PDF
11. LATICS: A Low-Overhead Adaptive Task-Based Intermittent Computing System
- Author
-
Mingsong Lv, Wei Zhang, Songran Liu, Qiulin Chen, and Nan Guan
- Subjects
Computer science ,Distributed computing ,02 engineering and technology ,Computer Graphics and Computer-Aided Design ,020202 computer hardware & architecture ,Task (project management) ,Adaptive system ,0202 electrical engineering, electronic engineering, information engineering ,Task analysis ,Overhead (computing) ,Energy supply ,State (computer science) ,Electrical and Electronic Engineering ,Energy harvesting ,Software - Abstract
Energy harvesting promises to power billions of Internet-of-Things devices without being restricted by battery life. The energy output of harvesters is typically tiny and highly unstable, so the computing system must store program states into nonvolatile memory frequently to preserve the execution progress in the presence of frequent power failures. Task-based intermittent computing is a promising paradigm to provide such capability, where each task executes atomically and only states across task boundaries need to be saved. This article presents LATICS, a low-overhead adaptive task-based intermittent computing system, which dynamically decides the granularity of atomic execution to avoid unnecessarily frequent state saving when energy supply is sufficient. The novel feature of LATICS is to drastically reduce the amount of states to be saved at task boundaries compared with existing solutions. Notably, we disclose that skipping state saving at some task boundary may cause the system to store more states at other places, and thus leads to higher overall overhead. Therefore, LATICS enforces mandatory state saving at certain task boundaries regardless of the current energy condition to reduce state saving overhead. We implement LATICS on a real energy-harvesting platform based on MSP430 and experimentally compare against the state-of-the-art under different settings. The experimental results show that LATICS significantly reduces state saving overhead and improves execution efficiency compared to existing solutions.
- Published
- 2020
- Full Text
- View/download PDF
12. PRUID: Practical User Interface Distribution for Multi-surface Computing
- Author
-
Qingqiang He, Mingsong Lv, Chuancai Gu, Menglong Cui, Tao Yang, Caiqi Zhang, and Nan Guan
- Subjects
Source code ,business.industry ,Computer science ,Distributed computing ,media_common.quotation_subject ,Overhead (engineering) ,computer.software_genre ,Surface computing ,Information extraction ,User experience design ,Use case ,User interface ,business ,Mobile device ,computer ,media_common - Abstract
It becomes more and more common for people to have multiple mobile devices. This opens the opportunity of multi-surface computing in which users interact with an app using multiple devices simultaneously. Recently, a system called FLUID was developed, which can distribute User Interface (UI) elements of an app to multiple devices to support multi-surface computing. FLUID enables general, flexible and transparent multi-device interaction, which cannot be achieved by previous approaches such as screen mirroring, app migration, and customized app development on multiple devices. However, the practicality of FLUID is still severely limited because it requires that (1) the app source codes must be available and (2) the same app is pre-installed on all devices. This paper presents PRUID, a UI distribution system that is free from the above-mentioned limitations of FLUID. PRUID captures and extracts relevant information about UI elements to be distributed completely at run time, without requiring the app source code. An app-independent UI agent is designed to dock and render the UI components distributed to the guest device, so pre-installation of the app on guest devices is not required. We developed representative use cases to demonstrate the usage and evaluate the performance of PRUID. The evaluation results show that the extra overhead incurred due to the UI information extraction at run time is marginal and PRUID provides a smooth user experience.
- Published
- 2021
- Full Text
- View/download PDF
13. Real-Time Scheduling of Conditional DAG Tasks with Intra-Task Priority Assignment
- Author
-
Qingqiang He, Jinghao Sun, Nan Guan, Mingsong Lv, and Zhenyu Sun
- Subjects
Electrical and Electronic Engineering ,Computer Graphics and Computer-Aided Design ,Software - Published
- 2023
- Full Text
- View/download PDF
14. Selective detection of trace carbon monoxide at room temperature based on CuO nanosheets exposed to (111) crystal facets
- Author
-
Yuanyuan, Wu, Ji, Li, Mingsong, Lv, Xianfa, Zhang, Rui, Gao, Chuanyu, Guo, Xiaoli, Cheng, Xin, Zhou, Yingming, Xu, Shan, Gao, Zoltán, Major, and Lihua, Huo
- Subjects
Oxygen ,Carbon Monoxide ,Environmental Engineering ,Health, Toxicology and Mutagenesis ,Temperature ,Humans ,Reproducibility of Results ,Environmental Chemistry ,Pollution ,Waste Management and Disposal - Abstract
In recent years, carbon monoxide (CO) intoxication incidents occur frequently, and the sensitive detection of CO is particularly significant. At present, most reported carbon monoxide (CO) sensors meet the disadvantage of high working temperature. It is always a challenge to realize the sensitive detection of carbon monoxide at room temperature. In this study, CuO nanosheets exposed more (111) active crystal facets and oxygen vacancy defects were synthesized by a simple and environmentally friendly one-step hydrothermal method. The sensor has good comprehensive gas sensing performance, compared with other sensors that can detect CO at room temperature. The response value to 100 ppm CO at room temperature is as high as 39.6. In addition, it also has excellent selectivity, low detection limit (100 ppb), good reproducibility, moisture resistance and long-term stability (60 days). This excellent gas sensing performance is attributed to the special structural characteristics of 2D materials and the synergistic effect of more active crystal facets exposed on the crystal surface and oxygen vacancy defects. Therefore, it is expected to become a promising sensitive material for rapid and accurate detection of trace CO gas under low energy consumption, reduce the risk of poisoning, and then effectively protect human life safety.
- Published
- 2023
- Full Text
- View/download PDF
15. Integrated plasmonic full adder based on cascaded rectangular ring resonators for optical computing
- Author
-
Yichen Ye, Yiyuan Xie, Tingting Song, Nan Guan, Mingsong Lv, and Chuandong Li
- Subjects
Electrical and Electronic Engineering ,Atomic and Molecular Physics, and Optics ,Electronic, Optical and Magnetic Materials - Published
- 2022
- Full Text
- View/download PDF
16. Optimizing the Locations and Sizes of Solar Assisted Electric Vehicle Charging Stations in an Urban Area
- Author
-
Jiayu Yang, Wang Yi, Dong Ji, and Mingsong Lv
- Subjects
business.product_category ,General Computer Science ,solar energy ,ComputerApplications_COMPUTERSINOTHERSYSTEMS ,metaheuristic ,010501 environmental sciences ,User requirements document ,Urban area ,01 natural sciences ,Automotive engineering ,Computer Science::Systems and Control ,Range (aeronautics) ,Charging station ,0502 economics and business ,Electric vehicle ,General Materials Science ,Air quality index ,0105 earth and related environmental sciences ,geography ,geography.geographical_feature_category ,business.industry ,05 social sciences ,electric vehicle ,General Engineering ,Solar energy ,Work (electrical) ,Greenhouse gas ,Environmental science ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,business ,lcsh:TK1-9971 ,050203 business & management - Abstract
With the wide-spread adoption of electric vehicles (EVs), introducing solar energy in building EV charging stations is promising as it can reduce carbon emissions and improve air quality. The main challenges are how to decide where to build solar assisted charging stations in a city and how to size the charging stations, as the decision is affected by a broad range of factors, such as construction cost, solar energy fluctuation, and user requirements. This paper proposed an approach to efficiently decide the locations and sizes of solar energy assisted charging stations for an urban area. Experiments are conducted on real EV history data from 297 users of an EV leasing company. The results show that the proposed method can produce high quality decisions within reasonable computation time. The work of this paper will provide important information for decision makers to integrate solar energy into the EV charging infrastructure.
- Published
- 2020
- Full Text
- View/download PDF
17. Surviving Transient Power Failures with SRAM Data Retention
- Author
-
Nan Guan, Qiulin Chen, Wei Zhang, Songran Liu, and Mingsong Lv
- Subjects
Non-volatile memory ,Computer science ,Backup ,Cyclic redundancy check ,Transient (computer programming) ,Central processing unit ,Static random-access memory ,Data retention ,Reliability engineering ,Power (physics) - Abstract
Many computing systems, such as those powered by energy harvesting or deployed in harsh working environment, may experience unpredictable and frequent transient power failures in their life time. The systems may fail to deliver correct computation results or never progress, as computation is frequently interrupted by the power failures. A possible solution could be frequently saving program states to non-volatile memory (NVM), such as using checkpoints, so that the system can incrementally progress. However, this approach is too costly, since frequent NVM writes is time and energy consuming, and may wear out the NVM device. In this work, we propose an approach to enable a system to use volatile SRAM to correctly progress in the presence of transient power failures, since SRAM is capable of retaining its data for seconds or minutes with the charge remained in the battery/capacitor after the CPU core stops at its brown-out voltage. The main problem is to validate whether the data in SRAM are actually retained during power failures. In our approach, we validate only a subset of the program states with Cyclic Redundancy Check for efficiency. The validation technique requires maintaining a backup version of the program states, which additionally provides the system with the ability to progress incrementally. We implement a run-time system with the proposed approach. Experimental results on an MSP430 platform show that the system can correctly progress on SRAM in the presence of transient power failures with low overhead.
- Published
- 2021
- Full Text
- View/download PDF
18. Intermittent Computing with Efficient State Backup by Asynchronous DMA
- Author
-
Nan Guan, Wei Zhang, Sonran Liu, Mingsong Lv, and Qiulin Chen
- Subjects
Correctness ,Backup ,business.industry ,Computer science ,Asynchronous communication ,Embedded system ,Data_FILES ,Process (computing) ,State (computer science) ,business ,Energy harvesting ,Energy (signal processing) ,Power (physics) - Abstract
Energy harvesting promises to power billions of Internet-of-Things devices without being restricted by battery life. The energy output of harvesters is typically weak and highly unstable, so computing systems must frequently back up program states into non-volatile memory to ensure a program will progress in the presence of frequent power failures. However, state backup is a time-consuming process. In existing solutions for this problem, state backup is conducted sequentially with program execution, which considerably impact system performance. This paper proposes techniques to parallelize state backup and program execution with asynchronous DMA. The challenge is that program states can be incorrectly backed up, which may further cause the program to deliver incorrect computation. Our main idea is to allow errors to occur in parallel state backup and program execution, and detect the errors at the end of the state backup. Moreover, we propose a technique that allows the system to tolerate backup errors during execution without harming logical correctness. We designed a run-time system to implement the proposed approach. Experimental results on an STM32F7-based platform show that execution performance can be considerably improved by parallelizing state backup and program execution.
- Published
- 2021
- Full Text
- View/download PDF
19. Predicting Performance Degradation on Adaptive Cache Replacement Policy
- Author
-
Qingxu Deng, Ran Cui, Yi Zhang, Mingsong Lv, and Chuanwen Li
- Subjects
010302 applied physics ,Multi-core processor ,Hardware_MEMORYSTRUCTURES ,Computer science ,Distributed computing ,02 engineering and technology ,01 natural sciences ,020202 computer hardware & architecture ,Moment (mathematics) ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Performance prediction ,Cache ,Sensitivity (control systems) ,Adaptation (computer science) - Abstract
Adaptive Cache Replacement Policy (ACRP) has been implemented in recently proposed commercial multi-core processors. ACRP consists of two candidate cache replacement policies and dynamically employs the policy which is with fewer cache misses at the moment. ACRP can diminish the overall cache misses, but at the same time it augments the performance inference between co-running applications and makes the performance prediction much harder. Unfortunately, very little work has focused on the performance impact from this mechanism. In this paper, we firstly expose the performance variation problem due to adaptive cache replacement policies. Secondly, we present Bubble-Bound, a low-overhead measurement-based method to estimate a program's performance variation caused by the dynamic adaptation of cache replacement policies. By using a stress program to characterize the pressure and sensitivity, our method can predict a bound for the performance degradation between co-located applications and enable “safe” co-locations on the processors with ACRP.
- Published
- 2020
- Full Text
- View/download PDF
20. Real-Time Scheduling and Analysis of OpenMP Programs with Spin Locks
- Author
-
Xu Jiang, Mingsong Lv, Wang Yi, Tao Yang, and He Du
- Subjects
020203 distributed computing ,Computer science ,Distributed computing ,Workload ,02 engineering and technology ,Blocking (computing) ,020202 computer hardware & architecture ,Scheduling (computing) ,Resource (project management) ,Multithreading ,Component (UML) ,0202 electrical engineering, electronic engineering, information engineering ,Resource allocation ,Resource management ,Protocol (object-oriented programming) - Abstract
Locking protocol is an essential component in resource management of real-time systems, which coordinates mutually exclusive accesses to shared resources from different tasks. OpenMP is a promising framework for multi-core realtime embedded systems as well as provides spin locks to protect shared resources. In this paper, we propose a resource model for analyzing OpenMP programs with spin locks. Based on our resource model, we also develop a technique for analyzing the blocking time which impacts the total workload. Notably, the resource model provides detailed resource access behavior of the programs, making our blocking analysis more accurate. Further, we derive the schedulability analysis for real-time OpenMP tasks with spin locks protecting shared resources. Experiments with realistic OpenMP programs are conducted to evaluate the performance of our method.
- Published
- 2020
- Full Text
- View/download PDF
21. A Gaussian Set Sampling Model for Efficient Shared Cache Profiling on Multi-Cores
- Author
-
Mingsong Lv, Zhanwei Ling, Yi Zhang, and Nan Guan
- Subjects
General Computer Science ,Computer science ,Gaussian ,Gaussian distribution ,shared cache ,02 engineering and technology ,Parallel computing ,01 natural sciences ,symbols.namesake ,multi-core ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Gaussian function ,Overhead (computing) ,General Materials Science ,set sampling ,010302 applied physics ,Profiling (computer programming) ,Multi-core processor ,Hardware_MEMORYSTRUCTURES ,General Engineering ,Sampling (statistics) ,020202 computer hardware & architecture ,Shared memory ,symbols ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,Cache ,lcsh:TK1-9971 - Abstract
The last level cache (LLC) has significant impact to system performance on modern multi-core processors. But as cache sizes reach several megabytes and more, the overhead of exploring performance on LLC greatly increases as well. To improve the efficiency of performance analysis, we propose a set-sampling-based cache profiling model for the performance analysis on multi-core LLC. We first explore the memory access distributions on LLC by developing a low-overhead stress-application-based method. The results show that memory access distributions can be approximated by Gaussian distribution function. Based on this observation, a Gaussian-distribution-based set sampling model is proposed which can predict program performance with limited representative samples. We evaluate our model on a contemporary multi-core machine and show that 1) the proposed method can precisely predict program performance on LLC under different contention intensities and 2) our method can achieve similar precision with less samples compared to widely adopted set sampling methods such as the random sampling and the continuous address sampling.
- Published
- 2019
- Full Text
- View/download PDF
22. An Efficient UAV Hijacking Detection Method Using Onboard Inertial Measurement Unit
- Author
-
Mingsong Lv, Nan Guan, Wang Yi, Zhiwei Feng, Xue Liu, Weichen Liu, Qingxu Deng, and School of Computer Science and Engineering
- Subjects
010504 meteorology & atmospheric sciences ,Computational complexity theory ,Computer science ,Real-time computing ,Cyber-physical system ,020206 networking & telecommunications ,Gyroscope ,02 engineering and technology ,Accelerometer ,01 natural sciences ,Drone ,law.invention ,Acceleration ,Hardware and Architecture ,law ,Inertial measurement unit ,0202 electrical engineering, electronic engineering, information engineering ,Trajectory ,Computer science and engineering [Engineering] ,Software ,Detection Methods ,Drones ,0105 earth and related environmental sciences - Abstract
With the fast growth of civil drones, their security problems meet significant challenges. A commercial drone may be hijacked by a GPS-spoofing attack for illegal activities, such as terrorist attacks. The target of this article is to develop a technique that only uses onboard gyroscopes to determine whether a drone has been hijacked. Ideally, GPS data and the angular velocities measured by gyroscopes can be used to estimate the acceleration of a drone, which can be further compared with the measurement of the accelerometer to detect whether a drone has been hijacked. However, the detection results may not always be accurate due to some calculation and measurement errors, especially when no hijacking occurs in curve trajectory situations. To overcome this, in this article, we propose a novel and simple method to detect hijacking only based on gyroscopes’ measurements and GPS data, without using any accelerometer in the detection procedure. The computational complexity of our method is very low, which is suitable to be implemented in the drones with micro-controllers. On the other hand, the proposed method does not rely on any accelerometer to detect attacks, which means it receives less information in the detection procedure and may reduce the results accuracy in some special situations. While the previous method can compensate for this flaw, the high detection results also can be guaranteed by using the above two methods. Experiments with a quad-rotor drone are conducted to show the effectiveness of the proposed method and the combination method.
- Published
- 2018
- Full Text
- View/download PDF
23. Integrating Cyber-Attack Defense Techniques into Real-Time Cyber-Physical Systems
- Author
-
Xiaochen Hao, Mingsong Lv, Jiesheng Zheng, Zhengkui Zhang, and Wang Yi
- Subjects
0209 industrial biotechnology ,Computer science ,business.industry ,Cyber-physical system ,02 engineering and technology ,Execution time ,020901 industrial engineering & automation ,Control flow ,020204 information systems ,Embedded system ,Data integrity ,0202 electrical engineering, electronic engineering, information engineering ,Cyber-attack ,business ,Return-oriented programming ,Control-flow integrity ,Data-flow analysis ,Vulnerability (computing) ,Buffer overflow - Abstract
With the rapid deployment of Cyber-Physical Systems (CPS), security has become a more critical problem than ever before, as such devices are interconnected and have access to a broad range of critical data. A well-known attack is ReturnOriented Programming (ROP) which can diverge the control flow of a program by exploiting the buffer overflow vulnerability. To protect a program from ROP attacks, a useful method is to instrument code into the protected program to do runtime control flow checking (known as Control Flow Integrity, CFI). However, instrumented code brings extra execution time, which has to be properly handled, as most CPS systems need to behave in a real-time manner. In this paper, we present a technique to efficiently compute an execution plan, which maximizes the number of executions of instrumented code to achieve maximal defense effect, and at the same time guarantees real-time schedulability of the protected task system with a new response time analysis. Simulation-based experimental results show that the proposed method can yield good quality execution plans, but performs orders of magnitude faster than exhaustive search. We also built a prototype in which a small auto-drive car is defended against ROP attacks by the proposed method implemented in FreeRTOS. The prototype demonstrates the effectiveness of our method in real-life scenarios.
- Published
- 2019
- Full Text
- View/download PDF
24. Detecting and Predicting Performance Degradation Caused by Impaired Cache Isolation
- Author
-
Ran Cui, Zhanwei Ling, Qingxu Deng, Mingsong Lv, Yi Zhang, and Nan Guan
- Subjects
010302 applied physics ,Multi-core processor ,Hardware_MEMORYSTRUCTURES ,Computer science ,business.industry ,Quality of service ,Temporal isolation among virtual machines ,Throughput ,02 engineering and technology ,01 natural sciences ,020202 computer hardware & architecture ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Isolation (database systems) ,Cache ,business ,Computer network - Abstract
As the shared last level cache (LLC) in multicore processors has been shown to be a critical resource for system performance, much work has been proposed for improving the quality of service (QoS) and throughput on LLC. Cache Allocation Technology (CAT) and Adaptive Cache Replacement Policies (ACRP) are two of the techniques that are featured in recent Intel processors. CAT implements way partitioning and provides the ability to control the cache space allocation among cores. ACRP works with multiple replacement policies and enables the cache to adapt to the cache replacement policy with less cache misses. In this paper, we first show an interesting finding that ACRP technique can violate the performance isolation provided by CAT. We find the cause for this problem is that the ACRP chooses the cache replacement policy upon the global information even although the cache space partitioning is being enabled by CAT. As the result, the cache/performance isolation can be impaired by the interference on cache replacement policy. To deal with this problem, we propose a low overhead method to predict the worst execution time degradation caused by the replacement policy adaptation. Thus, in the partitioned cache space, if the worst execution time estimated by our method is not beyond the response time required for this program, the QoS for this program can be quaranteed no matter how the cache replacement policies alternate.
- Published
- 2019
- Full Text
- View/download PDF
25. Scheduling and analysis of real-time task graph models with nested locks
- Author
-
He Du, Mingsong Lv, Tao Yang, Wang Yi, and Xu Jiang
- Subjects
Speedup ,Hardware and Architecture ,Computer science ,Distributed computing ,Graph (abstract data type) ,Digraph ,Software ,Scheduling (computing) ,Shared resource - Abstract
Locking protocol is a crucial component in scheduling of real-time systems. The digraph real-time task model (DRT) is the state-of-the-art graph-based task model, which is a generalization of most previous real-time task models. To our best knowledge, the only work addressing resource sharing problem in DRT task model proposes a resource sharing protocol, called ACP, as well as a scheduling strategy EDF+ACP. Although EDF+ACP is optimal for scheduling DRT tasks with non-nested resource access, it cannot handle the situation of nested resource accesses. In this paper, we propose a new protocol, called N-ACP, by modifying ACP to manage nested resource accesses in task graph models. We apply N-ACP to EDF scheduling to obtain a new scheduling strategy EDF+N-ACP. We develop schedulability analysis techniques for EDF+N-ACP and evaluate its performance by a widely-used quantitative metrics speedup factor. We derive its speedup factor as a function of the maximal nesting level of resource accesses in the system.
- Published
- 2021
- Full Text
- View/download PDF
26. Multi-feature fusion for thermal face recognition
- Author
-
Yangjie Wei, Mingsong Lv, Yin Bi, Nan Guan, and Wang Yi
- Subjects
Local binary patterns ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,02 engineering and technology ,Sparse approximation ,Condensed Matter Physics ,01 natural sciences ,Facial recognition system ,Atomic and Molecular Physics, and Optics ,Electronic, Optical and Magnetic Materials ,010309 optics ,Face (geometry) ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Feature (machine learning) ,Three-dimensional face recognition ,020201 artificial intelligence & image processing ,Artificial intelligence ,Noise (video) ,Face detection ,business - Abstract
Human face recognition has been researched for the last three decades. Face recognition with thermal images now attracts significant attention since they can be used in low/none illuminated environment. However, thermal face recognition performance is still insufficient for practical applications. One main reason is that most existing work leverage only single feature to characterize a face in a thermal image. To solve the problem, we propose multi-feature fusion, a technique that combines multiple features in thermal face characterization and recognition. In this work, we designed a systematical way to combine four features, including Local binary pattern, Gabor jet descriptor, Weber local descriptor and Down-sampling feature. Experimental results show that our approach outperforms methods that leverage only a single feature and is robust to noise, occlusion, expression, low resolution and different l 1 -minimization methods.
- Published
- 2016
- Full Text
- View/download PDF
27. AutoDietary: A Wearable Acoustic Sensor System for Food Intake Recognition in Daily Life
- Author
-
Wang Yi, Chen Song, Wenyao Xu, Nan Guan, Mingsong Lv, and Yin Bi
- Subjects
Engineering ,Food intake ,business.industry ,Speech recognition ,digestive, oral, and skin physiology ,010401 analytical chemistry ,0206 medical engineering ,Embedded hardware ,Feature extraction ,Acoustic sensor ,Wearable computer ,02 engineering and technology ,020601 biomedical engineering ,01 natural sciences ,0104 chemical sciences ,law.invention ,Bluetooth ,User experience design ,law ,Human–computer interaction ,Electrical and Electronic Engineering ,Hidden Markov model ,business ,Instrumentation - Abstract
Nutrition-related diseases are nowadays a main threat to human health and pose great challenges to medical care. A crucial step to solve the problems is to monitor the daily food intake of a person precisely and conveniently. For this purpose, we present AutoDietary, a wearable system to monitor and recognize food intakes in daily life. An embedded hardware prototype is developed to collect food intake sensor data, which is highlighted by a high-fidelity microphone worn on the subject’s neck to precisely record acoustic signals during eating in a noninvasive manner. The acoustic data are preprocessed and then sent to a smartphone via Bluetooth, where food types are recognized. In particular, we use hidden Markov models to identify chewing or swallowing events, which are then processed to extract their time/frequency-domain and nonlinear features. A lightweight decision-tree-based algorithm is adopted to recognize the type of food. We also developed an application on the smartphone, which aggregates the food intake recognition results in a user-friendly way and provides suggestions on healthier eating, such as better eating habits or nutrition balance. Experiments show that the accuracy of food-type recognition by AutoDietary is 84.9%, and those to classify liquid and solid food intakes are up to 97.6% and 99.7%, respectively. To evaluate real-life user experience, we conducted a survey, which collects rating from 53 participants on wear comfort and functionalities of AutoDietary. Results show that the current design is acceptable to most of the users.
- Published
- 2016
- Full Text
- View/download PDF
28. Efficient drone hijacking detection using two-step GA-XGBoost
- Author
-
Zhiwei Feng, Qingxu Deng, Mingsong Lv, Nan Guan, Xue Liu, Wang Yi, and Wenchen Liu
- Subjects
010302 applied physics ,Correctness ,060102 archaeology ,Computer science ,business.industry ,Real-time computing ,Cyber-physical system ,06 humanities and the arts ,01 natural sciences ,Drone ,Upload ,Hardware and Architecture ,Inertial measurement unit ,0103 physical sciences ,Genetic algorithm ,Global Positioning System ,0601 history and archaeology ,business ,Software ,Inertial navigation system - Abstract
With the fast growth of civilian drones, their security problems meet significant challenges. A commercial drone may be hijacked by Global Positioning System (GPS)-spoofing attacks for illegal activities, such as terrorist attacks. Ideally, comparing positions respectively estimated by GPS and Inertial Navigation System (INS) can detect such attacks, while the results may always get fault because of the accumulated errors over time in INS. Therefore, in this paper, we propose a two-step GA-XGBoost method to detect GPS-spoofing attacks that just uses GPS and Inertial Measurement Unit (IMU) data. However, tunning the proper values of XGBoost parameters directly on the drone to achieve high prediction results consumes lots of resources which would influence the real-time performance of the drone. The proposed method separates the training phase into offboard step and onboard step. In offboard step, model is first trained by flight logs, and the training parameter values are automatically tuned by Genetic Algorithm (GA). Once the offboard model is trained, it could be uploaded to drones. To adapt our method to drones with different types of sensors and improve the correctness of prediction results, in onboard step, the model is further trained when a drone starts a mission. After onboard training finishes, the proposed method switches to the prediction mode. Besides, our method does not require any extra onboard hardware. The experiments with a real quadrotor drone also show the detection correctness is 96.3% and 100% in hijacked and non-hijacked cases at each sampling time respectively. Moreover, our method can achieve 100% detection correctness just within 1 s just after the attacks start.
- Published
- 2020
- Full Text
- View/download PDF
29. On the Consensus Mechanisms of Blockchain/DLT for Internet of Things
- Author
-
Wang Yi, Nan Guan, Mingsong Lv, and Qingqiang He
- Subjects
business.industry ,Computer science ,Distributed ledger ,0202 electrical engineering, electronic engineering, information engineering ,020206 networking & telecommunications ,020201 artificial intelligence & image processing ,02 engineering and technology ,Convergence (relationship) ,Business model ,Internet of Things ,business ,Data science ,Theme (computing) - Abstract
Internet of Things (IoT) has been experiencing exponential growth in recent years, but still faces many serious challenges. The distributed ledger technology (DLT), e.g., Blockchain, not only appears to be promising to address these technical challenges, but also brings tremendous opportunities for new application and business models. However, the convergence of IoT and DLT is yet a goal far beyond our reach today. Among many problems that have not been sufficiently understood, a fundamental one is how to design appropriate consensus mechanisms for DLT applied to IoT, which is the theme of this paper. We first discuss the potential benefits of applying DLT to IoT, and identify major challenges posed to DLT by IoT. Then we make a survey of existing DLT consensus mechanisms, to summarize major principles and discuss their pros and cons when applied in IoT.
- Published
- 2018
- Full Text
- View/download PDF
30. A Spatial-Temporal Model for Locating Electric Vehicle Charging Stations
- Author
-
Mingyang Zhao, Lei Yang, Xinyang Dong, Dong Ji, Gang Chen, Yingnan Zhao, and Mingsong Lv
- Subjects
High energy ,geography ,geography.geographical_feature_category ,business.product_category ,Computer science ,Real-time computing ,Optimal cost ,Urban area ,Data-driven ,Charging station ,Computer Science::Systems and Control ,Electric vehicle ,Key (cryptography) ,business - Abstract
A perfect charging station network plays a key role in Electric Vehicle (EV) adoption. In order to find optimal cost to construct the network and maximize satisfaction of usage consideration, we proposes a data driven framework for solving the problem of locating the charging station. Spatial-temporal models are built to analyze the EV usage behavior in the urban area. The features such as charging demand, high energy consumption area, and highly traveled paths are captured. We evaluate the proposed models on a real-world EV dataset. The results clearly demonstrate the efficiency and accuracy of our models on locating EV charging stations.
- Published
- 2018
- Full Text
- View/download PDF
31. Benchmarking OpenMP programs for real-time scheduling
- Author
-
Yang Wang, Mingsong Lv, TianZhang He, Qingqiang He, Wang Yi, Jinghao Sun, and Nan Guan
- Subjects
Metrical task system ,020203 distributed computing ,Computer science ,business.industry ,Suite ,Response time ,Workload ,02 engineering and technology ,Parallel computing ,Benchmarking ,020202 computer hardware & architecture ,Scheduling (computing) ,Software ,0202 electrical engineering, electronic engineering, information engineering ,Benchmark (computing) ,business - Abstract
Real-time systems are shifting from single-core to multi-core processors. Software must be parallelized to fully utilize the computation power of multi-core architecture. OpenMP is a popular parallel programming framework in general and high-performance computing, and recently has drawn a lot of interests in embedded and real-time computing. Much recent work has been done on real-time scheduling of OpenMP-based parallel workload. However, these studies conduct evaluations with randomly generated task systems, which cannot well represent the structure features of OpenMP workload. This paper presents a benchmark suite, ompTGB, to support research on real-time scheduling of OpenMP-based parallel tasks. ompTGB does not only collect realistic OpenMP programs, but also models them into task graphs so that the real-time scheduling researchers can easily understand and use them. We also present a new response time bound for a subset of OpenMP programs and use it to demonstrate the usage of ompTGB.
- Published
- 2017
- Full Text
- View/download PDF
32. Efficient drone hijacking detection using onboard motion sensors
- Author
-
Mingsong Lv, Nan Guan, Xue Liu, Wang Yi, Weichen Liu, Qingxu Deng, and Zhiwei Feng
- Subjects
0209 industrial biotechnology ,Engineering ,Computational complexity theory ,business.industry ,Real-time computing ,020206 networking & telecommunications ,Angular velocity ,Gyroscope ,02 engineering and technology ,Accelerometer ,Drone ,law.invention ,Acceleration ,020901 industrial engineering & automation ,Position (vector) ,law ,0202 electrical engineering, electronic engineering, information engineering ,Global Positioning System ,business ,Simulation - Abstract
The fast growth of civil drones raises significant security challenges. A legitimate drone may be hijacked by GPS spoofing for illegal activities, such as terrorist attacks. The target of this paper is to develop techniques to let drones detect whether they have been hijacked using onboard motion sensors (accelerometers and gyroscopes). Ideally, the linear acceleration and angular velocity measured by motion sensors can be used to estimate the position of a drone, which can be compared with the position reported by GPS to detect whether the drone has been hijacked. However, the position estimation by motion sensors is very inaccurate due to the significant error accumulation over time. In this paper, we propose a novel method to detect hijacking based on motion sensors measurements and GPS, which overcomes the accumulative error problem. The computational complexity of our method is very low, and thus is suitable to be implemented in the micro-controllers of drones. Experiments with a quad-rotor drone are conducted to show the effectiveness of the proposed method.
- Published
- 2017
- Full Text
- View/download PDF
33. Efficient and Effective Dimension Control in Automotive Applications
- Author
-
Ye Ma, Gang Chen, Wang Yi, Mingsong Lv, Hao Chen, Xue Liu, and Bo Zhu
- Subjects
Production line ,business.industry ,Computer science ,Industrial production ,020208 electrical & electronic engineering ,Process (computing) ,Automotive industry ,02 engineering and technology ,Effective dimension ,Industrial engineering ,Computer Science Applications ,Control and Systems Engineering ,0202 electrical engineering, electronic engineering, information engineering ,Constraint programming ,Process control ,Electrical and Electronic Engineering ,Dimension (data warehouse) ,business ,Information Systems - Abstract
In automotive industry, the production line for assembling mechanical parts of vehicles must place and weld hundreds of components on the right positions of the platform. The accuracy of deploying the components has great impact on the quality and performance of the produced vehicle. To ensure the assembly accuracy, a critical task in the production process is the so-called dimension quality control. The current state of practice in automotive industries is mainly based on a manual process where experienced engineers use production data to identify accuracy problems and suggest solutions for corrections on fixture adjustment in the assembly line. It is an extremely inefficient process, which typically takes the engineers around ten days for one batch of vehicles and a year to achieve the required assembly accuracy for final production. In this article, we present an automatic technique for dimension control. We formulate the dimension control problem as a constraint programming problem and present a refinement method to prune the exploration space. Our technique can not only identify the wrongly deployed parts leading to dimensional defects, but also provide high-quality fixture adjustment decisions. Experiments conducted on industrial production data from BMW Brilliance Automotive demonstrate the significantly improved efficiency and effectiveness of dimension control in automotive industries with our approach.
- Published
- 2020
- Full Text
- View/download PDF
34. WCET analysis with MRU cache
- Author
-
Wang Yi, Ge Yu, Mingsong Lv, and Nan Guan
- Subjects
Hardware_MEMORYSTRUCTURES ,Hardware and Architecture ,Power consumption ,Computer science ,Adaptive replacement cache ,Feature (machine learning) ,Contrast (statistics) ,Cache ,Parallel computing ,Predictability ,Abstract interpretation ,Software - Abstract
Most previous work on cache analysis for WCET estimation assumes a particular replacement policy called LRU. In contrast, much less work has been done for non-LRU policies, since they are generally considered to be very unpredictable. However, most commercial processors are actually equipped with these non-LRU policies, since they are more efficient in terms of hardware cost, power consumption and thermal output, while still maintaining almost as good average-case performance as LRU. In this work, we study the analysis of MRU, a non-LRU replacement policy employed in mainstream processor architectures like Intel Nehalem. Our work shows that the predictability of MRU has been significantly underestimated before, mainly because the existing cache analysis techniques and metrics do not match MRU well. As our main technical contribution, we propose a new cache hit/miss classification, k -Miss, to better capture the MRU behavior, and develop formal conditions and efficient techniques to decide k -Miss memory accesses. A remarkable feature of our analysis is that the k -Miss classifications under MRU are derived by the analysis result of the same program under LRU. Therefore, our approach inherits the advantages in efficiency and precision of the state-of-the-art LRU analysis techniques based on abstract interpretation. Experiments with instruction caches show that our proposed MRU analysis has both good precision and high efficiency, and the obtained estimated WCET is rather close to (typically 1%∼8% more than) that obtained by the state-of-the-art LRU analysis, which indicates that MRU is also a good candidate for cache replacement policies in real-time systems.
- Published
- 2014
- Full Text
- View/download PDF
35. Speed planning for solar-powered electric vehicles
- Author
-
Xue Liu, Wang Yi, Ye Ma, Nan Guan, Dong Ji, Erwin Knippel, and Mingsong Lv
- Subjects
0209 industrial biotechnology ,Engineering ,business.product_category ,Range anxiety ,business.industry ,020209 energy ,Real-time computing ,02 engineering and technology ,Solver ,Solar energy ,Dynamic programming ,020901 industrial engineering & automation ,Obstacle ,Electric vehicle ,0202 electrical engineering, electronic engineering, information engineering ,Overhead (computing) ,business ,Energy harvesting ,Simulation - Abstract
Electric vehicles (EVs) are the trend for future transportation. The major obstacle is range anxiety due to poor availability of charging stations and long charging time. Solar-powered EVs, which mostly rely on solar energy, are free of charging limitations. However, the range anxiety problem is more severe due to the availability of sun light. For example, shadings of buildings or trees may cause a solar-powered EV to stop halfway in a trip. In this paper, we show that by optimally planning the speed on different road segments and thus balancing energy harvesting and consumption, we can enable a solar-powered EV to successfully reach the destination using the shortest travel time. The speed planning problem is essentially a constrained non-linear programming problem, which is generally difficult to solve. We have identified an optimality property that allows us to compute an optimal speed assignment for a partition of the path; then, a dynamic programming method is developed to efficiently compute the optimal speed assignment for the whole trip with significantly low computation overhead compared to the state-of-the-art non-linear programming solver. To evaluate the usability of the proposed method, we have also developed a solar-powered EV prototype. Experiments show that the predictions by the proposed technique match well with the data collected from the physical EV. Issues on practical implementation are also discussed.
- Published
- 2016
- Full Text
- View/download PDF
36. Improving the Performance of Shared Memory Communication in Impulse C
- Author
-
Qingxu Deng, Xi Jin, Mingsong Lv, and Nan Guan
- Subjects
General Computer Science ,Computer science ,business.industry ,Impulse C ,Memory management ,Software ,Shared memory ,Computer architecture ,Control and Systems Engineering ,High-level programming language ,Embedded system ,Process control ,business ,Field-programmable gate array ,Scope (computer science) - Abstract
With the evolution of field-programmable gate arrays (FPGAs) to the Million-Gate scope, high-level languages are gaining popularity in electronic system design, which greatly improves design and verification efficiency. Impulse C is a high-level language widely used in software/hardware (SW/HW) codesign and provides users with varies SW/HW communication mechanisms. But the communication mechanisms of Impulse C are mainly designed for versatility, and the resources within the FPGA chip is not fully utilized. In this letter, we present a improved implementation of the shared memory communication in Impulse C by utilizing both ports of the dual-port BRAM. Experiment results show that the improved implementation can greatly improve the performance of shared memory communication, and further improve the execution efficiency of hardware processes.
- Published
- 2010
- Full Text
- View/download PDF
37. Static worst-case execution time analysis of the μC/OS-II real-time kernel
- Author
-
Mingsong Lv, Yi Wang, Ge Yu, Nan Guan, and Qingxu Deng
- Subjects
General Computer Science ,Computer science ,business.industry ,Complex system ,Static timing analysis ,Static analysis ,Execution time ,Theoretical Computer Science ,Worst-case execution time ,Kernel (statistics) ,Embedded system ,ComputerSystemsOrganization_SPECIAL-PURPOSEANDAPPLICATION-BASEDSYSTEMS ,Interrupt ,business ,Real-time operating system - Abstract
Worst-case execution time (WCET) analysis is one of the major tasks in timing validation of hard real-time systems. In complex systems with real-time operating systems (RTOS), the timing properties of the system are decided by both the applications and RTOS. Traditionally, WCET analysis mainly deals with application programs, while it is crucial to know whether RTOS also behaves in a timely predictable manner. In this paper, static analysis techniques are used to predict the WCET of the system calls and the Disable Interrupt regions of the μC/OS-II real-time kernel, which presents a quantitative evaluation of the real-time performance of μC/OS-II. The precision of applying existing WCET analysis techniques on RTOS is evaluated, and the practical difficulties in using static methods in timing analysis of RTOS are also discussed.
- Published
- 2010
- Full Text
- View/download PDF
38. WCET Analysis with MRU Caches: Challenging LRU for Predictability
- Author
-
Wang Yi, Mingsong Lv, Nan Guan, and Ge Yu
- Subjects
Hardware_MEMORYSTRUCTURES ,Computer science ,Contrast (statistics) ,02 engineering and technology ,Parallel computing ,Abstract interpretation ,020202 computer hardware & architecture ,Composability ,Power consumption ,0202 electrical engineering, electronic engineering, information engineering ,Feature (machine learning) ,020201 artificial intelligence & image processing ,Cache ,Predictability ,Real-time operating system - Abstract
Most previous work in cache analysis for WCET estimation assumes a particular replacement policy called LRU. In contrast, much less work has been done for non-LRU policies, since they are generally considered to be very "unpredictable". However, most commercial processors are actually equipped with these non-LRU policies, since they are more efficient in terms of hardware cost, power consumption and thermal output, but still maintaining almost as good average-case performance as LRU. In this work, we study the analysis of MRU, a non-LRU replacement policy employed in mainstream processor architectures like Intel Nehalem. Our work shows that the predictability of MRU has been significantly underestimated before, mainly because the existing cache analysis techniques and metrics, originally designed for LRU, do not match MRU well. As our main technical contribution, we propose a new cache hit/miss classification, k-Miss, to better capture the MRU behavior, and develop formal conditions and efficient techniques to decide the k-Miss memory accesses. A remarkable feature of our analysis is that the k-Miss classifications under MRU are derived by the analysis result of the same program under LRU. Therefore, our approach inherits all the advantages in efficiency, precision and composability of the state-of-the-art LRU analysis techniques based on abstract interpretation. Experiments with benchmarks show that the estimated WCET by our proposed MRU analysis is rather close to (5% # 20% more than) that obtained by the state-of-the-art LRU analysis, which indicates that MRU is also a good candidate for the cache replacement policy in real-time systems.
- Published
- 2012
- Full Text
- View/download PDF
39. McAiT – A Timing Analyzer for Multicore Real-Time Software
- Author
-
Ge Yu, Wang Yi, Nan Guan, Mingsong Lv, and Qingxu Deng
- Subjects
Multi-core processor ,Task (computing) ,Computer science ,Real-time computing ,Synchronization (computer science) ,Cache ,Abstract interpretation ,Automaton ,Jitter ,Shared resource - Abstract
We present McAiT, a tool for estimating the Worst-Case Execution Times (WCET) of programs running on multicore processors. The highlight of McAiT is that it leverages timed automata to model both the timing behaviors of the programs' interaction with its environment (based on the results of local cache analysis by abstract interpretation) and a broad range of on-chip shared resources, such as shared buses and shared caches. McAiT also allows for modeling complex task models, such as synchronization, jitter, etc. High analysis precision is achieved by the McAiT approach, which is demonstrated by extensive experiments. The tool also supports the classical Implicit Path Enumeration Technique (IPET) combined with worst-case shared resource access delay for WCET estimation, to provide the users with the flexibility to trade analysis precision for efficiency.
- Published
- 2011
- Full Text
- View/download PDF
40. Combining Abstract Interpretation with Model Checking for Timing Analysis of Multicore Software
- Author
-
Wang Yi, Ge Yu, Nan Guan, and Mingsong Lv
- Subjects
Multi-core processor ,Hardware_MEMORYSTRUCTURES ,Computer science ,CPU cache ,business.industry ,Timed automaton ,Memory bus ,Parallel computing ,Shared memory ,Bus sniffing ,Embedded system ,Cache ,Local bus ,business - Abstract
It is predicted that multicores will be increasingly used in future embedded real-time systems for high performance and low energy consumption. The major obstacle is that we may not predict and provide any guarantee on real-time properties of software on such platforms. The shared memory bus is among the most critical resources, which severely degrade the timing predictability of multicore software due to the access contention between cores. In this paper, we study a multicore architecture where each core has a local L1 cache and all cores use a shared bus to access the off-chip memory. We use Abstract Interpretation (AI) to analyze the local cache behavior of a program running on a dedicated core. Based on the cache analysis, we construct a Timed Automaton (TA) to model when the programs access the memory bus. Then we model the shared bus also using timed automata. The TA models for the bus and programs will be explored using the UPPAAL model checker to find the WECTs for the respective programs. Based on the presented techniques, we have developed a tool for multicore timing analysis, which allows automatic generation of the TA models from binary code and WCET estimation for any given TA model of the shared bus. Extensive experiments have been conducted, showing that the combined approach can significantly tighten the estimations. As examples, we have studied the TDMA and FCFS buses, of which the WCET bounds can be tightened by up to 240% and 82% respectively, compared with the worst-case bounds estimated based on worst-case bus access delay.
- Published
- 2010
- Full Text
- View/download PDF
41. A Survey of WCET Analysis of Real-Time Operating Systems
- Author
-
Yi Zhang, Nan Guan, Mingsong Lv, Qingxu Deng, Jianming Zhang, and Ge Yu
- Subjects
Correctness ,Computer science ,business.industry ,Static timing analysis ,Static analysis ,Scheduling (computing) ,Reliability engineering ,Embedded software ,Program analysis ,Systems analysis ,Embedded system ,ComputerSystemsOrganization_SPECIAL-PURPOSEANDAPPLICATION-BASEDSYSTEMS ,business ,Real-time operating system - Abstract
Timing correctness of hard real-time systems is guaranteed by schedulability analysis and worst-case execution time (WCET) analysis of programs. Traditional WCET analysis mainly deals with application programs and has achieved success in industry. Timing analysis of application programs along cannot guarantee correctness of complete systems consisting RTOS. WCET tools designed for application program analysis have been applied to analyze RTOS routines by several research groups, but poor WCET estimations have been reported. Timing analysis of real-time systems considering both applications and RTOS has not been fully studied. So we intend to give a survey of related work on WCET analysis of RTOS. By summarizing previous work, challenges of WCET analysis of complete real-time systems are presented, and some possible further research potentials are unleashed.
- Published
- 2009
- Full Text
- View/download PDF
42. WCET Analysis of the mC/OS-II Real-Time Kernel
- Author
-
Ge Yu, Yi Zhang, Mingsong Lv, Nan Guan, Qingxu Deng, Rui Chen, and Wang Yi
- Subjects
business.industry ,Computer science ,Embedded system ,Kernel (statistics) ,Code (cryptography) ,Static timing analysis ,ComputerSystemsOrganization_SPECIAL-PURPOSEANDAPPLICATION-BASEDSYSTEMS ,Context (language use) ,Static analysis ,business ,Real-time operating system ,Execution time - Abstract
Worst-case execution time (WCET) analysis is one of the major tasks in timing validation of hard real-time systems. In complex systems with real-time operating systems (RTOS), the timing properties of the system are decided by both the applications and the RTOS. Traditionally, WCET analysis mainly deals with application programs, while it is crucial to know whether the RTOS also behaves in a timely predictable manner. In this paper, we present a case study where static analysis is used to predict the WCET of the system calls of the uC/OS-II real-time kernel. To our knowledge, this paper is the first to present quantitative results on the real-time performance of uC/OS-II. The precision of applying existing WCET analysis techniques on RTOS code is evaluated, and the practical difficulties in using static methods in timing analysis of RTOS are also reported.
- Published
- 2009
- Full Text
- View/download PDF
43. Performance Comparison of Techniques on Static Path Analysis of WCET
- Author
-
Mingsong Lv, Qingxu Deng, Zonghua Gu, Nan Guan, and Ge Yu
- Subjects
Model checking ,Worst-case execution time ,Computer science ,Scalability ,Parallel computing ,Path analysis (computing) ,Critical path method ,Upper and lower bounds ,Automaton ,Data modeling - Abstract
Static path analysis is a key process of Worst Case Execution Time (WCET) estimation, the objective of which is to find the execution path that has the largest execution time. Currently, there is an argument in the research community whether model checking is another good solution for WCET analysis, besides ILP. To our knowledge, no paper so far has addressed this argument with real performance data. In this paper, we implement both ILP and model checking for static path analysis of WCET, and the experiment results show that ILP yields very good performance, while model checking only works well for simple programs, and it is inclined to scalability problems when dealing with programs that have complex structures and large loop counts.
- Published
- 2008
- Full Text
- View/download PDF
44. On-Line Placement of Real-Time Tasks on 2D Partially Run-Time Reconfigurable FPGAs
- Author
-
Qingxu Deng, Fanxin Kong, Mingsong Lv, Nan Guan, and Wang Yi
- Subjects
Task (computing) ,business.industry ,Computer science ,Embedded system ,Line (geometry) ,Algorithm complexity ,Metric (mathematics) ,Software system ,Complex network ,Space (commercial competition) ,Field-programmable gate array ,business - Abstract
Partially Runtime-Reconfigurable (PRTR) FPGAs allow hardware tasks to be placed and removed dynamically at runtime in multi-tasking systems. Such systems need to not only support sharing of the resources in space, but also guarantee timely execution of the tasks. We present a novel online task placement algorithm under real-time constraints. The proposed algorithm uses a new metric to allocate tasks that makes tasks be placed densely, thereby, larger continuous free area remains. Simulation experiments indicate that our approach gives better results and sometimes has lower algorithm complexity than existingtechniques.
- Published
- 2008
- Full Text
- View/download PDF
45. Schedulability Analysis of Global Fixed-Priority or EDF Multiprocessor Scheduling with Symbolic Model-Checking
- Author
-
Zonghua Gu, Nan Guan, Mingsong Lv, Ge Yu, and Qingxu Deng
- Subjects
Model checking ,Statistical classification ,Important research ,Object-oriented modeling ,Computer science ,Distributed computing ,Processor scheduling ,Parallel computing ,Formal verification ,Multiprocessor scheduling ,Scheduling (computing) - Abstract
As Moore's law comes to an end, multi-processor (MP) systems are becoming increasingly important in embedded systems design, hence real-time schedulability analysis for MP systems has become an important research topic. In this paper, we present an exact method for schedulability analysis of global multiprocessor scheduling with either fixed-priority (FP) or earliest-deadline-first (EDF) algorithms using the model-checker NuSMV. Compared to safe but pessimistic schedulability tests based on processor utilization bounds, model-checking can provide an exact answer to the schedulability of a taskset, as well as quantitative information on each task's best-case and worst- case response times.
- Published
- 2008
- Full Text
- View/download PDF
46. RTNoC: A Simulation Tool for Real-Time Communication Scheduling on Networks-on-Chips
- Author
-
Mingsong Lv, Qingxu Deng, Nan Guan, and Ying Guo
- Subjects
Fixed-priority pre-emptive scheduling ,Network on a chip ,Computer science ,Real-time communication ,Two-level scheduling ,Distributed computing ,Processor scheduling ,Dynamic priority scheduling ,ComputerSystemsOrganization_PROCESSORARCHITECTURES ,Round-robin scheduling ,Bottleneck ,Fair-share scheduling ,Scheduling (computing) - Abstract
Networks-on-chips (NoC) is accepted as the most promising on-chip communication infrastructure to solve the communication bottleneck of MPSoCs. Currently, research on real-time communication scheduling on NoCs is immature, and no existing simulation tool can satisfy the requirements of simulating real-time communication scheduling on NoCs. Such a simulator is highly desirable for the research on NoC-based real-time systems. In this paper, we designed RTNoC, a realtime communication scheduling simulator on wormhole-switched NoCs. We also use this simulator to evaluate the real-time performance of different scheduling algorithms with periodic task sets. Experiment results show that RTNoC is a good tool to motivate research on real-time communication scheduling.
- Published
- 2008
- Full Text
- View/download PDF
47. ARMISS: An Instruction Set Simulator for the ARM Architecture
- Author
-
Mingsong Lv, Yaming Xie, Ge Yu, Qingxu Deng, and Nan Guan
- Subjects
Instruction set ,ARM architecture ,Orthogonal instruction set ,Instruction set simulator ,Computer architecture simulator ,Computer architecture ,Reduced instruction set computing ,Computer science ,business.industry ,Embedded system ,Software design ,Interrupt ,business - Abstract
The development efficiency of embedded systems is highly pressured due to the pursuit of short time-to-market of embedded products. In traditional design flow, although software can be developed in parallel with the hardware platform, it can only be tested and verified after the platform is fabricated. ARMISS, an Instruction Set Simulator for the ARM architecture, is developed to enable early software testing and verification. The ARM instruction set, MMU and interrupt handling are emulated in this tool. An instruction caching technique is designed to accelerate the interpretation-based instruction emulation. ARMISS is implemented in the C programming language, thus is highly portable across varies host operating systems for embedded system design.
- Published
- 2008
- Full Text
- View/download PDF
48. Static Scheduling and Software Synthesis for Dataflow Graphs with Symbolic Model-Checking
- Author
-
Zonghua Gu, Mingxuan Yuan, Nan Guan, Mingsong Lv, Xiuqiang He, Qingxu Deng, and Ge Yu
- Subjects
Front-side bus ,Computer science ,business.industry ,media_common.quotation_subject ,Computation ,Upper and lower bounds ,Time variance ,Task (computing) ,Embedded system ,Cache ,Function (engineering) ,business ,media_common ,TRACE (psycholinguistics) - Abstract
The integration phase of real-time COTS-based systems is often problematic because when multiple tasks run concurrently, the interference at the bus level between cache fetching activities and I/O peripheral transactions is significant and causes unpredictable behaviors: experimentally, tasks can have computation time variance up to 50%. In this work, we present a theoretical framework able to model the interaction between CPU and peripherals contending for shared main memory through the front side bus (FSB). We first show how to compute worst case execution times for a task given a trace of its cache activity and given an upper bound function that models peripheral activities; then, we introduce the novel idea of "hardware server" as a means of controlling the unpredictable behavior of COTS peripheral components.
- Published
- 2007
- Full Text
- View/download PDF
49. A Real-Time Scheduling Algorithm with Buffer Optimization for Embedded Signal Processing Systems
- Author
-
Nan Guan, Mingsong Lv, Ge Yu, and Qingxu Deng
- Subjects
Signal processing ,Computer science ,Hybrid system ,Real-time computing ,Buffer (optical fiber) ,Fair-share scheduling ,Scheduling (computing) - Abstract
Embedded signal processing system is a typical type of application in embedded domain. Such systems typically have requirements on high real-time responsiveness and large buffer capacity. In this paper a new scheduling policy for embedded signal processing systems is advanced, which can provide static scheduling for such hybrid systems that contains both data stream processing and independent periodic tasks, and at the same time can minimize buffer consumption.
- Published
- 2007
- Full Text
- View/download PDF
50. Composing Functional and State-Based Performance Models for Analyzing Heterogeneous Real-Time Systems
- Author
-
Mingxuan Yuan, Xiuqiang He, Mingsong Lv, Ge Yu, Qingxu Deng, Zonghua Gu, and Nan Guan
- Subjects
Model checking ,Computer science ,business.industry ,Dataflow ,event count automata ,functional-based performance models ,heterogeneous real-time systems ,real-time calculus ,state-based performance models ,distributed processing ,systems analysis ,Parallel computing ,Data buffer ,Scheduling (computing) ,ddc ,Systems analysis ,Formal specification ,business ,Digital signal processing ,Data-flow analysis - Abstract
In this paper, we address the problem of static scheduling and software synthesis for dataflow graphs with the symbolic model-checker NuSMV using a two-step process: first use model-checking to obtain a static schedule with the objective of minimizing the data buffer size, then synthesize efficient code from the static schedule with the objective of minimizing code size and performance overheads due to runtime dynamic decisions. We show the effectiveness of these techniques using a number of digital signal processing examples.
- Published
- 2006
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.