224 results on '"Jean-Luc Gaudiot"'
Search Results
2. Autonomous Vehicles Digital Twin: A Practical Paradigm for Autonomous Driving System Development
- Author
-
Bo Yu, Chongyu Chen, Jie Tang, Shaoshan Liu, and Jean-Luc Gaudiot
- Subjects
General Computer Science - Published
- 2022
- Full Text
- View/download PDF
3. Anomaly-Based Detection of Microarchitectural Attacks for IoT Devices
- Author
-
Congmiao Li and Jean-Luc Gaudiot
- Abstract
With the rapid growth of Internet of Things (IoT) technology, the number of connected devices and information processed through the Internet has significantly risen. As a result, cyber-attacks targeting vulnerable IoT devices have also dramatically increased. Microarchitectural attacks pose a serious threat to IoT security because they can not only leak confidential information to adversaries through shared processor resources such as caches, branch predictors and various functional units but also fully compromise the embedded system devices themselves. Traditional signature-based antivirus software cannot effectively detect microarchitectural attacks, particularly zero-day attacks. In this paper, an anomaly-based detector for IoT devices is developed to demonstrate the feasibility of detecting unknown microarchitectural attacks for resource-constrained devices using features collected from hardware performance counters with unsupervised machine learning and feature selection methods. Our experiments show promising detection results for the tested devices. The methodology will be used to guide the development of such detectors for similar IoT devices.
- Published
- 2023
- Full Text
- View/download PDF
4. Programming Autonomous Machines : Special Session Paper
- Author
-
Shaoshan Liu, Xiaoming Li, Tongsheng Geng, Stephane Zuckerman, and Jean-Luc Gaudiot
- Published
- 2022
- Full Text
- View/download PDF
5. Π-RT: A Runtime Framework to Enable Energy-Efficient Real-Time Robotic Vision Applications on Heterogeneous Architectures
- Author
-
Yuan Xie, Liu Liu, Jean-Luc Gaudiot, Shaoshan Liu, Jie Tang, and Bo Yu
- Subjects
010302 applied physics ,General Computer Science ,Computer science ,business.industry ,Cloud computing ,02 engineering and technology ,01 natural sciences ,020202 computer hardware & architecture ,Task (project management) ,Embedded system ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Task analysis ,Robot ,business ,Energy (signal processing) ,Efficient energy use - Abstract
We propose P-RT, the first robotic vision runtime framework to efficiently manage dynamic task executions on mobile systems with multiple accelerators as well as on the cloud to achieve better performance and energy savings. With P-RT, we enable a robot to simultaneously perform autonomous navigation.
- Published
- 2021
- Full Text
- View/download PDF
6. Concept drift detection for distributed multi-model machine learning systems
- Author
-
Beverly Abadines Quon and Jean-Luc Gaudiot
- Published
- 2022
- Full Text
- View/download PDF
7. Secure Data Storage and Recovery in Industrial Blockchain Network Environments
- Author
-
Wei Liang, Yongkai Fan, Jean-Luc Gaudiot, Kuan-Ching Li, and Dafang Zhang
- Subjects
Blockchain ,Distributed database ,Smart contract ,Computer science ,business.industry ,020208 electrical & electronic engineering ,Cloud computing ,02 engineering and technology ,Computer Science Applications ,Control and Systems Engineering ,Distributed data store ,Computer data storage ,0202 electrical engineering, electronic engineering, information engineering ,Electrical and Electronic Engineering ,business ,Information Systems ,Computer network - Abstract
The massive redundant data storage and communication in network 4.0 environments have issues of low integrity, high cost, and easy tampering. To address these issues, in this article, a secure data storage and recovery scheme in the blockchain-based network is proposed by improving the decentration, tampering-proof, real-time monitoring, and management of storage systems, as such design supports the dynamic storage, fast repair, and update of distributed data in the data storage system of industrial nodes. A local regenerative code technology is used to repair and store data between failed nodes while ensuring the privacy of user data. That is, as the data stored are found to be damaged, multiple local repair groups constructed by vector code can simultaneously yet efficiently repair multiple distributed data storage nodes. Based on the unique chain storage structure, such as data consensus mechanism and smart contract, the storage structure of blockchain distributed coding not only quickly repair the nearby local regenerative codes in the blockchain but also reduce the resource overhead in the data storage process of industrial nodes. Experimental results show that the proposed scheme improves the repair rate of multinode data by 9% and data storage rate increased by 8.6%, indicating to be promising with good security and real-time performance.
- Published
- 2020
- Full Text
- View/download PDF
8. A novel data representation framework based on nonnegative manifold regularisation
- Author
-
Kuan-Ching Li, Yan Jiang, Hongbo Zhou, Jintian Tang, Jean-Luc Gaudiot, and Wei Liang
- Subjects
Theoretical computer science ,Computer science ,Matrix factorisation ,020207 software engineering ,02 engineering and technology ,External Data Representation ,law.invention ,Human-Computer Interaction ,Artificial Intelligence ,Multimedia content analysis ,law ,Cognitive learning ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Cluster analysis ,Feature learning ,Manifold (fluid mechanics) ,Software - Abstract
Representation learning techniques have been frequently applied in multimedia content analysis and retrieval. In this study, an efficient multimedia data clustering method is presented, which consists of two independent algorithms. First, we propose a new representation framework by incorporating sparse coding and manifold regularisation in an optimisation objective function, the cluster indicator matrix is estimated by introducing ℓ1 sparsity norm coarsely. Second, we refine the estimated cluster indicator matrix by performing spectral rotation such that an optimal assignment for clustering can be learned. Compared with existing methods, we have the following merits: our method takes into account the global matrix reconstruction information and locality manifold information simultaneously. Therefore, global and locality information both are respected. Additionally, theoretical justification about the novel representation method is presented in this study. Comprehensive experiments demonstrate the effectiveness and efficiency of our method in comparison with the state-of-the-art clustering methods on six real-world image datasets.
- Published
- 2020
- Full Text
- View/download PDF
9. Autonomous vehicles lite self-driving technologies should start small, go slow
- Author
-
Jean-Luc Gaudiot and Shaoshan Liu
- Subjects
business.industry ,020208 electrical & electronic engineering ,05 social sciences ,02 engineering and technology ,Transport engineering ,Kilometer ,Industrial park ,0502 economics and business ,0202 electrical engineering, electronic engineering, information engineering ,TRIPS architecture ,Business ,Electricity ,Electrical and Electronic Engineering ,China ,Transit (satellite) ,050203 business & management ,Tourism ,Mile - Abstract
Many young urbanites don't want to own a car, and unlike earlier generations, they don't have to rely on mass transit. Instead they treat mobility as a service: When they need to travel significant distances, say, more than 5 miles (8 kilometers), they use their phones to summon an Uber (or a car from a similar ride-sharing company). If they have less than a mile or so to go, they either walk or use various “micromobility” services, such as the increasingly ubiquitous Lime and Bird scooters or, in some cities, bike sharing. The problem is that today's mobility-as-a-service ecosystem often doesn't do a good job covering intermediate distances, say a few miles. Hiring an Uber or Lyft for such short trips proves frustratingly expensive, and riding a scooter or bike more than a mile or so can be taxing to many people. So getting yourself to a destination that is from 1 to 5 miles away can be a challenge. Yet such trips account for about half of the total passenger miles traveled. Many of these intermediate-distance trips take place in environments with limited traffic, such as university campuses and industrial parks, where it is now both economically reasonable and technologically possible to deploy small, low-speed autonomous vehicles powered by electricity. We've been involved with a startup that intends to make this form of transportation popular. The company, PerceptIn, has autonomous vehicles operating at tourist sites in Nara and Fukuoka, Japan; at an industrial park in Shenzhen, China; and is just now arranging for its vehicles to shuttle people around Fishers, Ind., the location of the company's headquarters.
- Published
- 2020
- Full Text
- View/download PDF
10. Cybersecurity and High-Performance Computing Environments
- Author
-
Kuan-Ching Li, Nitin Sukhija, Elizabeth Bautista, and Jean-Luc Gaudiot
- Published
- 2022
- Full Text
- View/download PDF
11. TransMigrator: A Transformer-Based Predictive Page Migration Mechanism for Heterogeneous Memory
- Author
-
Songwen Pei, Jianan Li, Yihuan Qian, Jie Tang, and Jean-Luc Gaudiot
- Published
- 2022
- Full Text
- View/download PDF
12. A profile-based AI-assisted dynamic scheduling approach for heterogeneous architectures
- Author
-
Stéphane Zuckerman, Alfredo Goldman, Guang R. Gao, Tongsheng Geng, Jean-Luc Gaudiot, and Marcos Amaris
- Subjects
Computer science ,Distributed computing ,Dynamic priority scheduling ,Supercomputer ,SCHEDULING ,Application profile ,Stencil ,Theoretical Computer Science ,Scheduling (computing) ,Runtime system ,Resource (project management) ,Theory of computation ,Software ,Information Systems - Abstract
While heterogeneous architectures are increasing popular with High Performance Computing systems, their effectiveness depends on how efficient the scheduler is at allocating workloads onto appropriate computing devices and how communication and computation can be overlapped. With different types of resources integrated into one system, the complexity of the scheduler correspondingly increases. Moreover, for applications with varying problem sizes on different heterogeneous resources, the optimal scheduling approach may vary accordingly. Thus, we introduce a Profile-based AI-assisted Dynamic Scheduling approach to dynamically and adaptively adjust workloads and efficiently utilize heterogeneous resources. It combines online scheduling, application profile information, hardware mathematical modeling and offline machine learning estimation modeling to implement automatic application-device-specific scheduling for heterogeneous architectures. A hardware mathematical model provides coarse-grain computing resource selection while the profile information and offline machine learning model estimates the performance of a fine-grain workload, and an online scheduling approach dynamically and adaptively distributes the workload. Our scheduling approach is tested on control-regular applications, 2D and 3D Stencil kernels (based on a Jacobi Algorithm), and a data-irregular application, Sparse Matrix-Vector Multiplication, in an event-driven runtime system. Experimental results show that PDAWL is either on-par or far outperforms whichever yields the best results (CPU or GPU).
- Published
- 2022
13. Challenges in Detecting an 'Evasive Spectre'
- Author
-
Congmiao Li and Jean-Luc Gaudiot
- Subjects
Exploit ,Computer science ,Bandwidth (signal processing) ,02 engineering and technology ,computer.software_genre ,Computer security ,020202 computer hardware & architecture ,Microarchitecture ,Conditional execution ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Malware ,Sleep (system call) ,computer - Abstract
Spectre attacks exploit serious vulnerabilities in modern CPU design to extract sensitive data through side channels. Completely fixing the problem would require a redesign of the architecture for conditional execution which cannot be backported. Researchers have proposed to detect Spectre with promising accuracy by monitoring deviations in microarchitectural events using existing hardware performance counters. However, the attacker may attempt to evade detection by reshaping the microarchitectural profile of Spectre so as to mimic benign programs. This letter thus identifies the challenges in detecting “Evasive Spectre” attacks by showing that the detection accuracy drops significantly after the attacker inserted carefully chosen instructions in the middle of an attack or periodically put the attack to sleep at a frequency higher than the victim's sampling rate when operating the attack at a lower bandwidth, yet with reasonable success rate.
- Published
- 2020
- Full Text
- View/download PDF
14. π-Hub: Large-scale video learning, storage, and retrieval on heterogeneous hardware platforms
- Author
-
Shaoshan Liu, Jie Cao, Jean-Luc Gaudiot, Dawei Sun, Bolin Ding, Weisong Shi, and Jie Tang
- Subjects
Service (systems architecture) ,Computer Networks and Communications ,business.industry ,Computer science ,020206 networking & telecommunications ,Symmetric multiprocessor system ,Cloud computing ,02 engineering and technology ,Hardware and Architecture ,Software deployment ,Server ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,business ,Mobile device ,Intelligent transportation system ,Software ,Computer hardware - Abstract
The burgeoning of Internet of Things (IoT) and camera-equipped mobile devices contributes a tremendous amount of video data generated at the edge of the network. At the same time, we have witnessed the fast deployment of many video-based application services, such as plate recognition for public safety, intelligent transportation, Industry 4.0 and so on. The success of these services, in turn, requires large-scale video data being learned, stored, and retrieved in a more efficient way. A generic software and hardware framework for large-scale IoT video analysis and service support is still missing. To address this challenge, we present π -Hub, PerceptIn’s robotic cloud solution which supports large-scale video data analysis, storage, and query by implementing the learn-store-retrieve paradigm. Interestingly, we found that among the learning, storage, and retrieval services each of them stresses one type of resources on heterogeneous computing servers, i.e., GPU, CPU, and Memory, respectively, therefore it is extremely cost-efficient to co-locate these services together to fully utilize the resources. In addition, several optimization techniques for data writing, reading, and data reduction are proposed and evaluated. The evaluation results show that these techniques improve the performance of the learning, storage and retrieval services significantly as well as notably reduce the cost of the system. We also verify π -Hub’s scalability by reliably running a 1000-machine deployment to support up to one million users. Finally, we conclude the paper by discussing several lessons learned from this study and future work.
- Published
- 2020
- Full Text
- View/download PDF
15. Panel Discussion: Career Pointers from Computer Society Leadership
- Author
-
Thomas M. Conte, Hironori Kasahara, Dejan Milojicic, Jean-Luc Gaudiot, and Roger U. Fujii
- Subjects
business.industry ,Sociology ,Public relations ,Computer society ,business ,Panel discussion - Published
- 2021
- Full Text
- View/download PDF
16. Streaming Data Priority Scheduling Framework for Autonomous Driving by Edge
- Author
-
Shaoshan Liu, Jean-Luc Gaudiot, Lingbing Yao, Hang Zhao, and Jie Tang
- Subjects
Task (computing) ,Process (engineering) ,business.industry ,Computer science ,Data stream mining ,Distributed computing ,Spark (mathematics) ,Cloud computing ,Enhanced Data Rates for GSM Evolution ,Differentiated service ,business ,Scheduling (computing) - Abstract
In recent years, intelligent vehicles like autonomous vehicles generate a huge amount of sensing data continuously. The computations on those data streams are far beyond the processing capacity of on-board computing. To deal with the streaming data process in real-time, the deployment of streaming data processing system by edge turns to the first choice in terms of performance. However, the existing frameworks cannot satisfy the complicated demands from autonomous driving tasks and lack the ability in supporting the task priority scheduling. In this paper, we propose a streaming data priority scheduling framework for autonomous driving by edge on Spark Streaming and make an implementation on Spark 2.3.0. The proposed framework can identify the priorities among different data processing tasks and implement the task scheduling based on non-preemptive priority queuing theory. To meet differentiated service level requirements, the proposed non-preemptive priority queuing scheduling mechanism considers the priority category of tasks, the distance between vehicles and edge nodes, and the priority weight of vehicles. Experiments show that this mechanism can effectively identify the priority information of different tasks from different vehicles and reduce the end-to-end latency of high-priority tasks by up to 46% than low-priority tasks.
- Published
- 2021
- Full Text
- View/download PDF
17. COMPSAC 2021 President’s Panel
- Author
-
Leila de Floriani, Hironori Kasahara, Forrest Shull, Jean-Luc Gaudiot, Steve Diamond, and Cecilia Metra
- Published
- 2021
- Full Text
- View/download PDF
18. Computer Education in the Age of COVID-19
- Author
-
Jean-Luc Gaudiot and Hironori Kasahara
- Subjects
Forcing (recursion theory) ,Coronavirus disease 2019 (COVID-19) ,General Computer Science ,Computer science ,business.industry ,The Internet ,sense organs ,Public relations ,Computer education ,Computer aided instruction ,business - Abstract
COVID-19 has been devastating across the globe, forcing profound changes in most human interactions. Through an informal survey of numerous educators worldwide, we explore some of the disease's effects on the education community and how the online delivery of educational materials can meet these challenges.
- Published
- 2020
- Full Text
- View/download PDF
19. A Unified Cloud Platform for Autonomous Driving
- Author
-
Jie Tang, Jean-Luc Gaudiot, Quan Wang, Chao Wang, and Shaoshan Liu
- Subjects
0209 industrial biotechnology ,General Computer Science ,Exploit ,business.industry ,Computer science ,Distributed computing ,Symmetric multiprocessor system ,Cloud computing ,02 engineering and technology ,020901 industrial engineering & automation ,Distributed data store ,0202 electrical engineering, electronic engineering, information engineering ,Overhead (computing) ,020201 artificial intelligence & image processing ,business ,Efficient energy use - Abstract
Tailoring cloud support for each autonomous-driving application would require maintaining multiple infrastructures, potentially resulting in low resource utilization, low performance, and high management overhead. To address this problem, the authors present a unified cloud infrastructure with Spark for distributed computing, Alluxio for distributed storage, and OpenCL to exploit heterogeneous computing resources for enhanced performance and energy efficiency.
- Published
- 2017
- Full Text
- View/download PDF
20. An effective pre-store/pre-load method exploiting intra-request idle time of NAND flash-based storage devices
- Author
-
Jin-Young Kim, Eui-Young Chung, Hyeokjun Seo, Sungroh Yoon, Tae Hee You, and Jean-Luc Gaudiot
- Subjects
Hardware_MEMORYSTRUCTURES ,Computer Networks and Communications ,Computer science ,Nand flash memory ,NAND gate ,020206 networking & telecommunications ,02 engineering and technology ,Write buffer ,Disk buffer ,computer.software_genre ,020202 computer hardware & architecture ,Artificial Intelligence ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Operating system ,Page cache ,Cache ,Latency (engineering) ,computer ,Cache algorithms ,Software ,Dram - Abstract
NAND flash-based storage devices (NFSDs) are widely employed owing to their superior characteristics when compared to hard disk drives. However, NAND flash memory (NFM) still exhibits drawbacks, such as a limited lifetime and an erase-before-write requirement. Along with effective software management, the implementation of a cache buffer is one of the most common solutions to overcome these limitations. However, the read/write performance becomes saturated primarily because the eviction overhead caused by limited DRAM capacity significantly impacts overall NFSD performance. This paper therefore proposes a method that hides the eviction overhead and overcomes the saturation of the read/write performance. The proposed method exploits the new intra-request idle time (IRIT) in NFSD and employs a new data management scheme. In addition, the new pre-store eviction scheme stores dirty page data in the cache to NFMs in advance. This reduces the eviction overhead by maintaining a sufficient number of clean pages in the cache. Further, the new pre-load insertion scheme improves the read performance by frequently loading data that needs to be read into the cache in advance. Unlike previous methods with large migration overhead, our scheme does not cause any eviction/insertion overhead because it actually exploits the IRIT to its advantage. We verified the effectiveness of our method, by integrating it into two cache management strategies which were then compared. Our proposed method reduced read latency by 43% in read-intensive traces, reduced write latency by 40% in write-intensive traces, and reduced read/write latency by 21% and 20%, respectively, on average compared to NFSD with a conventional write cache buffer.
- Published
- 2017
- Full Text
- View/download PDF
21. Prediction and Routing
- Author
-
Shaoshan Liu, Liyun Li, Jie Tang, Shuang Wu, and Jean-Luc Gaudiot
- Published
- 2020
- Full Text
- View/download PDF
22. Creating Autonomous Vehicle Systems
- Author
-
Shaoshan Liu, Liyun Li, Jie Tang, Shuang Wu, and Jean-Luc Gaudiot
- Published
- 2020
- Full Text
- View/download PDF
23. Introduction to Autonomous Driving
- Author
-
Shaoshan Liu, Liyun Li, Jie Tang, Shuang Wu, and Jean-Luc Gaudiot
- Published
- 2020
- Full Text
- View/download PDF
24. Client Systems for Autonomous Driving
- Author
-
Shaoshan Liu, Liyun Li, Jie Tang, Shuang Wu, and Jean-Luc Gaudiot
- Published
- 2020
- Full Text
- View/download PDF
25. PerceptIn’s Autonomous Vehicles Lite
- Author
-
Shaoshan Liu, Liyun Li, Jie Tang, Shuang Wu, and Jean-Luc Gaudiot
- Published
- 2020
- Full Text
- View/download PDF
26. Autonomous Last-mile Delivery Vehicles in Complex Traffic Environments
- Author
-
Qi Kong, Bai Li, Shaoshan Liu, Jean-Luc Gaudiot, Liangliang Zhang, and Jie Tang
- Subjects
FOS: Computer and information sciences ,Road traffic control ,General Computer Science ,Computer science ,Mobile robot ,02 engineering and technology ,01 natural sciences ,010309 optics ,Transport engineering ,Computer Science - Robotics ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Last mile ,Robotics (cs.RO) - Abstract
E-commerce has evolved with the digital technology revolution over the years. Last-mile logistics service contributes a significant part of the e-commerce experience. In contrast to the traditional last-mile logistics services, smart logistics service with autonomous driving technologies provides a promising solution to reduce the delivery cost and to improve efficiency. However, the traffic conditions in complex traffic environments, such as those in China, are more challenging compared to those in well-developed countries. Many types of moving objects (such as pedestrians, bicycles, electric bicycles, and motorcycles, etc.) share the road with autonomous vehicles, and their behaviors are not easy to track and predict. This paper introduces a technical solution from JD.com, a leading E-commerce company in China, to the autonomous last-mile delivery in complex traffic environments. Concretely, the methodologies in each module of our autonomous vehicles are presented, together with safety guarantee strategies. Up to this point, JD.com has deployed more than 300 self-driving vehicles for trial operations in tens of provinces of China, with an accumulated 715,819 miles and up to millions of on-road testing hours., Comment: 6 pages 6 figures, submitted to IEEE Computer
- Published
- 2020
- Full Text
- View/download PDF
27. Cloud Platform for Autonomous Driving
- Author
-
Shaoshan Liu, Liyun Li, Jie Tang, Shuang Wu, and Jean-Luc Gaudiot
- Published
- 2020
- Full Text
- View/download PDF
28. Autonomous Vehicle Localization
- Author
-
Shaoshan Liu, Liyun Li, Jie Tang, Shuang Wu, and Jean-Luc Gaudiot
- Published
- 2020
- Full Text
- View/download PDF
29. Data Shepherding: A Last Level Cache Design for Large Scale Chips
- Author
-
Jean-Luc Gaudiot and Ganghee Jang
- Subjects
010302 applied physics ,Hardware_MEMORYSTRUCTURES ,Computer science ,CPU cache ,02 engineering and technology ,Parallel computing ,01 natural sciences ,020202 computer hardware & architecture ,Set (abstract data type) ,Memory management ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Bandwidth (computing) ,Cache ,General-purpose computing on graphics processing units ,Resource management (computing) - Abstract
Newer chips include cache memories as large as 128 MB to sustain the bandwidth for the GPGPU module. As 128 MB was a reasonable main memory size a decade ago, we examine the design impact of a larger granularity in the management of caches. We thus propose a cache memory design called the Data Shepherding Cache for larger last level caches. Even with a granularity as large as a page for the management of the last level cache, our Data Shepherding Cache could achieve reasonable performance from a smaller area footage over the same sized set associative cache.
- Published
- 2019
- Full Text
- View/download PDF
30. Detecting Malicious Attacks Exploiting Hardware Vulnerabilities Using Performance Counters
- Author
-
Congmiao Li and Jean-Luc Gaudiot
- Subjects
010302 applied physics ,Computer performance ,Exploit ,business.industry ,Computer science ,02 engineering and technology ,01 natural sciences ,020202 computer hardware & architecture ,Microarchitecture ,Software ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Overhead (computing) ,Cache ,business ,Branch misprediction ,Computer hardware ,Dram - Abstract
Over the past decades, the major objectives of computer design have been to improve performance and to reduce cost, energy consumption, and size, while security has remained a secondary concern. Meanwhile, malicious attacks have rapidly grown as the number of Internet-connected devices, ranging from personal smart embedded systems to large cloud servers, have been increasing. Traditional antivirus software cannot keep up with the increasing incidence of these attacks, especially for exploits targeting hardware design vulnerabilities. For example, as DRAM process technology scales down, it becomes easier for DRAM cells to electrically interact with each other. For instance, in Rowhammer attacks, it is possible to corrupt data in nearby rows by reading the same row in DRAM. As Rowhammer exploits a computer hardware weakness, no software patch can completely fix the problem. Similarly, there is no efficient software mitigation to the recently reported attack Spectre. The attack exploits microarchitectural design vulnerabilities to leak protected data through side channels. In general, completely fixing hardware-level vulnerabilities would require a redesign of the hardware which cannot be backported. In this paper, we demonstrate that by monitoring deviations in microarchitectural events such as cache misses, branch mispredictions from existing CPU performance counters, hardware-level attacks such as Rowhammer and Spectre can be efficiently detected during runtime with promising accuracy and reasonable performance overhead using various machine learning classifiers.
- Published
- 2019
- Full Text
- View/download PDF
31. Enabling Deep Learning on IoT Devices
- Author
-
Shaoshan Liu, Dawei Sun, Jie Tang, and Jean-Luc Gaudiot
- Subjects
General Computer Science ,Multimedia ,business.industry ,Computer science ,Deep learning ,020208 electrical & electronic engineering ,020206 networking & telecommunications ,Cloud computing ,02 engineering and technology ,computer.software_genre ,World Wide Web ,0202 electrical engineering, electronic engineering, information engineering ,Artificial intelligence ,Internet of Things ,business ,computer - Abstract
Deep learning can enable Internet of Things (IoT) devices to interpret unstructured multimedia data and intelligently react to both user and environmental events but has demanding performance and power requirements. The authors explore two ways to successfully integrate deep learning with low-power IoT products.
- Published
- 2017
- Full Text
- View/download PDF
32. Computer Architectures for Autonomous Driving
- Author
-
Zhe Zhang, Shaoshan Liu, Jie Tang, and Jean-Luc Gaudiot
- Subjects
General Computer Science ,Computer science ,business.industry ,Workload ,02 engineering and technology ,020202 computer hardware & architecture ,Power (physics) ,Task (computing) ,Stack (abstract data type) ,Embedded system ,0202 electrical engineering, electronic engineering, information engineering ,Global Positioning System ,Process control ,020201 artificial intelligence & image processing ,Architecture ,business - Abstract
To enable autonomous driving, a computing stack must simultaneously ensure high performance, consume minimal power, and have low thermal dissipation—all at an acceptable cost. An architecture that matches workload to computing units and implements task time-sharing can meet these requirements.
- Published
- 2017
- Full Text
- View/download PDF
33. Extending Amdahl’s Law for Heterogeneous Multicore Processor with Consideration of the Overhead of Data Preparation
- Author
-
Myoung-Seo Kim, Jean-Luc Gaudiot, and Songwen Pei
- Subjects
Heterogeneous System Architecture ,Multi-core processor ,Amdahl's law ,General Computer Science ,Computer science ,020207 software engineering ,02 engineering and technology ,Parallel computing ,ComputerSystemsOrganization_PROCESSORARCHITECTURES ,Software_PROGRAMMINGTECHNIQUES ,Data preparation ,020202 computer hardware & architecture ,ComputingMilieux_GENERAL ,symbols.namesake ,Control and Systems Engineering ,Homogeneous ,ComputerSystemsOrganization_MISCELLANEOUS ,0202 electrical engineering, electronic engineering, information engineering ,Multicore systems ,symbols ,Overhead (computing) - Abstract
We extend Amdahl’s law by considering the overhead of data preparation (ODP) for multicore systems, and apply it to three “traditional” multicore system scenarios (homogeneous symmetric multicore, asymmetric multicore, and dynamic multicore) and two new scenarios (heterogeneous CPU-GPU multicore and dynamic CPU-GPU multicore). It demonstrates that potential innovations in heterogeneous system architecture are indispensable to decrease ODP.
- Published
- 2016
- Full Text
- View/download PDF
34. Detecting Spectre Attacks Using Hardware Performance Counters
- Author
-
Congmiao Li and Jean-Luc Gaudiot
- Subjects
Exploit ,business.industry ,Computer science ,Detector ,Evasion (network security) ,Workload ,Theoretical Computer Science ,Software ,Computational Theory and Mathematics ,Hardware and Architecture ,Overhead (computing) ,Transient (computer programming) ,Set (psychology) ,business ,Computer hardware - Abstract
Spectre attacks can be catastrophic and widespread because they exploit common design flaws caused by the speculative capabilities in modern processors to leak sensitive data through side channels. Completely fixing the problem would require a redesign of the architecture for transient execution or the implementation of a new design on re-configurable hardware. However, such fixes cannot be backported to old machines with fixed hardware design. Completely replacing those machines will take a long time. Moreover, existing software patches may cause significant performance overhead. This paper proposes to detect Spectre by monitoring deviations in microarchitectural events using hardware performance counters with promising accuracy above 90% under a variety of workload conditions. However, the attacker may attempt to evade detection by slowing down the attack or mimicking benign programs. This paper thus compares different evasion strategies quantitatively and demonstrates that it is possible for the attacker to avoid detection when operating the attacks at a lower speed while maintaining a reasonable attack success rate. Then, we show that, in order to resist evasion, the original detector must be enhanced by randomly switching between a set of detectors using different features and sampling periods so we can keep the detection accuracy above 80%.
- Published
- 2021
- Full Text
- View/download PDF
35. Embracing Changes
- Author
-
Jean-Luc Gaudiot
- Subjects
General Computer Science - Published
- 2017
- Full Text
- View/download PDF
36. Creating Autonomous Vehicle Systems, Second Edition
- Author
-
Liyun Li, Shuang Wu, Jean-Luc Gaudiot, Jie Tang, and Shaoshan Liu
- Subjects
050210 logistics & transportation ,General Computer Science ,Multimedia ,Computer science ,media_common.quotation_subject ,05 social sciences ,02 engineering and technology ,computer.software_genre ,Perception ,0502 economics and business ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,computer ,media_common - Abstract
This book is one of the first technical overviews of autonomous vehicles written for a general computing and engineering audience. The authors share their practical experiences designing a...
- Published
- 2020
- Full Text
- View/download PDF
37. Online Detection of Spectre Attacks Using Microarchitectural Traces from Performance Counters
- Author
-
Congmiao Li and Jean-Luc Gaudiot
- Subjects
010302 applied physics ,Computer performance ,business.industry ,Computer science ,Speculative execution ,02 engineering and technology ,01 natural sciences ,020202 computer hardware & architecture ,Software ,Embedded system ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,False positive paradox ,Overhead (computing) ,Cache ,business ,Branch misprediction ,Data cache - Abstract
To improve processor performance, computer architects have adopted such acceleration techniques as speculative execution and caching. However, researchers have recently discovered that this approach implies inherent security flaws, as exploited by Meltdown and Spectre. Attacks targeting these vulnerabilities can leak protected data through side channels such as data cache timing by exploiting mis-speculated executions. The flaws can be catastrophic because they are fundamental and widespread and they affect many modern processors. Mitigating the effect of Meltdown is relatively straightforward in that it entails a software-based fix which has already been deployed by major OS vendors. However, to this day, there is no effective mitigation to Spectre. Fixing the problem may require a redesign of the architecture for conditional execution in future processors. In addition, a Spectre attack is hard to detect using traditional software-based antivirus techniques because it does not leave traces in traditional log files. In this paper, we proposed to monitor microarchitectural events such as cache misses, branch mispredictions from existing CPU performance counters to detect Spectre during attack runtime. Our detector was able to achieve 0% false negatives with less than 1 % false positives using various machine learning classifiers with a reasonable performance overhead.
- Published
- 2018
- Full Text
- View/download PDF
38. Network Variation and Fault Tolerant Performance Acceleration in Mobile Devices with Simultaneous Remote Execution
- Author
-
Benjamin Y. Cho, Won Woo Ro, Keunsoo Kim, and Jean-Luc Gaudiot
- Subjects
business.industry ,Computer science ,End user ,Distributed computing ,Fault tolerance ,Energy consumption ,Theoretical Computer Science ,Computational Theory and Mathematics ,Hardware and Architecture ,Server ,Computation offloading ,Wireless ,business ,Mobile device ,Software ,Computer network - Abstract
As mobile applications provide increasingly richer features to end users, it has become imperative to overcome the constraints of a resource-limited mobile hardware. Remote execution is one promising technique to resolve this important problem. Using this technique, the computation intensive part of the workload is migrated to resource-rich servers, and then once the computation is completed, the results can be returned to the client devices. To enable this operation, strong wireless connectivity is required. However, unstable wireless connections are the staple of real-life. This makes performance unpredictable, sometimes offsetting the benefits brought by this technique and leading to performance degradation. To address this problem, in this paper, we present a Simultaneous Remote Execution (SRE) model for mobile devices. Our SRE model performs concurrent executions both locally and remotely. Therefore, the worst-case execution time on fluctuating network condition is significantly reduced. In addition, SRE provides inherent tolerance for abrupt network failure. We designed and implemented an SRE-based offloading system consisting of a real smartphone and a remote server connected via 3G and Wifi networks. The experimental results under various real-life network variation scenarios show that SRE outperforms the alternative schemes in highly fluctuating network environments.
- Published
- 2015
- Full Text
- View/download PDF
39. Guest Editorial: SBAC-PAD 2013
- Author
-
José Nelson Amaral, Derek Chiou, Chita R. Das, Manish Parashar, Jean-Luc Gaudiot, and Guido Araujo
- Subjects
Graduate students ,Computer science ,Library science ,Technical committee ,Computer society ,Software ,Information Systems ,Theoretical Computer Science - Abstract
Welcome to this special issue, a showcase of an extension of some of the most notable papers which were presented at SBAC-PAD 2013 in Porto de Galinhas, Brazil, in October 2013. SBAC-PAD is an international annual conference, started in 1987, which has continuously presented an overview of new developments, applications, and trends in parallel and distributed computing technologies. SBAC-PAD is open to faculty members, researchers, practitioners, and graduate students around the world. Last year, itwas promoted by theBrazilianComputer Society, organized in cooperation with the IEEE Computer Society Technical Committee on Computer Architecture
- Published
- 2015
- Full Text
- View/download PDF
40. A Performance-Energy Model to Evaluate Single Thread Execution Acceleration
- Author
-
Won Seob Jeong, Seung Hun Kim, Won Woo Ro, Dohoon Kim, Changmin Lee, and Jean-Luc Gaudiot
- Subjects
Power model ,Multi-core processor ,Boosting (machine learning) ,Power demand ,Hardware and Architecture ,Computer science ,Operating frequency ,Parallel computing ,Thread (computing) ,Efficient energy use - Abstract
It is well known that the cost of executing the sequential portion of a program will limit and sometimes even eclipse the gains brought by processing in parallel the rest of the program. This means that serious consideration should be brought to bear on accelerating the execution of this unavoidable sequential part. Such acceleration can be done by boosting the operating frequency in a symmetric multicore processor. In this paper, we derive a performance and power model to describe the implications of this approach. From our model, we show that the ratio of performance over energy during the sequential part improves with an increase in the number of cores. In addition, we demonstrate how to determine with the proposed model the optimal frequency boosting ratio which maximizes energy efficiency.
- Published
- 2015
- Full Text
- View/download PDF
41. Kasahara Voted 2017 Computer Society President-Elect
- Author
-
Hironori Kasahara and Jean-Luc Gaudiot
- Subjects
General Computer Science ,Political science ,Computer society ,Management - Published
- 2016
- Full Text
- View/download PDF
42. Engineering the New Boundaries of AI
- Author
-
Amir Banifatemi and Jean-Luc Gaudiot
- Subjects
Ubiquitous computing ,General Computer Science ,business.industry ,Computer science ,Intelligent decision support system ,Engineering ethics ,Applications of artificial intelligence ,Software engineering ,business ,Artificial intelligence, situated approach - Abstract
Recognizing the increasingly critical role that AI plays in all aspects of modern society, the XPRIZE Foundation launched the IBM Watson AI XPRIZE. Interdisciplinary teams will advance current AI technologies as they compete for the grand prize.
- Published
- 2016
- Full Text
- View/download PDF
43. Decision, Planning, and Control
- Author
-
Shaoshan Liu, Liyun Li, Jie Tang, Shuang Wu, and Jean-Luc Gaudiot
- Published
- 2018
- Full Text
- View/download PDF
44. Deep Learning in Autonomous Driving Perception
- Author
-
Shaoshan Liu, Liyun Li, Jie Tang, Shuang Wu, and Jean-Luc Gaudiot
- Published
- 2018
- Full Text
- View/download PDF
45. Return of experience on the mean-shift clustering for heterogeneous architecture use case
- Author
-
Mustapha Lebbah, Fouste Yuehgoh, Jean-Luc Gaudiot, and Christophe Cérin
- Subjects
Set (abstract data type) ,Software ,Memory management ,business.industry ,Computer science ,Distributed computing ,Big data ,Hardware acceleration ,Algorithm design ,Architecture ,Field-programmable gate array ,business - Abstract
The exponential increment in data size poses new challenges for computer scientists, giving rise to a new set of methodologies under the term Big Data. Many efficient algorithms for machine learning have been proposed, facing up time and memory requirements. Nevertheless, with hardware acceleration, multiple software instructions can be integrated and executed into a single hardware die. Current researches aim at eliminating the burden for the user in using multiple processor types. In this paper we propose our return of experience on a new way of implementing machine learning algorithms on heterogeneous hardware. To explore our vision, we use a parallel Mean-shift algorithm, developed at LIPN as our case study to investigate issues in building efficient Machine Learning libraries for heterogeneous systems. The ultimate goal is to provide a core set of building blocks for Machine Learning programming that could serve either to build new applications on heterogeneous architectures or to control the evolution of the underlying platform. We thus examine the difficulties encountered during the implementation of the algorithm with the aim to discover methodologies for building systems based on heterogeneous hardware. We also discover issues and building blocks for solving concrete machine learning (ML) problems on the Chisel software stack we use for this purpose.
- Published
- 2017
- Full Text
- View/download PDF
46. Accelerating Lattice Quantum Chromodynamics Simulations with Value Prediction
- Author
-
Christine Eisenbeis, Chen Liu, Shaoshan Liu, Jie Tang, and Jean-Luc Gaudiot
- Subjects
Instruction set ,Floating point ,Software ,Computer engineering ,business.industry ,Computer science ,Computation ,Big data ,Latency (engineering) ,business ,Quantum ,Bottleneck - Abstract
Communication latency problems are universal and have become a major performance bottleneck as we scale in big data infrastructure and many-core architectures. Specifically, research institutes around the world have built specialized supercomputers with powerful computation units in order to accelerate scientific computation. However, the problem often comes from the communication side instead of the computation side. In this paper we first demonstrate the severity of communication latency problems. Then we use Lattice Quantum Chromo Dynamic (LQCD) simulations as a case study to show how value prediction techniques can reduce the communication overheads, thus leading to higher performance without adding more expensive hardware. In detail, we first implement a software value predictor on LQCD simulations: our results indicate that 22.15% of the predictions result in performance gain and only 2.65% of the predictions lead to rollbacks. Next we explore the hardware value predictor design, which results in a 20-fold reduction of the prediction latency. In addition, based on the observation that the full range of floating point accuracy may not be always needed, we propose and implement an initial design of the tolerance value predictor: as the tolerance range increases, the prediction accuracy also increases dramatically.
- Published
- 2017
- Full Text
- View/download PDF
47. A Runtime Workload Distribution with Resource Allocation for CPU-GPU Heterogeneous Systems
- Author
-
Shouq Alsubaihi and Jean-Luc Gaudiot
- Subjects
010302 applied physics ,Multi-core processor ,Speedup ,Computer science ,Distributed computing ,Workload ,02 engineering and technology ,Parallel computing ,Energy consumption ,Thread (computing) ,01 natural sciences ,Execution time ,020202 computer hardware & architecture ,Idle ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Resource allocation ,Central processing unit - Abstract
Nowadays, Graphic Processing Units (GPUs) have become popular as general-purpose processors; they have been used as co-processors with CPUs forming heterogeneous systems. CPUs and GPUs have different execution capabilities, energy consumption and thermal characteristics. Typically, the role of the GPU is to execute the parallel parts of the job and the role of the CPU (i.e., host) is to execute the sequential parts and manage the CPU-GPU data transfer. The host remains idle, waiting for the GPU's execution and data transfer to complete. This classic workload distribution does not fully utilize the CPU and the GPU. Thus, there is a need for an adaptive workload distributor that fully exploits the potential of the CPU and the GPU. Allocating resources (i.e., core scaling, thread allocation) is also a challenge since different sets of resources exhibit different behaviors in terms of performance, energy consumption, peak power and peak CPU temperature. Several studies have been conducted on workload distribution with an eye on performance improvement. However, few of them consider both performance and energy consumption. We thus propose our novel Workload Distributor with a Resource Allocator (WDRA) which combines workload distribution, core scaling, and thread allocation into a multi-objective optimization problem. Since resource allocation is known to be an NP-hard problem, WDRA utilizes Particle Swarm Optimization (PSO). The goal is to find an efficient workload distribution in terms of both execution time and energy consumption, under peak power and peak CPU temperature constraints. To evaluate WDRA, experiments were conducted on an actual system equipped with a multicore CPU and a GPU. Compared to performance-based and other workload distributors, on average, WDRA can achieve up to a 1.47x speedup and energy savings of up to 82%. WDRA is suitable runtime algorithm for distributing a job's workload since the algorithm only takes up to 1.7% of the job's execution time.
- Published
- 2017
- Full Text
- View/download PDF
48. $C\!\!-\!\!Lock$: Energy Efficient Synchronization for Embedded Multicore Systems
- Author
-
Eui-Young Chung, Jean-Luc Gaudiot, Won Woo Ro, Seung Hun Kim, Sang Hyong Lee, Minje Jun, and Byunghoon Lee
- Subjects
Multi-core processor ,Record locking ,Computer science ,Order (ring theory) ,Transactional memory ,Parallel computing ,Synchronization ,Lock (computer science) ,Theoretical Computer Science ,Computational Theory and Mathematics ,Hardware and Architecture ,Product (mathematics) ,Synchronization (computer science) ,Data synchronization ,Software ,Energy (signal processing) - Abstract
Data synchronization among multiple cores has been one of the critical issues which must be resolved in order to optimize the parallelism of multicore architectures. Data synchronization schemes can be classified as lock-based methods (“pessimistic”) and lock-free methods (“optimistic”). However, none of these methods consider the nature of embedded systems which have demanding and sometimes conflicting requirements not only for high performance, but also for low power consumption. As an answer to these problems, we propose $C\!\!- \!\! Lock$ , an energy- and performance-efficient data synchronization method for multicore embedded systems. $C\!\!- \!\! Lock$ achieves balanced energy- and performance-efficiency by combining the advantages of lock-based methods and transactional memory (TM) approaches; in $C\!\!- \!\! Lock$ , the core is blocked only when true conflicts exist (advantage of TM), while avoiding roll-back operations which can cause huge overhead with regard to both performance and energy (this is an advantage of locks). Also, in order to save more energy, $C\!\!- \!\! Lock$ disables the clocks of the cores which are blocked for the access to the shared data until the shared data become available. We compared our $C\!\!- \!\! Lock$ approach against traditional locks and transactional memory systems and found that $C\!\!- \!\! Lock$ can reduce the energy-delay product by up to 1.94 times and 13.78 times compared to the baseline and TM, respectively.
- Published
- 2014
- Full Text
- View/download PDF
49. How many cores do we need to run a parallel workload: A test drive of the Intel SCC platform?
- Author
-
Pollawat Thanarungroj, Chen Liu, and Jean-Luc Gaudiot
- Subjects
Computer Networks and Communications ,Computer science ,business.industry ,Processor design ,Clock rate ,Workload ,Energy consumption ,computer.software_genre ,Single-chip Cloud Computer ,Bottleneck ,Theoretical Computer Science ,Artificial Intelligence ,Hardware and Architecture ,Embedded system ,Scalability ,Operating system ,business ,computer ,Software ,Efficient energy use - Abstract
As semiconductor manufacturing technology continues to improve, it is possible to integrate more and more transistors onto a single processor. Many-core processor design has resulted in part from the search to utilize this enormous transistor real estate. The Single-Chip Cloud Computer (SCC) is an experimental many-core processor created by Intel Labs. In this paper we present a study in which we analyze this innovative many-core system by running several workloads with distinctive parallelism characteristics. We investigate the effect on system performance by monitoring specific hardware performance counters. Then, we experiment on varying different hardware configuration parameters such as number of cores, clock frequency and voltage levels. We execute the chosen workloads and collect the timing, power consumption and energy consumption information on such a many-core research platform. Thus, we can comprehensively analyze the behavior and scalability of the Intel SCC system with the introduced workload in terms of performance and energy consumption. Our results show that the profiled parallel workload execution has a communication bottleneck on the Intel SCC system. Moreover, our results indicate that we should carefully choose the number of cores to execute different workloads in order to yield a balance between execution performance and energy efficiency for different applications.
- Published
- 2014
- Full Text
- View/download PDF
50. Complexity-Effective Contention Management with Dynamic Backoff for Transactional Memory Systems
- Author
-
Won Woo Ro, Jean-Luc Gaudiot, Seung Hun Kim, and Dongmin Choi
- Subjects
business.industry ,Computer science ,Distributed computing ,Transactional memory ,Thread (computing) ,Theoretical Computer Science ,Concurrency control ,Memory management ,Computational Theory and Mathematics ,Hardware and Architecture ,Distributed algorithm ,Benchmark (computing) ,Algorithm design ,business ,Software ,Rollback ,Computer network - Abstract
Reducing memory access conflicts is a crucial part of the design of Transactional Memory (TM) systems since the number of running threads increases and long latency transactions gradually appear: without an efficient contention management, there will be repeated aborts and wasteful rollback operations. In this paper, we present a dynamic backoff control algorithm developed for complexity-effective and distributed contention management in Hardware Transactional Memory (HTM) systems. Our approach aims at controlling the restarting intervals of aborted transactions, and can be easily applied to the various TM systems. To this end, we have profiled the applications of the STAMP benchmark suite and have identified those “problem” transactions which repeatedly cause aborts in the applications with the attendant high contention rate. The proposed algorithm alleviates the impact of these repeated aborts by dynamically adjusting the initial exponent value of the traditional backoff approach. In addition, the proposed scheme decreases the number of wasted cycles down to 82% on average compared to the baseline TM system. Our design has been integrated in LogTM-SE where we observed an average performance improvement of 18%.
- Published
- 2014
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.