2,164 results on '"multicore"'
Search Results
2. Core allocation to minimize total flow time in a multicore system in the presence of a processing time constraint.
- Author
-
Vesilo, Rein
- Subjects
- *
PROCESS capability , *USER experience , *COMMUNICATION infrastructure , *PRODUCTION scheduling , *DEADLINES , *LAGRANGE multiplier - Abstract
Data centers are a vital and fundamental infrastructure component of the cloud. The requirement to execute a large number of demanding jobs places a premium on processing capacity. Parallelizing jobs to run on multiple cores reduces execution time. However, there is a decreasing marginal benefit to using more cores, with the speedup function quantifying the achievable gains. A critical performance metric is flow time. Previous results in the literature derived closed-form expressions for the optimal allocation of cores to minimize total flow time for a power-law speedup function if all jobs are present at time 0. However, this work did not place a constraint on the makespan. For many diverse applications, fast response times are essential, and latency targets are specified to avoid adverse impacts on user experience. This paper is the first to determine the optimal core allocations for a multicore system to minimize total flow time in the presence of a completion deadline (where all jobs have the same deadline). The allocation problem is formulated as a nonlinear optimization program that is solved using the Lagrange multiplier technique. Closed-form expressions are derived for the optimal core allocations, total flow time, and makespan, which can be fitted to a specified deadline by adjusting the value of a single Lagrange multiplier. Compared to the unconstrained problem, the shortest job first property for optimal allocation is maintained; however, a number of other properties require revising and other properties are only retained in a modified form (such as the scale-free and size-dependence properties). It is found that with a completion deadline the optimal solution may contain groups of simultaneous completions. In general, all possible patterns of single- and group-completion need to be considered, producing an exponential search space. However, the paper determines analytically that the optimal completion pattern consists of a sequence of single completions followed by a single group of simultaneous completions at the end, which reduces the search space dimension to being linear. The paper validates the Lagrange multiplier approach by verifying constraint qualifications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Enhanced OpenMP Algorithm to Compute All-Pairs Shortest Path on X86 Architectures
- Author
-
Calderón, Sergio, Rucci, Enzo, Chichizola, Franco, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Pesado, Patricia, editor, Panessi, Walter, editor, and Fernández, Juan Manuel, editor
- Published
- 2024
- Full Text
- View/download PDF
4. Design of Processors
- Author
-
Yan, Xiaolang, Meng, Jianyi, Chen, Zhijian, Wang, Yangyuan, editor, Chi, Min-Hwa, editor, Lou, Jesse Jen-Chung, editor, and Chen, Chun-Zhang, editor
- Published
- 2024
- Full Text
- View/download PDF
5. КОНЦЕПЦІЯ ГЕНЕТИЧНОГО ЯДРА В СТРУКТУРНІЙ ОРГАНІЗАЦІЇ І ЕВОЛЮЦІЇ СКЛАДНИХ ЕЛЕКТРОМЕХАНІЧНИХ СИСТЕМ.
- Author
-
Шинкаренко, Василь
- Abstract
The interdisciplinary aspect in the study of the core concept in complex genetically organized systems of natural and anthropogenic origin is analyzed. The purpose of the article is scientific substantiation of the nature and essence of the genetic core of the electromechanical system. Based on the provisions of the theory of genetic evolution of electromechanical systems, the information essence of the genetic core as a carrier of genetic information of the primary source of the electromagnetic field is revealed. The analysis of the invariant relationships of the genetic nucleus with the structure of groups, subgroups and small periods, the generative system was carried out. An invariant connection between the core concept and the processes of evolutionary speciation and the main taxonomic categories of electromechanical systems has been established. Based on the results of the system-genetic analysis, a definition of the concept of the genetic core is proposed. The relationship between genetic and energy cores in the hierarchy of levels of complexity of electromechanical systems is studied. The principles of genetic structuring of complex systems of the multinuclear type are revealed. According to the results of the system-genetic analysis of the conception of the genetic core of electro-mechanical systems, its main properties are summarized. The importance of the obtained research results for the spread of genetic prediction technology and interdisciplinary synthesis to complex electromechanical complexes with nuclei of different physical nature is emphasized. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Reinforcement Learning-Based Cache Replacement Policies for Multicore Processors
- Author
-
Matheus A. Souza and Henrique C. Freitas
- Subjects
Cache replacement ,coherence ,multicore ,reinforcement learning ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
High-performance computing (HPC) systems need to handle ever-increasing data sizes for fast processing and quick response times. However, modern processors’ caches are unable to handle massive amounts of data, leading to significant cache miss penalties that affect performance. In this context, selecting an effective cache replacement policy is crucial to improving HPC performance. Existing cache replacement policies fall short of Bélády’s optimal algorithm, and we propose a new approach that leverages the coherence state and sharers’ bit-vector of a cache block to make better decisions. We suggest a reinforcement learning-based strategy that learns from past eviction decisions and applies this knowledge to make better decisions in the future. Our approach uses a next-attempt method that combines the results from classic cache replacement algorithms with reinforcement learning. We evaluated our approach using the Sniper simulator and seven kernels from CAP Benchmarks. Our results show that our approach can significantly reduce the cache miss rate by 41.20% and 27.30% in L1 and L2 caches, respectively. In addition, our approach can improve the IPC by 27.33% in the best case and reduce energy consumption by 20.36% compared to an unmodified policy.
- Published
- 2024
- Full Text
- View/download PDF
7. RTOS Schedulers for Periodic and Aperiodic Taskset
- Author
-
Rinku, Dhruva R., Asha Rani, M., Krishna Suhruth, Y., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Tuba, Milan, editor, Akashe, Shyam, editor, and Joshi, Amit, editor
- Published
- 2023
- Full Text
- View/download PDF
8. Lock-Free Bucketized Cuckoo Hashing
- Author
-
Li, Wenhai, Cheng, Zhiling, Chen, Yuan, Li, Ao, Deng, Lingfeng, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Cano, José, editor, Dikaiakos, Marios D., editor, Papadopoulos, George A., editor, Pericàs, Miquel, editor, and Sakellariou, Rizos, editor
- Published
- 2023
- Full Text
- View/download PDF
9. Programming Heterogeneous Architectures Using Hierarchical Tasks
- Author
-
Faverge, Mathieu, Furmento, Nathalie, Guermouche, Abdou, Lucas, Gwenolé, Namyst, Raymond, Thibault, Samuel, Wacrenier, Pierre-André, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Singer, Jeremy, editor, Elkhatib, Yehia, editor, Blanco Heras, Dora, editor, Diehl, Patrick, editor, Brown, Nick, editor, and Ilic, Aleksandar, editor
- Published
- 2023
- Full Text
- View/download PDF
10. Characterization of SPEC2006 Benchmarks Under Multicore Platform to Identify Critical Architectural Aspects
- Author
-
Shukla, Surendra Kumar, Pant, Bhaskar, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Agrawal, Rajeev, editor, Mitra, Pabitra, editor, Pal, Arindam, editor, and Sharma Gaur, Madhu, editor
- Published
- 2023
- Full Text
- View/download PDF
11. Exploring the Scheduling Techniques for the RTOS
- Author
-
Rinku, Dhruva R., Asha Rani, M., Suhruth Krishna, Y., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Tuba, Milan, editor, Akashe, Shyam, editor, and Joshi, Amit, editor
- Published
- 2023
- Full Text
- View/download PDF
12. A Multi-core Based Real-time Scheduler Supporting Periodic and Sporadic Threads and Processes.
- Author
-
Kim, Sanggyu and Park, Hong Seong
- Abstract
This paper proposes, implements, and verifies a multicore real-time scheduler (MCRT scheduler) for periodic and sporadic threads and processes, and non-real-time processes where periodic and sporadic (or event-driven) processes are processed according to real-time characteristics such as limited periods and deadlines. Using the Xenomai and Linux operating systems, the proposed MCRT scheduler was implemented and verified through various test cases designed for multicore operations. The proposed MCRT scheduler generates scheduling tables for periodic and sporadic threads and processes, based on which they are executed during the basic period. The MCRT scheduler was verified using several examples. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
13. Preparation of Multicore Millimeter-Sized Spherical Alginate Capsules to Specifically and Sustainedly Release Fish Oil
- Author
-
Lina Tao, Panpan Wang, Ting Zhang, Mengzhen Ding, Lijie Liu, Ningping Tao, Xichang Wang, and Jian Zhong
- Subjects
Ionotropic gelation ,Millimeter-sized spherical capsule ,Monoaxial dispersion electrospraying ,Multicore ,Specific and sustained release ,Nutrition. Foods and food supply ,TX341-641 - Abstract
Specific and sustained release of nutrients from capsules to the gastrointestinal tract has attracted many attentions in the field of food and drug delivery. In this work, we reported a monoaxial dispersion electrospraying-ionotropic gelation technique to prepare multicore millimeter-sized spherical capsules for specific and sustained release of fish oil. The spherical capsules had diameters from 2.05 mm to 0.35 mm with the increased applied voltages. The capsules consisted of uniform (at applied voltages of ≤ 10 kV) or nonuniform (at applied voltages of > 10 kV) multicores. The obtained capsules had reasonable loading ratios (9.7 %−6.3 %) due to the multicore structure. In addition, the obtained capsules had specific and sustained release behaviors of fish oil into the small intestinal phase of in vitro gastrointestinal tract and small intestinal tract models. The simple monoaxial dispersion electrospraying-ionotropic gelatin technique does not involve complicated preparation formulations and polymer modification, which makes the technique has a potential application prospect for the fish oil preparations and the encapsulation of functional active substances in the field of food and drug industries.
- Published
- 2023
- Full Text
- View/download PDF
14. Performance evaluation on work-stealing featured parallel programs on asymmetric performance multicore processors
- Author
-
Adnan
- Subjects
Amdahl’s law ,Speedup factor ,Asymmetric performance ,Multicore ,Work stealing ,Computer engineering. Computer hardware ,TK7885-7895 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
The speed difference between high-performance CPUs and energy-efficient CPUs, which are found in asymmetric performance multicore processors, affects the current form of Amdahl’s law equation. This paper proposes two updates to that equation based on the performance evaluation results of a simple parallel pi program written with OpenCilk. Performance evaluation was done by measuring execution time and instructions per cycle (IPC). The performance evaluation of the parallel program executed on the Intel Core i5 1240P processor did not indicate decreased performance due to asymmetric performance. Instead, the program with efficient work-stealing advantages from OpenCilk performed well. In the case of using the execution time of the P-CPU as a reference to obtain speedup, the evaluation results in a sublinear speedup. Conversely, in the case of using the execution time of the E-CPU as a reference, the evaluation results in a superlinear speedup. This paper proposes two updates to Amdahl’s law equation based on these two evaluation results.
- Published
- 2023
- Full Text
- View/download PDF
15. Parallel Message Authentication Algorithm Implemented Over Multicore CPU.
- Author
-
Alamari, Yamamh Alaa, Fanfakh, Ahmed, and Hadi, Esraa
- Subjects
PARALLEL algorithms ,INTERNET fraud ,ALGORITHMS ,MULTICORE processors ,SOCIAL networks ,SECURITY systems - Abstract
Currently, there are around 4.95 billion people who use the internet, which creates a large audience with an increasing demand for activities that can be done online. Some examples of these activities include social networking, information sharing, and online shopping. Therefore, there is an urgent need for improved levels of secrecy and privacy. It has recently come to light that one of the most serious challenges to the mainstream adoption of business apps is that of online fraud. As a direct consequence of this, issues pertaining to authentic cation, authorization, and identification have emerged as critical considerations in today's open and accessible society. The act of recognizing an entity, whether it be a human, a computer, or a software program, is referred to as the identification process. Authentication and authorization are complementary processes that are used in security systems to decide which persons are permitted to access information resources over a network. Authentication is performed by the user's device, while authorization is performed by the security system. However, a variety of potential solutions have been suggested, one of which is the use of parallelism in order to boost the efficiency of authentication methods. This work proposes a new message authentication algorithm that is applied in parallel over a multicore processor. The proposed message authentication algorithm uses two PNGRs and two substitution boxes to encrypt and authenticate the plain message. The encryption process has a one-round operation, making it a fast technique to encrypt and decrypt blocks of messages. In comparison to the existing method, the proposed authentication method outperforms the parallel speck-based authentication method by an average of 3.27 times faster when executed over a multicore CPU. The average speedup compared to the sequential version of the proposed algorithm and its parallel implementation is 2.99. The proposed method passes the most difficult randomness test, and the obtained MAC values are tested further to meet other security measurements. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. FlyOS: rethinking integrated modular avionics for autonomous multicopters.
- Author
-
Farrukh, Anam and West, Richard
- Abstract
Autonomous multicopters often feature federated architectures, which incur relatively high communication costs between separate hardware components. These costs limit the ability to react quickly to new mission objectives. Additionally, federated architectures are not easily upgraded without introducing new hardware that impacts size, weight, power and cost constraints. In turn, such constraints restrict the use of redundant hardware to handle faults. In response to these challenges, we propose FlyOS, an Integrated Modular Avionics approach to consolidate mixed-criticality flight functions in software on heterogeneous multicore aerial platforms. FlyOS is based on a separation kernel that statically partitions resources among virtualized sandboxed OSes. We present a dual-sandbox prototype configuration, where timing- and safety-critical flight control tasks execute in a real-time OS alongside mission-critical vision-based navigation tasks in a Linux sandbox. Low latency shared memory communication allows flight commands and data to be relayed in real-time between sandboxes. A hypervisor-based fault-tolerance mechanism is also deployed to ensure failover flight control in case of critical function or timing failures. We validate FlyOS's performance and showcase its benefits when compared against traditional architectures in terms of predictable, extensible and efficient flight control. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
17. A Comprehensive Survey on the Use of Hypervisors in Safety-Critical Systems
- Author
-
Santiago Lozano, Tamara Lugo, and Jesus Carretero
- Subjects
Aerospace ,automotive ,aviation ,embedded ,hypervisor ,multicore ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Virtualization has become one of the main tools for making efficient use of the resources offered by multicore embedded platforms. In recent years, even sectors such as space, aviation, and automotive, traditionally wary of adopting this type of technology due to the impact it could have on the safety of their systems, have been forced to introduce it into their day-to-day work, as their applications are becoming increasingly complex and demanding. This article provides a comprehensive review of the research work that uses or considers the use of a hypervisor as the basis for building a virtualized safety-critical embedded system. Once the hypervisors developed or adapted for this type of system have been identified, an exhaustive qualitative comparison is made between them. an exhaustive qualitative comparison is made between them. To the best of our knowledge, this is the first time that all this information is collected in a single article. Therefore, the main contribution of this article is that it collects and categorizes the information of each hypervisor and compares them with each other, so that this article can be used as a starting point for future researchers in this area, who will be able to quickly check which hypervisor is best suited to their research needs.
- Published
- 2023
- Full Text
- View/download PDF
18. An Efficient Authenticated Elliptic Curve Cryptography Scheme for Multicore Wireless Sensor Networks
- Author
-
Esau Taiwo Oladipupo, Oluwakemi Christiana Abikoye, Agbotiname Lucky Imoize, Joseph Bamidele Awotunde, Ting-Yi Chang, Cheng-Chi Lee, and Dinh-Thuan Do
- Subjects
Multiprocessor ,multicore ,wireless sensor ,encryption ,chosen plaintext attack ,chosen ciphertext attack ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
The need to ensure the longevity of Wireless Sensor Networks (WSNs) and secure their communication has spurred various researchers to come up with various WSN models. Prime among the methods for extending the life span of WSNs is the clustering of Wireless Sensors (WS), which reduces the workload of WS and thereby reduces its power consumption. However, a drastic reduction in the power consumption of the sensors when multicore sensors are used in combination with sensors clustering has not been well explored. Therefore, this work proposes a WSN model that employs clustering of multicore WS. The existing Elliptic Curve Cryptographic (ECC) algorithm is optimized for parallel execution of the encryption/decryption processes and security against primitive attacks. The Elliptic Curve Diffie-Helman (ECDH) was used for the key exchange algorithm, and the Elliptic Curve Digital Signature Algorithm (ECDSA) was used to authenticate the communicating nodes. Security analysis of the model and comparative performance analysis with the existing ones were demonstrated. The security analysis results reveal that the proposed model meets the security requirements and resists various security attacks. Additionally, the projected model is scalable, energy-conservative, and supports data freshness. The results of comparative performance analysis show that the proposed WSN model can efficiently leverage multiprocessors and/or many cores for quicker execution and conserves power usage.
- Published
- 2023
- Full Text
- View/download PDF
19. Fast Parallel Bellman-Ford-Moore Algorithm Implementation for Small Graphs
- Author
-
Vezolainen, Alexei, Salnikov, Alexey, Klyuchikov, Artem, Komech, Sergey, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Voevodin, Vladimir, editor, Sobolev, Sergey, editor, Yakobovskiy, Mikhail, editor, and Shagaliev, Rashit, editor
- Published
- 2022
- Full Text
- View/download PDF
20. Observing the Impact of Multicore Execution Platform for TSP Systems Under Schedulability, Security and Safety Constraints
- Author
-
Atchadam, Ill-ham, Lemarchand, Laurent, Singhoff, Frank, Tran, Hai Nam, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Trapp, Mario, editor, Schoitsch, Erwin, editor, Guiochet, Jérémie, editor, and Bitsch, Friedemann, editor
- Published
- 2022
- Full Text
- View/download PDF
21. Heterogeneous Voltage Frequency Scaling of Data-Parallel Applications for Energy Saving on Homogeneous Multicore Platforms
- Author
-
Bratek, Pawel, Szustak, Lukasz, Wyrzykowski, Roman, Olas, Tomasz, Chmiel, Tomasz, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Chaves, Ricardo, editor, B. Heras, Dora, editor, Ilic, Aleksandar, editor, Unat, Didem, editor, Badia, Rosa M., editor, Bracciali, Andrea, editor, Diehl, Patrick, editor, Dubey, Anshu, editor, Sangyoon, Oh, editor, L. Scott, Stephen, editor, and Ricci, Laura, editor
- Published
- 2022
- Full Text
- View/download PDF
22. Improving Performance of Long Short-Term Memory Networks for Sentiment Analysis Using Multicore and GPU Architectures
- Author
-
Künas, Cristiano A., Serpa, Matheus S., Padoin, Edson Luiz, Navaux, Philippe O. A., Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Gitler, Isidoro, editor, Barrios Hernández, Carlos Jaime, editor, and Meneses, Esteban, editor
- Published
- 2022
- Full Text
- View/download PDF
23. Performance Evaluation of OSCAR Multi-target Automatic Parallelizing Compiler on Intel, AMD, Arm and RISC-V Multicores
- Author
-
Magnussen, Birk Martin, Kawasumi, Tohma, Mikami, Hiroki, Kimura, Keiji, Kasahara, Hironori, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Li, Xiaoming, editor, and Chandrasekaran, Sunita, editor
- Published
- 2022
- Full Text
- View/download PDF
24. Multicore Embedded Worst-Case Task Design Issues and Analysis Using Machine Learning Logic
- Author
-
Aradhya, Sumalatha, Thejaswini, S., Nagaveni, V., Howlett, Robert J., Series Editor, Jain, Lakhmi C., Series Editor, Senjyu, Tomonobu, editor, Mahalle, Parakshit, editor, Perumal, Thinagaran, editor, and Joshi, Amit, editor
- Published
- 2022
- Full Text
- View/download PDF
25. Performance Analysis of Genetic Algorithm for Function Optimization in Multicore Platform Using DEAP
- Author
-
Harini, D. N., Karthi, R., Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Reddy, V. Sivakumar, editor, Prasad, V. Kamakshi, editor, Wang, Jiacun, editor, and Reddy, K. T. V., editor
- Published
- 2022
- Full Text
- View/download PDF
26. A hybrid CUDA, OpenMP, and MPI parallel TCA-based domain adaptation for classification of very high-resolution remote sensing images.
- Author
-
Garea, Alberto S., Heras, Dora B., Argüello, Francisco, and Demir, Begüm
- Subjects
- *
DEEP learning , *MULTISPECTRAL imaging , *REMOTE sensing , *MESSAGE passing (Computer science) , *CLASSIFICATION - Abstract
Domain Adaptation (DA) is a technique that aims at extracting information from a labeled remote sensing image to allow classifying a different image obtained by the same sensor but at a different geographical location. This is a very complex problem from the computational point of view, specially due to the very high-resolution of multispectral images. TCANet is a deep learning neural network for DA classification problems that has been proven as very accurate for solving them. TCANet consists of several stages based on the application of convolutional filters obtained through Transfer Component Analysis (TCA) computed over the input images. It does not require backpropagation training, in contrast to the usual CNN-based networks, as the convolutional filters are directly computed based on the TCA transform applied over the training samples. In this paper, a hybrid parallel TCA-based domain adaptation technique for solving the classification of very high-resolution multispectral images is presented. It is designed for efficient execution on a multi-node computer by using Message Passing Interface (MPI), exploiting the available Graphical Processing Units (GPUs), and making efficient use of each multicore node by using Open Multi-Processing (OpenMP). As a result, an accurate DA technique from the point of view of classification and with high speedup values over the sequential version is obtained, increasing the applicability of the technique to real problems. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
27. Reducing energy consumption using heterogeneous voltage frequency scaling of data-parallel applications for multicore systems.
- Author
-
Bratek, Pawel, Szustak, Lukasz, Wyrzykowski, Roman, and Olas, Tomasz
- Subjects
- *
VOLTAGE , *FLUID dynamics , *MULTICORE processors , *PARALLEL algorithms - Abstract
This paper investigates the exploitation of heterogeneous DVFS (dynamic voltage frequency scaling) control for improving the energy efficiency of data-parallel applications on ccNUMA shared-memory systems. We propose to adjust the clock frequency individually for the appropriately selected groups of cores, taking into account the diversified costs of parallel computation. This paper aims to evaluate the proposed approach using two different data-parallel applications: solving the 3D diffusion problem, and MPDATA fluid dynamics application. As a result, we observe the energy-savings gains of up to 20 percentage points over the traditional homogeneous frequency scaling approach on the server with two 18-core Intel Xeon Gold 6240. Additionally, we confirm the effectiveness of our strategy using two 64-core AMD EPYC 7773X. This paper also introduces two pruning algorithms that help select the optimal heterogeneous DVFS setups taking into account the energy or performance profile of studied applications. Finally, the cost and efficiency of developed algorithms are verified and compared experimentally against the brute-force search. • Heterogeneous DVFS method for energy efficiency of regular data-parallel applications. • Individually adjusting clock frequency for cores based on workload distribution. • Pruning algorithms for selecting optimal heterogeneous DVFS setups. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
28. DAG Hierarchical Schedulability Analysis for Avionics Hypervisor in Multicore Processors.
- Author
-
Yang, Huan, Zhao, Shuai, Shi, Xiangnan, Zhang, Shuang, and Guo, Yangming
- Subjects
AVIONICS ,HYPERVISOR (Computer software) ,MULTICORE processors ,VIRTUAL machine systems ,DIRECTED acyclic graphs - Abstract
Parallel hierarchical scheduling of multicore processors in avionics hypervisor is being studied. Parallel hierarchical scheduling utilizes modular reasoning about the temporal behavior of the upper Virtual Machine (VM) by partitioning CPU time. Directed Acyclic Graphs (DAGs) are used for modeling functional dependencies. However, the existing DAG scheduling algorithm wastes resources and is inaccurate. Decreasing the completion time (CT) of DAG and offering a tight and secure boundary makes use of joint-level parallelism and inter-joint dependency, which are two key factors of DAG topology. Firstly, Concurrent Parent and Child Model (CPCM) is researched, which accurately captures the above two factors and can be applied recursively when parsing DAG. Based on CPCM, the paper puts forward a hierarchical scheduling algorithm, which focuses on decreasing the maximum CT of joints. Secondly, the new Response Time Analysis (RTA) algorithm is proposed, which offers a general limit for other execution sequences of Noncritical joints (NC-joints) and a specific limit for a fixed execution sequence. Finally, research results show that the parallel hierarchical scheduling algorithm has higher performance than other algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
29. Enhanced Multicore Performance Using Novel Thread-Aware Cache Coherence and Prefetch-Control Mechanism.
- Author
-
Ghosh, Soma Niloy, Sahula, Vineet, and Bhargava, Lava
- Abstract
We propose a hardware technique for cache coherence over the existing approaches that ensure that shared and less frequently used cache blocks bypass private caches of multiple cores. Furthermore, this manuscript proposes a mechanism to tune the aggressiveness of a data prefetcher. Increased cache hit rate and improved performance have been observed since coherence management and prefetching delays are avoided using the proposed bypassing and thread progress-aware prefetch controlling mechanism. Our approach shows around 19% improvement in cache hit rate and 29% average performance improvement over existing state-of-the-art techniques for Parsec & Splash-2 multithreaded benchmarks. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
30. Fine-grain data classification to filter token coherence traffic.
- Author
-
Upadhyay, Bhargavi R., Ros, Alberto, and M., Supriya
- Subjects
- *
OPTICAL disks , *CLASSIFICATION - Abstract
Snoop-based cache coherence protocols perform well in small-scale systems by enabling low latency cache-to-cache data transfers in just two-hop coherence transactions. However, they are not a scalable alternative as they require frequent broadcast of coherence requests. Token coherence protocols were proposed to improve the scalability of snoop-based protocols by removing a large amount of traffic due to broadcast responses. Still, broadcasting coherence requests on every cache miss represents a scalability issue for medium and large-scale systems. In this paper, we propose to reduce the number of broadcast operations in Token coherence protocols by performing an efficient fine-grain private-shared data classification and disabling broadcasts for misses to data classified as private. Our fine-grain classification is orchestrated and stored by the Translation Look-aside Buffers (TLBs), where entries are kept for a longer time than in local caches. We explore different classification granularity accounting for different storage overheads and their impact on filtering coherence traffic. We evaluate our proposals on a set of parallel benchmarks through full-system cycle-accurate simulation and show that a subpage-grain classification offers the best trade-off when accounting for storage, traffic, and performance. When running a 16-core configuration, our subpage-grain classification eliminates 40.1% of broadcast operations compared to not performing any classification and 13.7% of broadcast operations more than a page-grain data classification. This reduction translates into less network traffic (16.0%), and finally, performance improvements of 12.0% compared to not having a classification mechanism. • Evaluation of TLB-based private/shared classification with varying granularities. • Proposal of a new TLB-based sub-page classification mechanism. • Integration of classification techniques to filter Token coherence traffic. • Reduction in traffic by 16% and performance by 20% with a low storage cost. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
31. Intelligent Security Strategy Based on the Selection of the Computer and Neural Network Architecture.
- Author
-
Ruchkin, V. N., Kostrov, B. V., and Fulin, V. A.
- Abstract
The article analyzes information security strategies, such as a strategic cooperative game of chicken, balancing business incentives and striving for brinkmanship, and ensuring sufficient security with minimal effort for customers and consumers while not impairing—and in some cases improving—the privacy of their infrastructure and the Internet of Things (IoT) security maturity model (SMM). The benefits of the latter strategy are estimated by selecting an architecture with the core of the IoT SMM in the form of a hierarchy of security practices. Algorithms ensuring privacy and protection from threats are analyzed. A methodology for analyzing and selecting the best architecture for multicore hierarchical clustering of computer systems is proposed. An expert system based on the on-chip MCNPAoC SBIS 1879BM8Y instrument module MS 127.05 with the proposed user interface is implemented. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
32. Automatic Differentiation of C++ Codes on Emerging Manycore Architectures with Sacado.
- Author
-
Phipps, Eric, Pawlowski, Roger, and Trott, Christian
- Subjects
- *
AUTOMATIC differentiation , *C++ , *PARTIAL differential equations , *SOFTWARE development tools , *INTEGRATED software - Abstract
Automatic differentiation (AD) is a well-known technique for evaluating analytic derivatives of calculations implemented on a computer, with numerous software tools available for incorporating AD technology into complex applications. However, a growing challenge for AD is the efficient differentiation of parallel computations implemented on emerging manycore computing architectures such as multicore CPUs, GPUs, and accelerators as these devices become more pervasive. In this work, we explore forward mode, operator overloading-based differentiation of C++ codes on these architectures using the widely available Sacado AD software package. In particular, we leverage Kokkos, a C++ tool providing APIs for implementing parallel computations that is portable to a wide variety of emerging architectures. We describe the challenges that arise when differentiating code for these architectures using Kokkos, and two approaches for overcoming them that ensure optimal memory access patterns as well as expose additional dimensions of fine-grained parallelism in the derivative calculation. We describe the results of several computational experiments that demonstrate the performance of the approach on a few contemporary CPU and GPU architectures. We then conclude with applications of these techniques to the simulation of discretized systems of partial differential equations. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
33. FPGA-based programmable embedded platform for image processing applications
- Author
-
Siddiqui, Fahad Manzoor, Woods, Roger, and Rafferty, Karen
- Subjects
621.36 ,FPGA ,Dataflow ,Multicore ,Zynq ,Parallel computing ,Hardware acceleration ,Image Processing ,Programmable - Abstract
A vast majority of electronic systems including medical, surveillance and critical infrastructure employs image processing to provide intelligent analysis. They use onboard pre-processing to reduce data bandwidth and memory requirements before sending information to the central system. Field Programmable Gate Arrays (FPGAs) represent a strong platform as they permit reconfigurability and pipelining for streaming applications. However, rapid advances and changes in these application use cases crave adaptable hardware architectures that can process dynamic data workloads and be easily programmed to achieve ecient solutions in terms of area, time and power. FPGA-based development needs iterative design cycles, hardware synthesis and place-and-route times which are alien to the software developers. This work proposes an FPGA-based programmable hardware acceleration approach to reduce design effort and time. This allows developers to use FPGAs to profile, optimise and quickly prototype algorithms using a more familiar software-centric, edit-compile-run design flow that enables the programming of the platform by software rather than high-level synthesis (HLS) engineering principles. Central to the work has been the development of an optimised FPGA-based processor called Image Processing Processor (IPPro) which efficiently uses the underlying resources and presents a programmable environment to the programmer using a dataflow design principle. This gives superior performance when compared to competing alternatives. From this, a three-layered platform has been created which enables the realisation of parallel computing skeletons on FPGA which are used to eciently express designs in high-level programming languages. From bottom-up, these layers represent programming (actor, multiple actors and parallel skeletons) and hardware (IPPro core, multicore IPPro, system infrastructure) abstraction. The platform allows acceleration of parallel and non-parallel dataflow applications. A set of point and area image pre-processing functions are implemented on Avnet Zedboard platform which allows the evaluation of the performance. The point function achieved 2.53 times better performance than the area functions and point and area functions achieved performance improvements of 7.80 and 5.27 times over sin- gle core IPPro by exploiting data parallelism. The pipelined execution of multiple stages revealed that a dataflow graph can be decomposed into balanced actors to deliver maximum performance by hiding data transfer and processing time through exploiting task parallelism; otherwise, the maximum achievable performance is limited by the slowest actor due to the ripple effect caused by unbalanced actors. The platform delivered better performance in terms of fps/Watt/Area than Embedded Graphic Processing Unit (GPU) considering both technologies allows a software-centric design flow.
- Published
- 2018
34. Preliminary Performance and Programmability Comparison of the Thick Control Flow Architecture and Current Multicore CPUs
- Author
-
Forsell, Martti, Nikula, Sara, Roivainen, Jussi, Arabnia, Hamid, Series Editor, Arabnia, Hamid R., editor, Deligiannidis, Leonidas, editor, Grimaila, Michael R., editor, Hodson, Douglas D., editor, Joe, Kazuki, editor, Sekijima, Masakazu, editor, and Tinetti, Fernando G., editor
- Published
- 2021
- Full Text
- View/download PDF
35. L2 Cache Robust Partitioning in Multicore Processors
- Author
-
de Oliveira Duarte, Thiago Silva, Saotome, Osamu, Howlett, Robert J., Series Editor, Jain, Lakhmi C., Series Editor, Iano, Yuzo, editor, Saotome, Osamu, editor, Kemper, Guillermo, editor, Mendes de Seixas, Ana Claudia, editor, and Gomes de Oliveira, Gabriel, editor
- Published
- 2021
- Full Text
- View/download PDF
36. SPORTS: A Semi-partitioned Real-Time Scheduler for Heterogeneous Multicore Platforms
- Author
-
Sharma, Yanshul, Das, Zinea, Moulik, Sanjay, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Ning, Li, editor, Chau, Vincent, editor, and Lau, Francis, editor
- Published
- 2021
- Full Text
- View/download PDF
37. RAT: A Lightweight Architecture Independent System-Level Soft Error Mitigation Technique
- Author
-
Gava, Jonas, Reis, Ricardo, Ost, Luciano, Rannenberg, Kai, Editor-in-Chief, Soares Barbosa, Luís, Editorial Board Member, Goedicke, Michael, Editorial Board Member, Tatnall, Arthur, Editorial Board Member, Neuhold, Erich J., Editorial Board Member, Stiller, Burkhard, Editorial Board Member, Tröltzsch, Fredi, Editorial Board Member, Pries-Heje, Jan, Editorial Board Member, Kreps, David, Editorial Board Member, Reis, Ricardo, Editorial Board Member, Furnell, Steven, Editorial Board Member, Mercier-Laurent, Eunika, Editorial Board Member, Winckler, Marco, Editorial Board Member, Malaka, Rainer, Editorial Board Member, Calimera, Andrea, editor, Gaillardon, Pierre-Emmanuel, editor, Korgaonkar, Kunal, editor, and Kvatinsky, Shahar, editor
- Published
- 2021
- Full Text
- View/download PDF
38. Parallel CPU-Based Processing for Automatic Crop Row Detection in Corn Fields
- Author
-
Pusdá-Chulde, Marco, De Giusti, Armando, Herrera-Granda, Erick, García-Santillán, Iván, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Botto-Tobar, Miguel, editor, Cruz, Henry, editor, and Díaz Cadena, Angela, editor
- Published
- 2021
- Full Text
- View/download PDF
39. Minits-AllOcc: An Efficient Algorithm for Mining Timed Sequential Patterns
- Author
-
Karsoum, Somayah, Barrus, Clark, Gruenwald, Le, Leal, Eleazar, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Karlapalem, Kamal, editor, Cheng, Hong, editor, Ramakrishnan, Naren, editor, Agrawal, R. K., editor, Reddy, P. Krishna, editor, Srivastava, Jaideep, editor, and Chakraborty, Tanmoy, editor
- Published
- 2021
- Full Text
- View/download PDF
40. Performance of Static and Dynamic Task Scheduling for Real-Time Engine Control System on Embedded Multicore Processor
- Author
-
Oki, Yoshitake, Mikami, Hiroki, Nishida, Hikaru, Umeda, Dan, Kimura, Keiji, Kasahara, Hironori, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Pande, Santosh, editor, and Sarkar, Vivek, editor
- Published
- 2021
- Full Text
- View/download PDF
41. A Survey of Techniques for Reducing Interference in Real-Time Applications on Multicore Platforms
- Author
-
Tamara Lugo, Santiago Lozano, Javier Fernandez, and Jesus Carretero
- Subjects
Real-time systems ,architecture ,multicore ,timing analysis ,schedulability analysis ,WCET ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
This survey reviews the scientific literature on techniques for reducing interference in real-time multicore systems, focusing on the approaches proposed between 2015 and 2020. It also presents proposals that use interference reduction techniques without considering the predictability issue. The survey highlights interference sources and categorizes proposals from the perspective of the shared resource. It covers techniques for reducing contentions in main memory, cache memory, a memory bus, and the integration of interference effects into schedulability analysis. Every section contains an overview of each proposal and an assessment of its advantages and disadvantages.
- Published
- 2022
- Full Text
- View/download PDF
42. Cache memory structural design for handling big data usages
- Author
-
Kumar, Satendra
- Published
- 2021
- Full Text
- View/download PDF
43. Pengujian Multicore Pada Processor Terhadap Performansi Server Virtualisasi Menggunakan Metode Load Testing
- Author
-
Doddy Ferdiansyah, Aliev Riaunanda Kamal, Sali Alas Majapahit, and Ferry Mulyanto
- Subjects
keamanan ,keamanan informasi ,laboratorium ,multicore ,pengujian performa ,Mathematics ,QA1-939 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Dalam membangun sebuah laboratorium keamanan informasi, perlu diperhatikan aspek perangkat, aplikasi, dan lingkungannya (environment). Laboratorium keamanan informasi ini bertujuan untuk menguji tingkat keamanan dari sebuah aplikasi yang akan atau sudah dibangun. Tetapi yang menjadi masalah utama adalah sulitnya pemilihan jenis perangkat keras yang sesuai dengan kebutuhan. Ada beberapa parameter yang harus diperhatikan dalam menentukan komponen perangkat keras yang tepat, yaitu Random Access Memeory (RAM), Processor, dan Network Interface Card (NIC). Dalam penelitian ini, hanya berfokus pada pengujian pengaruh Multi Core dalam sebuah Processor. Seperti yang diketahui, Processor merupakan perangkat utama yang sangat penting dalam komputer. Jika diibaratkan, Processor merupakan otak dari komputer yang akan digunakan untuk menguji tingkat keamanan dari sebuah aplikasi yang akan atau sudah dibangun. Selain itu, pemilihan jenis Processor juga sangat berpengaruh dalam pemrosesan tugas yang akan dilakukan oleh komputer uji dalam rancangan pembangunan laboratorium kemanan informasi ini, sehingga pemilihan Processor sangatlah penting. Hasil akhir dari penelitian ini adalah untuk mendapatkan rekomendasi Processor yang sesuai dengan kebutuhan komputer uji pada Blueprint laboratorium kemanan informasi
- Published
- 2021
- Full Text
- View/download PDF
44. RT-SEAT: A hybrid approach based real-time scheduler for energy and temperature efficient heterogeneous multicore platforms
- Author
-
Yanshul Sharma and Sanjay Moulik
- Subjects
Multicore ,Deadline ,Energy ,Heterogeneous ,Temperature ,Technology - Abstract
The demand for heterogeneous multicore platforms is growing at a rapid pace in modern gadgets. Such platforms help cater to various types of applications and thus provide high resource utilization. As each task has a different execution time on different types of cores, it is very challenging to schedule tasks on such platforms. With the advancement in technology, it has become imperative to manage energy consumption and temperatures of cores in multicore platforms. Hence, in this work, we propose RT-SEAT, a hybrid real-time scheduler for energy and temperature-efficient heterogeneous multicore systems. Through extensive experimental analysis, we have found that RT-SEAT is able to schedule more tasks (up to 46.75%), save more energy (up to 6.89%), and reduce the average temperature of the cores by 20.45%, when the system workload varies from 50% to 100%, with respect to the state-of-the-art.
- Published
- 2022
- Full Text
- View/download PDF
45. 30-GHz Low-Phase-Noise Scalable Multicore Class-F Voltage-Controlled Oscillators Using Coupled-Line-Based Synchronization Topology.
- Author
-
Wan, Jiayue, Li, Xiao, Fei, Zesong, Han, Fang, Li, Xiaoran, Wang, Xinghua, and Chen, Zhiming
- Abstract
In this letter, low-phase-nose multicore class-F voltage-controlled oscillators (VCOs) using coupled-lined-based synchronization topology are proposed. Compared to traditional resistance-coupled multicore VCOs, the proposed coupled-line-based topology improves the $Q$ of the small inductors in the millimeter-wave frequency range. Mode ambiguity is eliminated for a robust oscillation startup. Quad-core and oct-core VCO prototypes are designed and implemented in 65-nm CMOS process, which exhibit a measured frequency tuning range of 20.5% centered at 31.32 GHz. The quad-core VCO has a measured phase noise (PN) of −134.33 dBc/Hz and a corresponding FoM of 191.32 dBc/Hz at 10-MHz offset from 28.28 GHz. The oct-core VCO has a measured PN of −137.23 dBc/Hz and a corresponding FoM of 191.08 dBc/Hz at 10-MHz offset from 28.16 GHz. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
46. Speeding up wheel factoring method.
- Author
-
Bahig, Hazem M., Nassr, Dieaa I., Mahdi, Mohammed A., Hazber, Mohamed A. G., Al-Utaibi, Khaled, and Bahig, Hatem M.
- Subjects
- *
PUBLIC key cryptography , *PARALLEL algorithms , *CRYPTOSYSTEMS , *POLYNOMIAL time algorithms , *COMPUTER algorithms , *WHEELS - Abstract
The security of many public key cryptosystems that are used today depends on the difficulty of factoring an integer into its prime factors. Although there is a polynomial time quantum-based algorithm for integer factorization, there is no polynomial time algorithm on a classical computer. In this paper, we study how to improve the wheel factoring method using two approaches. The first approach is introducing two sequential modifications on the wheel factoring method. The second approach is parallelizing the modified algorithms on a parallel system. The experimental studies on composite integers n that are a product of two primes of equal size show the following results. (1) The percentages of improvements for the two modified sequential methods compared to the wheel factoring method are almost 47 % and 90 % . (2) The percentage of improvement for the two proposed parallel methods compared to the two modified sequential algorithms is 90 % on the average. (3) The maximum speedup achieved by the best parallel proposed algorithm using 24 threads is almost 336 times the wheel factoring method. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
47. Memory-Aware Denial-of-Service Attacks on Shared Cache in Multicore Real-Time Systems.
- Author
-
Bechtel, Michael and Yun, Heechul
- Subjects
- *
DENIAL of service attacks , *SHARED workspaces , *RANDOM access memory , *MULTICORE processors , *MICROELECTROMECHANICAL systems - Abstract
In this paper, we identify that memory performance plays a crucial role in the feasibility and effectiveness for performing denial-of-service attacks on shared cache. Based on this insight, we introduce new cache DoS attacks, which can be mounted from the user-space and can cause extreme worst-case execution time (WCET) impacts to cross-core victims—even if the shared cache is partitioned—by taking advantage of the platform’s memory address mapping information and HugePage support. We deploy these enhanced attacks on two popular embedded out-of-order multicore platforms using both synthetic and real-world benchmarks. The proposed DoS attacks achieve up to 111X WCET increases on the tested platforms. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
48. Real-Time System Benchmarking with Embedded Linux and RT Linux on a Multi-Core Hardware Platform
- Author
-
Hosseini, Kian and Hosseini, Kian
- Abstract
To catch up with the growing trend of parallelism, this thesis work focuses on the adaption of embedded real-time systems to a multicore platform. We use the embedded system of Xilinx ZCU-102, a multicore board, as an example of an embedded system without getting deep into its architecture. First, we deal with the tasks required to be able to make an embedded system operational and discuss why they are different from those for normal computer systems. The processes it takes to make a custom operating system for the given Xilinx embedded system are examined and patching the custom operating system along with customizing it is studied. We then take a look at related work in the field of benchmarking real-time systems and embedded systems and with a good understanding of related work propose a design similar to the related work for benchmarking embedded systems. The benchmarks we use run on multiple cores and aim at challenging the Xilinx board’s capabilities of running real-time tasks when the other cores on the board are occupied with performing independent tasks. We test the designed benchmarks on different conditions under two different operating systems of RT-Linux and Embedded Linux to study the differences between them. We then note how the RT-Linux would be a real upgrade for real-time systems if multicore operations are considered. The final result we have obtained is that core idling might decrease the performance of real-time tasks and RT-Linux might experience more interrupts but it is also better at recovering from interrupts.
- Published
- 2024
49. Reachability-Based Response-Time Analysis of Preemptive Tasks Under Global Scheduling
- Author
-
Pourya Gohari and Jeroen Voeten and Mitra Nasri, Gohari, Pourya, Voeten, Jeroen, Nasri, Mitra, Pourya Gohari and Jeroen Voeten and Mitra Nasri, Gohari, Pourya, Voeten, Jeroen, and Nasri, Mitra
- Abstract
Global scheduling reduces the average response times as it can use the available computing cores more efficiently for scheduling ready tasks. However, this flexibility poses challenges in accurately quantifying interference scenarios, often resulting in either conservative response-time analyses or scalability issues. In this paper, we present a new response-time analysis for preemptive periodic tasks (or job sets) subject to release jitter under global job-level fixed-priority (JLFP) scheduling. Our analysis relies on the notion of schedule-abstraction graph (SAG), a reachability-based response-time analysis known for its potential accuracy and efficiency. Up to this point, SAG was limited to non-preemptive tasks due to the complexity of handling preemption when the number of preemptions and the moments they occur are not known beforehand. In this paper, we introduce the concept of time partitions and demonstrate how it facilitates the extension of SAG for preemptive tasks. Moreover, our paper provides the first response-time analysis for the global EDF(k) policy - a JLFP scheduling policy introduced in 2003 to address the Dhall’s effect. Our experiments show that our analysis is significantly more accurate compared to the state-of-the-art analyses. For example, we identify 12 times more schedulable task sets than existing tests for the global EDF policy (e.g., for systems with 6 to 16 tasks, 70% utilization, and 4 cores) with an average runtime of 30 minutes. We show that EDF(k) outperforms global RM and EDF by scheduling on average 24.9% more task sets (e.g., for systems with 2 to 10 cores and 70% utilization). Moreover, for the first time, we show that global JLFP scheduling policies (particularly, global EDF(k)) are able to schedule task sets that are not schedulable using well-known partitioning heuristics.
- Published
- 2024
- Full Text
- View/download PDF
50. Tracking Multicore Contention in Memory Controllers and DRAM
- Author
-
Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Barcelona Supercomputing Center, Moretó Planas, Miquel, Cazorla Almeida, Francisco Javier, Fernández de Lecea Navarro, Asier, Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors, Barcelona Supercomputing Center, Moretó Planas, Miquel, Cazorla Almeida, Francisco Javier, and Fernández de Lecea Navarro, Asier
- Abstract
The main memory subsystem has traditionally been one of the more complex resources to analyze in multicore real-time embedded systems, with memory controller considerations and JEDEC timing constraints being the more prominent factors contributing to such complexity. One of the main challenges in multicore real-time systems is the production of the necessary evidence regarding the management of contention for the certification of multicore platforms in safety-relevant sectors. As current MPSoC platforms provide little information on how tasks may be interacting and delaying each other at large, it still remains a tall order to provide evidence about the correctness of hardware and software mechanisms deployed specifically to mitigate and manage contention on shared resources. This work attempts to bridge this gap by proposing a low-overhead hardware mechanism to tightly track inter-core contention within the main memory subsystem. The proposed technique enhances the quality of timing- and contention-related evidence, increasing the explainability and management of multicore contention in the main memory subsystem for multicore real-time systems in relation to applicable safety standards regulating their usage.
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.