639 results on '"DATA pipelining"'
Search Results
2. Developing and Testing High-Performance SHM Sensors Mounting Low-Noise MEMS Accelerometers.
- Author
-
Crognale, Marianna, Rinaldi, Cecilia, Potenza, Francesco, Gattulli, Vincenzo, Colarieti, Andrea, and Franchi, Fabio
- Subjects
- *
STRUCTURAL health monitoring , *TIME-frequency analysis , *FREQUENCY-domain analysis , *DATA pipelining , *MODAL analysis , *STRUCTURAL dynamics , *WIRELESS sensor networks , *SENSOR networks - Abstract
Recently, there has been increased interest in adopting novel sensing technologies for continuously monitoring structural systems. In this respect, micro-electrical mechanical system (MEMS) sensors are widely used in several applications, including structural health monitoring (SHM), in which accelerometric samples are acquired to perform modal analysis. Thanks to their significantly lower cost, ease of installation in the structure, and lower power consumption, they enable extensive, pervasive, and battery-less monitoring systems. This paper presents an innovative high-performance device for SHM applications, based on a low-noise triaxial MEMS accelerometer, providing a guideline and insightful results about the opportunities and capabilities of these devices. Sensor nodes have been designed, developed, and calibrated to meet structural vibration monitoring and modal identification requirements. These components include a protocol for reliable command dissemination through network and data collection, and improvements to software components for data pipelining, jitter control, and high-frequency sampling. Devices were tested in the lab using shaker excitation. Results demonstrate that MEMS-based accelerometers are a feasible solution to replace expensive piezo-based accelerometers. Deploying MEMS is promising to minimize sensor node energy consumption. Time and frequency domain analyses show that MEMS can correctly detect modal frequencies, which are useful parameters for damage detection. The acquired data from the test bed were used to examine the functioning of the network, data transmission, and data quality. The proposed architecture has been successfully deployed in a real case study to monitor the structural health of the Marcus Aurelius Exedra Hall within the Capitoline Museum of Rome. The performance robustness was demonstrated, and the results showed that the wired sensor network provides dense and accurate vibration data for structural continuous monitoring. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Cloud-Based Infrastructure and DevOps for Energy Fault Detection in Smart Buildings.
- Author
-
Horvath, Kaleb, Abid, Mohamed Riduan, Merino, Thomas, Zimmerman, Ryan, Peker, Yesem, and Khan, Shamim
- Subjects
ENERGY infrastructure ,ANOMALY detection (Computer security) ,INTELLIGENT buildings ,DATA warehousing ,CLOUD computing ,DATA pipelining ,SHARED workspaces - Abstract
We have designed a real-world smart building energy fault detection (SBFD) system on a cloud-based Databricks workspace, a high-performance computing (HPC) environment for big-data-intensive applications powered by Apache Spark. By avoiding a Smart Building Diagnostics as a Service approach and keeping a tightly centralized design, the rapid development and deployment of the cloud-based SBFD system was achieved within one calendar year. Thanks to Databricks' built-in scheduling interface, a continuous pipeline of real-time ingestion, integration, cleaning, and analytics workflows capable of energy consumption prediction and anomaly detection was implemented and deployed in the cloud. The system currently provides fault detection in the form of predictions and anomaly detection for 96 buildings on an active military installation. The system's various jobs all converge within 14 min on average. It facilitates the seamless interaction between our workspace and a cloud data lake storage provided for secure and automated initial ingestion of raw data provided by a third party via the Secure File Transfer Protocol (SFTP) and BLOB (Binary Large Objects) file system secure protocol drivers. With a powerful Python binding to the Apache Spark distributed computing framework, PySpark, these actions were coded into collaborative notebooks and chained into the aforementioned pipeline. The pipeline was successfully managed and configured throughout the lifetime of the project and is continuing to meet our needs in deployment. In this paper, we outline the general architecture and how it differs from previous smart building diagnostics initiatives, present details surrounding the underlying technology stack of our data pipeline, and enumerate some of the necessary configuration steps required to maintain and develop this big data analytics application in the cloud. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Software Engineering of Machine Learning Systems: Seeking to make machine learning more dependable.
- Author
-
Isbell, Charles, Littman, Michael L., and Norvig, Peter
- Subjects
- *
MACHINE learning , *SOFTWARE engineering , *DATA pipelining , *SOFTWARE engineers , *DATA management , *PERSONALLY identifiable information , *DATA protection - Abstract
The article discusses various aspects of the use of machine learning (ML) in the software engineering field, and it mentions ML system failures, the basic elements of software engineering, and efforts to construct dependable and safe systems using ML. Machine learning operations (MLOps) techniques and software engineers are mentioned, along with a call for organizations to utilize data pipelines to manage data and protect personally identifiable information.
- Published
- 2023
- Full Text
- View/download PDF
5. BioLegato: a programmable, object-oriented graphic user interface
- Author
-
Graham Alvare, Abiel Roche-Lima, and Brian Fristensky
- Subjects
Graphic user interface ,User experience ,Data pipelining ,Sequencing ,Genomics ,Transcriptomics ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Biologists are faced with an ever-changing array of complex software tools with steep learning curves, often run on High Performance Computing platforms. To resolve the tradeoff between analytical sophistication and usability, we have designed BioLegato, a programmable graphical user interface (GUI) for running external programs. Results BioLegato can run any program or pipeline that can be launched as a command. BioLegato reads specifications for each tool from files written in PCD, a simple language for specifying GUI components that set parameters for calling external programs. Thus, adding new tools to BioLegato can be done without changing the BioLegato Java code itself. The process is as simple as copying an existing PCD file and modifying it for the new program, which is more like filling in a form than writing code. PCD thus facilitates rapid development of new applications using existing programs as building blocks, and getting them to work together seamlessly. Conclusion BioLegato applies Object-Oriented concepts to the user experience by organizing applications based on discrete data types and the methods relevant to that data. PCD makes it easier for BioLegato applications to evolve with the succession of analytical tools for bioinformatics. BioLegato is applicable not only in biology, but in almost any field in which disparate software tools need to work as an integrated system.
- Published
- 2023
- Full Text
- View/download PDF
6. BioLegato: a programmable, object-oriented graphic user interface.
- Author
-
Alvare, Graham, Roche-Lima, Abiel, and Fristensky, Brian
- Subjects
USER interfaces ,GRAPHICAL user interfaces ,HIGH performance computing ,SOFTWARE development tools ,COMPUTING platforms ,OBJECT-oriented databases ,SYNTHETIC biology - Abstract
Background: Biologists are faced with an ever-changing array of complex software tools with steep learning curves, often run on High Performance Computing platforms. To resolve the tradeoff between analytical sophistication and usability, we have designed BioLegato, a programmable graphical user interface (GUI) for running external programs. Results: BioLegato can run any program or pipeline that can be launched as a command. BioLegato reads specifications for each tool from files written in PCD, a simple language for specifying GUI components that set parameters for calling external programs. Thus, adding new tools to BioLegato can be done without changing the BioLegato Java code itself. The process is as simple as copying an existing PCD file and modifying it for the new program, which is more like filling in a form than writing code. PCD thus facilitates rapid development of new applications using existing programs as building blocks, and getting them to work together seamlessly. Conclusion: BioLegato applies Object-Oriented concepts to the user experience by organizing applications based on discrete data types and the methods relevant to that data. PCD makes it easier for BioLegato applications to evolve with the succession of analytical tools for bioinformatics. BioLegato is applicable not only in biology, but in almost any field in which disparate software tools need to work as an integrated system. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
7. FPGA based Digital Filter Design for faster operations.
- Author
-
Ariunaa, Kh., Tudevdagva, U., and Hussai, M.
- Subjects
FLOWGRAPHS ,DATA pipelining ,FIELD programmable gate arrays - Abstract
The reduced complexity design of the IIR filter is discussed in this paper. The use of the IIR filter has been variably increasing during the present times and they are realtime applications. The filter complexity reduction is done to increase the usage of filters in FPGA. In this method, the coefficients are expressed in the form of 1's and 0's. The multiplied delays have been used which is used to scale down the complexity. Signal Flow Graph plays a critical role in identifying the difficult path. IIR Filter is programmed on basis of the Synchronous Data Flow and Pipelining. The Half Band IIR filters are used for high-performance applications. This design is processed to implement in FPGA which will result in higher speed and the cost will be variably decreased as well as the consumption of power. The entire process is implemented by using the Verilog language which reduces the complication and speeds up the whole process. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
8. Accelerating Dynamic Programming by P-Fold Pipeline Implementation on GPU.
- Author
-
Matsumae, Susumu
- Subjects
DYNAMIC programming ,GRAPHICS processing units ,DATA pipelining - Abstract
In this paper, we show the effectiveness of pipeline implementations of Dynamic Programming (DP) on Graphics Processing Unit (GPU). We deal with a simplified DP problem where each element of its solution table is calculated in order by semi-group operations among several of already computed elements in the table. We implement the DP program on GPU in a pipeline fashion, i.e., we use GPU cores for supporting pipeline-stages so that several elements of the solution tables are partially computed at one time. Further, to accelerate the pipeline implementation, we propose a p-fold pipeline technique, which enables larger parallelism more than the number of pipeline-stages. [ABSTRACT FROM AUTHOR]
- Published
- 2023
9. PSReliP: an integrated pipeline for analysis and visualization of population structure and relatedness based on genome-wide genetic variant data.
- Author
-
Solovieva, Elena and Sakai, Hiroaki
- Subjects
- *
GENETIC variation , *SELECTION (Plant breeding) , *SINGLE nucleotide polymorphisms , *DATA pipelining , *GENOME-wide association studies , *PIPELINE inspection , *FALSE positive error - Abstract
Background: Population structure and cryptic relatedness between individuals (samples) are two major factors affecting false positives in genome-wide association studies (GWAS). In addition, population stratification and genetic relatedness in genomic selection in animal and plant breeding can affect prediction accuracy. The methods commonly used for solving these problems are principal component analysis (to adjust for population stratification) and marker-based kinship estimates (to correct for the confounding effects of genetic relatedness). Currently, many tools and software are available that analyze genetic variation among individuals to determine population structure and genetic relationships. However, none of these tools or pipelines perform such analyses in a single workflow and visualize all the various results in a single interactive web application. Results: We developed PSReliP, a standalone, freely available pipeline for the analysis and visualization of population structure and relatedness between individuals in a user-specified genetic variant dataset. The analysis stage of PSReliP is responsible for executing all steps of data filtering and analysis and contains an ordered sequence of commands from PLINK, a whole-genome association analysis toolset, along with in-house shell scripts and Perl programs that support data pipelining. The visualization stage is provided by Shiny apps, an R-based interactive web application. In this study, we describe the characteristics and features of PSReliP and demonstrate how it can be applied to real genome-wide genetic variant data. Conclusions: The PSReliP pipeline allows users to quickly analyze genetic variants such as single nucleotide polymorphisms and small insertions or deletions at the genome level to estimate population structure and cryptic relatedness using PLINK software and to visualize the analysis results in interactive tables, plots, and charts using Shiny technology. The analysis and assessment of population stratification and genetic relatedness can aid in choosing an appropriate approach for the statistical analysis of GWAS data and predictions in genomic selection. The various outputs from PLINK can be used for further downstream analysis. The code and manual for PSReliP are available at https://github.com/solelena/PSReliP. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. Cloud-Based Infrastructure and DevOps for Energy Fault Detection in Smart Buildings
- Author
-
Kaleb Horvath, Mohamed Riduan Abid, Thomas Merino, Ryan Zimmerman, Yesem Peker, and Shamim Khan
- Subjects
data pipelining ,big data analytics ,smart buildings ,energy efficiency ,Databricks ,ADLS ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
We have designed a real-world smart building energy fault detection (SBFD) system on a cloud-based Databricks workspace, a high-performance computing (HPC) environment for big-data-intensive applications powered by Apache Spark. By avoiding a Smart Building Diagnostics as a Service approach and keeping a tightly centralized design, the rapid development and deployment of the cloud-based SBFD system was achieved within one calendar year. Thanks to Databricks’ built-in scheduling interface, a continuous pipeline of real-time ingestion, integration, cleaning, and analytics workflows capable of energy consumption prediction and anomaly detection was implemented and deployed in the cloud. The system currently provides fault detection in the form of predictions and anomaly detection for 96 buildings on an active military installation. The system’s various jobs all converge within 14 min on average. It facilitates the seamless interaction between our workspace and a cloud data lake storage provided for secure and automated initial ingestion of raw data provided by a third party via the Secure File Transfer Protocol (SFTP) and BLOB (Binary Large Objects) file system secure protocol drivers. With a powerful Python binding to the Apache Spark distributed computing framework, PySpark, these actions were coded into collaborative notebooks and chained into the aforementioned pipeline. The pipeline was successfully managed and configured throughout the lifetime of the project and is continuing to meet our needs in deployment. In this paper, we outline the general architecture and how it differs from previous smart building diagnostics initiatives, present details surrounding the underlying technology stack of our data pipeline, and enumerate some of the necessary configuration steps required to maintain and develop this big data analytics application in the cloud.
- Published
- 2024
- Full Text
- View/download PDF
11. OpenFab: A Programmable Pipeline for Multimaterial Fabrication.
- Author
-
Vidimče, Kiril, Szu-Po Wang, Ragan-Kelley, Jonathan, and Matusik, Wojciech
- Subjects
- *
PRINTING software , *THREE-dimensional printing , *DATA pipelining , *FABRICATION (Manufacturing) , *MATERIALS - Abstract
3D printing hardware is rapidly scaling up to output continuous mixtures of multiple materials at increasing resolution over ever larger print volumes. This poses an enormous computational challenge: large high-resolution prints comprise trillions of voxels and petabytes of data, and modeling and describing the input with spatially varying material mixtures at this scale are simply challenging. Existing 3D printing software is insufficient; in particular, most software is designed to support only a few million primitives, with discrete material choices per object. We present OpenFab, a programmable pipeline for synthesis of multimaterial 3D printed objects that is inspired by RenderMan and modern GPU pipelines. The pipeline supports procedural evaluation of geometric detail and material composition, using shader-like fablets, allowing models to be specified easily and efficiently. The pipeline is implemented in a streaming fashion: only a small fraction of the final volume is stored in memory, and output is fed to the printer with a little startup delay. We demonstrate it on a variety of multimaterial objects. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
12. Fast hardware-aware matrix-free algorithms for higher-order finite-element discretized matrix multivector products on distributed systems.
- Author
-
Panigrahi, Gourab, Kodali, Nikhil, Panda, Debashis, and Motamarri, Phani
- Subjects
- *
MATRIX multiplications , *ALGORITHMS , *SPARSE matrices , *DATA pipelining , *GRAPHICS processing units , *DISTRIBUTED algorithms - Abstract
Recent hardware-aware matrix-free algorithms for higher-order finite-element (FE) discretized matrix-vector multiplications reduce floating point operations and data access costs compared to traditional sparse matrix approaches. In this work, we address a critical gap in existing matrix-free implementations which are not well suited for the action of FE discretized matrices on very large number of vectors. In particular, we propose efficient matrix-free algorithms for evaluating FE discretized matrix-multivector products on both multi-node CPU and GPU architectures. To this end, we employ batched evaluation strategies, with the batchsize tailored to underlying hardware architectures, leading to better data locality and enabling further parallelization. On CPUs, we utilize even-odd decomposition, SIMD vectorization, and overlapping computation and communication strategies. On GPUs, we develop strategies to overlap compute with data movement for achieving efficient pipelining and reduced data accesses through the use of GPU-shared memory, constant memory and kernel fusion. Our implementation outperforms the baselines for Helmholtz operator action on 1024 vectors, achieving up to 1.4x improvement on one CPU node and up to 2.8x on one GPU node, while reaching up to 4.4x and 1.5x improvement on multiple nodes for CPUs (3072 cores) and GPUs (24 GPUs), respectively. We further benchmark the performance of the proposed implementation for solving a model eigenvalue problem for 1024 smallest eigenvalue-eigenvector pairs by employing the Chebyshev Filtered Subspace Iteration method, achieving up to 1.5x improvement on one CPU node and up to 2.2x on one GPU node while reaching up to 3.0x and 1.4x improvement on multi-node CPUs (3072 cores) and GPUs (24 GPUs), respectively. • Matrix-free algorithms for FE matrix-multivector products with thousands of vectors on multi-node CPU and GPU architectures. • Hardware-tuned batched evaluation strategies for improved data locality. • Architecture-specific implementation strategies to evaluate the tensor contractions. • Even-Odd decomposition, SIMD vectorization, overlapping compute and communication on CPUs. • Shared memory, constant memory, kernel fusion, overlap compute with data movement, and efficient pipelining on GPUs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Improving Pipelining Tools for Pre-processing Data.
- Author
-
Novo-Lourés, Maria, Lage, Yeray, Pavón, Reyes, Laza, Rosalía, Ruano-Ordás, David, and Méndez, José Ramón
- Subjects
DATA pipelining ,DATA mining ,ELECTRONIC data processing ,PROGRAMMING languages ,DEBUGGING ,PIPELINE inspection ,PROCESS optimization ,BIG data - Abstract
The last several years have seen the emergence of data mining and its transformation into a powerful tool that adds value to business and research. Data mining makes it possible to explore and find unseen connections between variables and facts observed in different domains, helping us to better understand reality. The programming methods and frameworks used to analyse data have evolved over time. Currently, the use of pipelining schemes is the most reliable way of analysing data and due to this, several important companies are currently offering this kind of services. Moreover, several frameworks compatible with different programming languages are available for the development of computational pipelines and many research studies have addressed the optimization of data processing speed. However, as this study shows, the presence of early error detection techniques and developer support mechanisms is very limited in these frameworks. In this context, this study introduces different improvements, such as the design of different types of constraints for the early detection of errors, the creation of functions to facilitate debugging of concrete tasks included in a pipeline, the invalidation of erroneous instances and/or the introduction of the burst-processing scheme. Adding these functionalities, we developed Big Data Pipelining for Java (BDP4J, https://github.com/sing-group/bdp4j), a fully functional new pipelining framework that shows the potential of these features. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
14. Using Apache Beam to Pipeline and Process Data.
- Author
-
Kumar, Gaurav
- Subjects
DATA pipelining ,OPEN source intelligence ,DATA analysis ,DATA quality ,HUMAN beings - Abstract
The article discusses the importance of data pipelines to integrate and process data effectively. It explains the stages of data pipelining and how it is essential for real-time analytics to reduce human intervention. It mentions the importance of data quality. It also mentions that facilitating data processing pipelines and real-time data filtering, Apache Beam is a no-cost, open-source programming model.
- Published
- 2022
15. Data Mesh: Concepts and Principles of a Paradigm Shift in Data Architectures.
- Author
-
Machado, Inês Araújo, Costa, Carlos, and Santos, Maribel Yasmina
- Subjects
DATA warehousing ,DATA pipelining ,BIG data ,LAKES - Abstract
Inherent to the growing use of the most varied forms of software (e.g., social applications), there is the creation and storage of data that, due to its characteristics (volume, variety, and velocity), make the concept of Big Data emerge. Big Data Warehouses and Data Lakes are concepts already well established and implemented by several organizations, to serve their decision-making needs. After analyzing the various problems demonstrated by those monolithic architectures, it is possible to conclude about the need for a paradigm shift that will make organizations truly data-oriented. In this new paradigm, data is seen as the main concern of the organization, and the pipelining tools and the Data Lake itself are seen as a secondary concern. Thus, the Data Mesh consists in the implementation of an architecture where data is intentionally distributed among several Mesh nodes, in such a way that there is no chaos or data silos, since there are centralized governance strategies and the guarantee that the core principles are shared throughout the Mesh nodes. This paper presents the motivation for the appearance of the Data Mesh paradigm, its features, and approaches for its implementation. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
16. Cloud Pipe Dream.
- Author
-
Cai, Kenrick and Buckingham, John
- Subjects
HIGH technology industries ,DATA pipelining ,BIG data ,DATA analytics ,VENTURE capital - Abstract
The article focuses on Fivetran, a technology company co-founded by George Fraser and Taylor Brown that gathers data for companies and funnels it to big data analytic firms. It attributes valuation growth of Fivetran to venture capitals from blue-chip technology investment firms including Iconiq Capital and D1 Capital Partners. Also noted is the market lead of Fivetran in data pipelines due to its ease of use.
- Published
- 2022
17. The bdpar Package: Big Data Pipelining Architecture for R.
- Author
-
Ferreiro-Díaz, Miguel, Cotos-Yáñez, Tomás R., Méndez, José R., and Ruano-Ordás, David
- Subjects
- *
DATA pipelining , *BIG data - Abstract
In the last years, big data has become a useful paradigm for taking advantage of multiple sources to find relevant knowledge in real domains (such as the design of personalized marketing campaigns or helping to palliate the effects of several fatal diseases). Big data programming tools and methods have evolved over time from a MapReduce to a pipeline-based archetype. Concretely the use of pipelining schemes has become the most reliable way of processing and analyzing large amounts of data. To this end, this work introduces bdpar, a new highly customizable pipeline-based framework (using the OOP paradigm provided by R6 package) able to execute multiple preprocessing tasks over heterogeneous data sources. Moreover, to increase the flexibility and performance, bdpar provides helpful features such as (i) the definition of a novel object-based pipe operator (%>|%), (ii) the ability to easily design and deploy new (and customized) input data parsers, tasks, and pipelines, (iii) only-once execution which avoids the execution of previously processed information (instances), guaranteeing that only new both input data and pipelines are executed, (iv) the capability to perform serial or parallel operations according to the user needs, (v) the inclusion of a debugging mechanism which allows users to check the status of each instance (and find possible errors) throughout the process. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
18. An Adaptable Memory System Using Reconfigurable Row DRAM To Improve Performance Of Multi Core For Big Data.
- Author
-
Mekhiel, Nagi
- Subjects
BIG data ,DATA pipelining - Abstract
Multi-core based systems access DRAM using multiple different addresses that could map to different rows in the same bank at the same time, causing row conflicts forcing them to wait to activate one row at a time. We present an adaptable memory using reconfigurable row DRAM that divides rows into many segments and uses special latches to allow many accesses that map to different physical rows to form one adaptable logical row accessed by multi-core as one physical row. The adaptable row accesses different rows in a pipeline fashion by overlapping the long DRAM access time between the different accesses. The results show that the adaptable memory system improves the scalability of multi-core by up to 300% and could gain more from improving processor speed and global cache miss rate and memory-processor bus bandwidth. [ABSTRACT FROM AUTHOR]
- Published
- 2021
19. Advances in Modern Information Technologies for Data Analysis in CRYO-EM and XFEL Experiments.
- Author
-
Bobkov, S. A., Teslyuk, A. B., Baymukhametov, T. N., Pichkur, E. B., Chesnokov, Yu. M., Assalauova, D., Poyda, A. A., Novikov, A. M., Zolotarev, S. I., Ikonnikova, K. A., Velikhov, V. E., Vartanyants, I. A., Vasiliev, A. L., and Ilyin, V. A.
- Subjects
- *
INFORMATION technology , *DATA pipelining , *DATA analysis , *X-ray lasers , *MIDDLEWARE , *ELECTRONIC data processing - Abstract
A new approach to the organization of data pipelining in cryo-electron microscopy (Cryo-EM) and X-ray free-electron laser (XFEL) experiments is presented. This approach, based on the progress in information technologies (IT) due to the development of containerization techniques, allows one to separate user's work at the application level from the developments of IT experts at the system and middleware levels. A user must only perform two simple operations: pack application packages in containers and write a workflow with data processing logic in a standard format. Some examples of containerized workflows for Cryo-EM and XFEL experiments on study of the spatial structure of single biological nanoobjects (viruses, macromolecules, etc.) are discussed. Examples of program codes for installing applied packages in Docker containers and examples of applied workflows written in the high-level language CWL are presented at the site of the project. The examples have comments, which may help an IT-inexperienced researcher to gain an idea of how to organize Docker containers and form CWL workflows for Cryo-EM and XFEL data pipelining. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
20. Making the Web Faster with HTTP 2.0.
- Author
-
GRIGORIK, ILYA
- Subjects
- *
HTTP (Computer network protocol) , *INTERNET , *COMPUTER network protocols , *MOBILE apps , *INTERNET traffic , *DATA pipelining - Abstract
The article discusses various aspects of HyperText Transfer Protocol (HTTP) 2.0, a new version of the Internet application protocol as of December 2013. Background is presented on the history of the original HTTP including its success and the issues that have emerged recently. It highlights the performance challenges that come with modern Internet applications including the increase in online traffic caused mainly by mobile networks. The performance limitations of HTTP 1.1 are also highlighted including the failure of its request pipelining feature due to lack of support.
- Published
- 2013
- Full Text
- View/download PDF
21. Bio-electronic aggregates on Neon-Paleolitikos strata.
- Author
-
Sier, André
- Subjects
- *
DATA pipelining , *ELECTROPHYSIOLOGY , *INFORMATION organization , *COMPUTER network protocols , *COMPUTER interfaces , *MICROPROCESSORS , *VIRTUAL networks - Abstract
Electronic machinic phenomena yield fascinating links with biological processes. Either in the macro-micro-structure of binary encoded information – bytes on media – to the processual flow programs execute on hardware while operating it. Observing micro-electronic worlds akin to living entities: electronic voltages running throughout electronic architectures pipelining data to memory registers; operating systems executing programs on electronic substrates; data flows taking place in machines and in communications protocols within networks. Static art-sci constructs explore and visualize these observations as 2D drawings (Neon Paleolitikos Drawings, 2017–present) or 3D sculptures (Binary and Biological Sculpture Series, 2018–present), creatively exposing their inherent rhythmic organization of information, while dynamic installations (Phoenix.Wolfanddotcom.info, Wolfanddotcom, Half-Plant, 2017, Ant Ennae Labyrinths, 2019–present) propose immersive interference mechanisms that attempt user entanglement in non-human environments. Seven aesthetic case examples are introduced and explored, observing and seeking resonances between micro-granular electronic, biological and hybrid data as source synthesis. This research proposes a look at bio-electronic aggregates on Neon Paleolitikos strata. After the Anthropocene, Neon Paleolitikos is an imaginary epoch dating since the decline of mankind until the zenith of bio-electronic life forms: operational symbioses combined among ruins of silica, transistors, algorithms, cells, plants, animals and electricity. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
22. Data as an essential facility in European law: how to define the "target" market and divert the data pipeline?
- Author
-
Bruc, Édouard
- Subjects
- *
UBIQUITOUS computing , *ARTIFICIAL intelligence , *INTERNET of things , *DATA pipelining , *DISTRIBUTED computing - Abstract
A new economic test called the holistic inter-dynamic analysis is proposed to tackle both sides of the multisided market simultaneously. The dominance issue solved, the risk of 102 TFEU abuse regarding data was already highlighted in the 1996 Database Directive. On that matter, this article assesses each legal condition of a "refusal to supply" access. Apropos the indispensability condition, data's ubiquitous nature may be dispelled by an analysis of the context in which it evolves or its link with AI or the IoT. Concerning its replicability prong, data's feature, feedback loops, network effects, switching costs or economies of scale/scope are conducive to super-dominance. They altogether can create an insurmountable bottleneck. Regarding innovation, the author carefully analyses the "limitation of technical development" condition either on an economic or a legal standpoint. Finally, regarding privacy law, when granting access, a coherent approach appears feasible through a proportionality test. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
23. Outbreak analytics: a developing data science for informing the response to emerging pathogens.
- Author
-
Polonsky, Jonathan A., Baidjoe, Amrish, Kamvar, Zhian N., Cori, Anne, Durski, Kara, Edmunds, W. John, Eggo, Rosalind M., Funk, Sebastian, Kaiser, Laurent, Keating, Patrick, le Polain de Waroux, Olivier, Marks, Michael, Moraga, Paula, Morgan, Oliver, Nouvellet, Pierre, Ratnayake, Ruwan, Roberts, Chrissy H., Whitworth, Jimmy, and Jombart, Thibaut
- Subjects
- *
PUBLIC health , *DATA visualization , *DATA pipelining , *ACQUISITION of data , *DATA science - Abstract
Despite continued efforts to improve health systems worldwide, emerging pathogen epidemics remain a major public health concern. Effective response to such outbreaks relies on timely intervention, ideally informed by all available sources of data. The collection, visualization and analysis of outbreak data are becomingincreasingly complex, owing to the diversity in types of data, questions and available methods to address them. Recent advances have led to the rise of outbreak analytics, an emerging data science focused on the technological and methodological aspects of the outbreak data pipeline, fromcollection to analysis, modelling and reporting to inform outbreak response. In this article, we assess the current state of the field. After laying out the context of outbreak response, we critically review the most common analytics components, their interdependencies, data requirements and the type of information they can provide to informoperations in real time. We discuss some challenges and opportunities and conclude on the potential role of outbreak analytics for improving our understanding of, and response to outbreaks of emerging pathogens. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
24. Scaling Up Modulo Scheduling for High-Level Synthesis.
- Author
-
de Souza Rosa, Leandro, Bouganis, Christos-Savvas, and Bonato, Vanderlei
- Subjects
- *
COMPUTER scheduling , *FIELD programmable gate arrays , *GENETIC algorithms , *MOTHERBOARDS , *DATA pipelining - Abstract
High-level synthesis (HLS) tools have been increasingly used within the hardware design community to bridge the gap between productivity and the need to design large and complex systems. When targeting heterogeneous systems, where the CPU and the field-programmable gate array (FPGA) fabric are both available to perform computations, a design space exploration (DSE) is usually carried out for deciding which parts of the initial code should be mapped to the FPGA fabric such as the overall system’s performance is enhanced by accelerating its computation via dedicated processors. As the targeted systems become more complex and larger, leading to a large DSE, the fast estimative of the possible acceleration that can be obtained by mapping certain functionality into the FPGA fabric is of paramount importance. Loop pipelining, which is responsible for the majority of HLS compilation time, is a key optimization toward achieving high-performance acceleration kernels. A new modulo scheduling algorithm is proposed, which reformulates the classical modulo scheduling problem and leads to a reduced number of integer linear problems solved, resulting in large computational savings. Moreover, the proposed approach has a controlled tradeoff between solution quality and computation time. Results show the scalability is improved efficiently from quadratic, for the state-of-the-art method, to linear, for the proposed approach, while the optimized loop suffers a 1% (geomean) increment in the total number of cycles. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
25. Hardware Optimizations and Analysis for the WG-16 Cipher with Tower Field Arithmetic.
- Author
-
Zidaric, Nusa, Aagaard, Mark, and Gong, Guang
- Subjects
- *
STREAM ciphers , *COMPUTER input-output equipment , *POLYNOMIALS , *FIELD programmable gate arrays , *CRYPTOGRAPHY , *DATA pipelining - Abstract
This paper explores tower field constructions and hardware optimizations for the WG-16 stream cipher. The constructions ${\mathbb {F}}_{(((2^2)^2)^2)^2}$ and ${\mathbb {F}}_{(2^{4})^4}$ were chosen because their small subfields enable high speed arithmetic implementations and their regularity provides flexibility in pipeline granularity. A design methodology is presented where the tower field constructions guide how to proceed systematically from algebraic optimizations, through initial hardware implementation, selection of submodules, pipelining, and finally detailed hardware optimizations to increase clock speed. The highest frequency WG(16, 32) keystream generator, obtained for the 65 nm ASIC library, reached a clock speed of 2.44 GHz at 26.3 kGE, and the smallest area keystream generator achieved a clock speed of 0.33 GHz at 9.9 kGE. The highest frequency FPGA implementation on a Xilinx Spartan 6 reached a clock speed of 256 MHz using 631 slices. In addition, the paper demonstrates that LFSR feedback polynomials can be optimized to increase security without hurting performance, and retiming optimizations can be used to increase clock speed without increasing area. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
26. Feedforward-Cutset-Free Pipelined Multiply–Accumulate Unit for the Machine Learning Accelerator.
- Author
-
Ryu, Sungju, Park, Naebeom, and Kim, Jae-Joon
- Subjects
MACHINE learning ,DATA pipelining - Abstract
Multiply–accumulate (MAC) computations account for a large part of machine learning accelerator operations. The pipelined structure is usually adopted to improve the performance by reducing the length of critical paths. An increase in the number of flip-flops due to pipelining, however, generally results in significant area and power increase. A large number of flip-flops are often required to meet the feedforward-cutset rule. Based on the observation that this rule can be relaxed in machine learning applications, we propose a pipelining method that eliminates some of the flip-flops selectively. The simulation results show that the proposed MAC unit achieved a 20% energy saving and a 20% area reduction compared with the conventional pipelined MAC. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
27. Triggered-Issuance and Triggered-Execution: A Control Paradigm to Minimize Pipeline Stalls in Distributed Controlled Coarse-Grained Reconfigurable Arrays.
- Author
-
Lu, Yanan, Liu, Leibo, Deng, Yangdong, Weng, Jian, Yin, Shouyi, Shi, Yiyu, and Wei, Shaojun
- Subjects
- *
DATA pipelining , *ARRAY processing , *COMPUTER architecture , *DISTRIBUTED databases - Abstract
Distributed controlled coarse-grained reconfigurable arrays (CGRAs) enable efficient execution of irregular control flows by reconciling divergence in the processing elements (PEs). To further improve performance by better exploiting spatial parallelism, the triggered instruction architecture (TIA) eliminates the program counter and branch instructions by converting control flows into predicate dependencies as triggers. However, pipeline stalls, which occur in pipelines composed of both intra and inter-PEs, remain a major obstacle to the overall performance. In fact, the stalls in distributed controlled CGRAs pose a unique problem that is difficult to resolve by previous techniques. This work presents a triggered-issuance and triggered-execution (TITE) paradigm in which the issuance and execution of instructions are separately triggered to further relax the predicate dependencies in TIA. In this paradigm, instructions are paired as dual instructions to eliminate stalls caused by control divergence. Tags that identify the data transmitted between PEs are forwarded for acceleration. As a result, pipeline stalls of both intra- and inter-PEs can be significantly minimized. Experiments show that TITE improves performance by 21 percent, energy efficiency by 17 percent, and area efficiency by 12 percent compared with a baseline TIA. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
28. Unifying Fixed Code Mapping, Communication, Synchronization and Scheduling Algorithms for Efficient and Scalable Loop Pipelining.
- Author
-
Mastoras, Aristeidis and Gross, Thomas R.
- Subjects
- *
COMPUTER scheduling , *DATA pipelining , *DIGITAL mapping , *SYNCHRONIZATION software , *COMMUNICATIONS software - Abstract
Pipelining allows the execution of loop iterations with cross-iteration dependences to overlap in time, provided that the loop body is partitioned into stages such that the data dependences are not violated. Then, the stages are mapped onto threads and communication and synchronization between stages is typically achieved using queues. Pipelining techniques that rely on static scheduling perform poorly for load-imbalanced loops. Moreover, previous research efforts that achieve load-balancing are restricted to work-stealing and imply high overhead for fine-grained loops. In this article, we present URTS, a unified runtime system with compiler support that provides a lightweight dynamic scheduler by combining mapping, communication and synchronization algorithms with a suitable data structure and an efficient ticket mechanism. Particularly, URTS shows that it is possible to combine the efficiency of static scheduling with the load-imbalance tolerance of work-stealing by using a unified design that exploits the properties of a novel data structure. The evaluation on 8- and 32-core machines shows that URTS implies low overhead, of the same order as a static scheduler, for a set of benchmarks chosen from widely-used collections. URTS is a scalable solution that performs efficient dynamic scheduling for fine-grained loops, i.e., a class of interesting loops that is poorly handled by the state-of-the-art due to high overhead. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
29. Polyhedral-Based Dynamic Loop Pipelining for High-Level Synthesis.
- Author
-
Liu, Junyi, Wickerson, John, Bayliss, Samuel, and Constantinides, George A.
- Subjects
- *
DATA pipelining , *FIELD programmable analog arrays , *ADAPTIVE computing systems , *HIGH level synthesis (Electronic design) , *PARALLEL processing - Abstract
Loop pipelining is one of the most important optimization methods in high-level synthesis (HLS) for increasing loop parallelism. There has been considerable work on improving loop pipelining, which mainly focuses on optimizing static operation scheduling and parallel memory accesses. Nonetheless, when loops contain complex memory dependencies, current techniques cannot generate high performance pipelines. In this paper, we extend the capability of loop pipelining in HLS to handle loops with uncertain dependencies (i.e., parameterized by an undetermined variable) and/or nonuniform dependencies (i.e., varying between loop iterations). Our optimization allows a pipeline to be statically scheduled without the aforementioned memory dependencies, but an associated controller will change the execution speed of loop iterations at runtime. This allows the augmented pipeline to process each loop iteration as fast as possible without violating memory dependencies. We use a parametric polyhedral analysis to generate the control logic for when to safely run all loop iterations in the pipeline and when to break the pipeline execution to resolve memory conflicts. Our techniques have been prototyped in an automated source-to-source code transformation framework, with Xilinx Vivado HLS, a leading HLS tool, as the RTL generation backend. Over a suite of benchmarks, experiments show that our optimization can implement optimized pipelines at almost the same clock speed as without our transformations, running approximately 3.7– $10{\times }$ faster, with a reasonable resource overhead. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
30. Data Packet Processing Model based on Multi-Core Architecture.
- Author
-
Xian Zhang, Dong Yin, Taiguo Qu, Jia Liu, and Yiwen Liu
- Subjects
DATA packeting ,MULTICORE processors ,DATA transmission systems ,DATA pipelining ,COMPUTER architecture ,PARALLEL processing - Abstract
According to the characteristics of pipeline structure and multi-core processor structure for packet processing in network detection applications, the horizontal-based parallel architecture model and tree-based parallel architecture model are proposed for packet processing of Snort application. The principle of a tree-based parallel architecture model is to use pipelining and flow-pinning technology to design a processor that is specifically used to capture data packets, and other processors are responsible for other stages of parallel processing of the data packets. The experimental comparison and analysis show that the tree-based parallel architecture model has higher performance on the second-level cache hit ratio, throughput, CPU utilization, and inter-core load balancing compared to the horizontalbased parallel architecture model for packet processing of Snort application. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
31. P‐57: A Local Histogram Framework for Contrast Enhancement.
- Author
-
Zhou, Xue-Bing, Wen, Yi-Chien, Jin, Yu-Feng, Syu, Shen-Sian, and Jou, Ming-Jong
- Subjects
LIQUID crystal displays ,CONTRAST effect ,DATA pipelining - Abstract
This paper proposes an implementation framework based on local contrast enhancement (LCE). The framework, implemented on Xilinx Kintex‐7 FPGA platform, is lighted on the 55″ Ultra‐High Definition (UHD) LCD panel and shows the result of image contrast enhancement. In accordance with the hardware design concept, the implementation framework adopts the module partition, hardware parallel processing structure and pipeline processing architecture. The hardware implementation fully achieves the real‐time requirements. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
32. SIMD vectorization for the Lennard-Jones potential with AVX2 and AVX-512 instructions.
- Author
-
Watanabe, Hiroshi and Nakagawa, Koh M.
- Subjects
- *
SIMD (Computer architecture) , *NETWORK performance , *COMPUTER storage devices , *DATA pipelining , *CENTRAL processing units , *MOLECULAR dynamics - Abstract
Abstract This work describes the SIMD vectorization of the force calculation of the Lennard-Jones potential with Intel AVX2 and AVX-512 instruction sets. Since the force-calculation kernel of the molecular dynamics method involves indirect access to memory, the data layout is one of the most important factors in vectorization. We find that the Array of Structures (AoS) with padding exhibits better performance than Structure of Arrays (SoA) with appropriate vectorization and optimizations. In particular, AoS with 512-bit width exhibits the best performance among the architectures. While the difference in performance between AoS and SoA is significant for the vectorization with AVX2, that with AVX-512 is minor. The effect of other optimization techniques, such as software pipelining together with vectorization, is also discussed. We present results for benchmarks on three CPU architectures: Intel Haswell (HSW), Knights Landing (KNL), and Skylake (SKL). The performance gains by vectorization are about 42% on HSW compared with the code optimized without vectorization. On KNL, the hand-vectorized codes exhibit 34% better performance than the codes vectorized automatically by the Intel compiler. On SKL, the code vectorized with AVX2 exhibits slightly better performance than that with vectorized AVX-512. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
33. MICROPIPELINES.
- Author
-
Sutherland, Ivan E.
- Subjects
- *
DATA pipelining , *MICROPROCESSORS , *REAL-time computing - Abstract
Describes the use of pipeline processors as a paradigm for very high speed computers. Capability of pipeline processors to provide high speed; Types of pipeline processes; Features and functions; Conceptual frameworks for micropipelines.
- Published
- 1989
- Full Text
- View/download PDF
34. A Variable-Clock-Cycle-Path VLSI Design of Binary Arithmetic Decoder for H.265/HEVC.
- Author
-
Zhou, Jinjia, Zhou, Dajiang, Zhang, Shuping, Kimura, Shinji, and Goto, Satoshi
- Subjects
- *
DESIGN & construction of very large scale circuit integration , *DECODERS (Electronics) , *VIDEO coding , *DATA pipelining , *ENTROPY - Abstract
The next-generation 8K ultra-high-definition video format involves an extremely high bit rate, which imposes a high throughput requirement on the entropy decoder component of a video decoder. Context adaptive binary arithmetic coding (CABAC) is the entropy coding tool in the latest video coding standards including H.265/High Efficiency Video Coding and H.264/Advanced Video Coding. Due to critical data dependencies at the algorithm level, a CABAC decoder is difficult to be accelerated by simply leveraging parallelism and pipelining. This letter presents a new very-large-scale integration arithmetic decoder, which is the most critical bottleneck in CABAC decoding. Our design features a variable-clock-cycle-path architecture that exploits the differences in critical path delay and in probability of occurrence between various types of binary symbols (bins). The proposed design also incorporates a novel data-forwarding technique (rLPS forwarding) and a fast path-selection technique (coarse bin type decision), and is enhanced with the capability of processing additional bypass bins. As a result, its maximum throughput achieves 1010 Mbins/s in 90-nm CMOS, when decoding 0.96 bin per clock cycle at a maximum clock rate of 1053 MHz, which outperforms previous works by 19.1%. [ABSTRACT FROM PUBLISHER]
- Published
- 2018
- Full Text
- View/download PDF
35. Architecture and Synthesis for Area-Efficient Pipelining of Irregular Loop Nests.
- Author
-
Liu, Gai, Tan, Mingxing, Dai, Steve, Zhao, Ritchie, and Zhang, Zhiru
- Subjects
- *
HIGH level synthesis (Electronic design) , *DATA pipelining , *COMPUTATIONAL complexity , *MATHEMATICAL bounds , *BANDWIDTHS - Abstract
Modern high-level synthesis (HLS) tools commonly employ pipelining to achieve efficient loop acceleration by overlapping the execution of successive loop iterations. While existing HLS pipelining techniques obtain good performance with low complexity for regular loop nests, they provide inadequate support for effectively synthesizing irregular loop nests. For loop nests with dynamic-bound inner loops, current pipelining techniques require unrolling of the inner loops, which is either very expensive in resource or even inapplicable due to dynamic loop bounds. To address this major limitation, this paper proposes ElasticFlow, a novel architecture capable of dynamically distributing inner loops to an array of processing units (LPUs) in an area-efficient manner. The proposed LPUs can be either specialized to execute an individual inner loop or shared among multiple inner loops to balance the tradeoff between performance and area. A customized banked memory architecture is proposed to coordinate memory accesses among different LPUs to maximize memory bandwidth without significantly increasing memory footprint. We evaluate ElasticFlow using a variety of real-life applications and demonstrate significant performance improvements over a state-of-the-art commercial HLS tool for Xilinx FPGAs. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
36. Generalisation of recursive doubling for AllReduce: Now with simulation.
- Author
-
Ruefenacht, Martin, Bull, Mark, and Booth, Stephen
- Subjects
- *
COMPUTER simulation , *CRAY computers , *DATA pipelining , *COMPUTER software execution , *RECURSIVE functions - Abstract
The performance of AllReduce is crucial at scale. The recursive doubling with pairwise exchange algorithm theoretically achieves O (log 2 N ) scaling for short messages with N peers, but is limited by improvements in network latency. A multi-way exchange can be implemented using message pipelining, which is easier to improve than latency. Using our method, recursive multiplying, we show reductions in execution time of between 8% and 40% of AllReduce on a Cray XC30 over recursive doubling. Using a custom simulator we further explore the dynamics of recursive multiplying. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
37. Distributed snapshot maintenance in wide-column NoSQL databases using partitioned incremental ETL pipelines.
- Author
-
Qu, Weiping and Dessloch, Stefan
- Subjects
- *
NONRELATIONAL databases , *DATA warehousing , *DATA pipelining , *PHOTOGRAPHS , *BIG data - Abstract
Wide-column NoSQL databases are an important class of NoSQL (Not only SQL) databases which scale horizontally and feature high access performance on sparse tables. With current trends towards big Data Warehouses (DWs), it is attractive to run existing business intelligence/data warehousing applications on higher volumes of data in wide-column NoSQL databases for low latency by mapping multidimensional models to wide-column NoSQL models or using additional SQL add-ons. For examples, applications like retail management can run over integrated data sets stored in big DWs or in the cloud to capture current item-selling trends. Many of these systems also employ Snapshot Isolation (SI) as a concurrency control mechanism to achieve high throughput for read-heavy workloads. SI works well in a DW environment, as analytical queries can now work on (consistent) snapshots and are not impacted by concurrent update jobs performed by online incremental Extract-Transform-Load (ETL) flows that refresh fact/dimension tables. However, the snapshot made available in the DW is often stale, since at the moment when an analytical query is issued, the source updates (e.g. in a remote retail store) may not have been extracted and processed by the ETL process in time due to high input data volume or slow processing speed. This staleness may cause incorrect results for time-critical decision support queries. To address this problem, snapshots which are supposed to be accessed by analytical queries need to be first maintained by corresponding ETL flows to reflect source updates based on given freshness needs. Snapshot maintenance in this work means maintaining the distributed data partitions that are required by a query. Since most NoSQL databases are not ACID compliant and do not provide full-fledged distributed transaction support, snapshot may be inconsistently derived when its data partitions are updated by different ETL maintenance jobs. This paper describes an extended version of HBelt system [1] which tightly integrates the wide-column NoSQL database HBase with a clustered & pipelined ETL engine. Our objective is to efficiently refresh HBase tables with remote source updates while a consistent snapshot is guaranteed across distributed partitions for each scan request in analytical queries. A consistency model is defined and implemented to address so-called distributed snapshot maintenance . To achieve this, ETL jobs and analytical queries are scheduled in a distributed processing environment. In addition, a partitioned, incremental ETL pipeline is introduced to increase the performance of ETL (update) jobs. We validate the efficiency gain in terms of data pipelining and data partitioning using the TPC-DS benchmark, which simulates a modern decision support system for a retail product supplier. Experimental results show that high query throughput can be achieved in HBelt when distributed, refreshed snapshots are demanded. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
38. Design and Implementation of Pipelining Based Area Efficient Fast Multiplier.
- Author
-
Jain, Ruby and Jain, Vivek
- Subjects
DATA pipelining ,ANALOG multipliers ,PROGRAM transformation ,DIGITAL electronics ,SIMULATION methods & models - Abstract
Optimization of a multiplier is a challenging task. A number of researchers have been tried to optimize the performance of multiplier but user still need more optimized multiplier. The multiplier is a key building block of any digital circuit or computing device. For advancement in the field of DSP,FFT and to improve performance of gadgets we need to have more faster and area efficient multiplier. Several multipliers architectures have been proposed previously, whenever optimization in area is done the time required is increased and vice versa. Here the pipeline-based architecture have been proposed to optimize both area and speed. The multiplier is simulated on quartusII. [ABSTRACT FROM AUTHOR]
- Published
- 2017
39. Efficient FPGA Mapping of Pipeline SDF FFT Cores.
- Author
-
Ingemarsson, Carl, Kallstrom, Petter, Qureshi, Fahad, and Gustafsson, Oscar
- Subjects
FIELD programmable gate arrays ,FOURIER transforms ,DATA pipelining - Abstract
In this paper, an efficient mapping of the pipeline single-path delay feedback (SDF) fast Fourier transform (FFT) architecture to field-programmable gate arrays (FPGAs) is proposed. By considering the architectural features of the target FPGA, significantly better implementation results are obtained. This is illustrated by mapping an R22SDF 1024-point FFT core toward both Xilinx Virtex-4 and Virtex-6 devices. The optimized FPGA mapping is explored in detail. Algorithmic transformations that allow a better mapping are proposed, resulting in implementation achievements that by far outperforms earlier published work. For Virtex-4, the results show a 350% increase in throughput per slice and 25% reduction in block RAM (BRAM) use, with the same amount of DSP48 resources, compared with the best earlier published result. The resulting Virtex-6 design sees even larger increases in throughput per slice compared with Xilinx FFT IP core, using half as many DSP48E1 blocks and less BRAM resources. The results clearly show that the FPGA mapping is crucial, not only the architecture and algorithm choices. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
40. Near- and Sub- Vt Pipelines Based on Wide-Pulsed-Latch Design Techniques.
- Author
-
Jin, Wei, Kim, Seongjong, He, Weifeng, Mao, Zhigang, and Seok, Mingoo
- Subjects
DATA pipelining ,FINITE impulse response filters ,TEMPERATURE measurements - Abstract
This paper presents a methodology and chip demonstration to design near-/sub-threshold voltage ( Vt ) pipelines using pulsed latches that are clocked at very wide pulses. Pulsed-latch-based design is known for time borrowing capability but the amount of time borrowing is limited due to hold time constraint. To enable more cycle borrowing, in this paper, we aim to pad short paths to ~1/3 cycle time using multi- Vt cell library. While delay padding using multi- Vt cells is common in super- Vt design, the small delay difference among multi- Vt cells has not allowed such extensive short path padding due to large area overhead. However, in near-/sub- Vt regime, circuits delay becomes exponentially sensitive to Vt , suggesting that high- Vt cells can significantly reduce the overhead of padding. We build a semi-automatic short path padding flow around this idea, and use it to design: 1) ISCAS benchmark circuits and 2) an 8-bit 8-tap finite impulse response (FIR) core, the latter fabricated in a 65-nm CMOS technology. The chip measurement shows that the proposed FIR core achieves 45.2% throughput (frequency), 11% energy efficiency (Energy/cycle), and 38% energy-delay-product improvements at 0.35 V over the flip-flop-pipelined baseline. The measurement results also confirm that the proposed FIR core operates with the same pulsewidth setting robustly across process, voltage, and temperature variations. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
41. A Nanosecond-Transient Fine-Grained Digital LDO With Multi-Step Switching Scheme and Asynchronous Adaptive Pipeline Control.
- Author
-
Yang, Fan and Mok, Philip K. T.
- Subjects
VOLTAGE control ,DATA pipelining ,SYSTEMS on a chip - Abstract
This paper introduces a multi-step switching scheme for a digital low dropout regulator (DLDO) that emerges as a new way of achieving nanosecond-transient and fine-grained on-chip voltage regulation. The multi-step switching scheme takes advantage of the adaptive pipeline control and asynchronous clocking for area- and power-efficient digital controller utilization. It speeds up the transient response by varying the pass transistor sizing in two available lengths of coarse steps as per the perturbation, while maintaining a small output voltage ripple by toggling in a finer step at steady operation. A prototype proving the proposed concept, i.e., a 0.6–1.0-V input, 50–200-mV dropout, and 500-mA maximum loading DLDO with an on-chip 1.5-nF output capacitor, is fabricated in a 65-nm CMOS process to verify the effectiveness of this scheme. By employing the multi-step switching scheme and adaptive control, the DLDO achieved a fast transient response to nanoseconds loading current change, and a 100 mV per 10-ns reference voltage switching, as well as a resolution of 768 levels (~9.5 bits) with a 5-mV output ripple. The quiescent current consumed by this DLDO at steady operation is down to 300~\mu \textA . [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
42. Using SPDY to improve Web 2.0 over satellite links.
- Author
-
Cardaci, Andrea, Caviglione, Luca, Ferro, Erina, and Gotta, Alberto
- Subjects
NATURAL satellites ,HTTP (Computer network protocol) ,WIRELESS communications ,MOBILE operating systems ,DATA pipelining - Abstract
During the last decade, the Web has grown in terms of complexity, while the evolution of the HTTP (Hypertext Transfer Protocol) has not experienced the same trend. Even if HTTP 1.1 adds improvements like persistent connections and request pipelining, they are not decisive, especially in modern mixed wireless/wired networks, often including satellites. The latter play a key role for accessing the Internet everywhere, and they are one of the preferred methods to provide connectivity in rural areas or for disaster relief operations. However, they suffer of high-latency and packet losses, which degrade the browsing experience. Consequently, the investigation of protocols mitigating the limitations of HTTP, also in challenging scenarios, is crucial both for the industry and the academia. In this perspective, SPDY, which is a protocol optimized for the access to Web 2.0 contents over fixed and mobile devices, could be suitable also for satellite links. Therefore, this paper evaluates its performance when used both in real and emulated satellite scenarios. Results indicate the effectiveness of SPDY if compared with HTTP, but at the price of a more fragile behavior when in the presence of errors. Besides, SPDY can also reduce the transport overhead experienced by middleboxes typically deployed by service providers using satellite links. Copyright © 2016 John Wiley & Sons, Ltd. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
43. An Efficient Hardware Architecture for High Throughput AES Encryptor Using MUX Based Sub Pipelined S-Box.
- Author
-
Priya, Sridevi, Karthigaikumar, Palanivel, Siva Mangai, N., and Kirti Gaurav Das, P.
- Subjects
DATA pipelining ,ADVANCED Encryption Standard ,LOGIC circuits ,RECONFIGURABLE optical add-drop multiplexers ,HARDWARE - Abstract
In this paper an efficient structural architecture is proposed for AES Encryption process to achieve high throughput with less device utilization. Breakable and controllable structures for main AES blocks at the gate level are designed and used here. The control unit using high speed combinational logic circuit is designed to control the AES structural architecture. Modified MUX based S-Box is introduced in AES instead of S-Box to reduce the area without affecting the throughput. In addition Encryption process Mix-columns transformation is modified to reduce the hardware complexity. The five stage subpipelining is introduced in AES MUX based S-Box with six pipelining stages in AES encryption process to increase throughput further. The aim of this work is to investigate both the existing and new architectures. The modified MUX based S-Box for Rijndael algorithm has been used in the 128-bit AES encryption process. The role of five stage sub-pipelined MUX based S-Box of 128-bit pipelined AES is to reduce the critical path delay to minimum for achieving high clock frequency. The rows of multiplexer in AES architecture were used for the breaking and controlling of the design. The modified 128-bit encryption was implemented on Virtex-4, Virtex-5, and Spartan 3 FPGA Devices. The results of the proposed architecture are analysed, throughput and area for the implemented design are calculated. The calculated results are compared with other architecture (Liberatori et al. in 3rd southern proceedings of the IEEE conference on programmable logic, SPL'07, pp 195-198, 2007; Farashahi et al. in Microelectron J 45:1014-1025, 2014; Good and Benaissa in IET Inf Secur 1(1):1-10, 2007; Sireesha and Madhava Rao in Int J Sci Res 3(9):1-5, 2013; Gielata et al. in Proceedings of the international conference on signals and electronic systems (ICSES), pp 137-140, 2008; El Adib and Raissouni in Int J Inf Netw Secur 1(2):1-10, 2012; Good and Benaissa in Lecture Notes Computer Science, vol 3659, pp 427-440, 2005). From the results it is obtained that the proposed architecture gives 58 % improvement with 1.08 % reduction in area. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
44. Reconfigurable Constant Multiplication for FPGAs.
- Author
-
Moller, Konrad, Kumm, Martin, Kleinlein, Marco, and Zipf, Peter
- Subjects
- *
FIELD programmable gate arrays , *HEURISTIC , *MULTIPLEXING equipment , *ELECTRIC switchgear , *DATA pipelining - Abstract
This paper introduces a new heuristic to generate pipelined run-time reconfigurable constant multipliers for field-programmable gate arrays (FPGAs). It produces results close to the optimum. It is based on an optimal algorithm which fuses already optimized pipelined constant multipliers generated by an existing heuristic called reduced pipelined adder graph (RPAG). Switching between different single or multiple constant outputs is realized by the insertion of multiplexers. The heuristic searches for a solution that results in minimal multiplexer overhead. Using the proposed heuristic reduces the run-time of the fusion process, which raises the usability and application domain of the proposed method of run-time reconfiguration. An extensive evaluation of the proposed method confirms a 9%–26% FPGA resource reduction on average compared to previous work. For reconfigurable multiple constant multiplication, resource savings of up to 75% can be shown compared to a standard generic lookup table based multiplier. Two low level optimizations are presented, which further reduce resource consumption and are included into an automatic VHDL code generation based on the FloPoCo library. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
45. Optimized sparse Cholesky factorization on hybrid multicore architectures.
- Author
-
Tang, Meng, Gadou, Mohamed, Rennich, Steven, Davis, Timothy A., and Ranka, Sanjay
- Subjects
FACTORIZATION ,MULTICORE processors ,GRAPHICS processing units ,SPARSE matrices ,DATA pipelining - Abstract
We present techniques for supernodal sparse Cholesky factorization on a hybrid multicore platform consisting of a multicore CPU and GPU. The techniques are the subtree algorithm, pipelining and multithreading. The subtree algorithm [15] minimizes PCIe transmissions by storing an entire branch of the elimination tree in the GPU memory (the elimination tree is a tree data structure describing the workflow of the factorization), and also reduces the total kernel launch time by launching BLAS kernels in batches. The pipelining technique overlaps the execution of GPU kernels and PCIe data transfers. The multithreading technique [17] creates multiple threads for both the CPU and the GPU, to utilize concurrency of the elimination tree. Our experimental results on a platform consisting of an Intel multicore processor along with an Nvidia GPU indicate a significant improvement in performance and energy over CHOLMOD (SuiteSparse 4.5.3), a sparse algorithm, after these techniques are applied. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
46. Pipelined-Scheduling of Multiple Embedded Applications on a Multi-Processor-SoC.
- Author
-
Salamy, Hassan and Aslan, Semih
- Subjects
- *
SYSTEMS on a chip , *EMBEDDINGS (Mathematics) , *DATA pipelining , *BENCHMARK problems (Computer science) , *MATHEMATICAL decoupling - Abstract
Due to clock and power constraints, it is hard to extract more power out of single core architectures. Thus, multi-core systems are now the architecture of choice to provide the needed computing power. In embedded system, multi-processor system-on-a-chip (MPSoC) is widely used to provide the needed power to effectively run complex embedded applications. However, to effectively utilize an MPSoC system, tools to generate optimized schedules is highly needed. In this paper, we design an integrated approach to task scheduling and memory partitioning of multiple applications utilizing the MPSoC system simultaneously. This is in contrast to the traditional decoupled approach that looks at task scheduling and memory partitioning as two separate problems. Our framework is also based on pipelined scheduling to increase the throughput of the system. Results on different benchmarks show the effectiveness of our techniques. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
47. Advances in deepwater structure installation technologies.
- Author
-
Yi Wang, Menglan Duan, Huaguo Liu, Runhong Tian, and Chao Peng
- Subjects
SUBMARINE trenches ,COMPUTER simulation ,UNDERWATER drilling ,INSTALLATION of equipment ,DATA pipelining - Abstract
New offshore projects are targeting water depths of over 3000 m far from the land, and the preferred option for field development is deepwater structures, which include subsea equipment and pipeline systems. Many publications are focused on the deepwater structure installation technology in order to understand the behaviour of structures during installation and to control the installation process safely. In this paper, the installation solutions backed by engineering tools and numerical simulation methods are presented and discussed for subsea equipment and pipelines, respectively. The corresponding latest advances in the installation technologies are presented, together with their main characteristics and critical challenges. The authors also discuss general trends in future development that may result in further advances. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
48. Systematic Methodology for the Quantitative Analysis of Pipeline-Register Reliability.
- Author
-
Jeyapaul, Reiley, Flores, Roberto, Avila, Alfonso, and Shrivastava, Aviral
- Subjects
REGISTERS (Computers) ,DATA pipelining ,SOFT errors - Abstract
Decades of rapid aggressive technology scaling have brought the challenge of soft errors to modern computing systems. Sequential elements (registers) in the processor pipeline exposed to charge-carrying particles generate bit flips or soft errors that could translate into system failures. Next to the processor cache, the pipeline registers (PRs) - registers between two pipeline stages - account for more than 50% of soft-error failures in the system. In this paper, for the first time, we apply architectural correct execution models that quantitatively define the vulnerability (or exposure to soft errors) of microarchitectural components, and extend it to define the vulnerability of PRs - PR vulnerability (PRV). We develop gemV-Pipe, a simulation toolset for the systematic, accurate, and quantitative estimation and analysis of PRV. Our detailed ISA-aware analysis in gemV-Pipe reveals interesting facts on the data-access behavior of PRs: 1) the vulnerability of each PR is not proportional to their size; 2) the PR bits used for one instruction may not be used (and are thus not vulnerable) for another, which makes PRV extremely instruction-dependent; and 3) the functionality of stored data on the PR bits can be used to classify them as - instruction, control, and data bits - each of which differ in their instruction-specific behavior and vulnerability. Applying the insight gained, we perform design space exploration on selectively hardening the PR bits, and demonstrate that 75% improved reliability can be achieved for only <;15% power overhead. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
49. Minimizing Delay in Network Function Virtualization with Shared Pipelines.
- Author
-
Rottenstreich, Ori, Keslassy, Isaac, Revah, Yoram, and Kadosh, Aviran
- Subjects
- *
VIRTUAL machine systems , *PACKET transport networks , *DATA pipelining , *GREEDY algorithms , *MULTICORE processors - Abstract
Pipelines are widely used to increase throughput in multi-core chips by parallelizing packet processing while relying on virtualization. Typically, each packet type is served by a dedicated pipeline with several cores, each implementing a network service. However, with the increase in the number of packet types and their number of required services, there are not enough cores for pipelines. In this paper, we study pipeline sharing, such that a single pipeline can be used to serve several packet types. Pipeline sharing decreases the needed total number of cores, but typically increases pipeline lengths and therefore packet delays. We consider two novel optimization problems of allocating cores between different packet types such that the average or the worst-case delay is minimized. We study the two problems and suggest optimal algorithms that apply under different assumptions on the input. We also present greedy algorithms for the general case. Last, we examine our solutions on synthetic examples as well as on real-life applications and demonstrate that they often achieve close-to-optimal delays. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
50. The Effect of Dependency on Scalar Pipeline Architecture.
- Author
-
Patel, Renuka and Kumar, Sanjay
- Subjects
DATA pipelining ,PARALLEL processing ,COMPUTER software execution ,MIPS (Computer architecture) ,ELECTRONIC data processing - Abstract
Pipelining is one of the methods to improve the processor's performance. Pipelining is an easy and economical way to achieve Instruction Level Parallelism (ILP). There are five types of pipelines--scalar, superscalar, super pipeline, under pipeline and super scalar super pipeline. But dependency is a major bottleneck in all types of pipelines. Therefore, in this paper, a simulator is developed using C language for observing the effect of dependencies on scalar pipeline. In our purposed simulator CPI, IPC, clock cycle-wise stage occupation is shown in detail and subsequently it also calculates the total number of clock cycles to execute instructions. [ABSTRACT FROM AUTHOR]
- Published
- 2017
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.