Author: "Pena, Tomás F." - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Pena, Tomás F."' showing total 14 results

Start Over Author "Pena, Tomás F."

14 results on '"Pena, Tomás F."'

1. PASTASpark: multiple sequence alignment meets Big Data.

Author: Abuín, José M., Pena, Tomás F., and Pichel, Juan C.
Subjects: *CURRICULUM alignment, *BIOINFORMATICS, *BIOMATHEMATICS, *COMPUTERS in biology, *BIG data
Abstract: Motivation: One basic step in many bioinformatics analyses is the multiple sequence alignment. One of the state-of-the-art tools to perform multiple sequence alignment is PASTA (Practical Alignments using SATe' and TrAnsitivity). PASTA supports multithreading but it is limited to process datasets on shared memory systems. In this work we introduce PASTASpark, a tool that uses the Big Data engine Apache Spark to boost the performance of the alignment phase of PASTA, which is the most expensive task in terms of time consumption. Results: Speedups up to 10° with respect to single-threaded PASTA were observed, which allows to process an ultra-large dataset of 200 000 sequences within the 24-h limit. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

2. Model Performance Prediction for Hyperparameter Optimization of Deep Learning Models Using High Performance Computing and Quantum Annealing.

Author: García Amboage, Juan Pablo, Wulff, Eric, Girone, Maria, and Pena, Tomás F.
Subjects: *PREDICTION models, *DEEP learning, *HIGH performance computing, *QUANTUM annealing, *MATHEMATICAL optimization
Abstract: Hyperparameter Optimization (HPO) of Deep Learning (DL)-based models tends to be a compute resource intensive process as it usually requires to train the target model with many different hyperparameter configurations. We show that integrating model performance prediction with early stopping methods holds great potential to speed up the HPO process of deep learning models. Moreover, we propose a novel algorithm called Swift-Hyperband that can use either classical or quantum Support Vector Regression (SVR) for performance prediction and benefit from distributed High Performance Computing (HPC) environments. This algorithm is tested not only for the Machine-Learned Particle Flow (MLPF), model used in High-Energy Physics (HEP), but also for a wider range of target models from domains such as computer vision and natural language processing. Swift-Hyperband is shown to find comparable (or better) hyperparameters as well as using less computational resources in all test cases. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

3. Assessing Intel OneAPI capabilities and cloud-performance for heterogeneous computing.

Author: Alcaraz, Silvia R., Laso, Ruben, Lorenzo, Oscar G., Vilariño, David L., Pena, Tomás F., and Rivera, Francisco F.
Subjects: *HETEROGENEOUS computing, *IMAGE denoising, *DYNAMIC balance (Mechanics), *DYNAMIC loads, *LOAD balancing (Computer networks)
Abstract: This work presents a performance-oriented study of a heterogeneous application developed with Intel OneAPI to solve two well-known diffusion problems: heat diffusion and image denoising. We have explored CPU+iGPU and CPU+FPGA schemes, applying dynamic load balancing and conducting experiments on Intel DevCloud. The results demonstrate that the CPU+iGPU scheme outperforms the execution times achieved by the fastest device when the problem is sufficiently computationally demanding. We also found that the performance of the CPU+FPGA scheme is heavily affected by bandwidth limitations and specific strategies to manage memory efficiently are required. Moreover, it was demonstrated that dynamic workload balancing is crucial due to possible performance fluctuations in any of the implicated devices. In conclusion, Intel OneAPI provides a helpful tool for multi-platform development using a unique high-level language, DPC++. However, developing specific code for each platform is necessary to achieve optimal performance. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. QPU integration in OpenCL for heterogeneous programming.

Author: Vázquez-Pérez, Jorge, Piñeiro, César, Pichel, Juan C., Pena, Tomás F., and Gómez, Andrés
Subjects: *QUANTUM computing, *QUBITS, *HETEROGENEOUS computing, *QUANTUM chemistry, *QUANTUM computers, *PROOF of concept
Abstract: The integration of quantum processing units (QPUs) in a heterogeneous high-performance computing environment requires solutions that facilitate hybrid classical–quantum programming. Standards such as OpenCL facilitate the programming of heterogeneous environments, consisting of CPUs and hardware accelerators. This study presents an innovative method that incorporates QPU functionality into OpenCL, standardizing quantum processes within classical environments. By leveraging QPUs within OpenCL, hybrid quantum–classical computations can be sped up, impacting domains like cryptography, optimization problems, and quantum chemistry simulations. Using Portable Computing Language (Jääskeläinen et al. in Int J Parallel Program 43(5):752–785, 2014. https://doi.org/10.1007/s10766-014-0320-y) and the Qulacs library (Suzuki et al. in Quantum 5:559, 2021. https://doi.org/10.22331/q-2021-10-06-559), results demonstrate, for instance, the successful execution of Shor's algorithm (Nielsen and Chuang in Quantum computation and quantum information, 10th anniversary edn. Cambridge University Press, Cambridge, 2010), serving as a proof of concept for extending the approach to larger qubit systems and other hybrid quantum–classical algorithms. This integration approach bridges the gap between quantum and classical computing paradigms, paving the way for further optimization and application to a wide range of computational problems. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. Parallel sparse approximate preconditioners applied to the solution of BEM systems

Author: González, Patricia, Pena, Tomás F., and Cabaleiro, José C.
Subjects: *BOUNDARY element methods, *ENGINEERING, *PARALLEL processing, *COMPUTERS, *EQUATIONS, *NUMERICAL analysis
Abstract: Engineering problems, such as those dealing with boundary element methods (BEM), usually involve high computational costs. Previous works have proposed effective parallel implementations of BEM. These works have proved the benefit of using parallel computers to solve large civil engineering problems. Despite the profits of the parallel implementation in the discretization step of these methods, the drawback remains the solution of the systems of equations obtained. Because of their fully populated coefficient matrices, direct methods are usually preferable to solve these linear systems. However, parallel implementations of iterative methods achieve higher efficiency, but need suitable preconditioners in order to speed up their convergence. The aim of this work is the research into parallel Sparse Approximate Inverse Preconditioners (SPAI) for the iterative solution of dense systems of equations arising from the boundary element codes. Different heuristics for the construction of SPAI preconditioners suitable for this kind of systems are proposed. Some experiments and numerical results, on distributed-memory computers, are presented in this paper. [Copyright &y& Elsevier]
Published: 2004
Full Text: View/download PDF

6. Enabling BOINC in infrastructure as a service cloud system.

Author: Montes, Diego, Añel, Juan A., Pena, Tomás F., Uhe, Peter, and Wallom, David C. H.
Subjects: *CLIMATE change forecasts, *CLOUD computing, *COMPUTING platforms, *ATMOSPHERIC models, *COST effectiveness
Abstract: Volunteer or crowd computing is becoming increasingly popular for solving complex research problems from an increasingly diverse range of areas. The majority of these have been built using the Berkeley Open Infrastructure for Network Computing (BOINC) platform, which provides a range of different services to manage all computation aspects of a project. The BOINC system is ideal in those cases where not only does the research community involved need low-cost access to massive computing resources but also where there is a significant public interest in the research being done. We discuss the way in which cloud services can help BOINC-based projects to deliver results in a fast, on demand manner. This is difficult to achieve using volunteers, and at the same time, using scalable cloud resources for short on demand projects can optimize the use of the available resources. We show how this design can be used as an efficient distributed computing platform within the cloud, and outline new approaches that could open up new possibilities in this field, using Climateprediction.net (http://www.climateprediction.net/) as a case study. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

7. Enabling BOINC in Infrastructure as a Service Cloud Systems.

Author: Montes, Diego, Añel, Juan A., Pena, Tomás F., Uhe, Peter, and Wallom, David C. H.
Subjects: *CLOUD computing, *SCIENTIFIC community, *PUBLIC interest
Abstract: Volunteer or Crowd computing is becoming increasingly popular to solve complex research problems, from an increasing diverse range of areas. The majority of these have been built using the Berkeley Open Infrastructure for Network Computing (BOINC) platform, which provides a range of different services to manage all computation aspects of a project. The BOINC system is ideal in those cases where not only does the research community involved need low cost access to massive computing resource but also that there is a significant public interest in the research done. We discuss the way in which Cloud services can help BOINC based projects to deliver results in a fast, on demand manner. This is difficult to achieve using volunteers, and at the same time, using scalable cloud resources for short on demand projects can optimize the use of the available resources. We show how this design can be used as an efficient distributed computing plat- form within the Cloud, and outline new approaches that could open up new possibilities in this field, using http://climateprediction.net as a case study. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

8. SparkBWA: Speeding Up the Alignment of High-Throughput DNA Sequencing Data.

Author: Abuín, José M., Pichel, Juan C., Pena, Tomás F., and Amigo, Jorge
Subjects: *NUCLEOTIDE sequencing, *GENOMES, *BIG data, *COMPUTATIONAL biology, *SCALABILITY
Abstract: Next-generation sequencing (NGS) technologies have led to a huge amount of genomic data that need to be analyzed and interpreted. This fact has a huge impact on the DNA sequence alignment process, which nowadays requires the mapping of billions of small DNA sequences onto a reference genome. In this way, sequence alignment remains the most time-consuming stage in the sequence analysis workflow. To deal with this issue, state of the art aligners take advantage of parallelization strategies. However, the existent solutions show limited scalability and have a complex implementation. In this work we introduce SparkBWA, a new tool that exploits the capabilities of a big data technology as Spark to boost the performance of one of the most widely adopted aligner, the Burrows-Wheeler Aligner (BWA). The design of SparkBWA uses two independent software layers in such a way that no modifications to the original BWA source code are required, which assures its compatibility with any BWA version (future or legacy). SparkBWA is evaluated in different scenarios showing noticeable results in terms of performance and scalability. A comparison to other parallel BWA-based aligners validates the benefits of our approach. Finally, an intuitive and flexible API is provided to NGS professionals in order to facilitate the acceptance and adoption of the new tool. The source code of the software described in this paper is publicly available at , with a GPL3 license. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

9. A fast and optimal pathfinder using airborne LiDAR data.

Author: Yermo, Miguel, Rivera, Francisco F., Cabaleiro, José C., Vilariño, David L., and Pena, Tomás F.
Subjects: *POINT cloud, *DIGITAL elevation models, *AIRBORNE lasers, *LIDAR, *HIGHWAY planning
Abstract: Determining the optimal path between two points in a 3D point cloud is a problem that have been addressed in many different situations: from road planning and escape routes determination, to network routing and facility layout. This problem is addressed using different input information, being 3D point clouds one of the most valuables. Its main utility is to save costs, whatever the field of application is. In this paper, we present a fast algorithm to determine the least cost path in an Airborne Laser Scanning point cloud. In some situations, like finding escape routes for instance, computing the solution in a very short time is crucial, and there are not many works developed in this theme. State of the art methods are mainly based on a digital terrain model (DTM) for calculating these routes, and these methods do not reflect well the topography along the edges of the graph. Also, the use of a DTM leads to a significant loss of both information and precision when calculating the characteristics of possible routes between two points. In this paper, a new method that does not require the use of a DTM and is suitable for airborne point clouds, whether they are classified or not, is proposed. The problem is modeled by defining a graph using the information given by a segmentation and a Voronoi Tessellation of the point cloud. The performance tests show that the algorithm is able to compute the optimal path between two points by processing up to 678,820 points per second in a point cloud of 40,000,000 points and 16 km 2 of extension. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

10. BigBWA: approaching the Burrows-Wheeler aligner to Big Data technologies.

Author: Abuin, José M., Pichel, Juan C., Pena, Tomás F., and Amigo, Jorge
Subjects: *BIG data, *FAULT tolerance (Engineering), *SEQUENCE alignment, *SOURCE code, *SEQUENCE analysis
Abstract: BigBWA is a new tool that uses the Big Data technology Hadoop to boost the performance of the Burrows-Wheeler aligner (BWA). Important reductions in the execution times were observed when using this tool. In addition, BigBWA is fault tolerant and it does not require any modification of the original BWA source code. [ABSTRACT FROM AUTHOR]
Published: 2015
Full Text: View/download PDF

11. A parallel 3D semiconductor device simulator for gradual heterojunction bipolar transistors.

Author: García-Loureiro, Antonio J., López-González, J. M., and Pena, Tomás F.
Subjects: *SEMICONDUCTORS, *BIPOLAR transistors, *POISSON'S equation, *FINITE element method, *ELECTRIC conductivity, *ELECTRONIC circuits
Abstract: In this paper, we present a parallel three-dimensional semiconductor device simulator for gradual heterojunction bipolar transistor. This simulator uses the drift-diffusion transport model. The Poisson equation and continuity equations were discretized using a finite element method (FEM) on an unstructured tetrahedral mesh. Fermi–Dirac statistics is considered in our model and a compact formulation is used that makes it easy to take into account other effects such as the non-parabolic nature of the bands or the presence of various subbands in the conduction process. Domain decomposition methods were tested to solve the linear systems. We have applied this simulator to a gradual heterojunction bipolar transistor (HBT), and we present some measures of the parallel execution time for several solvers and some electrical results. This code has been implemented for distributed memory multicomputers, making use of the MPI message passing standard library and a parallel solver library. Copyright © 2002 John Wiley & Sons, Ltd. [ABSTRACT FROM AUTHOR]
Published: 2003
Full Text: View/download PDF

12. Parallel iterative solvers involving fast wavelet transforms for the solution of BEM systems

Author: González, Patricia, Cabaleiro, José C., and Pena, Tomás F.
Subjects: *BOUNDARY element methods, *WAVELETS (Mathematics)
Abstract: This paper describes the parallelization of a strategy to speed up the convergence of iterative methods applied to boundary element method (BEM) systems arising from problems with non-smooth boundaries and mixed boundary conditions. The aim of the work is the application of fast wavelet transforms as a black box transformation in existing boundary element codes. A new strategy was proposed, applying wavelet transforms on the interval, so it could be used in case of non-smooth coefficient matrices. Here, we describe the parallel iterative scheme and we present some of the results we have obtained. [Copyright &y& Elsevier]
Published: 2002

13. Big Data in metagenomics: Apache Spark vs MPI.

Author: Abuín, José M., Lopes, Nuno, Ferreira, Luís, Pena, Tomás F., and Schmidt, Bertil
Subjects: *METAGENOMICS, *BIG data, *HIGH performance computing, *FOOD composition, *MESSAGE passing (Computer science), *BACTERIAL genomes
Abstract: The progress of next-generation sequencing has lead to the availability of massive data sets used by a wide range of applications in biology and medicine. This has sparked significant interest in using modern Big Data technologies to process this large amount of information in distributed memory clusters of commodity hardware. Several approaches based on solutions such as Apache Hadoop or Apache Spark, have been proposed. These solutions allow developers to focus on the problem while the need to deal with low level details, such as data distribution schemes or communication patterns among processing nodes, can be ignored. However, performance and scalability are also of high importance when dealing with increasing problems sizes, making in this way the usage of High Performance Computing (HPC) technologies such as the message passing interface (MPI) a promising alternative. Recently, MetaCacheSpark, an Apache Spark based software for detection and quantification of species composition in food samples has been proposed. This tool can be used to analyze high throughput sequencing data sets of metagenomic DNA and allows for dealing with large-scale collections of complex eukaryotic and bacterial reference genome. In this work, we propose MetaCache-MPI, a fast and memory efficient solution for computing clusters which is based on MPI instead of Apache Spark. In order to evaluate its performance a comparison is performed between the original single CPU version of MetaCache, the Spark version and the MPI version we are introducing. Results show that for 32 processes, MetaCache-MPI is 1.65× faster while consuming 48.12% of the RAM memory used by Spark for building a metagenomics database. For querying this database, also with 32 processes, the MPI version is 3.11× faster, while using 55.56% of the memory used by Spark. We conclude that the new MetaCache-MPI version is faster in both building and querying the database and uses less RAM memory, when compared with MetaCacheSpark, while keeping the accuracy of the original implementation. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

14. A big data approach to metagenomics for all-food-sequencing.

Author: Kobus, Robin, Abuín, José M., Müller, André, Hellmann, Sören Lukas, Pichel, Juan C., Pena, Tomás F., Hildebrandt, Andreas, Hankeln, Thomas, and Schmidt, Bertil
Subjects: *BIG data, *SHOTGUN sequencing, *BACTERIAL genomes, *FOOD composition, *COMPUTER workstation clusters, *FOOD testing
Abstract: Background: All-Food-Sequencing (AFS) is an untargeted metagenomic sequencing method that allows for the detection and quantification of food ingredients including animals, plants, and microbiota. While this approach avoids some of the shortcomings of targeted PCR-based methods, it requires the comparison of sequence reads to large collections of reference genomes. The steadily increasing amount of available reference genomes establishes the need for efficient big data approaches. Results: We introduce an alignment-free k-mer based method for detection and quantification of species composition in food and other complex biological matters. It is orders-of-magnitude faster than our previous alignment-based AFS pipeline. In comparison to the established tools CLARK, Kraken2, and Kraken2+Bracken it is superior in terms of false-positive rate and quantification accuracy. Furthermore, the usage of an efficient database partitioning scheme allows for the processing of massive collections of reference genomes with reduced memory requirements on a workstation (AFS-MetaCache) or on a Spark-based compute cluster (MetaCacheSpark). Conclusions: We present a fast yet accurate screening method for whole genome shotgun sequencing-based biosurveillance applications such as food testing. By relying on a big data approach it can scale efficiently towards large-scale collections of complex eukaryotic and bacterial reference genomes. AFS-MetaCache and MetaCacheSpark are suitable tools for broad-scale metagenomic screening applications. They are available at https://muellan.github.io/metacache/afs.html (C++ version for a workstation) and https://github.com/jmabuin/MetaCacheSpark (Spark version for big data clusters). [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

14 results on '"Pena, Tomás F."'

1. PASTASpark: multiple sequence alignment meets Big Data.

2. Model Performance Prediction for Hyperparameter Optimization of Deep Learning Models Using High Performance Computing and Quantum Annealing.

3. Assessing Intel OneAPI capabilities and cloud-performance for heterogeneous computing.

4. QPU integration in OpenCL for heterogeneous programming.

5. Parallel sparse approximate preconditioners applied to the solution of BEM systems

6. Enabling BOINC in infrastructure as a service cloud system.

7. Enabling BOINC in Infrastructure as a Service Cloud Systems.

8. SparkBWA: Speeding Up the Alignment of High-Throughput DNA Sequencing Data.

9. A fast and optimal pathfinder using airborne LiDAR data.

10. BigBWA: approaching the Burrows-Wheeler aligner to Big Data technologies.

11. A parallel 3D semiconductor device simulator for gradual heterojunction bipolar transistors.

12. Parallel iterative solvers involving fast wavelet transforms for the solution of BEM systems

13. Big Data in metagenomics: Apache Spark vs MPI.

14. A big data approach to metagenomics for all-food-sequencing.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

14 results on '"Pena, Tomás F."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources