1,930 results
Search Results
2. Comments on a paper “A Hermitian Morita theorem for algebras with anti-structure”
- Author
-
Dasgupta, Bhanumati
- Subjects
- *
ALGEBRA , *MATHEMATICS , *MATHEMATICAL analysis , *ALGORITHMS - Abstract
Abstract: In 1.9 of the paper [A. Hahn, A Hermitian Morita theorem for algebras with anti-structure, J. Algebra 93 (1985) 215–235], should be replaced by . This leads to minor changes in the rest of the paper where the ring should be replaced by its opposite and vice versa. [Copyright &y& Elsevier]
- Published
- 2007
- Full Text
- View/download PDF
3. Almost optimal query algorithm for hitting set using a subset query.
- Author
-
Bishnu, Arijit, Ghosh, Arijit, Kolay, Sudeshna, Mishra, Gopinath, and Saurabh, Saket
- Subjects
- *
HYPERGRAPHS , *INDEPENDENT sets , *COMBINATORIAL optimization , *ALGORITHMS - Abstract
In this paper, we focus on Hitting-Set , a fundamental problem in combinatorial optimization, through the lens of sublinear time algorithms. Given access to the hypergraph through a subset query oracle in the query model, we give sublinear time algorithms for Hitting-Set with almost tight parameterized query complexity. In parameterized query complexity , we estimate the number of queries to the oracle based on the parameter k , the size of the Hitting-Set. The subset query oracle we use in this paper is called Generalized d -partite Independent Set query oracle (GPIS) and it was introduced by Bishnu et al. (ISAAC'18). GPIS is a generalization to hypergraphs of the Bipartite Independent Set query oracle (BIS) introduced by Beame et al. (ITCS'18 and TALG'20) for estimating the number of edges in graphs. Since its introduction GPIS query oracle has been used for estimating the number of hyperedges independently by Dell et al. (SODA'20 and SICOMP'22) and Bhattacharya et al. (STACS'22), and for estimating the number of triangles in a graph by Bhattacharya et al. (ISAAC'19 and TOCS'21). Formally, GPIS is defined as follows: GPIS oracle for a d-uniform hypergraph H takes as input d pairwise disjoint non-empty subsets A 1 , ... , A d of vertices in H and answers whether there is a hyperedge in H that intersects each set A i , where i ∈ { 1 , 2 , ... , d }. For d = 2 , the GPIS oracle is nothing but BIS oracle. We show that d - Hitting-Set , the hitting set problem for d -uniform hypergraphs, can be solved using O ˜ d (k d log n) GPIS queries. Additionally, we also showed that d - Decision-Hitting-Set , the decision version of d - Hitting-Set can be solved with O ˜ d (min { k d log n , k 2 d 2 }) GPIS queries. We complement these parameterized upper bounds with an almost matching parameterized lower bound that states that any algorithm that solves d - Decision-Hitting-Set requires Ω (( k + d d )) GPIS queries. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
4. Computing free non-commutative Gröbner bases over [formula omitted] with Singular:Letterplace.
- Author
-
Levandovskyy, Viktor, Metzlaff, Tobias, and Abou Zeid, Karim
- Subjects
- *
GROBNER bases , *INTEGRAL domains , *ASSOCIATIVE algebras , *ASSOCIATIVE rings , *COMMUTATIVE rings - Abstract
With this paper we present an extension of our recent ISSAC paper about computations of Gröbner(-Shirshov) bases over free associative algebras Z 〈 X 〉. We present all the needed proofs in details, add a part on the direct treatment of the ring Z / m Z as well as new examples and applications to e.g. Iwahori-Hecke algebras. The extension of Gröbner bases concept from polynomial algebras over fields to polynomial rings over rings allows to tackle numerous applications, both of theoretical and of practical importance. Gröbner and Gröbner-Shirshov bases can be defined for various non-commutative and even non-associative algebraic structures. We study the case of associative rings and aim at free algebras over principal ideal rings. We concentrate on the case of commutative coefficient rings without zero divisors (i.e. a domain). Even working over Z allows one to do computations, which can be treated as universal for fields of arbitrary characteristic. By using the systematic approach, we revisit the theory and present the algorithms in the implementable form. We show drastic differences in the behaviour of Gröbner bases between free algebras and algebras, which are close to commutative. Even the process of the formation of critical pairs has to be reengineered, together with implementing the criteria for their quick discarding. We present an implementation of algorithms in the Singular subsystem called Letterplace , which internally uses Letterplace techniques (and Letterplace Gröbner bases), due to La Scala and Levandovskyy. Interesting examples and applications accompany our presentation. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
5. An algorithmic approach to small limit cycles of nonlinear differential systems: The averaging method revisited.
- Author
-
Huang, Bo and Yap, Chee
- Subjects
- *
NONLINEAR systems , *LIMIT cycles , *MAPLE , *ALGORITHMS - Abstract
This paper introduces an algorithmic approach to the analysis of bifurcation of limit cycles from the centers of nonlinear continuous differential systems via the averaging method. We develop three algorithms to implement the averaging method. The first algorithm allows one to transform the considered differential systems to the normal form of averaging. Here, we restricted the unperturbed term of the normal form of averaging to be identically zero. The second algorithm is used to derive the computational formulae of the averaged functions at any order. The third algorithm is based on the first two algorithms and determines the exact expressions of the averaged functions for the considered differential systems. The proposed approach is implemented in Maple and its effectiveness is shown by several examples. Moreover, we report some incorrect results in published papers on the averaging method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
6. GPU-accelerated scalable solver with bit permutated cyclic-min algorithm for quadratic unconstrained binary optimization.
- Author
-
Yasudo, Ryota, Nakano, Koji, Ito, Yasuaki, Katsuki, Ryota, Tabata, Yusuke, Yazane, Takashi, and Hamano, Kenichiro
- Subjects
- *
QUANTUM annealing , *RANDOM numbers , *ISING model , *COMBINATORIAL optimization , *ALGORITHMS , *GRAPHICS processing units , *IMAGE encryption - Abstract
• The adaptive bulk search is a QUBO solver alternative to quantum annealing. • The cyclic-min algorithm is a SA-like algorithm without random number generation. • The bit permutated cyclic-min algorithm provides near-optimal solutions for QUBO. • Our scalable implementation attains linear improvement of the search rate. A wide range of combinatorial optimization problems can be reduced to the Ising model, and equivalently the quadratic unconstrained binary optimization (QUBO) problem. Thus, in recent years, researchers have proposed to solve QUBO on FPGAs, GPUs, and special-purpose processors. The adaptive bulk search (ABS) is a previously-proposed framework for solving QUBO in parallel on multiple GPUs. In the ABS, a CPU host performs a GA-based global search while GPUs asynchronously perform many local searches in parallel. An original ABS adopts a simple local search algorithm called a cyclic-min algorithm, which does not use pseudo random numbers. However, the lack of randomness may cause a potential drawback of restricted bit-flipping operations in a local search. To avoid this drawback, this paper proposes a cyclic-min algorithm with randomly-generated multiple bit permutations, which enables a more effective local search with random number generation in CPUs (not in GPUs). Furthermore, this paper introduces a scalable implementation of the ABS with MPI and OpenMP. Our experimental results on TSUBAME3.0 show that the solution quality improves and the throughput linearly increases as the number of GPUs increases; with 256 GPUs, it evaluates 20.1 × 10 12 solutions per second. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
7. Unsupervised question-retrieval approach based on topic keywords filtering and multi-task learning.
- Author
-
Shang, Aiguo, Zhu, Xinjuan, Danner, Michael, and Rätsch, Matthias
- Subjects
- *
INFORMATION retrieval , *LEARNING , *QUESTION answering systems , *CONTRASTIVE linguistics , *ALGORITHMS - Abstract
Currently, the majority of retrieval-based question-answering systems depend on supervised training using question pairs. However, there is still a significant need for further exploration of how to employ unsupervised methods to improve the accuracy of retrieval-based question-answering systems. From the perspective of question topic keywords, this paper presents TFCSG, an unsupervised question-retrieval approach based on topic keyword filtering and multi-task learning. Firstly, we design the topic keyword filtering algorithm, which, unlike the topic model, can sequentially filter out the keywords of the question and can provide a training corpus for subsequent unsupervised learning. Then, three tasks are designed in this paper to complete the training of the question-retrieval model. The first task is a question contrastive learning task based on topic keywords repetition strategy, the second is questions and its corresponding sequential topic keywords similarity distribution task, and the third is a sequential topic keywords generation task using questions. These three tasks are trained in parallel in order to obtain quality question representations and thus improve the accuracy of question-retrieval task. Finally, our experimental results on the four publicly available datasets demonstrate the effectiveness of the TFCSG, with an average improvement of 7.1%, 4.4%, and 3.5% in the P@1, MAP, and MRR metrics when using the BERT model compared to the baseline model. The corresponding metrics improved by 5.7%, 3.5% and 3.0% on average when using the RoBERTa model. The accuracy of unsupervised similar question-retrieval task is effectively improved. In particular, the values of P@1, P@5, and P@10 are close, the retrieved similar questions are ranked more advance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Computing pullback function of second order differential operators by using their semi-invariants.
- Author
-
Aldossari, Shayea
- Subjects
- *
ALGORITHMS - Abstract
In this paper, we present an algorithm that, given a differential operator of order 2, detects if this operator comes from a simpler operator under a non-trivial change of variables. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
9. Deletion to scattered graph classes II - improved FPT algorithms for deletion to pairs of graph classes.
- Author
-
Jacob, Ashwin, Majumdar, Diptapriyo, and Raman, Venkatesh
- Subjects
- *
ALGORITHMS , *APPROXIMATION algorithms , *BIPARTITE graphs , *LOGIC - Abstract
The problem of deletion of vertices to a hereditary graph class is a well-studied problem in parameterized complexity. Recently, a natural extension of the problem was initiated where we are given a finite set of hereditary graph classes and we determine whether k vertices can be deleted from a given graph so that the connected components of the resulting graph belong to one of the given hereditary graph classes. The problem is shown to be fixed parameter tractable (FPT) when the deletion problem to each of the given hereditary graph classes is fixed-parameter tractable, and the property of being in any of the graph classes is expressible in the counting monodic second order (CMSO) logic. This paper focuses on pairs of specific graph classes (Π 1 , Π 2) in which we would like the connected components of the resulting graph to belong to, and design simpler and more efficient FPT algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. Markov chains and unambiguous automata.
- Author
-
Baier, Christel, Kiefer, Stefan, Klein, Joachim, Müller, David, and Worrell, James
- Subjects
- *
MARKOV processes , *ALGORITHMS - Abstract
Unambiguous automata are nondeterministic automata in which every word has at most one accepting run. In this paper we give a polynomial-time algorithm for model checking discrete-time Markov chains against ω -regular specifications represented as unambiguous automata. We furthermore show that the complexity of this model checking problem lies in NC: the subclass of P comprising those problems solvable in poly-logarithmic parallel time. These complexity bounds match the known bounds for model checking Markov chains against specifications given as deterministic automata, notwithstanding the fact that unambiguous automata can be exponentially more succinct than deterministic automata. We report on an implementation of our procedure, including an experiment in which the implementation is used to model check LTL formulas on Markov chains. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
11. Grid recognition: Classical and parameterized computational perspectives.
- Author
-
Gupta, Siddharth, Sa'ar, Guy, and Zehavi, Meirav
- Subjects
- *
PARAMETERIZATION , *ALGORITHMS , *TREES - Abstract
Over the past few decades, a large body of works studied the (in)tractability of various computational problems on grid graphs, which often yield substantially faster algorithms than general graphs. Unfortunately, the recognition of a grid graph is hard—it was shown to be NP-hard already in 1987. In this paper, we provide several positive results in this regard in the framework of parameterized complexity. Specifically, our contribution is threefold. First, we show that the problem is FPT parameterized by k + mcc where mcc is the maximum size of a connected component of G. Second, we present a new parameterization, denoted a G , relating graph distance to geometric distance. We show that the problem is para-NP-hard parameterized by a G , but FPT parameterized by a G on trees, as well as FPT parameterized by k + a G. Third, we show that the recognition of k × r grid graphs is NP-hard on graphs of pathwidth 2 where k = 3. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
12. Industrial defect detection and location based on greedy membrane clustering algorithm.
- Author
-
Tang, Yaorui, Yang, Bo, Peng, Hong, and Luo, Xiaohui
- Subjects
- *
ALGORITHMS , *ARTIFICIAL membranes , *DISTRIBUTED computing , *GLOBAL optimization , *PARALLEL programming , *FEATURE extraction - Abstract
This paper introduces a related model of membrane calculation in the defect detection and positioning of industrial components. It has the characteristics of distributed and parallel computing, and can efficiently search for better solutions in a given feature space. Inspired by the membrane clustering algorithm, this paper proposes a greedy membrane clustering algorithm and names it GMCA. GMCA is applied after the extraction of local features of normal samples. It uses a greedy strategy to construct a sub-feature set that describes the local characteristics of normal samples. During training, GMCA can learn the membrane cluster center of normal image blocks and each sub-feature within the cluster. At test time, the anomaly map is obtained by calculating the distance from the test sample block to the corresponding cluster center and the maximum distance from the cluster center to the nearest neighbor in the training sample. This solves the limitation of traditional algorithms requiring dataset alignment. In the unsupervised dataset MvTec AD, samples can be divided into object categories and texture categories according to the background of images. The pixel-level anomaly location index (AUROC) of this method on object category data reaches 98.3%. The image-level anomaly detection index (AUROC) on texture category data reaches 99.1%. • We design a computational model of membrane clustering using the evolutionary mechanisms and communicative mechanisms of cells. • GMCA has the global optimization characteristics of high accuracy and fast convergence of the membrane clustering algorithm. • GMCA has the local optimization characteristics of the greedy strategy. • GMCA solves the limitation of traditional defect detection and positioning methods that require dataset alignment. • Numerous experiments show the proposed GMCA performs competitively in industrial defect detection and location prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Algebraic number fields and the LLL algorithm.
- Author
-
Uray, M.J.
- Subjects
- *
ALGEBRAIC numbers , *ALGEBRAIC fields , *GAUSSIAN elimination , *ALGORITHMS , *ARITHMETIC - Abstract
In this paper we analyze the computational costs of various operations and algorithms in algebraic number fields using exact arithmetic. Let K be an algebraic number field. In the first half of the paper, we calculate the running time and the size of the output of many operations in K in terms of the size of the input and the parameters of K. We include some earlier results about these, but we go further than them, e.g. we also analyze some R -specific operations in K like less-than comparison. In the second half of the paper, we analyze two algorithms: the Bareiss algorithm, which is an integer-preserving version of the Gaussian elimination, and the LLL algorithm, which is for lattice basis reduction. In both cases, we extend the algorithm from Z n to K n , and give a polynomial upper bound on the running time when the computations in K are performed exactly (as opposed to floating-point approximations). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Distributed and individualized computation offloading optimization in a fog computing environment.
- Author
-
Li, Keqin
- Subjects
- *
NASH equilibrium , *COMBINATORIAL optimization , *ALGORITHMS , *MATHEMATICAL optimization , *INFORMATION technology - Abstract
• Formulate a non-cooperative game with both UEs and MECs as players. • Prove the existence of and convergence to a Nash equilibrium. • Develop a set of algorithms to find the Nash equilibrium. • Demonstrate numerical examples of non-cooperative games with and without MECs' participation. In a newly emerged fog computing environment, various user equipments (UE) enhance their computing power and extend their battery lifetime by computation offloading to mobile edge cloud (MEC) servers. Such an environment is distributed and competitive in nature. In this paper, we take a game theoretical approach to computation offloading optimization in a fog computing environment. Such an approach captures and characterizes the nature of a competitive environment. The main contributions of the paper can be summarized as follows. First, we formulate a non-cooperative game with both UEs and MECs as players. Each UE attempts to minimize the execution time of its tasks with an energy constraint. Each MEC attempts to minimize the product of its power consumption for computation and execution time for allocated tasks. Second, we develop a heuristic algorithm for a UE to determine its "heuristically" best response to the current situation, an algorithm for an MEC to determine its best response to the current situation, and an iterative algorithm to find the Nash equilibrium. Third, we prove that our iterative algorithm converges to a Nash equilibrium. We demonstrate numerical examples of our non-cooperative games with and without MECs' participation. We observe that our iterative algorithm always quickly converges to a Nash equilibrium. The uniqueness of our non-cooperative games is that the strategy set of a player can be discrete and the payoff function of a player can be obtained by a heuristic algorithm for combinatorial optimization. To the best of the author's knowledge, there has been no such investigation of non-cooperative games based on combinatorial optimization for computation offloading optimization in a fog computing environment. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
15. Application of the Layered Algorithm in search of an airborne contaminant source.
- Author
-
Szaban, Miroslaw and Wawrzynczak, Anna
- Subjects
- *
SEARCH algorithms , *METROPOLITAN areas , *CELLULAR automata , *ALGORITHMS , *MATHEMATICAL optimization , *MICROBIOLOGICAL aerosols - Abstract
The paper presents a new method of optimization by the Layered Algorithm (LA). The proposed algorithm reduces the initial area to the sub-area containing the optimum. The proposed technique is based on the classification of the optimized function values (data sampled by sensors). The classification method uses a two-dimensional three-state Cellular Automata (CA). The CA classifies all area points ascribed to the CA cells based on their values. Specification of the categorization layers to the data gives a possibility to identify the different levels areas. Consequently, after analysis, a sub-area containing the optimum can be designated. In this paper, the proposed algorithm is applied to find the location of the airborne contaminant source by analyzing the concentration of released substances reported by mobile sensors distributed over the domain of interest. The Gaussian dispersion model simulation of the contaminant dispersion in the urbanized area is applied to generate the data used to verify the efficiency of the proposed Layered Algorithm. The LA successfully estimates the sub-area of the considered domain where the contamination source is located, taking to account data from sensors solely. • Optimization method reducing the initial area to the sub-area containing the optimum. • Verification of the Layered Algorithm (LA) with Selected Benchmark Functions. • Analysis of the LA efficiency to localize the airborne contaminant source. • LA results assessment by the measures: classification and accuracy error, relevance. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
16. A high-performance VLSI array reconfiguration scheme based on network flow under row and column rerouting.
- Author
-
Ding, Hao, Qian, Junyan, Zhao, Lingzhong, and Zhai, Zhongyi
- Subjects
- *
VERY large scale circuit integration , *ARRAY processors , *TELECOMMUNICATION equipment , *ALGORITHMS , *BOTTLENECKS (Manufacturing) - Abstract
• A network flow model of the VLSI processor array is constructed under row and column rerouting. • A new strategy for selecting the bottleneck row in the logic array using the minimum cutting technique is proposed. • The proposed schemes significantly reduce the interconnect redundancy of the logical subarray. The reconfiguration algorithms have been extensively investigated to ensure the reliability and stability for the processor arrays with faults. It is important to reduce the power consumption, capacitance and communication costs in the processors by reducing the interconnection length of the VLSI array. This paper discusses the reconfiguration problem of the high-performance VLSI processor array under the row and column rerouting constraints. A novel method, making use the idea of network flow, is proposed in this paper. Firstly, a network flow model of the VLSI processor array is constructed, such that the high-performance VLSI target array can be obtained by utilizing the minimal cost flow algorithm. Secondly, we propose a new strategy for bottleneck row selection in the logical array using the minimum cut technique, which can find a more suitable bottleneck row. Finally, we conducted reliable experiments to clearly reveal the efficiency of the new rerouting scheme and algorithm in reducing the number of long interconnects. The experimental results show that, for a host array with size of 256×256, the number of long interconnects in the subarray can be reduced by up to 79.22% and 55.88% without performance penalty for random faults with density of 1% and 25% respectively, when compared with state-of-the-art. In addition, the proposed scheme improves existing algorithm in terms of subarray size. On a 256×256 host array with 25% faulty density, the average improvement in subarray size is up to 3.77% compared with state-of-the-art. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
17. A two-phase heuristic algorithm for power-aware offline scheduling in IaaS clouds.
- Author
-
Ignatov, A., Maslova, I., Posypkin, M., Yang, W., and Wu, J.
- Subjects
- *
HEURISTIC algorithms , *SCHEDULING , *ONLINE algorithms , *ALGORITHMS - Abstract
The paper aims at mitigating hot-spots during Offline Scheduling in IaaS (Infrastructure-as-a-Service) cloud systems. Unlike previous studies, the research focuses on identifying and resolving hot-spots not at servers, but at server racks. A two-phase algorithm for performing power-aware offline scheduling is proposed. The first phase aims at identifying and mitigating hot-spots at racks, while the second phase performs VM consolidation, i.e. minimization of the number of occupied servers while maintaining a feasible VM mapping and low migration costs. The proposed algorithm takes into account the dynamic nature of VM's resource consumption: it does not only resolve detected hot-spots, but also tries to avoid hot-spots in a reasonable future time period. The algorithm was tested with the data from a real IaaS cloud with different sets of algorithm's parameters. Experimental evaluation showed that the statistical estimates of the future VM's resource consumption provide the most reliable mapping, which is a result of minimization of the number of new hot-spot occurrences. • A server rack is a hot-spot if it exceeds the upper bound of power consumption. • Hot-spot identification, mitigation, and avoidance is crucial for IaaS systems. • An algorithm is proposed for rack hot-spot mitigation and consolidation of VMs. • Considering future CPU utilization values is associated with migrating more memory. • Ignoring future CPU utilization values leads to more new hot-spots happening shortly. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
18. A new algorithm for computing staggered linear bases.
- Author
-
Hashemi, Amir and Michael Möller, H.
- Subjects
- *
GROBNER bases , *VECTOR fields , *ALGORITHMS , *VECTOR spaces , *POLYNOMIALS - Abstract
Considering a multivariate polynomial ideal over a given field as a vector space, we investigate for such an ideal a particular linear basis, a so-called staggered linear basis, which contains a Gröbner basis as well. In this paper, we present a simple and efficient algorithm to compute a staggered linear basis. The new framework is equipped with some novel criteria (including both Buchberger's criteria) to detect superfluous reductions. The proposed algorithm has been implemented in Maple, and we provide an illustrative example to show how it works. Finally, the efficiency of this algorithm compared to the existing methods is discussed via a set of benchmark polynomials. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
19. Affinity-aware resource provisioning for long-running applications in shared clusters.
- Author
-
Mommessin, Clément, Yang, Renyu, Shakhlevich, Natalia V., Sun, Xiaoyang, Kumar, Satish, Xiao, Junqing, and Xu, Jie
- Subjects
- *
ALGORITHMS - Abstract
Resource provisioning plays a pivotal role in determining the right amount of infrastructure resource to run applications and reduce the monetary cost. A significant portion of production clusters is now dedicated to long-running applications (LRAs), which are typically in the form of microservices and executed in the order of hours or even months. It is therefore practically important to plan ahead the placement of LRAs in a shared cluster for the minimized number of compute nodes required by them. Existing works on LRA scheduling are often application-agnostic, without particularly addressing the constraining requirements imposed by LRAs, such as co-location affinity constraints and time-varying resource requirements. In this paper, we present an affinity-aware resource provisioning approach for deploying large-scale LRAs in a shared cluster subject to multiple constraints, with the objective of minimizing the number of compute nodes in use. We investigate a broad range of solution algorithms which fall into three main categories: Application-Centric, Node-Centric, and Multi-Node approaches, and tune them for typical large-scale real-world scenarios. Experimental studies driven by the Alibaba Tianchi dataset show that our algorithms can achieve competitive scheduling effectiveness and running time, as compared with the heuristics used by the latest work including Medea and Lra Sched. Best results are obtained by the Application-Centric algorithms, if the algorithm's running time is of primary concern, and by Multi-Node algorithms, if the solution quality is of primary concern. • Resource provisioning for long-running applications is studied in shared clusters. • Temporal resource requests and application-level affinity are considered. • Application-Centric algorithms win if the running time is of primary concern. • Node-Centric Algorithms offer a trade-off between solution quality and time. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
20. ParaLiNGAM: Parallel causal structure learning for linear non-Gaussian acyclic models.
- Author
-
Shahbazinia, Amirhossein, Salehkaleybar, Saber, and Hashemi, Matin
- Subjects
- *
ACYCLIC model , *MACHINE learning , *PARALLEL algorithms , *ALGORITHMS - Abstract
One of the key objectives in many fields in machine learning is to discover causal relationships among a set of variables from observational data. In linear non-Gaussian acyclic models (LiNGAM), it can be shown that the true underlying causal structure can be identified uniquely from merely observational data. The DirectLiNGAM algorithm is a well-known solution to learn the true causal structure in a high dimensional setting. DirectLiNGAM algorithm executes in a sequence of iterations and it performs a set of comparisons between pairs of variables in each iteration. Unfortunately, the runtime of this algorithm grows significantly as the number of variables increases. In this paper, we propose a parallel algorithm, called ParaLiNGAM, to learn casual structures based on DirectLiNGAM algorithm. We propose a threshold mechanism that can reduce the number of comparisons remarkably compared with the sequential solution. Moreover, in order to further reduce runtime, we employ a messaging mechanism between workers. We also present an implementation of ParaLiNGAM on GPU, considering hardware constraints. Experimental results on synthetic and real data show that our proposed solution outperforms DirectLiNGAM by a factor up to 4788X, and by a median of 2344X. • A scalable algorithm for learning causal structures in linear non-Gaussian models. • Significant speedup by selecting the necessary computations judiciously. • Speedup ratio of up to 4788 on GPUs in real-world and synthetic datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
21. Toward the best algorithm for approximate GCD of univariate polynomials.
- Author
-
Nagasaka, Kosaku
- Subjects
- *
POLYNOMIALS , *ALGORITHMS , *SYLVESTER matrix equations - Abstract
Approximate polynomial GCD (greatest common divisor) of polynomials with a priori errors on their coefficients, is one of interesting problems in Symbolic-Numeric Computations. In fact, there are many known algorithms: QRGCD, UVGCD, STLN based methods, Fastgcd and so on. The fundamental question of this paper is "which is the best?" from the practical point of view, and subsequently "is there any better way?" by any small extension, any effect by pivoting, and any combination of sub-routines along the algorithms. In this paper, we consider a framework that covers those algorithms and their sub-routines, and makes their sub-routines being interchangeable between the algorithms (i.e. disassembling the algorithms and reassembling their parts). By this framework along with/without small new extensions and a newly adapted refinement sub-routine, we have done many performance tests and found the current answer. In summary, 1) UVGCD is the best way to get smaller tolerance, 2) modified Fastgcd is better for GCD that has one or more clusters of zeros with large multiplicity, and 3) modified ExQRGCD is better for GCD that has no cluster of zeros. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
22. BSF: A parallel computation model for scalability estimation of iterative numerical algorithms on cluster computing systems.
- Author
-
Sokolinsky, Leonid B.
- Subjects
- *
COMPUTER workstation clusters , *COMPUTER systems , *SCALABILITY , *PARALLEL algorithms , *ALGORITHMS - Abstract
This paper examines a novel parallel computation model called bulk synchronous farm (BSF) that focuses on estimating the scalability of compute-intensive iterative algorithms aimed at cluster computing systems. The main advantage of the proposed model is that it allows to estimate the scalability of a parallel algorithm before its implementation. Another important feature of the BSF model is the representation of problem data in the form of lists that greatly simplifies the logic of building applications. In the BSF model, a computer is a set of processor nodes connected by a network and organized according to the master/slave paradigm. A cost metric of the BSF model is presented. This cost metric requires the algorithm to be represented in the form of operations on lists. This allows us to derive an equation that predicts the scalability boundary of a parallel program: the maximum number of processor nodes after which the speedup begins to decrease. The paper includes examples of applying the BSF model to designing and analyzing parallel numerical algorithms. The large-scale computational experiments conducted on a cluster computing system confirm the adequacy of the analytical estimations obtained using the BSF model. • Scalability is important property of parallel algorithm. • The main measure for evaluating scalability is speedup. • Algorithm is scalable if speedup increases linearly with number of processors. • Scalability boundary is number of processors at which speedup peak is reached. • Parallel computation model can predict scalability boundary of parallel algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
23. A decentralized algorithm to combine topology control with network coding.
- Author
-
Khalily-Dermany, Moammad
- Subjects
- *
LINEAR network coding , *WIRELESS sensor networks , *ELECTRIC network topology , *SUBGRADIENT methods , *TOPOLOGY , *ALGORITHMS , *DECOMPOSITION method , *ROUTING algorithms - Abstract
Network coding and topology control techniques have been widely used to increase throughput and improve the lifetime of Wireless Sensor Networks (WSNs). This paper considers the simultaneous utilization of these techniques in a WSN and proposes convex non-linear programming. Since solving the problem for a large-scale and dynamic WSN is impractical and almost impossible, Lagrangian, sub-gradient and the decomposition methods are employed to provide a decentralized algorithm. In the proposed algorithm, a node makes the computations by acquiring local knowledge and information from its neighbors. This paper provides a mathematical language to build an analytic foundation for the design of a modularized and decentralized algorithm that provides transmission ranges and routes for a WSN. The simulation results show that increasing the number of sources, sinks, sensors, and traffic load leads to improving the lifetime which is acquired by the proposed algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
24. An extension of holonomic sequences: C2-finite sequences.
- Author
-
Jiménez-Pastor, Antonio, Nuspl, Philipp, and Pillwein, Veronika
- Subjects
- *
LINEAR equations , *COMPUTER scientists , *MATHEMATICIANS , *DIFFERENCE equations , *POLYNOMIALS , *SHIFT registers - Abstract
Holonomic sequences are widely studied as many objects interesting to mathematicians and computer scientists are in this class. In the univariate case, these are the sequences satisfying linear recurrences with polynomial coefficients and also referred to as D -finite sequences. A subclass are C -finite sequences satisfying a linear recurrence with constant coefficients. We investigate the set of sequences which satisfy linear recurrence equations with coefficients that are C -finite sequences and call them C 2 -finite sequences. These sequences are a natural generalization of holonomic sequences. In this paper, we show that C 2 -finite sequences form a difference ring and provide methods to compute in this ring. Furthermore, we provide an analogous construction for D 2 -finite sequences, i.e., sequences satisfying a linear recurrence with holonomic coefficients. We show that these constructions can be iterated and obtain an increasing chain of difference rings. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
25. EdgeDecAp: An auction-based decentralized algorithm for optimizing application placement in edge computing.
- Author
-
Smolka, Sven, Wißenberg, Leon, and Mann, Zoltán Ádám
- Subjects
- *
ALGORITHMS , *LOCAL knowledge , *SERVER farms (Computer network management) , *MATHEMATICAL optimization , *EDGE computing , *DECISION making - Abstract
In edge computing, application components can be placed over a range of computational devices from cloud data centers to nodes at the network edge. Application placement can have significant impact on important metrics like latency and resource utilization. Thus, application placement is an important optimization problem. In edge computing, the characteristics of both the infrastructure and the application may change over time, which may require the dynamic re-optimization of the application placement. Most algorithms suggested so far for the dynamic re-optimization of edge application placement are centralized, i.e., they rely on one entity collecting information from the whole infrastructure and making decisions centrally. However, centralized approaches suffer from limited scalability and are vulnerable to failures. In this paper, we present a decentralized approach for the dynamic re-optimization of edge application placement. We adopt an algorithm of Malek et al. for distributed systems and modify it to make it applicable to edge computing. In this approach, each node makes decisions autonomously, using auctions for coordination. Our empirical results demonstrate that the proposed algorithm is very effective in optimizing edge application placement. In an edge system with 637 edge nodes and 563 end devices, our algorithm achieves 54% higher reduction of application latency than a previous decentralized algorithm. • Optimized placement of application components on cloud and edge devices. • Decentralized optimization algorithm using local knowledge and local modifications. • Auctions and distributed locking for synchronization. • Minimization of application latency. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
26. IoT architecture for adaptation to transient devices.
- Author
-
Ariza, Jairo, Garcés, Kelly, Cardozo, Nicolás, Sánchez, Juan Pablo Rodríguez, and Vargas, Fernando Jiménez
- Subjects
- *
ALGORITHMS , *SELF-adaptive software - Abstract
IoT environments are continuously changing. Changes may come from the service, connectivity, or physical layers of the IoT architecture. Therefore, to function appropriately, the system needs to dynamically adapt to its environment. In previous work, we posited eight challenges to foster adaptation through all architecture layers of IoT systems. In this paper, we address the challenges to manage the inclusion of new devices and devices' transient connection , by means of dynamic adaptations incorporated into our proposed software architecture for adaptive IoT systems. To manage dynamic adaptations, we extend the reference IoT architecture with our specialized components. In particular, we use (1) ontologies and instances to represent the domain knowledge; (2) a matching algorithm to pair services and IoT devices, taking into account their functional requirements, quality attributes and sensors properties; and (3) a match update algorithm used whenever sensors become (un)available. We evaluate the effectiveness of our solution with respect to the accuracy of matching services and IoT devices, and the response to environment changes. • This paper presents a new adaptive architecture for IoT Systems. • Ontology matching algorithm-based architecture to manage transient connections. • Functional, non-functional, and QoS matching algorithm between IoT components. • Matching algorithm with better precision and recall compared with the gold-standard. • Architecture deployed in a real case study for a SUDS system in the city of Bogotá. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
27. Circular pattern matching with k mismatches.
- Author
-
Charalampopoulos, Panagiotis, Kociumaka, Tomasz, Pissis, Solon P., Radoszewski, Jakub, Rytter, Wojciech, Straszyński, Juliusz, Waleń, Tomasz, and Zuba, Wiktor
- Subjects
- *
PATTERN matching , *ALGORITHMS , *HAMMING distance - Abstract
We consider the circular pattern matching with k mismatches (k -CPM) problem in which one is to compute the minimal Hamming distance of every length- m substring of T and any cyclic rotation of P , if this distance is no more than k. It is a variation of the well-studied k -mismatch problem. A multitude of papers has been devoted to solving the k -CPM problem, but only average-case upper bounds are known. In this paper, we present the first non-trivial worst-case upper bounds for this problem. Specifically, we show an O (n k) -time algorithm and an O (n + n m k 4) -time algorithm. The latter algorithm applies in an extended way a technique that was very recently developed for the k -mismatch problem Bringmann et al. (2019) [10]. A preliminary version of this work appeared at FCT 2019 [35]. In this version we improve the time complexity of the second algorithm from O (n + n m k 5) to O (n + n m k 4). [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
28. Constructive arithmetics in Ore localizations enjoying enough commutativity.
- Author
-
Hoffmann, Johannes and Levandovskyy, Viktor
- Subjects
- *
COMMUTATIVE algebra , *ARITHMETIC , *NONCOMMUTATIVE rings , *ALGORITHMS , *ORES , *LOCALIZATION (Mathematics) - Abstract
This paper continues a research program on constructive investigations of non-commutative Ore localizations, initiated in our previous papers, and particularly touches the constructiveness of arithmetics within such localizations. Earlier we have introduced monoidal, geometric and rational types of localizations of domains as objects of our studies. Here we extend this classification to rings with zero divisors and consider Ore sets of the mentioned types which are commutative enough: such a set either belongs to a commutative algebra or it is central or its elements commute pairwise. By using the systematic approach we have developed before, we prove that arithmetic within the localization of a commutative polynomial algebra is constructive and give the necessary algorithms. We also address the important question of computing the local closure of ideals which is also known as the desingularization, and present an algorithm for the computation of the symbolic power of a given ideal in a commutative ring. We also provide algorithms to compute local closures for certain non-commutative rings with respect to Ore sets with enough commutativity. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
29. Vulnerability assessment of fault-tolerant optical network-on-chips.
- Author
-
Abdollahi, Meisam and Mohammadi, Siamak
- Subjects
- *
DATA transmission systems , *SYSTEMS on a chip , *OPTICAL resonators , *OPTICAL devices , *ALGORITHMS , *MOTIVATION (Psychology) - Abstract
Multi/Many-core systems based on the traditional electrical network-on-chip are confronted with the limited bandwidth, high latency, and reliability challenges. The emerging optical solution of silicon photonics has promised to improve the design parameters of electrical interconnections for future multiprocessor system on chips. Although the optical network-on-chip has advantages in terms of reliability compared to the electrical ones, important phenomena such as crosstalk noise, process variation, and temperature fluctuations must be carefully considered. These physical challenges have substantial adverse effects on the correct functionality of on-chip optical devices such as microring resonator and the optical waveguide. Malfunction of these elements may cause the injection of faults that should be tolerated by the reliable system. In this paper, we have assessed the effect of the fault-tolerant design of optical routers on the reliability parameters through application mapping. We propose a fault-tolerant algorithm to modify optical routers to improve the reliability parameter. The vulnerability analysis of the proposed algorithm shows that besides obtaining the fault-tolerant capability in optical routers, only about 9.63% and 19.29% of SNR decrease for real-world applications and seven traditional optical routers are achieved in the case of a single fault and two faults injections, respectively. • In the first step, we explain the primitives of optical on-chip data transmission and discuss on its reliability challenges and also the paper's motivation. • A general framework to construct a fault-tolerant optical router. • Evaluation the reliability parameters of various fault-tolerant optical routers. • Optimized application mapping for fair comparison in different optical routers. • Vulnerability analysis through fault injection for the proposed fault-tolerant schema. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
30. Integration of multi-objective PSO based feature selection and node centrality for medical datasets.
- Author
-
Rostami, Mehrdad, Forouzandeh, Saman, Berahmand, Kamal, and Soltani, Mina
- Subjects
- *
FEATURE selection , *REPRESENTATIONS of graphs , *ALGORITHMS , *CENTRALITY , *DATABASES , *COMPUTER engineering - Abstract
In the past decades, the rapid growth of computer and database technologies has led to the rapid growth of large-scale medical datasets. On the other, medical applications with high dimensional datasets that require high speed and accuracy are rapidly increasing. One of the dimensionality reduction approaches is feature selection that can increase the accuracy of the disease diagnosis and reduce its computational complexity. In this paper, a novel PSO-based multi objective feature selection method is proposed. The proposed method consists of three main phases. In the first phase, the original features are showed as a graph representation model. In the next phase, feature centralities for all nodes in the graph are calculated, and finally, in the third phase, an improved PSO-based search process is utilized to final feature selection. The results on five medical datasets indicate that the proposed method improves previous related methods in terms of efficiency and effectiveness. • In this paper, a novel feature selection method is developed by integrating of node centrality and PSO algorithm. • Proposed method selects a subset of features in the medical dataset with lowest redundancy and the highest relevance. • In this proposed method the optimal number of final feature set is determined automatically. • Experimental results showed that the proposed method has the best performance among different disease diagnosis methods. • The results on five medical datasets indicate that the proposed method improves disease diagnosis prediction accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
31. A parallel branch-and-bound algorithm with history-based domination and its application to the sequential ordering problem.
- Author
-
Gonggiatgul, Taspon, Shobaki, Ghassan, and Muyan-Özçelik, Pınar
- Subjects
- *
MULTICORE processors , *PARALLEL algorithms , *NP-hard problems , *NP-complete problems , *ALGORITHMS , *COMBINATORIAL optimization - Abstract
In this paper, we describe the first parallel Branch-and-Bound (B&B) algorithm with a history-based domination technique. Although history-based domination substantially speeds up a B&B search, it makes parallelization much more challenging. Our algorithm is the first parallel exact algorithm for the Sequential Ordering Problem using a pure B&B approach. To effectively explore the solution space, we have developed three novel parallelization techniques: thread restart, parallel history domination, and history-table memory management. The proposed algorithm was experimentally evaluated using the SOPLIB and TSPLIB benchmarks on multi-core processors. Using 32 threads with a time limit of one hour, the algorithm gives geometric-mean speedups of 72x and 20x on the medium-difficulty SOPLIB and TSPLIB instances, respectively. On the hard instances, it solves 12 instances that the sequential algorithm does not solve, with geometric-mean speedups of 16x on SOPLIB and 32x on TSPLIB. Super-linear speedups up to 366x are seen on 16 instances. • We propose the first parallel B&B algorithm that involves a history-based domination technique. • The proposed algorithm includes three novel parallelization techniques: thread restart, parallel history-based domination, and history-table memory management. • The proposed algorithm is the first parallel B&B algorithm for the Sequential Ordering Problem, which is an NP-hard sequencing problem with precedence constraints. • We present a thorough experimental evaluation of the proposed parallel algorithm on SOPLIB and TSPLIB. The results show that super-linear speedup is not an anomaly; it can be achieved on many instances. We report a speedup ratio as high as 285 on a 32-core processor. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
32. Simulating structural plasticity of the brain more scalable than expected.
- Author
-
Czappa, Fabian, Geiß, Alexander, and Wolf, Felix
- Subjects
- *
SYNAPSES , *SCALABILITY , *ALGORITHMS , *NEURONS - Abstract
Structural plasticity of the brain describes the creation of new and the deletion of old synapses over time. Rinke et al. (JPDC 2018) introduced a scalable algorithm that simulates structural plasticity for up to one billion neurons on current hardware using a variant of the Barnes–Hut algorithm. They demonstrate good scalability and prove a runtime complexity of O (n log 2 n). In this comment paper, we show that with careful consideration of the algorithm and a rigorous proof, the theoretical runtime can even be classified as O (n log n). • Improved serial runtime bound for the adapted Barnes–Hut algorithm to Θ(n*log(n)). • Improved the parallel runtime bound to Θ(n/p*log(n)+p). • Mathematical justification that the given runtime bound is sharp. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
33. Faster Graph bipartization.
- Author
-
Kolay, Sudeshna, Misra, Pranabendu, Ramanujan, M.S., and Saurabh, Saket
- Subjects
- *
OPERATIONS research , *GEOMETRIC vertices , *TRANSVERSAL lines , *ALGORITHMS , *LINEAR dependence (Mathematics) - Abstract
In the Graph bipartization (or Odd Cycle Transversal) problem, the objective is to decide whether a given graph G can be made bipartite by the deletion of k vertices for some given k. The parameterized complexity of Odd Cycle Transversal was resolved in the breakthrough paper of Reed, Smith and Vetta [Operations Research Letters, 2004], who developed an algorithm running in time O (3 k k m n). The question of improving the dependence on the input size to linear, which was another long standing open problem in the area, was resolved by Iwata et al. [SICOMP 2016] and Ramanujan and Saurabh [TALG 2017], who presented O (4 k (m + n)) and 4 k k O (1) (m + n) algorithms respectively. In this paper, we obtain a faster algorithm that runs in time 3 k k O (1) (m + n) and hence preserves the linear dependence on the input size while nearly matching the dependence on k incurred by the algorithm of Reed, Smith and Vetta. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
34. Investigations on the approximability and computability of the Hilbert transform with applications.
- Author
-
Boche, Holger and Pohl, Volker
- Subjects
- *
HILBERT transform , *BANACH spaces , *FUNCTION spaces , *ALGORITHMS - Abstract
It was recently shown that on a large class of important Banach spaces there exist no linear methods which are able to approximate the Hilbert transform from samples of the given function. This implies that there is no linear algorithm for calculating the Hilbert transform which can be implemented on a digital computer and which converges for all functions from the corresponding Banach spaces. The present paper develops a much more general framework which also includes non-linear approximation methods. All algorithms within this framework have only to satisfy an axiom which guarantees the computability of the algorithm based on given samples of the function. The paper investigates whether there exists an algorithm within this general framework which converges to the Hilbert transform for all functions in these Banach spaces. It is shown that non-linear methods give actually no improvement over linear methods. Moreover, the paper discusses some consequences regarding the Turing computability of the Hilbert transform and the existence of computational bases in Banach spaces. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
35. A parallel algorithm for constructing multiple independent spanning trees in bubble-sort networks.
- Author
-
Kao, Shih-Shun, Klasing, Ralf, Hung, Ling-Ju, Lee, Chia-Wei, and Hsieh, Sun-Yuan
- Subjects
- *
SPANNING trees , *RECURSIVE functions , *TIME complexity , *PROBLEM solving , *PARALLEL algorithms , *ALGORITHMS - Abstract
The use of multiple independent spanning trees (ISTs) for data broadcasting in networks provides a number of advantages, including the increase of fault-tolerance and secure message distribution. Thus, the designs of multiple ISTs on several classes of networks have been widely investigated. Kao et al. (2019) [18] proposed an algorithm to construct independent spanning trees in bubble-sort networks. The algorithm is executed in a recursive function and thus is hard to parallelize. In this paper, we focus on the problem of constructing ISTs in bubble-sort networks B n and present a non-recursive algorithm. Our approach can be fully parallelized, i.e., every vertex can determine its parent in each spanning tree in constant time. This solves the open problem from the paper by Kao et al. Furthermore, we show that the total time complexity O (n ⋅ n !) of our algorithm is asymptotically optimal, where n is the dimension of B n and n ! is the number of vertices of the network. • We present an algorithm for constructing multiple independent spanning trees in bubble-sort networks. • Our approach can be fully parallelized, thus solving an open problem from Kao et al. (2019). • We show that the total time complexity of our algorithm is asymptotically optimal. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
36. Implementation of bio-inspired hybrid algorithm with mutation operator for robotic path planning.
- Author
-
Gul, Faiza, Mir, Imran, Alarabiat, Deemah, Alabool, Hamzeh Mohammad, Abualigah, Laith, and Mir, Suleman
- Subjects
- *
ROBOTIC path planning , *BIOLOGICALLY inspired computing , *PARTICLE swarm optimization , *ALGORITHMS , *NP-hard problems , *GENETIC algorithms - Abstract
Path planning is an NP-hard problem that is aimed to satisfy multi-constraint optimization requirements. In autonomous robotics applications, path planning together with collision avoidance presents a challenging task. It necessitates the generation of possible search directions from a designated point to a fixed varying destination location satisfying spatial constraints. This paper presents a framework for the design of an intelligent multi-objective robotic path planning algorithm. The algorithm relies on the generation of way-points by hybridizing two meta-heuristics techniques, namely Grey Wolf Algorithm (GWO) and Particle Swarm Optimization (PSO). A frequency-based modification in GWO search operators is introduced to fasten the search process. An improvised search strategy is employed for collision detection and avoidance, which converts non-desired points into the desired point. Sensors are deployed around the robot vicinity for search optimization. Mutation operators are then introduced to improve path length by smoothing out the trajectory. The proposed algorithm's effectivity is then validated through extensive simulations, in which different condition environments are simulated. To validate the effectiveness of the proposed methodology, the results are compared with contemporary algorithms namely Minimum Angle Artificial Bee Colony (MAABC) algorithms, Hybrid Cuckoo Search-Bat Algorithm (BA-CSA), Bacterial Bolony (BC) and Genetic Algorithm (GA) algorithms. The results conclusively demonstrated that the proposed algorithm ensures effective performance in path smoothness and safety under a wide range of conditions. • A novel intelligent multi-objective framework for robotic path planning algorithm. • A hybridizing two meta-heuristics techniques to solve the path planning. • An improvised search strategy is employed for collision detection and avoidance. • The proposed algorithm's effectively validated through extensive simulations. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
37. CP-SGD: Distributed stochastic gradient descent with compression and periodic compensation.
- Author
-
Yu, Enda, Dong, Dezun, Xu, Yemao, Ouyang, Shuo, and Liao, Xiangke
- Subjects
- *
TRAIN delays & cancellations , *IMAGE compression , *DEEP learning , *SCALABILITY , *ALGORITHMS - Abstract
Communication overhead is the key challenge for distributed training. Gradient compression is a widely used approach to reduce communication traffic. When combined with a parallel communication mechanism method like pipeline, gradient compression technique can greatly alleviate the impact of communication overhead. However, there exist two problems of gradient compression technique to be solved. Firstly, gradient compression brings in extra computation cost, which will delay the next training iteration. Secondly, gradient compression usually leads to a decrease in convergence accuracy. In this paper, we combine parallel mechanism with gradient quantization and periodic full-gradient compensation, and propose a new distributed optimization method named CP-SGD, which can hide the overhead of gradient compression, overlap part of the communication and obtain high convergence accuracy. The local update operation in CP-SGD allows the next iteration to be launched quickly without waiting for the completion of gradient compression and the current communication process. Besides, the accuracy loss caused by gradient compression is solved by k-step correction method introduced in CP-SGD, which provides a gradient correction every k iterations. We prove that CP-SGD has a convergence guarantee and it achieves at least O (1 K + 1 K) convergence rate, where K is the number of iterations. We conduct extensive experiments on MXNet to verify the convergence properties and scaling performance of CP-SGD. Experimental results on a 32-GPU cluster show that convergence accuracy of CP-SGD is close to or even slightly better than that of S-SGD, and its end-to-end time is 30% less than 2-bit gradient compression under a 56Gbps bandwidth environment. In addition, we analyze the performance of CP-SGD when training on 8, 16 and 32 GPUs. It is found that CP-SGD is suitable for most compression-supported update algorithms, and its scalability is approximately linear. • CP-SGD: Distributed Stochastic Gradient Descent with Compression and Periodic Compensation. • CP-SGD help distributed deep learning train with high-accuracy and friendly-communication simultaneously. • Combining the advantages of parallel mechanism and gradient compression. • K-step correction technique makes CP-SGD achieve codesign optimizations and improve compression-based methods greatly. • CP-SGD suits most compression-supported update algorithms and gradient compression strategies. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
38. HY-DBSCAN: A hybrid parallel DBSCAN clustering algorithm scalable on distributed-memory computers.
- Author
-
Wu, Guoqing, Cao, Liqiang, Tian, Hongyun, and Wang, Wei
- Subjects
- *
MODERN architecture , *ALGORITHMS , *COMPUTER workstation clusters , *COMPUTERS , *PARALLEL algorithms , *DISTRIBUTED algorithms , *SCALABILITY - Abstract
• A parallel scalable DBSCAN algorithm which outperforms other implementations. • Optimizations for data partitioning, spatial indexing, and cluster merging. • Exploiting hybrid parallelization to take advantage of modern HPC architectures. • Demonstrating accuracy, performance and scalability of our algorithm. Dbscan is a density-based clustering algorithm which is well known for its ability to discover clusters of arbitrary shape as well as to distinguish noise. As it is computationally expensive for large datasets, research studies on the parallelization of Dbscan have been received a considerable amount of attention. In this paper we present an exact, efficient and scalable parallel Dbscan algorithm which we call Hy-Dbscan. It employs three major techniques to enable scalable data clustering on distributed-memory computers i) a modified kd-tree for domain decomposition, ii) a spatial indexing approach based on grid and inference, and iii) a cluster merging scheme based on distributed Rem's Union-Find algorithm. Moreover, Hy-Dbscan exploits process level and thread level parallelization. In experiments, we have demonstrated performance and scalability using two scientific datasets on up to 2048 cores of a distributed-memory computer. Through extensive evaluation, we show that Hy-Dbscan significantly outperforms previous state-of-the-art Dbscan implementations. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
39. Relative approximate bisimulations for fuzzy picture automata.
- Author
-
Yang, Chao, Wu, Ruiling, Sun, Xiaobing, Wang, Qichao, and Li, Yongming
- Subjects
- *
BISIMULATION , *PICTURES , *ROBOTS , *ALGORITHMS - Abstract
In this paper, we firstly give the notion of relative ϵ -approximate bisimulations between two fuzzy two-dimensional on-line tessellation automata (F2OTAs) and elaborate that the relative error between two F2OTAs, accompanying with a relative ϵ -approximate bisimulation between them, is less than or equal to ϵ where ϵ ∈ [ 0 , 1 ]. Moreover, if relative ϵ -approximate bisimulations between two F2OTAs are restricted to be surjective functional, then some properties are drawn and by which relative ϵ -approximate bisimulations on a F2OTA are defined. We construct the factor F2OTA of a F2OTA with respect to a relative ϵ -approximate bisimulation on it and describe the relationship between this factor automaton and the original F2OTA. Whereafter, we novelly design two algorithms to compute all maximal relative ϵ -approximate bisimulations on a given F2OTA. Finally, we discuss some interesting properties of bisimulations on F2OTAs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. Efficient topology reconfiguration for NoC-based multiprocessors: A greedy-memetic algorithm.
- Author
-
Qian, Junyan, Zhang, Chuanfang, Wu, Zheng, Ding, Hao, and Li, Long
- Subjects
- *
MULTIPROCESSORS , *MULTICORE processors , *ALGORITHMS , *COMMUNICATION infrastructure , *NETWORKS on a chip , *GREEDY algorithms - Abstract
In multi-core processor systems, the Network-on-Chip (NoC) serves as a vital communication infrastructure. To ensure chip reliability during potential failures, this paper proposes a two-level topology reconfiguration algorithm with core-level redundancy technology. Initially, a heuristic topology reconfiguration method utilizing a greedy strategy is proposed to perform local replacement of faulty processing elements (PEs) and generate an initial logical topology with shorter interconnection paths between PEs. Then, an intelligent optimization method based on memetic algorithm is introduced to optimize the generated initial topology for better communication performance. The experimental results demonstrate that compared to the current state-of-the-art algorithm, the proposed algorithm achieves an average improvement of 13.92% and 30.83% on various size topologies in terms of distance factor (DF) and congestion factor (CF), which represent communication delay and traffic balance respectively. The proposed algorithm significantly enhances the communication performance of the target topology, mitigating communication latency and potential congestion problems. • The candidate set aids greedy selection for optimal fault-free PE replacements. • Local greedy strategy reduces communication delay, eases congestion issues. • Memetic algorithm further optimizes the target topology in the reconfiguration. • The proposed algorithms show strong stability, adaptability in large-scale arrays. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Explainable AI Insights for Symbolic Computation: A case study on selecting the variable ordering for cylindrical algebraic decomposition.
- Author
-
Pickering, Lynn, del Río Almajano, Tereso, England, Matthew, and Cohen, Kelly
- Subjects
- *
MACHINE learning , *ARTIFICIAL intelligence , *SYMBOLIC computation , *COMPUTER systems , *ALGORITHMS - Abstract
In recent years there has been increased use of machine learning (ML) techniques within mathematics, including symbolic computation where it may be applied safely to optimise or select algorithms. This paper explores whether using explainable AI (XAI) techniques on such ML models can offer new insight for symbolic computation, inspiring new implementations within computer algebra systems that do not directly call upon AI tools. We present a case study on the use of ML to select the variable ordering for cylindrical algebraic decomposition. It has already been demonstrated that ML can make the choice well, but here we show how the SHAP tool for explainability can be used to inform new heuristics of a size and complexity similar to those human-designed heuristics currently commonly used in symbolic computation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. An energy and carbon-aware algorithm for renewable energy usage maximization in distributed cloud data centers.
- Author
-
Zhao, Daming and Zhou, Jiantao
- Subjects
- *
SERVER farms (Computer network management) , *RENEWABLE energy sources , *ENERGY consumption , *VIRTUAL machine systems , *CARBON emissions , *ALGORITHMS , *ECOLOGICAL impact - Abstract
The vigorous development and the increasing popularity of cloud computing highlight the necessity of reducing data center energy consumption and the environmental impact of carbon dioxide emissions. For geographically distributed data centers, cloud servers are connected to the conventional power grid and in addition they are supported by an attached renewable energy source. Since the carbon footprint rate of energy consumption has dynamic differences in space, reducing energy consumption does not mean decrease carbon emission, which indicates that energy consumption and carbon footprint need to be synergistically optimized. In this paper, an energy and carbon-aware algorithm for virtual machine placement is proposed. The goal is to obtain a virtual machine allocation scheme that aims to achieve the trade-off between energy consumption and carbon emissions by improving renewable energy utilization. The experimental results show that the proposed approach is more energy-efficient and greener, which can also maximize the renewable energy utilization with 73.11% while ensuring the SLA violation with 0.2% in comparison to the baseline algorithms. • Propose a novel energy-aware VM placement approach in cloud data centers. • Introduce renewable energy for managing energy consumption and carbon footprint. • Design renewable energy generation prediction method based on EEMD and TCN. • Demonstrate the superiority of the proposed algorithm with five baseline algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
43. Model-based selection of optimal MPI broadcast algorithms for multi-core clusters.
- Author
-
Nuriyev, Emin, Rico-Gallego, Juan-Antonio, and Lastovetsky, Alexey
- Subjects
- *
ALGORITHMS , *MULTICORE processors , *COMMUNICATION models , *BROADCASTING industry , *PROBLEM solving - Abstract
The performance of collective communication operations determines the overall performance of MPI applications. Different algorithms have been developed and implemented for each MPI collective operation, but none proved superior in all situations. Therefore, MPI implementations have to solve the problem of selecting the optimal algorithm for the collective operation depending on the platform, the number of processes involved, the message size(s), etc. The current solution method is purely empirical. Recently, an alternative solution method using analytical performance models of collective algorithms has been proposed and proved both accurate and efficient for one-process-per-CPU configurations. The method derives the analytical performance models of algorithms from their code implementation rather than from high-level mathematical definitions, and estimates the parameters of the models separately for each algorithm. The method is network and topology oblivious and uses the Hockney model for point-to-point communications. In this paper, we extend that selection method to the case of clusters of multi-core processors, where each core of the platform runs a process of the MPI application. We present the proposed approach using Open MPI broadcast algorithms, and experimentally validate it on three different clusters of multi-core processors, Grisou, Gros and MareNostrum4. • A novel model-based method for runtime selection of optimal collective algorithms. • Application of the proposed method to the Open MPI broadcast algorithms. • Experimental validation of the method on two modern multi-core clusters. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
44. SG-PBFT: A secure and highly efficient distributed blockchain PBFT consensus algorithm for intelligent Internet of vehicles.
- Author
-
Xu, Guangquan, Bai, Hongpeng, Xing, Jun, Luo, Tao, Xiong, Neal N., Cheng, Xiaochun, Liu, Shaoying, and Zheng, Xi
- Subjects
- *
DISTRIBUTED algorithms , *BLOCKCHAINS , *ALGORITHMS , *INTERNET , *INTERNET of things , *FAULT tolerance (Engineering) , *INTELLIGENT transportation systems - Abstract
• Our algorithm is more efficient, has less communication overhead, and has greater throughput than the original PBFT algorithm. • We propose a Vehicle - Service Provider - Roadside Unit architecture to ensure the orderliness and safety of vehicles. • We propose an identity authentication scheme based on blockchain. • Experimental results demonstrate that our solution is better than PBFT, G-PBFT and CPBFT. As an application of Internet of Things (IoT) technology, the Internet of Vehicles (IoV) faces two main security issues: (1) the central server of the IoV may not be powerful enough to support the centralized authentication of the rapidly increasing connected vehicles, (2) the IoV itself may not be robust enough to single-node attacks. To address these issues, in this paper, we propose SG-PBFT (Score Grouping-PBFT), a secure and efficient distributed consensus algorithm for blockchain applications in the IoV. The distributed structure can reduce the pressure on the central server and decrease the risk of single-node attacks. The SG-PBFT consensus algorithm improves the traditional practical Byzantine fault tolerance (PBFT) consensus algorithm by optimizing the PBFT consensus process and using a score grouping mechanism to achieve a higher consensus efficiency. The experimental results show that the method can greatly improve the consistency efficiency and effectively prevent single-node attacks. Specifically, when the number of consensus nodes reaches 1000, the consensus time of our algorithm is only about 27% of what is required for the state-of-the-art PBFT consensus algorithm. Our proposed SG-PBFT algorithm is versatile and can be used in other application scenarios which require high consensus efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
45. Medians in median graphs and their cube complexes in linear time.
- Author
-
Bénéteau, Laurine, Chalopin, Jérémie, Chepoi, Victor, and Vaxès, Yann
- Subjects
- *
CUBES , *CHARTS, diagrams, etc. , *TIME , *ALGORITHMS - Abstract
The median of a set of vertices P of a graph G is the set of all vertices x of G minimizing the sum of distances from x to all vertices of P. In this paper, we present a linear time algorithm to compute medians in median graphs. We also present a linear time algorithm to compute medians in the associated ℓ 1 -cube complexes. Our algorithm is based on the majority rule characterization of medians in median graphs and on a fast computation of parallelism classes of edges (Θ-classes) via Lexicographic Breadth First Search (LexBFS). We show that any LexBFS ordering of the vertices of a median graph satisfies the following fellow traveler property : the parents of any two adjacent vertices are also adjacent. Using the fast computation of the Θ-classes, we also compute the Wiener index (total distance) in linear time and the distance matrix in optimal quadratic time. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
46. Malicious code detection based on CNNs and multi-objective algorithm.
- Author
-
Cui, Zhihua, Du, Lei, Wang, Penghong, Cai, Xingjuan, and Zhang, Wensheng
- Subjects
- *
COMPUTER security vulnerabilities , *COMPUTER network security , *GENETIC algorithms , *CIPHERS , *ARTIFICIAL neural networks , *ALGORITHMS - Abstract
An increasing amount of malicious code causes harm on the internet by threatening user privacy as one of the primary sources of network security vulnerabilities. The detection of malicious code is becoming increasingly crucial, and current methods of detection require much improvement. This paper proposes a method to advance the detection of malicious code using convolutional neural networks (CNNs) and intelligence algorithm. The CNNs are used to identify and classify grayscale images converted from executable files of malicious code. Non-dominated Sorting Genetic Algorithm II (NSGA-II) is then employed to deal with the data imbalance of malware families. A series of experiments are designed for malware image data from Vision Research Lab. The experimental results demonstrate that the proposed method is effective, maintaining higher accuracy and less loss. • A technique for converting a malware binary to an image was introduced. • In this paper, a method based on CNN is used to identify and classify the malicious codes. • An effective data equilibrium approach based on the NSGA-II was designed. • The proposed method was demonstrated through the extensive experiments. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
47. Linear rank-width of distance-hereditary graphs II. Vertex-minor obstructions.
- Author
-
Kanté, Mamadou Moustapha and Kwon, O-joung
- Subjects
- *
GRAPH theory , *POLYNOMIALS , *ALGORITHMS , *SUBGRAPHS , *GEOMETRIC vertices - Abstract
In the companion paper (Adler et al., 2017), we presented a characterization of the linear rank-width of distance-hereditary graphs, from which we derived an algorithm to compute it in polynomial time. In this paper, we investigate structural properties of distance-hereditary graphs based on this characterization. First, we prove that for a fixed tree T , every distance-hereditary graph of sufficiently large linear rank-width contains a vertex-minor isomorphic to T . We extend this property to bigger graph classes, namely, classes of graphs whose prime induced subgraphs have bounded linear rank-width. Here, prime graphs are graphs containing no splits. We conjecture that for every tree T , every graph of sufficiently large linear rank-width contains a vertex-minor isomorphic to T . Our result implies that it is sufficient to prove this conjecture for prime graphs. For a class Φ of graphs closed under taking vertex-minors, a graph G is called a vertex-minor obstruction for Φ if G ∉ Φ but all of its proper vertex-minors are contained in Φ . Secondly, we provide, for each k ⩾ 2 , a set of distance-hereditary graphs that contains all distance-hereditary vertex-minor obstructions for graphs of linear rank-width at most k . Also, we give a simpler way to obtain the known vertex-minor obstructions for graphs of linear rank-width at most 1. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
48. A two-grid method for incompressible miscible displacement problems by mixed finite element and Eulerian–Lagrangian localized adjoint methods.
- Author
-
Wang, Yang and Chen, Yanping
- Subjects
- *
MISCIBLE displacement (Petroleum engineering) , *INCOMPRESSIBLE flow , *FINITE element method , *LAGRANGIAN functions , *ALGORITHMS - Abstract
Abstract In this paper, we present a scheme for solving two-dimensional miscible displacement problems using Eulerian–Lagrangian localized adjoint methods and mixed finite element methods. Since only the velocity and not the pressure appears explicitly in the concentration equation, an Eulerian–Lagrangian localized adjoint method is used to solve the concentration equation and a mixed finite element method is used for the pressure equation. To linearize and decouple the mixed-method equations, we use a two-grid algorithm based on the Newton iteration method for this fully discrete problems. First, we solve the original nonlinear equations on the coarse grid, then, we solve the linearized problem on the fine grid using Newton iteration once. It is shown that the coarse grid can be much coarser than the fine grid and achieve asymptotically optimal approximation as long as the mesh sizes satisfy H = O (h 1 / 2) in this paper. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
49. Computing with quadratic forms over number fields.
- Author
-
Koprowski, Przemysław and Czogała, Alfred
- Subjects
- *
ALGORITHMS , *HYPERBOLIC differential equations , *SCANNING electron microscopy , *INTEGERS - Abstract
This paper presents fundamental algorithms for the computational theory of quadratic forms over number fields. In the first part of the paper, we present algorithms for checking if a given non-degenerate quadratic form over a fixed number field is either isotropic (respectively locally isotropic) or hyperbolic (respectively locally hyperbolic). Next we give a method of computing the dimension of an anisotropic part of a quadratic form. The second part of the paper is devoted to algorithms computing two field invariants: the level and the Pythagoras number. Ultimately we present an algorithm verifying whether two number fields have isomorphic Witt rings (i.e. are Witt equivalent). [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
50. Approximate and incremental network function placement.
- Author
-
Lukovszki, Tamás, Rost, Matthias, and Schmid, Stefan
- Subjects
- *
COMPUTER networks , *DISTRIBUTED computing , *RANDOM access memory , *CENTRAL processing units , *COMPUTER storage devices , *INFORMATION networks - Abstract
The virtualization of modern computer networks introduces interesting new opportunities for a more flexible placement of network functions and middleboxes (firewalls, proxies, traffic optimizers, virtual switches, etc.). This paper studies approximation algorithms for the incremental deployment of a minimum number of middleboxes, such that capacity constraints at the middleboxes and length constraints on the communication routes are respected. Based on a new, purely combinatorial and rigorous proof of submodularity, we obtain our main result: a deterministic greedy approximation algorithm which only employs augmenting paths to serve future communication pairs. Hence, our algorithm does not require any changes to the locations of existing middleboxes or the preemption of previously served communication pairs when additional middleboxes are deployed. It is hence particularly attractive for incremental deployments. We prove that the achieved polynomial-time approximation bound is optimal, unless P = NP holds. This paper also initiates the study of a weighted problem variant, in which entire groups of nodes need to communicate via a middlebox, possibly at different rates. We present an LP relaxation and randomized rounding algorithm for this problem, leveraging an interesting connection to scheduling. We complement our formal results with a simulation study of a large set of synthetically generated instances. Our results indicate that the presented algorithms yield near-optimal solutions in practice. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.