15 results
Search Results
2. Statistical analysis of a hierarchical clustering algorithm with outliers.
- Author
-
Klutchnikoff, Nicolas, Poterie, Audrey, and Rouvière, Laurent
- Subjects
- *
STATISTICS , *CLUSTER analysis (Statistics) , *HIERARCHICAL clustering (Cluster analysis) , *ALGORITHMS - Abstract
It is well known that, in the presence of outliers, the single linkage algorithm generally fails to identify clusters. In this paper, we construct a new version of this algorithm, less sensitive to outliers, and study both its theoretical properties and its practical behavior. In particular, we provide an oracle-type inequality which guarantees that our procedure recovers clusters with high probability under mild assumptions on the distribution of the outliers. Using this inequality, we prove the consistency of our method and exhibit rates of convergence in various situations. The performance of this approach is also assessed through simulation studies. A thorough comparison with several classical clustering algorithms on simulated data is presented. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
3. A frequency based algorithm for identification of single and double cracked beams via a statistical approach used in experiment
- Author
-
Mazanoglu, K. and Sabuncu, M.
- Subjects
- *
ALGORITHMS , *FRACTURE mechanics , *MATHEMATICAL models , *FINITE element method , *SPECTRUM analysis , *CONTOURS (Cartography) , *ERRORS , *STATISTICS - Abstract
Abstract: An algorithm for detecting cracks on the beams and a statistical process for minimising the measurement errors in experiments are presented in this paper. Natural frequencies are determined by using the theoretical model for different depths and locations of single crack. The ratios of cracked and un-cracked beam’s natural frequencies constitute the prediction tables scaled in two axes as crack location and crack depth. Frequency contour lines corresponding to measured natural frequency ratios are matched with the interpolated prediction table, called frequency map, and are used for detection of a single crack. However, contour lines do not give any information about the existence of two cracks. The algorithm presented in this paper makes it possible to locate the suitable positions of two cracks searched over the frequency map. The algorithm is tested in the examples employing the frequency map prepared by the theory presented and the input frequency ratios obtained by the commercial finite element program. The algorithm is also verified by using the natural frequencies of cracked and un-cracked cantilever beams employed in several experiments. In measurement, determination of accurate natural frequency ratios is crucial for the success of crack detection. Therefore, this paper also presents a statistical approach called ‘recursively scaled zoomed frequencies (RSZF)’ for minimising the deviations caused by sensitivity and resolution lack in measured natural frequencies. In this approach, the measured frequencies in spectrum are modified by the mean value of the natural frequencies determined in different frequency scales. Zoomed frequencies are obtained by the cubic spline interpolation method that increases the resolution of frequency spectrum. RSZF comes into further prominence especially when the cracks are needed to be detected by very small sized data. All of the experimental results represent that single crack and double cracks are successfully detected by using the methods presented. [Copyright &y& Elsevier]
- Published
- 2012
- Full Text
- View/download PDF
4. Online selection of intervals and t-intervals.
- Author
-
Bachmann, Unnar Th., Halldórsson, Magnús M., and Shachnai, Hadas
- Subjects
- *
COMPUTER science , *CONFIDENCE intervals , *INTERVAL analysis , *ALGORITHMS , *INFORMATION theory , *STATISTICS - Abstract
Abstract: A t-interval is a union of at most t half-open intervals on the real line. An interval is the special case where . In this paper we study the problems of online selection of intervals and t-intervals. We derive lower bounds and (almost) matching upper bounds on the competitive ratios of randomized algorithms for selecting intervals, 2-intervals and t-intervals, for any . While offline t-interval selection has been studied before, the online version is considered here for the first time. [Copyright &y& Elsevier]
- Published
- 2013
- Full Text
- View/download PDF
5. Bridge the gap between statistical and hand-crafted grammars
- Author
-
Basirat, Ali and Faili, Heshaam
- Subjects
- *
GRAMMAR , *INFORMATION retrieval , *PROGRAMMING languages , *TRANSLATIONS , *NATURAL language processing , *APPLICATION software , *STATISTICS , *ALGORITHMS - Abstract
Abstract: LTAG is a rich formalism for performing NLP tasks such as semantic interpretation, parsing, machine translation and information retrieval. Depend on the specific NLP task, different kinds of LTAGs for a language may be developed. Each of these LTAGs is enriched with some specific features such as semantic representation and statistical information that make them suitable to be used in that task. The distribution of these capabilities among the LTAGs makes it difficult to get the benefit from all of them in NLP applications. This paper discusses a statistical model to bridge between two kinds LTAGs for a natural language in order to benefit from the capabilities of both kinds. To do so, an HMM was trained that links an elementary tree sequence of a source LTAG onto an elementary tree sequence of a target LTAG. Training was performed by using the standard HMM training algorithm called Baum–Welch. To lead the training algorithm to a better solution, the initial state of the HMM was also trained by a novel EM-based semi-supervised bootstrapping algorithm. The model was tested on two English LTAGs, XTAG (XTAG-Group, 2001) and MICA''s grammar (Bangalore et al., 2009) as the target and source LTAGs, respectively. The empirical results confirm that the model can provide a satisfactory way for linking these LTAGs to share their capabilities together. [Copyright &y& Elsevier]
- Published
- 2013
- Full Text
- View/download PDF
6. Application of cepstrum pre-whitening for the diagnosis of bearing faults under variable speed conditions
- Author
-
Borghesani, P., Pennacchi, P., Randall, R.B., Sawalhi, N., and Ricci, R.
- Subjects
- *
CEPSTRUM analysis (Mechanics) , *BEARINGS (Machinery) , *FRACTURE mechanics , *ROLLING (Metalwork) , *SIGNAL processing , *STATISTICS , *ALGORITHMS - Abstract
Abstract: Diagnostics of rolling element bearings involves a combination of different techniques of signal enhancing and analysis. The most common procedure presents a first step of order tracking and synchronous averaging, able to remove the undesired components, synchronous with the shaft harmonics, from the signal, and a final step of envelope analysis to obtain the squared envelope spectrum. This indicator has been studied thoroughly, and statistically based criteria have been obtained, in order to identify damaged bearings. The statistical thresholds are valid only if all the deterministic components in the signal have been removed. Unfortunately, in various industrial applications, characterized by heterogeneous vibration sources, the first step of synchronous averaging is not sufficient to eliminate completely the deterministic components and an additional step of pre-whitening is needed before the envelope analysis. Different techniques have been proposed in the past with this aim: The most widely spread are linear prediction filters and spectral kurtosis. Recently, a new technique for pre-whitening has been proposed, based on cepstral analysis: the so-called cepstrum pre-whitening. Owing to its low computational requirements and its simplicity, it seems a good candidate to perform the intermediate pre-whitening step in an automatic damage recognition algorithm. In this paper, the effectiveness of the new technique will be tested on the data measured on a full-scale industrial bearing test-rig, able to reproduce the harsh conditions of operation. A benchmark comparison with the traditional pre-whitening techniques will be made, as a final step for the verification of the potentiality of the cepstrum pre-whitening. [Copyright &y& Elsevier]
- Published
- 2013
- Full Text
- View/download PDF
7. Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis
- Author
-
Dines, John, Liang, Hui, Saheer, Lakshmi, Gibson, Matthew, Byrne, William, Oura, Keiichiro, Tokuda, Keiichi, Yamagishi, Junichi, King, Simon, Wester, Mirjam, Hirsimäki, Teemu, Karhila, Reima, and Kurimo, Mikko
- Subjects
- *
TRANSLATIONS , *SPEECH perception , *ALGORITHMS , *STATISTICS , *ORATORS , *HIDDEN Markov models , *SIMILARITY (Language learning) , *AUDITORY perception - Abstract
Abstract: In this paper we present results of unsupervised cross-lingual speaker adaptation applied to text-to-speech synthesis. The application of our research is the personalisation of speech-to-speech translation in which we employ a HMM statistical framework for both speech recognition and synthesis. This framework provides a logical mechanism to adapt synthesised speech output to the voice of the user by way of speech recognition. In this work we present results of several different unsupervised and cross-lingual adaptation approaches as well as an end-to-end speaker adaptive speech-to-speech translation system. Our experiments show that we can successfully apply speaker adaptation in both unsupervised and cross-lingual scenarios and our proposed algorithms seem to generalise well for several language pairs. We also discuss important future directions including the need for better evaluation metrics. [Copyright &y& Elsevier]
- Published
- 2013
- Full Text
- View/download PDF
8. Multivariate statistical analysis strategy for multiple misfire detection in internal combustion engines
- Author
-
Hu, Chongqing, Li, Aihua, and Zhao, Xingyang
- Subjects
- *
INTERNAL combustion engines , *MULTIVARIATE analysis , *STATISTICS , *ALGORITHMS , *TORSIONAL vibration , *SPEED , *ENGINE cylinders , *SIGNAL processing - Abstract
Abstract: This paper proposes a multivariate statistical analysis approach to processing the instantaneous engine speed signal for the purpose of locating multiple misfire events in internal combustion engines. The state of each cylinder is described with a characteristic vector extracted from the instantaneous engine speed signal following a three-step procedure. These characteristic vectors are considered as the values of various procedure parameters of an engine cycle. Therefore, determination of occurrence of misfire events and identification of misfiring cylinders can be accomplished by a principal component analysis (PCA) based pattern recognition methodology. The proposed algorithm can be implemented easily in practice because the threshold can be defined adaptively without the information of operating conditions. Besides, the effect of torsional vibration on the engine speed waveform is interpreted as the presence of super powerful cylinder, which is also isolated by the algorithm. The misfiring cylinder and the super powerful cylinder are often adjacent in the firing sequence, thus missing detections and false alarms can be avoided effectively by checking the relationship between the cylinders. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
9. Liberating the dimension
- Author
-
Kuo, Frances Y., Sloan, Ian H., Wasilkowski, Grzegorz W., and Woźniakowski, Henryk
- Subjects
- *
MULTIVARIATE analysis , *MATHEMATICAL variables , *MATHEMATICAL constants , *CYBERNETICS , *STATISTICS , *ALGORITHMS - Abstract
Abstract: Many recent papers considered the problem of multivariate integration, and studied the tractability of the problem in the worst case setting as the dimensionality increases. The typical question is: can we find an algorithm for which the error is bounded polynomially in , or even independently of ? And the general answer is: yes, if we have a suitably weighted function space. Since there are important problems with infinitely many variables, here we take one step further: we consider the integration problem with infinitely many variables–thus liberating the dimension–and we seek algorithms with small error and minimal cost. In particular, we assume that the cost for evaluating a function depends on the number of active variables. The choice of the cost function plays a crucial role in the infinite dimensional setting. We present a number of lower and upper estimates of the minimal cost for product and finite-order weights. In some cases, the bounds are sharp. [Copyright &y& Elsevier]
- Published
- 2010
- Full Text
- View/download PDF
10. Using mobile beacons to locate sensors in obstructed environments
- Author
-
Ding, Yong, Wang, Chen, and Xiao, Li
- Subjects
- *
DETECTORS , *ULTRASONICS , *DISTANCE measurement equipment , *STATISTICS , *ALGORITHMS , *LOCALIZATION theory - Abstract
Abstract: Locating sensors in an indoor environment is a challenging problem due to the insufficient distance measurements caused by short ultrasound range and the incorrect distance measurements caused by multipath effect of ultrasound. In this paper, we propose a virtual ruler approach, in which a vehicle equipped with multiple ultrasound beacons travels around the area to measure distances between pairwise sensors. Virtual Ruler can not only obtain sufficient distances between pairwise sensors, but can also eliminate incorrect distances in the distance measurement phase of sensor localization. We propose to measure the distance between pairwise sensors from multiple perspectives using the virtual ruler and filter incorrect values through a statistical approach. By assigning measured distances with confidence values, the localization algorithm can intelligently localize each sensor based on high confidence distances, which greatly improves localization accuracy. Our performance evaluation shows that the proposed approach can achieve better localization results than previous approaches in an indoor environment. [Copyright &y& Elsevier]
- Published
- 2010
- Full Text
- View/download PDF
11. Dealing with label switching in mixture models under genuine multimodality
- Author
-
Grün, Bettina and Leisch, Friedrich
- Subjects
- *
NONPARAMETRIC statistics , *STATISTICS , *ALGORITHMS , *MARKOV processes - Abstract
Abstract: The fitting of finite mixture models is an ill-defined estimation problem, as completely different parameterizations can induce similar mixture distributions. This leads to multiple modes in the likelihood, which is a problem for frequentist maximum likelihood estimation, and complicates statistical inference of Markov chain Monte Carlo draws in Bayesian estimation. For the analysis of the posterior density of these draws, a suitable separation into different modes is desirable. In addition, a unique labelling of the component specific estimates is necessary to solve the label switching problem. This paper presents and compares two approaches to achieve these goals: relabelling under multimodality and constrained clustering. The algorithmic details are discussed, and their application is demonstrated on artificial and real-world data. [Copyright &y& Elsevier]
- Published
- 2009
- Full Text
- View/download PDF
12. Ideal denoising for signals in sub-Gaussian noise
- Author
-
Ferrando, Sebastian E. and Pyke, Randall
- Subjects
- *
ALGORITHMS , *NOISE , *STATISTICS , *ALGEBRA - Abstract
Abstract: Donoho and Johnstone introduced an algorithm and supporting inequality that allows the selection of an orthonormal basis for optimal denoising. The present paper concentrates in extending and improving this result, the main contribution is to incorporate a wider class of noise vectors. The class of strict sub-Gaussian random vectors allow us to obtain large deviation inequalities in a uniform way over all basis in a given library. The results are obtained maintaining the algorithmic properties of the original results. [Copyright &y& Elsevier]
- Published
- 2008
- Full Text
- View/download PDF
13. Spectral independent component analysis
- Author
-
Singer, A.
- Subjects
- *
CLUSTER analysis (Statistics) , *SPECTRUM analysis , *STATISTICS , *ALGORITHMS - Abstract
Abstract: Independent component analysis (ICA) of a mixed signal into a linear combination of its independent components, is one of the main problems in statistics, with wide range of applications. The un-mixing is usually performed by finding a rotation that optimizes a functional closely related to the differential entropy. In this paper we solve the linear ICA problem by analyzing the spectrum and eigenspaces of the graph Laplacian of the data. The spectral ICA algorithm is based on two observations. First, independence of random variables is equivalent to having the eigenfunctions of the limiting continuous operator of the graph Laplacian in a separation of variables form. Second, the first non-trivial Neumann function of any Sturm–Liouville operator is monotonic. Both the degenerate and non-degenerate spectrums corresponding to identical and non-identical sources are studied. We provide successful numerical experiments of the algorithm. [Copyright &y& Elsevier]
- Published
- 2006
- Full Text
- View/download PDF
14. Bayesian model inversion using stochastic spectral embedding.
- Author
-
Wagner, Paul-Remo, Marelli, Stefano, and Sudret, Bruno
- Subjects
- *
POLYNOMIAL chaos , *PROBLEM solving , *INVERSE problems , *STATISTICS , *ALGORITHMS , *INVERSIONS (Geometry) - Abstract
• We present the SSLE approach for Bayesian model inversion based on SSE and SLE. • Posterior quantities are obtained analytically from the SSLE coefficients. • The original SSE algorithm is enhanced with an active learning enrichment scheme. • This method generalizes and drastically improves the efficiency of the SLE approach. • We showcase the method on three problems of different complexity and dimensionality. In this paper we propose a new sampling-free approach to solve Bayesian model inversion problems that is an extension of the previously proposed spectral likelihood expansions (SLE) method. Our approach, called stochastic spectral likelihood embedding (SSLE), uses the recently presented stochastic spectral embedding (SSE) method for local spectral expansion refinement to approximate the likelihood function at the core of Bayesian inversion problems. We show that, similar to SLE, this approach results in analytical expressions for key statistics of the Bayesian posterior distribution, such as evidence, posterior moments and posterior marginals , by direct post-processing of the expansion coefficients. Because SSLE and SSE rely on the direct approximation of the likelihood function, they are in a way independent of the computational/mathematical complexity of the forward model. We further enhance the efficiency of SSLE by introducing a likelihood specific adaptive sample enrichment scheme. To showcase the performance of the proposed SSLE, we solve three problems that exhibit different kinds of complexity in the likelihood function: multimodality, high posterior concentration and high nominal dimensionality. We demonstrate how SSLE significantly improves on SLE, and present it as a promising alternative to existing inversion frameworks. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
15. The Application of Bayes's Theorem When the True Data State is Uncertain.
- Author
-
Gettys, Charles F. and Willke, T. A.
- Subjects
- *
ALGORITHMS , *ALGEBRA , *FOUNDATIONS of arithmetic , *PROBABILITY theory , *MATHEMATICAL combinations , *MATHEMATICS , *BAYES' theorem , *STATISTICAL decision making , *STATISTICS - Abstract
This paper discusses the application of Bayes's theorem to those cases where the true state of the world is not known with certainty. An algorithm is proposed that relaxes the requirement of Bayes's theorem that the true data state be known with certainty by postulating a true but unobservable elementary event, ω, which gives rise to posterior probabilities which reflect the uncertainty of the data. A derivation is presented for the calculation of Bayesian posterior probabilities which uses as its input these probabilities, rather than the true event, o, which is assumed to be unavailable. Suggestions are made as to the application of this modification of Bayes's theorem to cascaded inference processes. [ABSTRACT FROM AUTHOR]
- Published
- 1969
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.