28 results on '"Bayesian statistical decision theory -- Usage"'
Search Results
2. Marginalized neural network mixtures for large-scale regression
- Author
-
Lazaro-Gredilla, M. and Figueiras-Vidal, A.R.
- Subjects
Neural networks -- Design and construction ,Bayesian statistical decision theory -- Usage ,Gaussian processes -- Analysis ,Neural network ,Business ,Computers ,Electronics ,Electronics and electrical industries - Published
- 2010
Catalog
3. Inference from aging information
- Author
-
de Oliveira, E.A. and Caticha, N.
- Subjects
Bayesian statistical decision theory -- Usage ,Machine learning -- Analysis ,Combinatorial probabilities -- Usage ,Geometric probabilities -- Usage ,Probabilities -- Usage ,Business ,Computers ,Electronics ,Electronics and electrical industries - Published
- 2010
4. The infinite hidden Markov random field model
- Author
-
Chatzis, S.P. and Tsechpenakis, G.
- Subjects
Bayesian statistical decision theory -- Usage ,Image processing -- Analysis ,Markov processes -- Usage ,Business ,Computers ,Electronics ,Electronics and electrical industries - Published
- 2010
5. Simplifying mixture models through function approximation
- Author
-
Kai Zhang and Kwok, J.T.
- Subjects
Asymptotes -- Usage ,Bayesian statistical decision theory -- Usage ,Clustering (Computers) -- Analysis ,Kernel functions -- Usage ,Machine learning -- Analysis ,Server clustering ,Business ,Computers ,Electronics ,Electronics and electrical industries - Published
- 2010
6. Recursive Bayesian recurrent neural networks for time-series modeling
- Author
-
Mirikitani, D.T. and Nikolaev, N.
- Subjects
Bayesian statistical decision theory -- Usage ,Kalman filtering -- Usage ,Neural networks -- Analysis ,Recursive functions -- Usage ,Neural network ,Business ,Computers ,Electronics ,Electronics and electrical industries - Published
- 2010
7. A multiobjective simultaneous learning framework for clustering and classification
- Author
-
Weiling Cai, Songcan Chen, and Daoqiang Zhang
- Subjects
Bayesian statistical decision theory -- Usage ,Clustering (Computers) -- Analysis ,Object recognition (Computers) -- Analysis ,Pattern recognition -- Analysis ,Server clustering ,Business ,Computers ,Electronics ,Electronics and electrical industries - Published
- 2010
8. Relevance units latent variable model and nonlinear dimensionality reduction
- Author
-
Junbin Gao, Jun Zhang, and Tien, D.
- Subjects
Bayesian statistical decision theory -- Usage ,Gaussian processes -- Analysis ,Kernels -- Models ,Business ,Computers ,Electronics ,Electronics and electrical industries - Published
- 2010
9. A Dirichlet process mixture of generalized Dirichlet distributions for proportional data modeling
- Author
-
Bouguila, N. and Ziou, D.
- Subjects
Bayesian statistical decision theory -- Usage ,Clustering (Computers) -- Usage ,Database administration -- Analysis ,Series, Dirichlet -- Usage ,Server clustering ,Business ,Computers ,Electronics ,Electronics and electrical industries - Published
- 2010
10. The Bayesian ARTMAP
- Author
-
Vigdor, Boaz and Lerner, Boaz
- Subjects
Algorithms -- Methods ,Bayesian statistical decision theory -- Usage ,Neural networks -- Usage ,Neural networks -- Methods ,Fuzzy algorithms -- Methods ,Fuzzy logic -- Methods ,Fuzzy systems -- Methods ,Algorithm ,Neural network ,Fuzzy logic ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
In this paper, we modify the fuzzy ARTMAP (FA) neural network (NN) using the Bayesian framework in order to improve its classification accuracy while simultaneously reduce its category proliferation. The proposed algorithm, called Bayesian ARTMAP (BA), preserves the FA advantages and also enhances its performance by the following: 1) representing a category using a multidimensional Gaussian distribution, 2) allowing a category to grow or shrink, 3) limiting a category hypervolume, 4) using Bayes' decision theory for learning and inference, and 5) employing the probabilistic association between every category and a class in order to predict the class. In addition, the BA estimates the class posterior probability and thereby enables the introduction of loss and classification according to the minimum expected loss. Based on these characteristics and using synthetic and 20 real-world databases, we show that the BA outperformes the FA, either trained for one epoch or until completion, with respect to classification accuracy, sensitivity to statistical overlapping, learning curves, expected loss, and category proliferation. Index Terms--Bayes' decision theory, category proliferation, classification, fuzzy ARTMAP (FA), neural network (NN). more...
- Published
- 2007
11. Markov and semi-Markov switching of source appearances for nonstationary independent component analysis
- Author
-
Hirayama, Jun-ichiro, Maeda, Shin-ichi, and Ishii, Shin
- Subjects
Markov processes -- Evaluation ,Bayesian statistical decision theory -- Usage ,Discriminant analysis -- Methods ,Factor analysis -- Methods ,Neural networks -- Research ,Neural network ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
Independent component analysis (ICA) is currently the most popularly used approach to blind source separation (BSS), the problem of recovering unknown source signals when their mixtures are observed but the actual mixing process is unknown. Many ICA algorithms assume that a fixed set of source signals consistently exists in mixtures throughout the time-series to be examined. However, real-world signals often have such difficult nonstationarity that each source signal abruptly appears or disappears, thus the set of active sources dynamically changes with time. In this paper, we propose switching ICA (SwICA), which focuses on such situations. The proposed approach is based on the noisy ICA formulated as a generative model. We employ a special type of hidden Markov model (HMM) to represent such prior knowledge that the source may abruptly appear or disappear with time. The special HMM setting then provides an effect of variable selection in a dynamic way. We use the variational Bayes (VB) method to derive an effective approximation of Bayesian inference for this model. In simulation experiments using artificial and realistic source signals, the proposed method exhibited performance superior to existing methods, especially in the presence of noise. The compared methods include the natural-gradient ICA with a nonholonomic constraint, and the existing ICA method incorporating an HMM source model, which aims to deal with general nonstationarities that may exist in source signals. In addition, the proposed method could successfully recover the source signals even when the total number of true sources was overestimated or was larger than that of mixtures. We also propose a modification of the basic Markov model into a semi-Markov model, and show that the semi-Markov one is more effective for robust estimation of the source appearance. Index Terms--Blind source separation (BSS), hidden Markov model (HMM), hidden semi-Markov model (HSMM), independent component analysis (ICA), variational Bayes (VB) method. more...
- Published
- 2007
12. Unsupervised learning of Gaussian mixtures based on variational component splitting
- Author
-
Constantinopoulos, Constantinos and Likas, Aristidis
- Subjects
Algorithms -- Usage ,Clustering (Computers) -- Analysis ,Bayesian statistical decision theory -- Usage ,Algorithm ,Server clustering ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
In this paper, we present an incremental method for model selection and learning of Gaussian mixtures based on the recently proposed variational Bayes approach. The method adds components to the mixture using a Bayesian splitting test procedure: a component is split into two components and then variational update equations are applied only to the parameters of the two components. As a result, either both components are retained in the model or one of them is found to be redundant and is eliminated from the model. In our approach, the model selection problem is treated locally, in a region of the data space, so we can set more informative priors based on the local data distribution. A modified Bayesian mixture model is presented to implement this approach, along with a learning algorithm that iteratively applies a splitting test on each mixture component. Experimental results and comparisons with two other techniques testify for the adequacy of the proposed approach. Index Terms--Clustering, mixture models, model selection, variational Bayes methods. more...
- Published
- 2007
13. Distribution modeling of nonlinear inverse controllers under a Bayesian framework
- Author
-
Herzallah, Randa and Lowe, David
- Subjects
Neural networks -- Models ,Neural networks -- Research ,Stochastic processes -- Usage ,Bayesian statistical decision theory -- Usage ,Neural network ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
The inverse controller is traditionally assumed to be a deterministic function. This paper presents a pedagogical methodology for estimating the stochastic model of the inverse controller. The proposed method is based on Bayes' theorem. Using Bayes' rule to obtain the stochastic model of the inverse controller allows the use of knowledge of uncertainty from both the inverse and the forward model in estimating the optimal control signal. The paper presents the methodology for general nonlinear systems and is demonstrated on nonlinear single-input-single-output (SISO) and multiple-input-multiple-output (MIMO) examples. Index Terms--Distribution modelling, inverse controller, neural networks, stochastic systems, uncertainty. more...
- Published
- 2007
14. Posterior probability support vector machines for unbalanced data
- Author
-
Tao, Qing, Wu, Gao-Wei, Wang, Fei-Yue, and Wang, Jue
- Subjects
Algorithms -- Usage ,Bayesian statistical decision theory -- Usage ,Sequence controllers, Programmable ,Algorithm ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
This paper proposes a complete framework of posterior probability support vector machines (PPSVMs) for weighted training samples using modified concepts of risks, linear separability, margin, and optimal hyperplane. Within this framework, a new optimization problem for unbalanced classification problems is formulated and a new concept of support vectors established. Furthermore, a soft PPSVM with an interpretable parameter v is obtained which is similar to the v-SVM developed by Scholkopf et al., and an empirical method for determining the posterior probability is proposed as a new approach to determine v. The main advantage of an PPSVM classifier lies in that fact that it is closer to the Bayes optimal without knowing the distributions. To validate the proposed method, two synthetic classification examples are used to illustrate the logical correctness of PPSVMs and their relationship to regular SVMs and Bayesian methods. Several other classification experiments are conducted to demonstrate that the performance of PPSVMs is better than regular SVMs in some cases. Compared with fuzzy support vector machines (FSVMs), the proposed PPSVM is a natural and an analytical extension of regular SVMs based on the statistical learning theory. Index Terms--Bayesian decision theory, classification, margin, maximal margin algorithms, v-SVM, posterior probability, support vector machines (SVMs), unbalanced data. more...
- Published
- 2005
15. Extracting semantics from audiovisual content: the final frontier in multimedia retrieval
- Author
-
Naphade, Milind R. and Huang, Thomas S.
- Subjects
Electrical engineering -- Research ,Neural networks -- Research ,Semantics -- Usage ,Pattern recognition -- Usage ,Bayesian statistical decision theory -- Usage ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
Multimedia understanding is a fast emerging interdisciplinary research area. There is tremendous potential for effective use of multimedia content through intelligent analysis. Diverse application areas are increasingly relying on multimedia understanding systems. Advances in multimedia understanding are related directly to advances in signal processing, computer vision, pattern recognition, multimedia databases, and smart sensors. We review the state-of-the-art techniques in multimedia retrieval. In particular we discuss how multimedia retrieval can be viewed as a pattern recognition problem. We discuss, how reliance on powerful pattern recognition and machine learning techniques is increasing in the field of multimedia retrieval. We review state-of-the-art multimedia understanding systems with particular emphasis on a system for semantic video indexing centered around multijects and multinets. We discuss how semantic retrieval is centered around concepts and context and also discuss various mechanisms for modeling concepts and context. Index Terms--Bayesian networks, decision theory, factor graphs, machine learning, multijects, multimedia understanding, multinets, semantic video indexing, statistical pattern recognition, sum-product algorithm. more...
- Published
- 2002
16. Minimizing risk using prediction uncertainty in neural network estimation fusion and its application to papermaking
- Author
-
Edwards, Peter J., Peacock, Andrew M., Renshaw, David, Hannah, John M., and Murray, Alan F.
- Subjects
Electrical engineering -- Research ,Bayesian statistical decision theory -- Usage ,Neural networks -- Research ,Paper industry -- Research ,Mathematical models -- Usage ,Uncertainty -- Research ,Competing risks -- Research ,Forecasting -- Research ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
This paper presents Bayesian information fusion theory in the context of neural-network model combination. It shows how confidence measures can be combined with individual model estimates to minimize risk through the fusion process. The theory is illustrated through application to the real task of quality prediction in the papermaking industry. Prediction uncertainty estimates are calculated using approximate Bayesian learning. These are incorporated into model combination as confidence measures. Cost functions in the fusion center are used to control the influence of the confidence measures and improve the performance of the resultant committee. Index Terms--Bayesian learning, confidence measures, industrial application, information fusion, model combination, prediction uncertainty estimation, risk minimization. more...
- Published
- 2002
17. Bayesian retrieval in associative memories with storage errors
- Author
-
Sommer, Friedrich T. and Dayan, Peter
- Subjects
Neural networks -- Research ,Bayesian statistical decision theory -- Usage ,Associative memory (Computers) -- Research ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
It is well known that for finite-sized networks, one-step retrieval in the autoassociative Willshaw net is a suboptimal way to extract the information stored in the synapses. Iterative retrieval strategies are much better, but have hitherto only had heuristic justification. We show how they emerge naturally from considerations of probabilistic inference under conditions of noisy and partial input and a corrupted weight matrix. We start from the conditional probability distribution over possible patterns for retrieval. This contains all possible information that is available to an observer of the network and the initial input. Since this distribution is over exponentially many patterns, we use it to develop two approximate, but tractable, iterative retrieval methods. One performs maximum likelihood inference to find the single most likely pattern, using the (negative log of the) conditional probability as a Lyapunov function for retrieval. In physics terms, if storage errors are present, then the modified iterative update equations contain an additional antiferromagnetic interaction term and site dependent threshold values. The second method makes a mean field assumption to optimize a tractable estimate of the full conditional probability distribution. This leads to iterative mean field equations which can be interpreted in terms of a network of neurons with sigmoidal responses but with the same interactions and thresholds as in the maximum likelihood update equations. In the absence of storage errors, both models become very similar to the Willshaw model, where standard retrieval is iterated using a particular form of linear threshold strategy. Index Terms - Bayesian reasoning, correlation associative memory, graded response neurons, iterative retrieval, maximum likelihood retrieval, mean field methods, threshold strategies, storage errors, Willshaw model. more...
- Published
- 1998
18. Averaging, maximum penalized likelihood and Bayesian estimation for improving Gaussian mixture probability density estimates
- Author
-
Ormoneit, Dirk and Tresp, Volker
- Subjects
Gaussian distribution -- Analysis ,Neural networks -- Research ,Bayesian statistical decision theory -- Usage ,Estimation theory -- Research ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
We apply the idea of averaging ensembles of estimators to probability density estimation. In particular, we use Gaussian mixture models which are important components in many neural-network applications. We investigate the performance of averaging using three data sets. For comparison, we employ two traditional regularization approaches, i.e., a maximum penalized likelihood approach and a Bayesian approach. In the maximum penalized likelihood approach we use penalty functions derived from conjugate Bayesian priors such that an expectation maximization (EM) algorithm can be used for training. In all experiments, the maximum penalized likelihood approach and averaging improved performance considerably if compared to a maximum likelihood approach. In two of the experiments, the maximum penalized likelihood approach outperformed averaging. In one experiment averaging was clearly superior. Our conclusion is that maximum penalized likelihood gives good results if the penalty term in the cost function is appropriate for the particular problem. If this is not the case, averaging is superior since it shows greater robustness by not relying on any particular prior assumption. The Bayesian approach worked very well on a low-dimensional toy problem but failed to give good performance in higher dimensional problems. Index Terms - Bagging, Bayesian inference, data augmentation, EM algorithm, ensemble averaging, Gaussian mixture model, Gibbs sampling, penalized likelihood, probability density estimation. more...
- Published
- 1998
19. A review of Bayesian neural networks with an application to near infrared spectroscopy
- Author
-
Thodberg, Hans Henrik
- Subjects
Neural networks -- Models ,Bayesian statistical decision theory -- Usage ,Near infrared spectroscopy -- Analysis ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
MacKay's Bayesian framework for backpropagation is a practical and powerful means to improve the generalization ability of neural networks. It is based on a Gaussian approximation to the posterior weight distribution. The framework is extended, reviewed, and demonstrated in a pedagogical way. The notation is simplified using the ordinary weight decay parameter, and a detailed and explicit procedure for adjusting several weight decay parameters is given. Bayesian backprop is applied in the prediction of fat content in minced meat from near infrared spectra. It out performs 'early stopping' as well as quadratic regression. The evidence of a committee of differently trained networks is computed, and the corresponding improved generalization is verified. The error bars on the predictions of the fat content are computed. There are three contributors: The random noise, the uncertainty in the weights, and the deviation among the committee members. The Bayesian framework is compared to Moody's GPE. Finally, MacKay and Neal's automatic relevance determination, in which the weight decay parameters depend on the input number, is applied to the data with improved results. more...
- Published
- 1996
20. A clustering technique for digital communications channel equalization using radial basis function networks
- Author
-
Sheng Chen, Mulgrew, Bernard, and Grant, Peter M.
- Subjects
Digital communications -- Research ,Bayesian statistical decision theory -- Usage ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
The paper investigates the application of a radial basis function network to digital communications channel equalization. It is shown that the radial basis function network has an identical structure to the optimal Bayesian symbol-decision equalizer solution and, therefore, can be employed to implement the Bayesian equalizer. The training of a radial basis function network to realize the Bayesian equalization solution can be achieved efficiently using a simple and robust supervised clustering algorithm. During data transmission a decision-directed version of the clustering algorithm enables the radial basis function network to track a slowly time-varying environment. Moreover, the clustering scheme provides an automatic compensation for nonlinear channel and equipment distortion. This represents a radically new approach to the adaptive equalizer design. Computer simulations are included to illustrate the analytical results. more...
- Published
- 1993
21. Pricing and Hedging Derivative Securities with Neural Networks: Bayesian Regularization, Early Stopping, and Bagging
- Author
-
Gencay, Ramazan and Qi, Min
- Subjects
Neural networks -- Research ,Bayesian statistical decision theory -- Usage ,Options (Finance) -- Research ,Derivatives (Financial instruments) -- Research ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
We study the effectiveness of cross validation, Bayesian regularization, early stopping, and bagging to mitigate overfitting and improving generalization for pricing and hedging derivative securities with daily S&P 500 index daily call options from January 1988 to December 1993. Our results indicate that Bayesian regularization can generate significantly smaller pricing and delta-hedging errors than the baseline neural-network (NN) model and the Black-Scholes model for some years. While early stopping does not affect the pricing errors, it significantly reduces the hedging error in four of the six years we investigated. Although computationally most demanding, bagging seems to provide the most accurate pricing and delta-hedging. Furthermore, the standard deviation of the MSPE of bagging is far less than that of the baseline model in all six years, and the standard deviation of the AHE of bagging is far less than that of the baseline model in five out of six years. Since we find in general these regularization methods work as effectively as homogeneity hint, we suggest they be used at least in cases when no appropriate hints are available. Index Terms--Bagging, Bayesian regularization, early stopping, hedging error, neural networks (NNs), option price. more...
- Published
- 2001
22. Multiresolution Forecasting for Futures Trading Using Wavelet Decompositions
- Author
-
Zhang, Bai-Ling, Coggins, Richard, Jabri, Marwan Anwar, Dersch, Dominik, and Flower, Barry
- Subjects
Neural networks -- Research ,Bayesian statistical decision theory -- Usage ,Futures -- Management ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
In this paper, we investigate the effectiveness of a financial time-series forecasting strategy which exploits the multiresolution property of the wavelet transform. A financial series is decomposed into an over complete, shift invariant scale-related representation. In transform space, each individual wavelet series is modeled by a separate multilayer perceptron (MLP). To better utilize the detailed information in the lower scales of wavelet coefficients (high frequencies) and general (trend) information in the higher scales of wavelet coefficients (low frequencies), we applied the Bayesian method of automatic relevance determination (ARD) to choose short past windows (short-term history) for the inputs to the MLPs at lower scales and long past windows (long-term history) at higher scales. To form the overall forecast, the individual forecasts are then recombined by the linear reconstruction property of the inverse transform with the chosen autocorrelation shell representation, or by another perceptron which learns the weight of each scale in the prediction of the original time series. The forecast results are then passed to a money management system to generate trades. Compared with previous work on combining wavelet techniques and neural networks to financial time-series, our contributions include 1) proposing a three-stage prediction scheme; 2) applying a multiresolution prediction which is strictly based on the autocorrelation shell representation, 3) incorporating the Bayesian technique ARD with MLP training for the selection of relevant inputs; and 4) using a realistic money management system and trading model to evaluate the forecasting performance. Using an accurate trading model, our system shows promising profitability performance. Results comparing the performance of the proposed architecture with an MLP without wavelet preprocessing on 10-year bond futures indicate a doubling in profit per trade ($AUD1753:$AUD819) and Sharpe ratio improvement of 0.732 versus 0.367, as well as significant improvements in the ratio of winning to loosing trades, thus indicating significant potential profitability for live trading. Index Terms--Autocorrelation shell representation, automatic relevance determination, financial time series, futures trading, multilayer perceptron, relevance determination, wavelet decomposition. more...
- Published
- 2001
23. Modeling Exchange Rates: Smooth Transitions, Neural Networks, and Linear Models
- Author
-
Medeiros, Marcelo C., Veiga, Alvaro, and Pedreira, Carlos Eduardo
- Subjects
Neural networks -- Research ,Bayesian statistical decision theory -- Usage ,Smoothing (Numerical analysis) -- Research ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
The goal of this paper is to test for and model nonlinearities in several monthly exchange rates time series. We apply two different nonlinear alternatives, namely: the artificial neural-network time series model estimated with Bayesian regularization and a flexible smooth transition specification, called the neuro-coefficient smooth transition autoregression. The linearity test rejects the null hypothesis of linearity in 10 out of 14 series. We compare, using different measures, the forecasting performance of the nonlinear specifications with the linear autoregression and the random walk models. Index Terms--Bayesian regularization, exchange rates, neural networks, smooth transition models, time series. more...
- Published
- 2001
24. General Statistical Inference for Discrete and Mixed Spaces By an Approximate Application of the Maximum Entropy Principle
- Author
-
Yan, Lian and Miller, David J.
- Subjects
Entropy (Information theory) -- Analysis ,Discrete-time systems -- Usage ,Bayesian statistical decision theory -- Usage ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
We propose a new method for learning a general statistical inference engine, operating on discrete and mixed discrete/continuous feature spaces. Such a model allows inference on any of the discrete features, given values for the remaining features. Applications are, e.g., to medical diagnosis with multiple possible diseases, fault diagnosis, information retrieval, and imputation in databases. Bayesian networks (BN's) are versatile tools that possess this inference capability. However, BN's require explicit specification of conditional independencies, which may be difficult to assess given limited data. Alternatively, Cheeseman proposed finding the maximum entropy (ME) joint probability mass function (pmf) consistent with arbitrary lower order probability constraints. This approach is in principle powerful and does not require explicit expression of conditional independence. However, until now, the huge learning complexity has severely limited the use of this approach. Here we propose an approximate ME method, which also encodes arbitrary low-order constraints but while retaining quite tractable learning. Our method uses a restriction of joint pmf support (during learning) to a subset of the feature space. Results on the University of California-Irvine repository reveal performance gains over several BN approaches and over multilayer perceptrons. Index Terms--Continuous and discrete feature spaces, maximum entropy, multiple inference task, probabilistic expert system. more...
- Published
- 2000
25. Bayesian Nonlinear Model Selection and Neural Networks: A Conjugate Prior Approach
- Author
-
Vila, Jean-Pierre, Wagner, Verene, and Neveu, Pascal
- Subjects
Neural networks -- Models ,Bayesian statistical decision theory -- Usage ,Regression analysis -- Methods ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
In order to select the best predictive neural-network architecture in a set of several candidate networks, we propose a general Bayesian nonlinear regression model comparison procedure, based on the maximization of an expected utility criterion. This criterion selects the model under which the training set achieves the highest level of internal consistency, through the predictive probability distribution of each model. The density of this distribution is computed as the model posterior predictive density and is asymptotically approximated from the assumed Gaussian likelihood of the data set and the related conjugate prior density of the parameters. The use of such a conjugate prior allows the analytic calculation of the parameter posterior and predictive posterior densities, in an empirical-Bayes-like approach. This Bayesian selection procedure allows us to compare general nonlinear regression models and in particular feedforward neural networks, in addition to embedded models as usual with asymptotic comparison tests. Index Terms--Bayesian model selection, conjugate prior distribution, empirical Bayes methods, expected utility criterion, feedforward neural network, nonlinear regression. more...
- Published
- 2000
26. The Application of Neural Networks to the Papermaking Industry
- Author
-
Edwards, Peter J., Murray, Alan F., Papadopoulos, Georgios, Wallace, A. Robin, Barnard, John, and Smith, Gordon
- Subjects
Neural networks -- Research ,Papermaking machinery -- Product enhancement ,Bayesian statistical decision theory -- Usage ,Quality control equipment -- Usage ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
This paper describes the application of neural network techniques to the papermaking industry, particularly for the prediction of paper 'curl.' Paper curl is an important quality measure that can only be measured reliably off-line after manufacture, making it difficult to control. Here we predict, before paper manufacture from characteristics of the current reel, whether the paper curl will be acceptable and the level of curl. For both the case of predicting the probability that paper will be 'out-of-specification' and that of predicting the level of curl, we include confidence intervals indicating to the machine operator whether the predictions should be trusted. The results and the associated discussion describe a successful application of neural networks to a difficult, but important, real-world task taken from the papermaking industry. In addition the techniques described are widely applicable to industry where direct prediction of a quality measure and its acceptability are desirable, with a clear indication of prediction confidence. Index Terms--Bayesian inference, collinearity reduction, committee of networks, confidence measures, multilayer perceptron, symbolic data. more...
- Published
- 1999
27. Bayesian Approach to Neural-Network Modeling with Input Uncertainty
- Author
-
Wright, W. A.
- Subjects
Bayesian statistical decision theory -- Usage ,Neural networks -- Models ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
It is generally assumed when using Bayesian inference methods for neural networks that the input data contains no noise or corruption. For real-world (errors in variable) problems this is clearly an unsafe assumption. This paper presents a Bayesian neural-network framework which allows for input noise provided that some model of the noise process exists. In the limit where the noise process is small and symmetric it is shown, using the Laplace approximation, that this method gives an additional term to the usual Bayesian error bar which depends on the variance of the input noise process. Further by treating the true (noiseless) input as a hidden variable and sampling this jointly with the network's weights, using a Markov chain Monte Carlo method, it is demonstrated that it is possible to infer the regression over the noiseless input. Index Terms--Bayesian estimation, errors in variables, uncertainty. more...
- Published
- 1999
28. A tight bound on concept learning
- Author
-
Takahashi, Haruhisa and Gu, Hanzhong
- Subjects
Approximation theory -- Analysis ,Bayesian statistical decision theory -- Usage ,Distributions, Theory of (Functional analysis) -- Analysis ,Interpolation -- Research ,Machine learning -- Research ,Neural networks -- Research ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
A tight bound on the generalization performance of concept learning is shown by a novel approach. Unlike the existing theories, the new approach uses no assumption on large sample size as in Bayesian approach and does not consider the uniform learnability as in the VC dimension analysis. We analyze the generalization performance of some particular learning algorithm that is not necessarily well behaved, in the hope that once learning curves or sample complexity of this algorithm is obtained, it is applicable to real learning situations. The result is expressed in a dimension called Boolean interpolation dimension, and is tight in the sense that it meets the lower bound requirement of Baum and Haussler. The Boolean interpolation dimension is not greater than the number of modifiable system parameters, and definable for almost all the real-world networks such as back-propagaton networks and linear threshold multilayer networks. It is shown that the generalization error follows from a beta distribution of parameters m, the number of training examples, and d, the Boolean interpolation dimension. This implies that for large d, the learning results tend to the average-case result, known as the self-averaging property of the learning. The bound is shown to be applicable to the practical learning algorithms that can be modeled by Gibbs algorithm with a uniform prior. The result is also extended to the case of inconsistent learning. Index Terms - Backpropagation, generalization error, interpolation dimension, neural networks, PAC learning, sample complexity, VC dimension. more...
- Published
- 1998
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.