192 results on '"Sample space"'
Search Results
2. Globalized distributionally robust optimization based on samples
- Author
-
Li, Yueyao and Xing, Wenxun
- Published
- 2024
- Full Text
- View/download PDF
3. A critique of terms and symbols in school mathematics
- Author
-
Robertas Vilkas
- Subjects
notations ,polynomial ,elementary event ,sample space ,even number ,Mathematics ,QA1-939 - Abstract
Criticism of some terms and symbols of school mathematics is presented, and recommendations are given. The problem of writing decimal fractions using a comma and the resulting semicolon when separating such numbers, which contradicts not only the international symbols of mathematics, but also the general rules of the Lithuanian language, is discussed separately. The incorrect use of the concept of an elementary event not only in schools, but also in universities, is discussed in more detail.
- Published
- 2023
- Full Text
- View/download PDF
4. Deep learning for industrial image: challenges, methods for enriching the sample space and restricting the hypothesis space, and possible issue.
- Author
-
Liu, Tianyuan, Bao, Jinsong, Wang, Junliang, and Wang, Jiacheng
- Subjects
DEEP learning ,COMPUTATIONAL learning theory ,PATTERN recognition systems ,IMAGE recognition (Computer vision) ,PROBLEM solving ,PRODUCT improvement - Abstract
Deep learning (DL) is an important enabling technology for intelligent manufacturing. The DL-based industrial image pattern recognition (DLBIIPR) plays a vital role in the improvement of product quality and production efficiency. Although DL technology has been widely used in the field of natural image, industrial image often has some mixed characteristics, such as small sample, imbalance, small target, strong interference, fine-grained, temporality and semantical, which reduce the feasibility and generalization of DLBIIPR. To solve this problem, this paper provides an overview of approaches commonly used in industry by enriching the sample space and limiting the hypothesis space. In order to improve the confidence of front-line workers in using DL models, the explainable deep learning (XDL) methods are reviewed, and a case study is used to verify the effectiveness of XDL. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
5. What Is Actually Equated in "Test Equating"? A Didactic Note.
- Author
-
der Linden, Wim J. van
- Subjects
SCALING (Social sciences) ,TEST scoring ,PROBABILITY theory - Abstract
The current literature on test equating generally defines it as the process necessary to obtain score comparability between different test forms. The definition is in contrast with Lord's foundational paper which viewed equating as the process required to obtain comparability of measurement scale between forms. The distinction between the notions of scale and score is not trivial. The difference is explained by connecting these notions with standard statistical concepts as probability experiment, sample space, and random variable. The probability experiment underlying equating test forms with random scores immediately gives us the equating transformation as a function mapping the scale of one form into the other and thus supports the point of view taken by Lord. However, both Lord's view and the current literature appear to rely on the idea of an experiment with random examinees which implies a different notion of test scores. It is shown how an explicit choice between the two experiments is not just important for our theoretical understanding of key notions in test equating but also has important practical consequences. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
6. Structural Zero Data of COVID-19 Discovers Exodus Probabilities
- Author
-
Shanmugam R and Singh KP
- Subjects
public knowledge ,perception ,communication ,sample space ,dependent probabilities ,venn layout ,odds ,odds ratio ,survey outcomes ,Medicine (General) ,R5-920 - Abstract
Ramalingam Shanmugam,1 Karan P Singh2 1School of Health Administration, Texas State University, San Marcos, TX, 78666, USA; 2Department of Epidemiology and Biostatistics, School of Community and Rural Health, The University of Texas Health Sciences Center at Tyler, Tyler, TX, 75708, USACorrespondence: Karan P SinghDepartment of Epidemiology and Biostatistics, School of Community and Rural Health, The University of Texas Health Sciences Center at Tyler, Tyler, TX, 75708, USAEmail karan.singh@uthct.eduBackground: Challenges to manage, mitigate, or prevent the COVID-19’s pandemics are felt by medical, healthcare professionals and governing agencies. Health researchers conduct survey among the citizens to capture their opinion on COVID-19. In such surveys like in Hanafiah and Wan (2020), structural-zero (different from sampling zero) category occurs as they question about perception, knowledge, and communication regarding COVID-19.Materials: The data were collected in a survey conducted among Malaysians by Hanafiah and Wan regarding COVID-19. The survey focused on people’s response about the public communication, knowledge, and perception.Methods: One of the four question categories in the survey is mutually exclusive with the other three questions. Consequently, there will be no entry in that category. Such group is called structurally zero category in the literature. The literature never probed the migrative split to other categories of the unknown proportion belonging to the structural zero category. In this article, the probability-based new and innovative method configures what proportion in that mutually exclusive category and it is the essence of our method.Results: The mutually exclusive nature of subquestions manufactured structural zero in their data. A careful analysis of the data has created so far unknown probability concepts in the literature, which we named as “Exodus probabilities” in this article. Its discovery and utility are illustrated and elaborated with application in COVID-19. This methodology is also useful in applications in engineering, epidemiology, marketing, communication networking, etc.Conclusion: What is quite novel about the discovery of the exodus probability in this article is the evolution of the concepts from the structural-zero category. In such situation, when a category is eliminated, the proportions of the sample might have uncommunicatively transited to other viable categories and our research question is all about configuring their proportions. This is an innovative approach.Keywords: public knowledge, perception, communication, sample space, dependent probabilities, Venn layout, odds, odds ratio, survey outcomes
- Published
- 2021
7. An Approach to Predicting Fatigue Crack Growth Under Mixed-Mode Loading Based on Improved Gaussian Process
- Author
-
Honghui Wang, Xin Fang, Guijie Liu, Yingchun Xie, Xiaojie Tian, Dingxin Leng, and Weilei Mu
- Subjects
Fatigue crack growth ,mixed-mode ,improved Gaussian process ,sample space ,local sample densification ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
This paper proposes an approach to predicting fatigue crack growth under mixed-mode loading based on improved Gaussian process. In terms of analyzing the theoretical background for fatigue crack growth, a corresponding finite element model is built to generate sufficient simulation data, which is utilized to obtain the key parameters (e.g., stress intensity factor) for the fatigue crack growth process. And then, a Gaussian process model is achieved to meet the condition that the stress intensity factor is a nonlinear continuous change in crack growth, especially for mixed-mode loading. Following, an idea of local sample densification method is implemented to improve Gaussian sample generation process according to the simplified model of the crack growth path. Based on the above investigation, a fatigue crack growth prediction model using the improved Gaussian process is finished, which is subsequently verified through the test data of lower bainite steel (SCM435) material. The results show that the proposed approach has better computational accuracy and efficiency than the traditional finite element method in predicting fatigue crack growth under mixed-mode loading.
- Published
- 2021
- Full Text
- View/download PDF
8. FAIRNESS IN GAMES: A STUDY ON CHILDREN'S AND ADULTS' UNDERSTANDING OF PROBABILITY.
- Author
-
BATISTA, RITA, BORBA, RUTE, and HENRIQUES, ANA
- Subjects
PROBABILITY theory ,BAYESIAN analysis ,REASONING ,THOUGHT & thinking ,GAMES - Abstract
This study aims to analyze the reasoning that children and adults with the same school level use to assess and justify the fairness of games, considering aspects of probability such as randomness, sample space, and comparison of probabilities. Data collection included a Piagetian clinical interview based on games of chance. The results showed participants' judgments about the fairness of the games depended mainly on the understanding of independence of events, analysis of the sample space, and perception of proportionality when comparing probabilities, and indicated they may have misunderstandings about these ideas. The similar low performance of adults and children on probabilistic reasoning tasks, indicates that the maturity and experience of the adults were not enough to develop appropriate probabilistic reasoning and to instrumentalize it to assess the fairness of a game consistently. Thus, teaching interventions to expand and consolidate students' learning in the field of probability are recommended and the activities presented in this study may serve as a basis for such interventions. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
9. Solving Probabilistic Tasks in Geometrical Context by Primary School Students.
- Author
-
Hernández Solís, Luis Armando, Batanero, Carmen, Gea, María Magdalena, and Álvarez-Arroyo, Rocío
- Abstract
We present an exploratory study of solving probabilistic tasks proposed to a sample of 55 primary school 6th grade Costa Rican children on comparison of probabilities and the construction of the sample space, analysing their strategies and errors. Comparing the results with previous investigations, an improvement is observed in the item in which the comparison of favorable and possible cases can be applied, and where the comparison of areas is necessary; however, there were no differences in the item in which the order in which the favorable cases are located is introduced as a distractor. The sample space is generally correctly built in the cases of possible and equiprobable event, but not in those of impossible or certain events. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
10. Exact Confidence Limits for the Parameter of an Exponential Distribution in the Accelerated Life Tests under Type-I Censoring.
- Author
-
Zheng, De-qiang and Fang, Xiang-zhong
- Abstract
Life data frequently arise in many reliability studies, such as accelerated life tests studies. This paper considers the part of life data where failure and censoring observations may exist. To develop statistical methods and theory for the analysis of these data, a new approach was proposed to obtain the exact lower and upper confidence limits for the mean life of the exponential distribution with Type-I censoring data. It is assumed that the acceleration factor is a random variable, and that the distribution of the acceleration factor is known from some empirical information or the meta analysis. A method for constructing the lower and upper confidence limits for the parameter based on an ordering relation among the sample space was proposed. Simulation studies and analyses of two examples suggest that the proposed method performed well. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
11. CONSTRUCTING INFORMAL VOCABULARY: A MIDDLE-SCHOOLER'S PATH TO PROBABILISTIC REASONING.
- Author
-
Findley, Kelly and Atabas, Sebnem
- Subjects
MIDDLE school education ,STUDENT participation ,PROBABILISTIC databases ,REASONING ,VACATION schools ,INSTRUCTIONAL innovations - Abstract
Our research investigates the probabilistic reasoning of middle-school students making sense of sample space, distribution, and long-run likelihood. Eight students participating in a 3-week summer class on probability participated in games and activities that focused on concepts from the Common Core State Standards for grades 6-8. These activities utilized both concrete and simulated environments. Through engagement with the games and activities, we captured seeds of probabilistic reasoning that emerged with one student in particular, Evan, and took note of the vocabulary he and the class created to make sense of seemingly paradoxical situations. Implications for instruction on probability are briefly discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2018
12. A new parallel data geometry analysis algorithm to select training data for support vector machine
- Author
-
Shu Lv, Kai-bo Shi, and Yunfeng Shi
- Subjects
Measure (data warehouse) ,Mahalanobis distance ,geometry analysis ,Computer science ,General Mathematics ,Centroid ,Geometry ,Sample (statistics) ,sample reduction ,Support vector machine ,ComputingMethodologies_PATTERNRECOGNITION ,parallel ,Outlier ,Sample space ,QA1-939 ,Trigonometric functions ,support vector machine ,mahalanobis distance ,Algorithm ,Mathematics - Abstract
Support vector machine (SVM) is one of the most powerful technologies of machine learning, which has been widely concerned because of its remarkable performance. However, when dealing with the classification problem of large-scale datasets, the high complexity of SVM model leads to low efficiency and become impractical. Due to the sparsity of SVM in the sample space, this paper presents a new parallel data geometry analysis (PDGA) algorithm to reduce the training set of SVM, which helps to improve the efficiency of SVM training. The PDGA introduce Mahalanobis distance to measure the distance from each sample to its centroid. And based on this, proposes a method that can identify non support vectors and outliers at the same time to help remove redundant data. When the training set is further reduced, cosine angle distance analysis method is proposed to determine whether the samples are redundant data, ensure that the valuable data are not removed. Different from the previous data geometry analysis methods, the PDGA algorithm is implemented in parallel, which greatly saving the computational cost. Experimental results on artificial dataset and 6 real datasets show that the algorithm can adapt to different sample distributions. Which significantly reduce the training time and memory requirements without sacrificing the classification accuracy, and its performance is obviously better than the other five competitive algorithms.
- Published
- 2021
13. Kernel C-Means Clustering Algorithms for Hesitant Fuzzy Information in Decision Making.
- Author
-
Li, Chaoqun, Zhao, Hua, and Xu, Zeshui
- Subjects
CLUSTER analysis (Statistics) ,KERNEL functions ,FUZZY sets ,FUZZY systems ,DECISION making ,STATISTICAL decision making - Abstract
When facing clustering problems for hesitant fuzzy information, we normally solve them on sample space by using a certain hesitant fuzzy clustering algorithm, which is usually time-consuming or generates inaccurate clustering results. To overcome the issue, we propose a novel hesitant fuzzy clustering algorithm called hesitant fuzzy kernel C-means clustering (HFKCM) by means of kernel functions, which maps the data from the sample space to a high-dimensional feature space. As a result, the differences between different samples are expanded and thus make the clustering results much more accurate. By conducting simulation experiments on distributions of facilities and the twenty-first Century Maritime Silk Road, the results reveal the feasibility and availability of the proposed HFKCM algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
14. Structural Zero Data of COVID-19 Discovers Exodus Probabilities
- Author
-
Ramalingam Shanmugam and Karan P. Singh
- Subjects
Coronavirus disease 2019 (COVID-19) ,Computer science ,media_common.quotation_subject ,Sample (statistics) ,perception ,Odds ,03 medical and health sciences ,0302 clinical medicine ,Perception ,odds ratio ,030212 general & internal medicine ,Research question ,General Nursing ,media_common ,communication ,Journal of Multidisciplinary Healthcare ,030503 health policy & services ,Methodology ,Sampling (statistics) ,odds ,General Medicine ,public knowledge ,sample space ,Data science ,dependent probabilities ,Zero (linguistics) ,survey outcomes ,Sample space ,0305 other medical science ,Venn layout - Abstract
Ramalingam Shanmugam,1 Karan P Singh2 1School of Health Administration, Texas State University, San Marcos, TX, 78666, USA; 2Department of Epidemiology and Biostatistics, School of Community and Rural Health, The University of Texas Health Sciences Center at Tyler, Tyler, TX, 75708, USACorrespondence: Karan P SinghDepartment of Epidemiology and Biostatistics, School of Community and Rural Health, The University of Texas Health Sciences Center at Tyler, Tyler, TX, 75708, USAEmail karan.singh@uthct.eduBackground: Challenges to manage, mitigate, or prevent the COVID-19’s pandemics are felt by medical, healthcare professionals and governing agencies. Health researchers conduct survey among the citizens to capture their opinion on COVID-19. In such surveys like in Hanafiah and Wan (2020), structural-zero (different from sampling zero) category occurs as they question about perception, knowledge, and communication regarding COVID-19.Materials: The data were collected in a survey conducted among Malaysians by Hanafiah and Wan regarding COVID-19. The survey focused on people’s response about the public communication, knowledge, and perception.Methods: One of the four question categories in the survey is mutually exclusive with the other three questions. Consequently, there will be no entry in that category. Such group is called structurally zero category in the literature. The literature never probed the migrative split to other categories of the unknown proportion belonging to the structural zero category. In this article, the probability-based new and innovative method configures what proportion in that mutually exclusive category and it is the essence of our method.Results: The mutually exclusive nature of subquestions manufactured structural zero in their data. A careful analysis of the data has created so far unknown probability concepts in the literature, which we named as “Exodus probabilities” in this article. Its discovery and utility are illustrated and elaborated with application in COVID-19. This methodology is also useful in applications in engineering, epidemiology, marketing, communication networking, etc.Conclusion: What is quite novel about the discovery of the exodus probability in this article is the evolution of the concepts from the structural-zero category. In such situation, when a category is eliminated, the proportions of the sample might have uncommunicatively transited to other viable categories and our research question is all about configuring their proportions. This is an innovative approach.Keywords: public knowledge, perception, communication, sample space, dependent probabilities, Venn layout, odds, odds ratio, survey outcomes
- Published
- 2021
15. Optimal performance evaluation of thermal AGC units based on multi-dimensional feature analysis.
- Author
-
Li, Bin, Wang, Shuai, Li, Botong, Li, Hongbo, and Wu, Jianzhong
- Subjects
- *
DATA scrubbing , *AUTOMATIC control systems , *EVALUATION methodology , *PROBLEM solving - Abstract
• Tracking abilities of AGC units were evaluated based on energy regulation. • AGC sample space was constructed and regularities were analysed by data-driving. • Appropriate operating conditions for different AGC units was obtained in detail. In modern energy system, automatic generation control (AGC) is the core technology of real-time output regulation for thermal power generator. The performance of thermal AGC units must be accurately evaluated to measure their actual contribution to the energy system. However, based on current conventional evaluation methods, the difficulty of the tasks undertaken by AGC units has not been distinguished and quantified. An optimal performance evaluation method based on multi-dimensional feature analysis is proposed. Firstly, a performance index describing the difference between the expected regulating energy and the actual regulated energy of AGC units is designed, which improves the universality of the evaluation to the actual engineering scenarios. Additionally, after data preprocessing and data cleaning, a sample space is constructed to significantly distinguish the difficulty of tasks performed by AGC units. Finally, a multi-dimensional feature analysis in the sample space is proposed to find the optimal performance points of AGC units. Based on historical data, the proposed methods were verified on real AGC units. The experimental results show that the proposed method obtains detailed evaluation results of thermal AGC units with different control requirements and solves the problem of evaluation failure in traditional method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. An Approach to Predicting Fatigue Crack Growth Under Mixed-Mode Loading Based on Improved Gaussian Process
- Author
-
Dingxin Leng, Wang Honghui, Mu Weilei, Xiaojie Tian, Xin Fang, Guijie Liu, and Yingchun Xie
- Subjects
Materials science ,General Computer Science ,improved Gaussian process ,Gaussian ,020101 civil engineering ,02 engineering and technology ,Fatigue crack growth ,0201 civil engineering ,symbols.namesake ,local sample densification ,0203 mechanical engineering ,General Materials Science ,Electrical and Electronic Engineering ,Gaussian process ,Stress intensity factor ,business.industry ,General Engineering ,Process (computing) ,Structural engineering ,Paris' law ,sample space ,Finite element method ,Nonlinear system ,020303 mechanical engineering & transports ,symbols ,mixed-mode ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,business ,lcsh:TK1-9971 ,Test data - Abstract
This paper proposes an approach to predicting fatigue crack growth under mixed-mode loading based on improved Gaussian process. In terms of analyzing the theoretical background for fatigue crack growth, a corresponding finite element model is built to generate sufficient simulation data, which is utilized to obtain the key parameters (e.g., stress intensity factor) for the fatigue crack growth process. And then, a Gaussian process model is achieved to meet the condition that the stress intensity factor is a nonlinear continuous change in crack growth, especially for mixed-mode loading. Following, an idea of local sample densification method is implemented to improve Gaussian sample generation process according to the simplified model of the crack growth path. Based on the above investigation, a fatigue crack growth prediction model using the improved Gaussian process is finished, which is subsequently verified through the test data of lower bainite steel (SCM435) material. The results show that the proposed approach has better computational accuracy and efficiency than the traditional finite element method in predicting fatigue crack growth under mixed-mode loading.
- Published
- 2021
17. Random Fuzzy Clustering Granular Hyperplane Classifier
- Author
-
Wei Li, Chao Tang, Xiaoyu Ma, Yumin Chen, Kaiqiang Zhang, and Youmeng Luo
- Subjects
Fuzzy clustering ,General Computer Science ,granular computing ,Computer science ,parallel distributed granulation ,0211 other engineering and technologies ,02 engineering and technology ,Genetic algorithm ,0202 electrical engineering, electronic engineering, information engineering ,Reinforcement learning ,General Materials Science ,Cluster analysis ,Time complexity ,fuzzy clustering granular hyperplane ,021101 geological & geomatics engineering ,business.industry ,Granular computing ,General Engineering ,Pattern recognition ,Classification ,fuzzy sets ,ComputingMethodologies_PATTERNRECOGNITION ,Hyperplane ,Sample space ,020201 artificial intelligence & image processing ,Rough set ,Artificial intelligence ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,business ,Classifier (UML) ,lcsh:TK1-9971 - Abstract
Granular computing is a method of studying human intelligent information processing, which has advantage of knowledge discovery. In this paper, we convert a classification problem of sample space into a classification problem of fuzzy clustering granular space and propose a random fuzzy clustering granular hyperplane classifier (RFCGHC) from the perspective of granular computing. Most classifiers are only used to process numerical data, RFCGHC can process not only non-numerical data, such as information granules, but also numerical data. The classic granulation method is generally serial granulation, which has high time complexity. We design a parallel distributed granulation method to enhance efficiency. First, a clustering algorithm with adaptive cluster center number is proposed, where the ratio of standard deviation between categories and standard deviation within categories is as evaluation criterion. The clusters and the optimal amount of cluster centers can be achieved by the method. On the basis of these, sample set can be divided into many subsets and each sample can be granulated by these cluster centers. Then, a fuzzy clustering granular space can be formed, where fuzzy clustering granules, fuzzy clustering granular vectors, and their operators can be defined. In order to get the optimal hyperplane in the fuzzy clustering granular space to classify these samples, we design a loss function and evaluate each category with probability by fuzzy clustering granular hyperplane. In solving the loss function, genetic algorithm based on fuzzy clustering granules is adopted. Experimental results and theoretical analysis show that RFCGHC has good performance.
- Published
- 2020
18. A Dynamic Network Model For Population Growth And Urbanization
- Author
-
Emir Haliki and Ege Üniversitesi
- Subjects
Dynamic network analysis ,Fizik, Partiküller ve Alanlar ,logistic differential equation ,Computer science ,Population ,Fizik, Atomik ve Moleküler Kimya ,Topology (electrical circuits) ,İnşaat Mühendisliği ,Topology ,Network topology ,Kimya, Analitik ,Mühendislik, Kimya ,Gıda Bilimi ve Teknolojisi ,Fizik, Katı Hal ,Fizik, Akışkanlar ve Plazma ,education ,lcsh:Science ,lcsh:Science (General) ,Kimya, Tıbbi ,Clustering coefficient ,Network model ,Connected component ,education.field_of_study ,Kimya, Organik ,Basic Sciences ,Temel Bilimler ,Fizikokimya ,clustering coefficient ,General Medicine ,Mühendislik, Makine ,Kimya, Uygulamalı ,Biyoloji Çeşitliliğinin Korunması ,connected component ,Çevre Mühendisliği ,lcsh:TA1-2040 ,population growth ,correlation ,network ,Sample space ,Fizik, Nükleer ,Bilgisayar Bilimleri, Bilgi Sistemleri ,Mühendislik, Jeoloji ,lcsh:Q ,Network,population growth,logistic differential equation,connected component,clustering coefficient,correlation ,Fizik, Matematik ,lcsh:Engineering (General). Civil engineering (General) ,Biyoloji ,lcsh:Q1-390 - Abstract
Dynamic networks imply those states of which change over time and such changes are generally associated with the topology of a network. Dynamic models are currently needed for numerous systems which could be defined as a network model. Those related to the propagation of living organisms are also a typical example. the study has examined a sample space which has been defined in the network topology of human population as the way in which it will spread with population growth and correlated with various variables in the modeled dynamic network., Dinamik ağlar, zaman içinde değişiklik gösteren durumları belirtmekte ve bu değişiklikler genellikle bir ağın topolojisi ile ilişkilendirilmektedir. Ağ modeli olarak tanımlanabilecek birçok sistem için dinamik modellere ihtiyaç duyulmaktadır. Canlı organizmaların yayılması ile ilgili olanlar da bu duruma bir örnektir. Çalışmada insan nüfusunun ağ topolojisi tanımlanarak ile bir örnek uzay oluşturulmuş, nüfus artışı ile içerisinde nasıl bir yayılma görüleceği araştırılmış ve nüfus büyümesinin modellenen dinamik ağdaki çeşitli değişkenler ile aralarındaki korelasyonlar incelenmiştir.
- Published
- 2019
19. Classification of complex systems by their sample-space scaling exponents
- Author
-
Jan Korbel, Rudolf Hanel, and Stefan Thurner
- Subjects
scaling expansion ,extensive entropy ,super-exponential systems ,complex systems ,sample space ,Science ,Physics ,QC1-999 - Abstract
The nature of statistics, statistical mechanics and consequently the thermodynamics of stochastic systems is largely determined by how the number of states W ( N ) depends on the size N of the system. Here we propose a scaling expansion of the phasespace volume W ( N ) of a stochastic system. The corresponding expansion coefficients (exponents) define the universality class the system belongs to. Systems within the same universality class share the same statistics and thermodynamics. For sub-exponentially growing systems such expansions have been shown to exist. By using the scaling expansion this classification can be extended to all stochastic systems, including correlated, constraint and super-exponential systems. The extensive entropy of these systems can be easily expressed in terms of these scaling exponents. Systems with super-exponential phasespace growth contain important systems, such as magnetic coins that combine combinatorial and structural statistics. We discuss other applications in the statistics of networks, aging, and cascading random walks.
- Published
- 2018
- Full Text
- View/download PDF
20. Variabilidade amostral das séries mensais de precipitação pluvial em duas regiões do Brasil: Pelotas-RS e Campinas-SP Sample variability of monthly precipitation series in two regions of Brazil: Pelotas-RS and Campinas-SP
- Author
-
Gabriel Constantino Blain, Mary Toshie Kayano, Marcelo Bento Paes de Camargo, and Jorge Lulu
- Subjects
espaço amostral ,teste da razão da máxima verossimilhança ,sample space ,likelihood ratio test ,Meteorology. Climatology ,QC851-999 - Abstract
O presente trabalho avaliou a variabilidade amostral dos parâmetros da distribuição gama, relativos a séries mensais de precipitação pluvial, nas regiões de Campinas-SP e Pelotas-RS, que têm dados para os períodos de 1890-2006 e 1890-2005, respectivamente. Assim, os espaços amostrais considerados foram de 58, 39 e 29 anos para Campinas e 58 e 29 anos para Pelotas. As análises foram feitas usando o teste da razão da máxima verossimilhança. Os resultados apontaram significativas alterações amostrais. Não houve indicações de tendências contínuas (redução ou aumento) no regime mensal de precipitação pluvial na região de Campinas-SP. Em contrapartida, esse teste indicou, para a localidade de Pelotas-RS, tendência de elevação no regime desse elemento meteorológico ocorrida entre as amostras relativas aos anos de 1948 a 1976 e 1977 a 2005.The present work evaluated the sample variability of the Gamma distribution parameters fitted to monthly precipitation series in the regions of Campinas-SP and Pelotas-RS, which have data for the 1890-2006 and 1890-2005 periods, respectively. So, the sample spaces considered were of 58, 39 and 29 years for Campinas, and of 58 and 29 years for Pelotas. Analyses were done using the likelihood ratio test. The analyses showed significant sample alterations. No trend was detected in monthly precipitation series of the region of Campinas-SP. Increasing trends was detected in the monthly precipitation series of the region of Pelotas-RS considering the 1948 to 1976 and 1977 to 2005 samples.
- Published
- 2009
- Full Text
- View/download PDF
21. Joint Feature-Space and Sample-Space Based Heterogeneous Feature Transfer Method for Object Recognition Using Remote Sensing Images with Different Spatial Resolutions
- Author
-
Erwei Yin, Wei Qin, Wei Hu, Xiangyi Meng, Ye Yan, Xie Liang, Xiyuan Kong, and Yan Huijiong
- Subjects
Computer science ,Chemical technology ,Feature vector ,Cognitive neuroscience of visual object recognition ,classification of remote sensing images ,TP1-1185 ,transfer learning ,Biochemistry ,Article ,Atomic and Molecular Physics, and Optics ,Analytical Chemistry ,Transformation (function) ,Dimension (vector space) ,Feature (computer vision) ,Remote Sensing Technology ,Sample space ,heterogeneous feature transfer ,Electrical and Electronic Engineering ,Projection (set theory) ,Focus (optics) ,Instrumentation ,negative transfer ,Remote sensing - Abstract
To improve the classification results of high-resolution remote sensing images (RSIs), it is necessary to use feature transfer methods to mine the relevant information between high-resolution RSIs and low-resolution RSIs to train the classifiers together. Most of the existing feature transfer methods can only handle homogeneous data (i.e., data with the same dimension) and are susceptible to the quality of the RSIs, while RSIs with different resolutions present different feature dimensions and samples obtained from illumination conditions. To obtain effective classification results, unlike existing methods that focus only on the projection transformation in feature space, a joint feature-space and sample-space heterogeneous feature transfer (JFSSS-HFT) method is proposed to simultaneously process heterogeneous multi-resolution images in feature space using projection matrices of different dimensions and reduce the impact of outliers by adaptive weight factors in the sample space simultaneously to reduce the occurrence of negative transfer. Moreover, the maximum interclass variance term is embedded to improve the discriminant ability of the transferred features. To solve the optimization problem of JFSSS-HFT, the alternating-direction method of multipliers (ADMM) is introduced to alternatively optimize the parameters of JFSSS-HFT. Using different types of ship patches and airplane patches with different resolutions, the experimental results show that the proposed JFSSS-HFT obtains better classification results than the typical feature transferred methods.
- Published
- 2021
- Full Text
- View/download PDF
22. A Novel Optimization Method for Conventional Primary and Secondary School Classrooms in Southern China Considering Energy Demand, Thermal Comfort and Daylighting
- Author
-
Hao Qian, Yizhe Xu, Gang Wang, Yanlong Jiang, Liang Sun, and Chengchu Yan
- Subjects
Computer science ,Geography, Planning and Development ,TJ807-830 ,Management, Monitoring, Policy and Law ,computer.software_genre ,TD194-195 ,Renewable energy sources ,multi-objective ,Genetic algorithm ,GE1-350 ,Artificial neural network ,Environmental effects of industries and plants ,Renewable Energy, Sustainability and the Environment ,Sorting ,Energy consumption ,Industrial engineering ,Simulation software ,Environmental sciences ,envelop optimization ,ANN ,school classroom ,Benchmark (computing) ,Sample space ,computer ,Daylighting - Abstract
The classroom environment is of great significance for the health of primary and secondary school students, but a comfortable indoor environment often requires higher energy consumption. This paper presents a multi-objective optimization method based on an artificial neural network (ANN) model, which can help designers efficiently optimize the design of primary and secondary school classrooms in southern China. In this optimization method, first, the optimization objectives and variables are determined according to building characteristics, and the physical model is established through simulation software (EnergyPlus) to generate the sample space. Second, sensitivity analysis is carried out for each optimization variable, and the physical model is modified according to the results to regenerate the sample space. Third, the ANN model is trained by using the regenerated sample space, and the Pareto optimal solution is generated through the use of the non-dominated sorting genetic algorithm II (NSGA-II). Finally, the effectiveness of the multi-objective optimization method is proven through a typical case of primary and secondary school classrooms in Nanjing, China. The results show that, compared with the benchmark scheme, TES decreased by 810.8 kWh at most, PT increased by 47.8% at most and DI increased by 4.2% at most.
- Published
- 2021
23. Functional, randomized and smoothed multivariate quantile regions
- Author
-
Olivier P. Faugeras and Ludger Rüschendorf
- Subjects
Statistics and Probability ,Numerical Analysis ,Multivariate statistics ,Markov chain ,Mass transportation ,010102 general mathematics ,Multivariate normal distribution ,[SHS.ECO]Humanities and Social Sciences/Economics and Finance ,01 natural sciences ,Stability (probability) ,Copula (probability theory) ,Depth area ,010104 statistics & probability ,Copula ,Sample space ,Applied mathematics ,0101 mathematics ,Statistics, Probability and Uncertainty ,Vector quantiles ,B- ECONOMIE ET FINANCE ,Smoothing ,Quantile ,Mathematics - Abstract
National audience; The mass transportation approach to multivariate quantiles in Chernozhukov et al. (2017) was modified in Faugeras and Rüschendorf (2017) by a two steps procedure. In the first step, a mass transportation problem from a spherical reference measure to the copula is solved and combined in the second step with a marginal quantile transformation in the sample space. Also, generalized quantiles given by suitable Markov morphisms are introduced there.In the present paper, this approach is further extended by a functional approach in terms of membership functions, and by the introduction of randomized quantile regions. In addition, in the case of continuous marginals, a smoothed version of the empirical quantile regions is obtained by smoothing the empirical copula. All three extended approaches give empirical quantile ares of exact level and improved stability. The resulting depth areas give a valid representation of the central quantile areas of a multivariate distribution and provide a valuable tool for their analysis.
- Published
- 2021
24. The Event: An Underexamined Risk Concept.
- Author
-
Yellman, Ted W.
- Subjects
RISK assessment ,AXIOMS ,INCONSISTENT verdicts ,LINGUISTICS ,TEXTBOOKS ,ASSERTIVENESS (Psychology) - Abstract
Some of the terms used in risk assessment and management are poorly and even contradictorily defined. One such term is 'event,' which arguably describes the most basic of all risk-related concepts. The author cites two contemporary textbook interpretations of 'event' that he contends are incorrect and misleading. He then examines the concept of an event in A. N. Kolmogorov's probability axioms and in several more-current textbooks. Those concepts are found to be too narrow for risk assessments and inconsistent with the actual usage of 'event' by risk analysts. The author goes on to define and advocate linguistic definitions of events (as opposed to mathematical definitions)-definitions constructed from natural language. He argues that they should be recognized for what they are: the de facto primary method of defining events. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
25. A novelty data mining approach for multi-influence factors on billet gas consumption in reheating furnace
- Author
-
Tang Kai, Jiaqi Li, Demin Chen, Biao Lu, and Yibo Zhao
- Subjects
020209 energy ,02 engineering and technology ,computer.software_genre ,01 natural sciences ,Degree (temperature) ,Iron and steel industry ,0202 electrical engineering, electronic engineering, information engineering ,Billet gas consumption ,Engineering (miscellaneous) ,Residence time (statistics) ,Reheating furnace ,Data mining ,Mathematics ,Fluid Flow and Transfer Processes ,Multi-influence factors ,Novelty ,Division (mathematics) ,Engineering (General). Civil engineering (General) ,010406 physical chemistry ,0104 chemical sciences ,Data set ,Sample space ,TA1-2040 ,Gas consumption ,computer ,Interpolation - Abstract
To systematically and quantitatively analyze the influence factors on billet gas consumption (BGC) in reheating furnace, a novelty data mining approach for multi-influence factors BGC analysis was proposed in this paper. This multi-influence factors data mining model mainly includes four steps: Firstly, the BGC apportionment model was established based on energy apportionment model in reheating furnace; Secondly, the BGC data set could be achieved according to the division of billet sample space (BSS); Thirdly, the data interpolation calculation method of various BSS subsets (BSSSs) was put forward; Lastly, the influence degree analysis method of various factors on BGC was described in detail. Especially, contribution degree model, which could quantitatively describe the influence degree of each factor on BGC, was established. Case study showed that working groups (WGs) should be eliminated because of weak influence on BGC. Then the order of contribution degree on BGC from weak to strong was working shifts (WSs) (1.61%), residence time (9.7%), loading temperature (88.68%). Therefore, residence time and loading temperature should be highlighted in all factors. Finally, some measures and suggestions, which could improve the residence time and loading temperature, were put forward.
- Published
- 2021
26. Risk Assessment of Hypertension in Steel Workers Based on LVQ and Fisher-SVM Deep Excavation
- Author
-
Hai-Dong Wang, Jing Li, Juxiang Yuan, Xin Zhang, Lu Zhang, Robertas Damaševičius, Wei Wei, Jianhui Wu, Guoli Wang, Jie Wang, and Marcin Wozniak
- Subjects
LVQ neural network ,Learning vector quantization ,Framingham Risk Score ,hypertension ,General Computer Science ,Artificial neural network ,fungi ,General Engineering ,risk assessment ,Risk factor (computing) ,Fisher-SVM ,Steel workers ,Support vector machine ,Sample size determination ,Statistics ,Sample space ,General Materials Science ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,Risk assessment ,lcsh:TK1-9971 ,Mathematics - Abstract
The steel industry is one of the pillar industries in China. The physical and mental health of steel workers is related to the development of China's steel industry. Steel workers have long been working in shifts, high temperatures, noise, highly stressed, and first-line environments. These occupational related factors have an impact on the health of steel workers. At present, the existing hypertension risk scoring models do not include occupational related factors, so they are not applicable to the risk score of hypertension in steel workers. It is necessary to establish a risk scoring model for hypertension in steel workers. In this study, the learning vector quantization (LVQ) neural network algorithm and the FisherSVM coupling algorithm are applied to estimate the hypertension risk of steel workers, and the microscopic laws of the "tailing" phenomenon of the two algorithms are analyzed by means of graphics analysis, which can describe the influence trend of sample size change in different intervals on the classification effect. The results show that the classification accuracy of the algorithm depends on the size of the sample space. When the sample size n ≤ 30 * (k + 1), the Fisher-SVM coupling intelligent algorithm is more applicable. Because its average accuracy rate is 90.00%, the average accuracy of the LVQ algorithm is only 63.34%. When the sample size is n > 30 * (k + 1), the LVQ algorithm is more applicable. Because its average accuracy rate is 93.33%, the average accuracy of the Fisher-SVM coupling intelligent algorithm is only 76.67%. The sample size of this paper is 4422, and the prediction of LVQ neural network model is more accurate. Therefore, based on the relative importance of each risk factor obtained by this model and to establish a steel worker hypertension risk rating scale, the score greater than 18 is considered as the high risk, 12-18 is considered as the medium risk, and less than 12 is considered as the low risk. Through the example's verification, the accuracy rate of the scale is 90.50% and the effect is very good. It shows that the established scoring system can effectively assess the risk of hypertension in steel workers and provide an effective basis for primary prevention of hypertension in steel workers.
- Published
- 2019
27. A Parameter-Free Cleaning Method for SMOTE in Imbalanced Classification
- Author
-
Ruiqing Liu, Yanping Zhang, Zihan Ding, Yuanting Yan, Xiuquan Du, and Jie Chen
- Subjects
constructive covering algorithm ,General Computer Science ,Computer science ,General Engineering ,020207 software engineering ,02 engineering and technology ,Imbalanced data ,computer.software_genre ,Constructive ,Measure (mathematics) ,oversampling ,0202 electrical engineering, electronic engineering, information engineering ,Sample space ,Oversampling ,020201 artificial intelligence & image processing ,General Materials Science ,Data mining ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,computer ,lcsh:TK1-9971 ,SMOTE ,data cleaning - Abstract
Oversampling is an efficient technique in dealing with class-imbalance problem. It addresses the problem by reduplicating or generating the minority class samples to balance the distribution between the samples of the majority and the minority class. Synthetic minority oversampling technique (SMOTE) is one of the typical representatives. During the past decade, researchers have proposed many variants of SMOTE. However, the existing oversampling methods may generate wrong minority class samples in some scenarios. Furthermore, how to effectively mine the inherent complex characteristics of imbalanced data remains a challenge. To this end, this paper proposes a parameter-free data cleaning method to improve SMOTE based on constructive covering algorithm. The dataset generated by SMOTE is first partitioned into a group of covers, then the hard-to-learn samples can be detected based on the characteristics of sample space distribution. Finally, a pair-wise deletion strategy is proposed to remove the hard-to-learn samples. The experimental results on 25 imbalanced datasets show that our proposed method is superior to the comparison methods in terms of various metrics, such as ${F}$ -measure, ${G}$ -mean, and Recall. Our method not only can reduce the complexity of the dataset but also can improve the performance of the classification model.
- Published
- 2019
28. An Optimized Neural Network Classification Method Based on Kernel Holistic Learning and Division
- Author
-
Yan Che, Li Tongbin, Hui Wen, Deli Chen, and Jianlu Yang
- Subjects
0209 industrial biotechnology ,Article Subject ,Computer science ,General Mathematics ,02 engineering and technology ,Kernel (linear algebra) ,020901 industrial engineering & automation ,Robustness (computer science) ,0202 electrical engineering, electronic engineering, information engineering ,QA1-939 ,Radial basis function ,business.industry ,General Engineering ,Pattern recognition ,Engineering (General). Civil engineering (General) ,ComputingMethodologies_PATTERNRECOGNITION ,Kernel (statistics) ,Radial basis function kernel ,Sample space ,Benchmark (computing) ,020201 artificial intelligence & image processing ,Artificial intelligence ,TA1-2040 ,business ,Classifier (UML) ,Subspace topology ,Mathematics - Abstract
An optimized neural network classification method based on kernel holistic learning and division (KHLD) is presented. The proposed method is based on the learned radial basis function (RBF) kernel as the research object. The kernel proposed here can be considered a subspace region consisting of the same pattern category in the training sample space. By extending the region of the sample space of the original instances, relevant information between instances can be obtained from the subspace, and the classifier’s boundary can be far from the original instances; thus, the robustness and generalization performance of the classifier are enhanced. In concrete implementation, a new pattern vector is generated within each RBF kernel according to the instance optimization and screening method to characterize KHLD. Experiments on artificial datasets and several UCI benchmark datasets show the effectiveness of our method.
- Published
- 2021
- Full Text
- View/download PDF
29. Complete and competitive financial markets in a complex world
- Author
-
Gianluca Cassese and Cassese, G
- Subjects
Statistics and Probability ,Financial economics ,Mathematical finance ,Financial market ,Extension (predicate logic) ,Mathematical Finance (q-fin.MF) ,Microeconomics ,FOS: Economics and business ,Market power ,Market completene ,Quantitative Finance - Mathematical Finance ,No arbitrage ,Economics ,Sample space ,Sublinear pricing ,Arbitrage ,SECS-S/06 - METODI MATEMATICI DELL'ECONOMIA E DELLE SCIENZE ATTUARIALI E FINANZIARIE ,Statistics, Probability and Uncertainty ,SECS-P/01 - ECONOMIA POLITICA ,Finance ,Probability measure ,Competitive completion - Abstract
We investigate the possibility of completing financial markets in a model with no exogenous probability measure, with market imperfections and with an arbitrary sample space. We also consider whether such an extension may be possible in a competitive environment. Our conclusions highlight the economic role of complexity.
- Published
- 2021
30. KNN Classification with One-step Computation
- Author
-
Shichao Zhang and Jiaye Li
- Subjects
FOS: Computer and information sciences ,Numerical linear algebra ,Computer Science - Machine Learning ,Computer science ,business.industry ,Computer Science - Artificial Intelligence ,Computation ,Pattern recognition ,computer.software_genre ,Computer Science Applications ,Weighting ,k-nearest neighbors algorithm ,Machine Learning (cs.LG) ,Set (abstract data type) ,Artificial Intelligence (cs.AI) ,ComputingMethodologies_PATTERNRECOGNITION ,Computational Theory and Mathematics ,Classification rule ,Sample space ,Artificial intelligence ,business ,computer ,Information Systems ,Test data - Abstract
KNN classification is an improvisational learning mode, in which they are carried out only when a test data is predicted that set a suitable K value and search the K nearest neighbors from the whole training sample space, referred them to the lazy part of KNN classification. This lazy part has been the bottleneck problem of applying KNN classification due to the complete search of K nearest neighbors. In this paper, a one-step computation is proposed to replace the lazy part of KNN classification. The one-step computation actually transforms the lazy part to a matrix computation as follows. Given a test data, training samples are first applied to fit the test data with the least squares loss function. And then, a relationship matrix is generated by weighting all training samples according to their influence on the test data. Finally, a group lasso is employed to perform sparse learning of the relationship matrix. In this way, setting K value and searching K nearest neighbors are both integrated to a unified computation. In addition, a new classification rule is proposed for improving the performance of one-step KNN classification. The proposed approach is experimentally evaluated, and demonstrated that the one-step KNN classification is efficient and promising, 13 pages. in IEEE Transactions on Knowledge and Data Engineering
- Published
- 2020
31. Entropy and the Link Action in the Causal Set Path-Sum
- Author
-
Sumati Surya, Abhishek Mathur, and Anup Anand Singh
- Subjects
Physics ,High Energy Physics - Theory ,Physics and Astronomy (miscellaneous) ,010308 nuclear & particles physics ,Mathematics::Complex Variables ,FOS: Physical sciences ,General Relativity and Quantum Cosmology (gr-qc) ,Causal sets ,01 natural sciences ,Action (physics) ,General Relativity and Quantum Cosmology ,Range (mathematics) ,Entropy (classical thermodynamics) ,High Energy Physics - Theory (hep-th) ,0103 physical sciences ,Path integral formulation ,Sample space ,FOS: Mathematics ,Quantum gravity ,Mathematics - Combinatorics ,Statistical physics ,Limit (mathematics) ,Combinatorics (math.CO) ,010306 general physics - Abstract
In causal set theory the gravitational path integral is replaced by a path-sum over a sample space $\Omega_n$ of $n$-element causal sets. The contribution from non-manifold-like orders dominates $\Omega_n$ for large $n$ and therefore must be tamed by a suitable action in the low energy limit of the theory. We extend the work of Loomis and Carlip on the contribution of sub-dominant bilayer orders to the causal set path-sum and show that the "link action" suppresses the dominant Kleitman-Rothschild orders for the same range of parameters., Comment: 14 pages, 10 figures. Section 2 revised and more figures added. Some editing in the introduction and conclusion section, section 3.1 renamed and references added
- Published
- 2020
32. General flation models for count data
- Author
-
Helen Ogden and Dankmar Böhning
- Subjects
Statistics and Probability ,Inflation ,media_common.quotation_subject ,05 social sciences ,Inference ,01 natural sciences ,Term (time) ,Truncated distribution ,010104 statistics & probability ,Statistics ,050501 criminology ,Sample space ,0101 mathematics ,Statistics, Probability and Uncertainty ,Baseline (configuration management) ,Random variable ,0505 law ,media_common ,Mathematics ,Count data - Abstract
The paper discusses very general extensions to existing inflation models for discrete random variables, allowing an arbitrary set of points in the sample space to be either inflated or deflated relative to a baseline distribution. The term flation is introduced to cover either inflation or deflation of counts. Examples include one-inflated count models where the baseline distribution is zero-truncated and count models for data with a few unusual large values. The main result is that inference about the baseline distribution can be based solely on the truncated distribution which arises when the entire set of flation points is truncated. A major application of this result relates to estimating the size of a hidden target population, and examples are provided to illustrate our findings.
- Published
- 2020
33. Sequential importance sampling for multiresolution Kingman–Tajima coalescent counting
- Author
-
Lorenzo Cappello and Julia A. Palacios
- Subjects
Statistics and Probability ,sequential importance sampling ,education.field_of_study ,Computer science ,Population ,Quantitative Biology::Genomics ,Article ,Coalescent theory ,enumeration ,Set (abstract data type) ,Cardinality ,Modeling and Simulation ,Sample space ,Statistical inference ,State space ,Quantitative Biology::Populations and Evolution ,Coalescent ,Statistics, Probability and Uncertainty ,education ,Algorithm ,Importance sampling - Abstract
Statistical inference of evolutionary parameters from molecular sequence data relies on coalescent models to account for the shared genealogical ancestry of the samples. However, inferential algorithms do not scale to available data sets. A strategy to improve computational efficiency is to rely on simpler coalescent and mutation models, resulting in smaller hidden state spaces. An estimate of the cardinality of the state space of genealogical trees at different resolutions is essential to decide the best modeling strategy for a given dataset. To our knowledge, there is neither an exact nor approximate method to determine these cardinalities. We propose a sequential importance sampling algorithm to estimate the cardinality of the sample space of genealogical trees under different coalescent resolutions. Our sampling scheme proceeds sequentially across the set of combinatorial constraints imposed by the data which, in this work, are completely linked sequences of DNA at a nonrecombining segment. We analyze the cardinality of different genealogical tree spaces on simulations to study the settings that favor coarser resolutions. We apply our method to estimate the cardinality of genealogical tree spaces from mtDNA data from the 1000 genomes and a sample from a Melanesian population at the $\beta $-globin locus.
- Published
- 2020
34. High-Efficiency Min-Entropy Estimation Based on Neural Network for Random Number Generators
- Author
-
Shuangyi Zhu, Yuan Ma, Jiwu Jing, Tianyu Chen, Jingqiang Lin, Na Lv, and Jing Yang
- Subjects
Science (General) ,Article Subject ,Computer Networks and Communications ,Computer science ,Random number generation ,Estimator ,Min entropy ,Sample (statistics) ,02 engineering and technology ,computer.software_genre ,01 natural sciences ,010309 optics ,Entropy estimation ,Q1-390 ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Sample space ,T1-995 ,020201 artificial intelligence & image processing ,Data mining ,Time complexity ,computer ,Randomness ,Technology (General) ,Information Systems - Abstract
Random number generator (RNG) is a fundamental and important cryptographic element, which has made an outstanding contribution to guaranteeing the network and communication security of cryptographic applications in the Internet age. In reality, if the random number used cannot provide sufficient randomness (unpredictability) as expected, these cryptographic applications are vulnerable to security threats and cause system crashes. Min-entropy is one of the approaches that are usually employed to quantify the unpredictability. The NIST Special Publication 800-90B adopts the concept of min-entropy in the design of its statistical entropy estimation methods, and the predictive model-based estimators added in the second draft of this standard effectively improve the overall capability of the test suite. However, these predictors have problems on limited application scope and high computational complexity, e.g., they have shortfalls in evaluating random numbers with long dependence and multivariate due to the huge time complexity (i.e., high-order polynomial time complexity). Fortunately, there has been increasing attention to using neural networks to model and forecast time series, and random numbers are also a type of time series. In our work, we propose several new and efficient approaches for min-entropy estimation by using neural network technologies and design a novel execution strategy for the proposed entropy estimation to make it applicable to the validation of both stationary and nonstationary sources. Compared with the 90B’s predictors officially published in 2018, the experimental results on various simulated and real-world data sources demonstrate that our predictors have a better performance on the accuracy, scope of applicability, and execution efficiency. The average execution efficiency of our predictors can be up to 10 times higher than that of the 90B’s for 10 6 sample size with different sample spaces. Furthermore, when the sample space is over 2 2 and the sample size is over 10 8 , the 90B’s predictors cannot give estimated results. Instead, our predictors can still provide accurate results. Copyright© 2019 John Wiley & Sons, Ltd.
- Published
- 2020
- Full Text
- View/download PDF
35. Is Compositional Data Analysis (CoDA) a theory able to discover complex dynamics in aqueous geochemical systems?
- Author
-
Sauro Graziano R.[1], Gozzi C.[1], and Antonella Buccianti A.[1
- Subjects
Mahalanobis distance ,010501 environmental sciences ,010502 geochemistry & geophysics ,01 natural sciences ,law.invention ,Physics::Geophysics ,anthropic perturbations ,Matrix (mathematics) ,Complex dynamics ,Fractal ,Geochemistry and Petrology ,law ,Intermittency ,Sample space ,Economic Geology ,Statistical physics ,Compositional data ,Robust principal component analysis ,Geology ,0105 earth and related environmental sciences - Abstract
The study of the structure of compositional changes characterizing a geochemical system in time or space could be the key to understand its dynamics and resilience to environmental and anthropic perturbations. Variables characterizing the composition of an aqueous system (river or ground waters) constitute a multivariate system that moves as a whole due to multiple interrelationships and feed-back mechanisms. If the goal of the research is to understand how the whole moves, the Compositional Data Analysis (CoDA) theory offers the correct statistical framework since aqueous geochemical data pertaining to the simplex sample space. Cases of a compositional matrix can be ranked by using some criterium (e.g. increasing runoff or conductivity values), and the differences between subsequent rows calculated by using the perturbation operator that measures the change in the simplex geometry. The obtained perturbation matrix can be then analyzed by using robust Principal Component Analysis (robust-PCA) to discover the association between variables subjected to proportional perturbations during geochemical processes. The calculus of the robust Mahalanobis distance from the barycenter of this matrix appears to be also useful to reveal presence of intermittency in time or space. Intermittency is highly related to the fractal nature of variability representing an emergent property of self-organized complex dissipative systems since it optimizes the dissipation energy gradients and maximize entropy production. Application examples for the surficial waters of the Alps region and the Arno river basin (Tuscany, Central Italy) reveal that the methodology is powerful and able to discover complex dynamics moving on the boundary between different sample spaces, from the Euclidean one to the fractal one. The proposed approach allows us to shift the perspective to the nature of the dynamic interactions and transitions affecting a geochemical landscape revealing how variability moves and how presence of intermittency and resilient behavior affects evolution and prediction.
- Published
- 2020
36. Cancer Characteristic Gene Selection via Sample Learning Based on Deep Sparse Filtering
- Author
-
Lin Zhang, Yuhu Cheng, Jian Liu, Z. Jane Wang, and Xuesong Wang
- Subjects
0301 basic medicine ,China ,Computer science ,lcsh:Medicine ,Sample (statistics) ,Article ,Machine Learning ,03 medical and health sciences ,Neoplasms ,Biomarkers, Tumor ,medicine ,Humans ,Learning ,Learning based ,lcsh:Science ,Multidisciplinary ,business.industry ,Gene Expression Profiling ,lcsh:R ,Cancer ,Pattern recognition ,Oncogenes ,Prognosis ,medicine.disease ,Identification (information) ,030104 developmental biology ,Gene selection ,Cancer genetics ,Sample space ,lcsh:Q ,Artificial intelligence ,business ,Algorithms - Abstract
Identification of characteristic genes associated with specific biological processes of different cancers could provide insights into the underlying cancer genetics and cancer prognostic assessment. It is of critical importance to select such characteristic genes effectively. In this paper, a novel unsupervised characteristic gene selection method based on sample learning and sparse filtering, Sample Learning based on Deep Sparse Filtering (SLDSF), is proposed. With sample learning, the proposed SLDSF can better represent the gene expression level by the transformed sample space. Most unsupervised characteristic gene selection methods did not consider deep structures, while a multilayer structure may learn more meaningful representations than a single layer, therefore deep sparse filtering is investigated here to implement sample learning in the proposed SLDSF. Experimental studies on several microarray and RNA-Seq datasets demonstrate that the proposed SLDSF is more effective than several representative characteristic gene selection methods (e.g., RGNMF, GNMF, RPCA and PMD) for selecting cancer characteristic genes.
- Published
- 2018
37. DONALD: A 2.5 T wide sample space permanent magnet
- Author
-
Simon R. Hall, Jason Potticary, Michael P. Avery, and Doug Mills
- Subjects
Materials science ,Biomedical Engineering ,Mechanical engineering ,Field strength ,02 engineering and technology ,010402 general chemistry ,7. Clean energy ,01 natural sciences ,Industrial and Manufacturing Engineering ,law.invention ,Sample environment ,Permanent magnet ,law ,lcsh:Science (General) ,Horse-shoe magnet ,Instrumentation ,Civil and Structural Engineering ,Electromagnet ,Mechanical Engineering ,Demagnetizing field ,021001 nanoscience & nanotechnology ,Sample (graphics) ,0104 chemical sciences ,Cuvette ,Volume (thermodynamics) ,Magnet ,Sample space ,0210 nano-technology ,lcsh:Q1-390 - Abstract
The permanent magnet apparatus described herein is based upon the C-shaped permanent magnet. It is designed to maximise field strength while increasing the pole gap to 5 mm, providing a sample volume large enough for wide applicability. The production of this equipment aims to provide a homogeneous, high field (∼2.5 T) magnetic sample environment with a volume large enough to accommodate solution crystallisation experiments in sample chambers such as NMR tubes and cuvettes whilst simultaneously allowing direct observation of the sample from a wide angle. Although the resulting rig is not lightweight at 26.5 kg it is eminently more portable than an equivalent electromagnet system (of the order of 625 kg), and provides a max field strength of 2.468 T with relatively low stray field. Keywords: Permanent magnet, Sample environment, Horse-shoe magnet
- Published
- 2018
38. Reversible Discriminant Analysis
- Author
-
Yuan-Hai Shao, Chun-Na Li, Lan Bai, and Zhen Wang
- Subjects
General Computer Science ,linear discriminant analysis ,02 engineering and technology ,01 natural sciences ,supervised learning ,Between-class scatter ,010104 statistics & probability ,Dimension (vector space) ,Scatter matrix ,0202 electrical engineering, electronic engineering, information engineering ,General Materials Science ,0101 mathematics ,Mathematics ,dimensionality reduction ,business.industry ,Dimensionality reduction ,Supervised learning ,General Engineering ,Pattern recognition ,Linear discriminant analysis ,Kernel (statistics) ,Principal component analysis ,Sample space ,020201 artificial intelligence & image processing ,Artificial intelligence ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,business ,lcsh:TK1-9971 - Abstract
Principal component analysis (PCA) and linear discriminant analysis (LDA) have been extended to be a group of classical methods in dimensionality reduction for unsupervised and supervised learning, respectively. However, compared with the PCA, the LDA loses several advantages because of the singularity of its between-class scatter, resulting in singular mapping and restriction of reduced dimension. In this paper, we propose a dimensionality reduction method by defining a full-rank between-class scatter, called reversible discriminant analysis (RDA). Based on the new defined between-class scatter matrix, our RDA obtains a nonsingular mapping. Thus, RDA can reduce the sample space to arbitrary dimension and the mapped sample can be recovered. RDA is also extended to kernel based dimensionality reduction. In addition, PCA and LDA are the special cases of our RDA. Experiments on the benchmark and real problems confirm the effectiveness of the proposed method.
- Published
- 2018
39. I Can't Make Heads Or Tails Out Of What You Are Saying, So Let's Just Agree To Be Fair.
- Author
-
Carter, Rickey E.
- Subjects
- *
STATISTICS education , *COINS , *PROBLEM-based learning , *SIMULATION methods & models , *ACTIVE learning , *PROBLEM solving - Abstract
Assuming a coin is fair is common place in introductory statistical education. This article offers three approaches to test if a coin is fair. The approaches lend themselves to straightforward simulation studies that can enrich student understanding of joint probability and sample size requirements. Simulation studies comparing the relative merits of the three, or potential other, approaches are an example of problem-based learning. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
40. Orientation uncertainty goes bananas: An algorithm to visualise the uncertainty sample space on stereonets for oriented objects measured in boreholes.
- Author
-
Stigsson, Martin and Munier, Raymond
- Subjects
- *
BOREHOLES , *UNCERTAINTY , *BANANAS , *ALGORITHMS , *OBJECT-oriented methods (Computer science) , *NUMERICAL analysis , *GEOMETRIC analysis - Abstract
Abstract: Measurements of structure orientations are afflicted with uncertainties which arise from many sources. Commonly, such uncertainties involve instrument imprecision, external disturbances and human factors. The aggregated uncertainty depends on the uncertainty of each of the sources. The orientation of an object measured in a borehole (e.g. a fracture) is calculated using four parameters: the bearing and inclination of the borehole and two relative angles of the measured object to the borehole. Each parameter may be a result of one or several measurements. The aim of this paper is to develop a method to both calculate and visualize the aggregated uncertainty resulting from the uncertainty in each of the four geometrical constituents. Numerical methods were used to develop a VBA-application in Microsoft Excel to calculate the aggregated uncertainty. The code calculates two different representations of the aggregated uncertainty: a 1-parameter uncertainty, the ‘minimum dihedral angle’, denoted by Ω; and, a non-parametric visual representation of the uncertainty, denoted by χ. The simple 1-parameter uncertainty algorithm calculates the minimum dihedral angle accurately, but overestimates the probability space that plots as an ellipsoid on a lower hemisphere stereonet. The non-parametric representation plots the uncertainty probability space accurately, usually as a sector of an annulus for steeply inclined boreholes, but is difficult to express numerically. The 1-parameter uncertainty can be used for evaluating statistics of large datasets whilst the non-parametric representation is useful when scrutinizing single or a few objects. [Copyright &y& Elsevier]
- Published
- 2013
- Full Text
- View/download PDF
41. Chronogenesis, Cosmogenesis and Collapse.
- Author
-
Pearle, Philip
- Subjects
- *
EIGENVALUES , *PROBABILITY theory , *SUPERPOSITION principle (Physics) , *MATHEMATICAL ability , *QUANTUM gravity ,UNIVERSE - Abstract
A simple quantum model describing the onset of time is presented. This is combined with a simple quantum model of the onset of space. A major purpose is to explore the interpretational issues which arise. The state vector is a superposition of states representing different 'instants.' The sample space and probability measure are discussed. Critical to the dynamics is state vector collapse: it is argued that a tenable interpretation is not possible without it. Collapse provides a mechanism whereby the universe size, like a clock, is narrowly correlated with the quantized time eigenvalues. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
42. A tale of two probabilities.
- Author
-
Falk, Ruma and Kendig, Keith
- Subjects
- *
CONDITIONAL probability , *BAYES' theorem , *TEACHING , *FAMILY size , *STATISTICS education , *SAMPLE size (Statistics) , *MATHEMATICAL models - Abstract
Two contestants debate the notorious probability problem of the sex of the second child. The conclusions boil down to explication of the underlying scenarios and assumptions. Basic principles of probability theory are highlighted. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
43. No Train Paradox.
- Author
-
Laraudogoitia, Jon
- Subjects
RELATIVITY ,PARADOX ,LOGIC ,PHILOSOPHY ,THEORY of knowledge - Abstract
In 'The Train Paradox'(Philosophia (2006) 34: 437-438) Gwiazda proposes the use of the relativity of simultaneity to formulate a new paradox. My purpose here is to show that there is no Train Paradox in Gwiazda's sense. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
44. Pre-Service Teachers' Conceptions of Probability.
- Author
-
Odafe, Victor U.
- Subjects
- *
STUDENT teacher attitudes , *PROBABILITY theory , *SCIENCE education , *BASIC education , *DECISION making , *EDUCATION - Abstract
Probability knowledge and skills are needed in science and in making daily decisions that are sometimes made under uncertain conditions. Hence, there is the need to ensure that the pre-service teachers of our children are well prepared to teach probability. Pre-service teachers' conceptions of probability are identified, and ways of helping them (teachers) learn the subject more thoroughly are suggested. [ABSTRACT FROM PUBLISHER]
- Published
- 2011
- Full Text
- View/download PDF
45. From personal to conventional probabilities: from sample set to sample space.
- Author
-
Chernoff, Egan and Zazkis, Rina
- Subjects
- *
MATHEMATICS education , *PROBABILITY learning , *TRAINING of mathematics teachers , *PEDAGOGICAL content knowledge , *EXAMPLE , *MATHEMATICAL models of learning , *MATHEMATICAL ability testing - Abstract
This article is a systematic reflection on a sequence of episodes related to teaching probability. Our central claim is that reducing problems to a consideration of the sample space, which consists of equiprobable outcomes, may not be in accord with learners' initial ways of reasoning. We suggest a 'desirable pedagogical approach' in which the solution builds on the set of outcomes as identified by learners and serves as a bridge towards mathematical convention. To explore prospective high school mathematics teachers' ideas related to addressing a potential learner's mistake and their reactions towards the suggested approach, we presented them with two tasks. In Task I, participants ( n = 30) were asked to suggest a pedagogical remedy to a frequent mistake found in dealing with a standard probability problem, whereas in Task II, they were asked to solve a probabilistic problem, which they had not encountered previously. We discuss participants' mathematical solutions to Task II in reference to their pedagogical approaches to Task I. The presented disparity serves in extending the convincing power of the suggested pedagogical approach. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
46. How Many Alternatives? Partitions Pose Problems for Predictions and Diagnoses.
- Author
-
Smithson, Michael
- Subjects
- *
DECISION making , *PROBABILITY theory , *FORECASTING , *LEGAL judgments , *PROBABILITY learning - Abstract
This paper focuses on one matter that poses a problem for both human judges and standard probability frameworks, namely the assumption of a unique (privileged) and complete partition of the state-space of possible events. This is tantamount to assuming that we know all possible outcomes or alternatives in advance of making a decision, but it is clear that there are many practical situations in prediction, diagnosis, and decision-making where such partitions are contestable and/or incomplete. The paper begins by surveying the impact of partitions on the choice of priors in formal probabilistic updating frameworks, and on human subjective probability judgements. That material is followed by an overview of strategies for dealing with partition dependence, including considerations of how a rational agent's preferences may determine the choice of a partition. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
47. Selection Neglect in Mutual Fund Advertisements.
- Author
-
Koehler, Jonathan J.
- Subjects
DECISION making ,HEURISTIC ,MUTUAL funds ,STOCKS (Finance) ,BUSINESS forecasting ,CONSUMER confidence ,STATISTICS - Abstract
Mutual fund companies selectively advertise their better-performing funds. However, investors respond to advertised performance data as if those data were unselected (i.e., representative of the population). We identify the failure to discount selected or potentially selected data as selection neglect. We examine these phenomena in an archival study (Study 1) and two controlled experiments (Studies 2 and 3). Study 1 identifies selection bias in mutual fund advertising by showing that the median performance rank for advertised funds is between the 79th and 100th percentile. Study 2 finds that both novice investors and financial professionals fall victim to selection neglect in a financial advertising task unless the advertisement makes the selective nature of available performance data transparent. Study 3 shows that selection neglect associated with a large well-known company can be debiased with a simple extrinsic sample space cue, although individual differences in statistical reasoning also matter. We argue that selection neglect results from a general tendency to ignore underlying sample spaces rather than a fundamental misunderstanding about the data selection process or the value of selected data. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
48. Sample space partitions: An investigative lens
- Author
-
Chernoff, Egan J.
- Subjects
- *
PARTITIONS (Mathematics) , *PROBABILITY theory , *MATHEMATICAL sequences , *ITERATIVE methods (Mathematics) , *REASONING , *VERBAL responses , *TASKS , *SET theory , *COMPARATIVE studies - Abstract
Abstract: In this study subjects are presented with sequences of heads and tails, derived from flipping a fair coin, and asked to consider their chances of occurrence. In this new iteration of the comparative likelihood task, the ratio of heads to tails in all of the sequences is maintained. In order to help situate participants’ responses within conventional probability, this article employs unconventional set descriptions of the sample space organized according to: switches, longest run, and switches and longest run, which are all based upon subjects’ verbal descriptions of the sample space. Results show that normatively incorrect responses to the task are not devoid of correct probabilistic reasoning. The notion of alternative set descriptions is further developed, and the article contends that sample space partitions can act as an investigative lens for research on the comparative likelihood task, and probability education in general. [Copyright &y& Elsevier]
- Published
- 2009
- Full Text
- View/download PDF
49. Proof-without-words: Markov's inequality.
- Author
-
HARTE, ROBIN E., HORGAN, JANE, and POWER, JAMES
- Subjects
- *
MATHEMATICAL equivalence , *EVIDENCE - Abstract
We offer a "proof without words" of the inequalities of Markov and Chebyshev. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
50. Adaptive Algorithms for Estimating Betweenness and k -path Centralities
- Author
-
Talel Abdessalem, Albert Bifet, Mostafa Haghir Chehreghani, Department of Computer Science - K.U.Leuven, Catholic University of Leuven - Katholieke Universiteit Leuven (KU Leuven), Data, Intelligence and Graphs (DIG), Laboratoire Traitement et Communication de l'Information (LTCI), Institut Mines-Télécom [Paris] (IMT)-Télécom Paris-Institut Mines-Télécom [Paris] (IMT)-Télécom Paris, Département Informatique et Réseaux (INFRES), Télécom ParisTech, and Institut Mines-Télécom [Paris] (IMT)-Télécom Paris
- Subjects
Computer science ,02 engineering and technology ,Directed graph ,Vertex (geometry) ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,Betweenness centrality ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Sample space ,020201 artificial intelligence & image processing ,Centrality ,Social network analysis ,Algorithm ,ComputingMilieux_MISCELLANEOUS - Abstract
Betweenness centrality and k-path centrality are two important indices that are widely used to analyze social, technological and information networks. In the current paper, first given a directed network G and a vertex $r\in V(G)$, we present a novel adaptive algorithm for estimating betweenness score of r. Our algorithm first computes two subsets of the vertex set of G, called $\mathcalRF (r)$ and $\mathcalRT (r)$. They define the sample spaces of the start-points and the end-points of the samples. Then, it adaptively samples from $\mathcalRF (r)$ and $\mathcalRT (r)$ and stops as soon as some condition is satisfied. The stopping condition depends on the samples met so far, $|\mathcalRF (r)|$ and $|\mathcalRT (r)|$. We show that compared to the well-known existing algorithms, our algorithm gives a better $(lambda,δ)$-approximation. Then, we propose a novel algorithm for estimating k-path centrality of r. Our algorithm is based on computing two sets $\mathcalRF (r)$ and $\mathcalD (r)$. While $\mathcalRF (r)$ defines the sample space of the source vertices of the sampled paths, $\mathcalD (r)$ defines the sample space of the other vertices of the paths. We show that in order to give a $(lambda,δ)$-approximation of the k-path score of r, our algorithm requires considerably less samples. Moreover, it processes each sample faster and with less memory. Finally, we empirically evaluate our proposed algorithms and show their superior performance. Also, we show that they can be used to efficiently compute centrality scores of a set of vertices.
- Published
- 2019
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.