Descriptor: "Zipf's law" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Zipf's law"' showing total 3,456 results

Start Over Descriptor "Zipf's law"

3,456 results on '"Zipf's law"'

101. Range-limited Heaps' law for functional DNA words in the human genome.

Author: Li, Wentian, Almirantis, Yannis, and Provata, Astero
Subjects: *HUMAN genome, *HUMAN DNA, *ZIPF'S law, *PROTEIN domains, *HUMAN chromosomes
Abstract: Heaps' or Herdan-Heaps' law is a linguistic law describing the relationship between the vocabulary/dictionary size (type) and word counts (token) to be a power-law function. Its existence in genomes with certain definition of DNA words is unclear partly because the dictionary size in genome could be much smaller than that in a human language. We define a DNA word as a coding region in a genome that codes for a protein domain. Using human chromosomes and chromosome arms as individual samples, we establish the existence of Heaps' law in the human genome within limited range. Our definition of words in a genomic or proteomic context is different from other definitions such as over-represented k-mers which are much shorter in length. Although an approximate power-law distribution of protein domain sizes due to gene duplication and the related Zipf's law is well known, their translation to the Heaps' law in DNA words is not automatic. Several other animal genomes are shown herein also to exhibit range-limited Heaps' law with our definition of DNA words, though with various exponents. When tokens were randomly sampled and sample sizes reach to the maximum level, a deviation from the Heaps' law was observed, but a quadratic regression in log–log type-token plot fits the data perfectly. Investigation of type-token plot and its regression coefficients could provide an alternative narrative of reusage and redundancy of protein domains as well as creation of new protein domains from a linguistic perspective. [Display omitted] [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

102. Beyond Zipf's law: Exploring the discrete generalized beta distribution in open-source repositories.

Author: Nowak, Przemysław, Santolini, Marc, Singh, Chakresh, Siudem, Grzegorz, and Tupikina, Liubov
Subjects: *ZIPF'S law, *BETA distribution, *DATA libraries, *CORPORA, *SYSTEM dynamics
Abstract: Rank-size distributions, such as Zipf's Law, have been instrumental in providing insights into the emergence of hierarchies across diverse systems, from linguistic corpuses to urban structures. However, the application of Zipf's Law reveals limitations, particularly in its focus on distribution tails, sometimes overlooking a large proportion of the data which might play a pivotal role in system dynamics. Yet, fitting rank-size distributions other than a straight line on the log–log scale requires caution. In this study, we re-evaluate the utility of rank-size distributions by contrasting the traditional Zipf's Law with the Discrete Generalized Beta Distribution (DGBD). We show the need of cautious fitting techniques for rank distributions, including the use of binning to prevent overfitting to data tails. Through both analytical derivation and empirical validation on commit data of open-source repositories, we show that DGBD consistently improves over Zipf distribution for concave rank distributions of large datasets (N ≥ 100). This approach contributes to the advancement of methodologies for analyzing hierarchical systems. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

103. Assessing Regional Development Balance Based on Zipf’s Law: The Case of Chinese Urban Agglomerations

Author: Liang Kong, Qinglin Wu, Jie Deng, Leichao Bai, Zhongsheng Chen, Zhong Du, and Mingliang Luo
Subjects: urban agglomeration, unevenness, Zipf’s law, China, Geography (General), G1-922
Abstract: With the deepening of urbanization in China, the coordinated development of cities in different regions is an important part of the sustainable development of the country, and the reasonable quantification of the unbalanced development of cities in different regions is an important issue facing the society nowadays. Previous studies usually use population data to analyze the power-law distribution law to quantify the imbalance of urban development in different regions, but China’s population data span a large number of years and numerous division criteria, and the results obtained from different population data are widely disparate and have obvious limitations. The paper starts from a fractal perspective and utilizes OpenStreetMap (OSM) data to extract national road intersections from 2015 to 2022, calculates critical distance thresholds for eight years using urban expansion curves, generates urban agglomerations in China, and quantifies the imbalance of urban development in different regions by calculating the urban agglomeration power-law index. The results indicate that (1) the critical distance threshold of urban expansion curves exhibits a slight overall increase and stabilizes within the range of 120–130 m, (2) the number of urban agglomerations in China has been increasing significantly year by year, but the power-law index has been decreasing from 1.49 in 2015 to 1.36 in 2022, and (3) the number of urban agglomerations and the power–law index of the Beijing–Tianjin–Hebei, Yangtze River Delta, Pearl River Delta, and Chengdu–Chongqing regions, which is consistent with the national scale trend, indicates that the scale distribution of urban agglomerations in China at this stage does not conform to Zipf’s law, and there is a certain Matthew effect among cities in different geographic areas with a large unevenness. The results of the study can provide new ideas for assessing the coordinated development of cities in different regions. It compensates for the instability of population and economic data in traditional studies.
Published: 2023
Full Text: View/download PDF

104. Empirical Laws of Natural Language Processing for Neural Language Generated Text

Author: Sumedha, Rohilla, Rajesh, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Bhattacharya, Mahua, editor, Kharb, Latika, editor, and Chahal, Deepak, editor
Published: 2021
Full Text: View/download PDF

105. Meaningfulness and Unit of Zipf’s Law: Evidence from Danmu Comments

Author: Zhou, Yihan, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Li, Sheng, editor, Sun, Maosong, editor, Liu, Yang, editor, Wu, Hua, editor, Kang, Liu, editor, Che, Wanxiang, editor, He, Shizhu, editor, and Rao, Gaoqi, editor
Published: 2021
Full Text: View/download PDF

106. Analyzing Social Engineering Research through Co-authorship Networks Using Scopus Database during 1926-2020

Author: Leila Khalili and Nayana Darshani Wijayasundara
Subjects: bibliometric, co-authorship networks, centrality measures, social engineering, zipf’s law, Bibliography. Library science. Information resources, Communication. Mass media, P87-96
Abstract: Purpose: Hacking the human brain and manipulating human trust to obtain information and get monetary gains is called social engineering. This study aims to visualize and analyze the co-authorship networks in the Scopus citation database's social engineering research from 1926 to 2020. Method: The present quantitative study used the bibliometric method and social network analysis. The study collected data from the Scopus database. A total number of 1994 records were taken as the sample of the study. Researchers used descriptive and inferential statistics and social network analysis to obtain results; to do this, different software types were used in the study (SPSS, Microsoft Excel, Text Statistics Analyzer, ISI.exe, Pajek, and VOSviewer). Findings: The findings indicate the top three sources of publishing and the related subject areas. Furthermore, the top three core authors and countries were identified. Also, the authors with high centrality measures in the co-authorship networks were identified. A large majority of papers had only one author. The Collaborative Coefficient among researchers was 0.36. Based on the results of Spearman's test, there was a significant association between the number of documents, the number of citations, and the rate of total link strength of the countries. Likewise, there was a positive and high significant association between degree and closeness centralities. Conclusion: The researchers' frequently used keywords in this area were social engineering, phishing, and information security; in addition, the frequency of keywords was not compatible with Zipf’s Law. A small sample of keywords will not properly follow Zipf’s distribution.
Published: 2022
Full Text: View/download PDF

107. Compilation, Analysis and Application of a Comprehensive Bangla Corpus KUMono

Author: Aysha Akther, Md. Shymon Islam, Hafsa Sultana, A. K. Z. Rasel Rahman, Sujana Saha, Kazi Masudul Alam, and Rameswar Debnath
Subjects: NLP, Bangla corpus, N-gram, Zipf’s law, article categorization, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Research in Natural Language Processing (NLP) and computational linguistics highly depends on a good quality representative corpus of any specific language. Bangla is one of the most spoken languages in the world but Bangla NLP research is in its early stage of development due to the lack of quality public corpus. This article describes the detailed compilation methodology of a comprehensive monolingual Bangla corpus, KUMono (Khulna University Monolingual corpus). The newly developed corpus consists of more than 350 million word tokens and more than one million unique tokens from 18 major text categories of online Bangla websites. We have conducted several word-level and character-level linguistic phenomenon analyses based on empirical studies of the developed corpus. The corpus follows Zipf’s curve and hapax legomena rule. The quality of the corpus is also assessed by analyzing and comparing the inherent sparseness of the corpus with existing Bangla corpora, by analyzing the distribution of function words of the corpus and vocabulary growth rate. We have developed a Bangla article categorization application based on the KUMono corpus and received compelling results by comparing to the state-of-the-art models.
Published: 2022
Full Text: View/download PDF

108. The relationship between inorganic nutrients and diversity of dinoflagellate cysts: An evaluation from the perspective of species abundance distribution

Author: Junfeng Gao and Qiang Su
Subjects: dinoflagellate cysts, inorganic nutrients, coastal ecosystem, species abundance distribution, fractal model, Zipf’s law, Science, General. Including nature conservation, geographical distribution, QH1-199.5
Abstract: The relationships between the inorganic nutrients and diversity of dinoflagellate cysts (the N-Dc relationships) are one of the most central issues in coastal ecology. It is not only an important pathway to explore the ecological processes of plankton, but also a key element for assessing eutrophication in marine ecosystems. Although the N-Dc relationships have been studied for many years, they have remained controversial, which may be attributed to (1) using samples collected from a single source (2) considering an insufficient range of nutrient concentrations (3) rarely taking into account species abundance distributions (SAD) that could better represent diversity. In this study, the N-Dc relationships are evaluated according to a compiled dataset, which cover the wide range of nutrient concentrations. Species diversity of cysts are estimated by four common diversity metrics and a new SAD parameter. Results show that all diversity metrics are negative with nutrients, which supports that low diversity of cysts could be considered as a signal of eutrophication. Additionally, this study finds a new pattern that SAD of cysts (Nr/N1, Nr and N1 is the abundance of the r-th and the first species in descending order) with decreasing nutrients appears to gradually approach 1: 1/2: 1/3…. In the future, if this pattern can be verified by more investigations, understanding the negative N-Dc relationships is more likely to provide new direction for assessing and managing eutrophication in coastal ecosystem, and even for exploring the general mechanisms determining diversity.
Published: 2023
Full Text: View/download PDF

109. Scaling the living space: Zipf’s law for traditional courtyard houses in South China

Author: Yizhi Zhou and Yiming Li
Subjects: urban morphology, zipf’s law, pareto distribution, scaling of living space, self-organization, Engineering (General). Civil engineering (General), TA1-2040, City planning, HT165.5-169.9
Abstract: In the traditional feudalistic society of China, there is a characteristic residential pattern of several core families with common ancestors living together in one house, as far as possible. Hence, the habitation of large families and their social dynamic always have a complex function and hierarchical structure. In this article, we consider a courtyard in South China as an example to enable a discussion of the mathematical relationship among the five basic functional spaces in it. Based on Zipf’s law, we find that the distribution of the five types of spaces, from large to small, can be described by the Pareto distribution with a shape parameter close to −1. Moreover, the Zipf parameters of different houses in the same area conform to the double Pareto distribution. This suggests that the size and shape of a residence also follows well-defined scaling laws. Additionally, it indicates that houses, at least traditional Chinese houses, have strong self-organization and self-similarity. It also shows that the power law of the Pareto distribution is applicable not only to the macro scale of the city but also the micro scale of housing.
Published: 2022
Full Text: View/download PDF

110. A comprehensive analysis of the relationship between temperature and species diversity: The case of planktonic foraminifera

Author: Junfeng Gao and Qiang Su
Subjects: unimodal relationship, diversity index, species abundance distribution, fractal theory, Zipf’s law, Science, General. Including nature conservation, geographical distribution, QH1-199.5
Abstract: The relationship between temperature (T) and species diversity is one of the most fundamental issues in marine diversity. Although their relationships have been discussed for many years, how species diversity is related to T remains a controversial question. Previous studies have identified three T–diversity relationships: positive, negative, and unimodal. Recently, the unimodal relationship has received great attention. However, these studies may be biased by (1) considering the insufficient T range of database, (2) using a single diversity metric (generally species richness, S), and (3) rarely considering species abundance distribution (SAD) that can better represent diversity. Here, to seek a more comprehensive understanding of T–diversity relationships, their relationships are evaluated according to a global planktonic foraminifera dataset, which is usually considered as a model dataset for exploring diversity pattern. Species diversity are estimated by four most commonly used metrics and a new SAD parameter (p). Results show that S and Shannon’s index support the typical unimodal relationship with T. However, evenness and dominance do not have significant unimodality. Additionally, this study conjectures that the SAD parameter p with increasing T will gradually approach the minimum 1, noting that SAD (Nr/N1, where Nr and N1 are the abundance of the rth and the first species in descending order) tends to be 1:1/2:1/3…. This study suggests that the T–diversity relationship cannot be wholly reflected by S and the other aspects of diversity (especially SAD) should be considered.
Published: 2022
Full Text: View/download PDF

111. Continuous vs. Discrete Urban Ranks: Explaining the Evolution in the Italian Urban Hierarchy over Five Decades.

Author: Capello, Roberta, Caragliu, Andrea, and Gerritse, Michiel
Subjects: *ZIPF'S law, *URBANIZATION, *METROPOLIS
Abstract: The reasons for changes in ranking within urban systems are a matter of a wide and long debate. Some focus on a continuous and smooth ordering of cities by their size within the urban system, in the tradition of Zipf's law. Others focus on discrete, discontinuous ordering, as cities take on functions at different levels, such as specialized market places or high-level education, in the tradition of Christaller. We enter the debate by empirically evaluating whether the same determinants explain continuous or discrete changes in urban ranks in the evolution of the Italian urban hierarchy over the years 1971 to 2011. We empirically show that small, continuous changes of cities' ranks have different drivers than large, discontinuous leaps. The presence of high-level functions in a city predicts major leaps across discrete ranks. Results are robust to the use of an instrumental variable strategy based on a shift–share argument. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

112. A Zipfian Approach to Words in Contexts: The Cases of Modern English and Chinese.

Author: Cong, Jin
Subjects: *CHINESE language, *ZIPF'S law, *ENGLISH language, *LINGUISTIC complexity, *ANTHROPOLOGICAL linguistics, *CHINESE people, *VOCABULARY
Abstract: The system-level complexity of language has been thoroughly investigated in terms of Zipf's law, whose quantitative features have proved to reflect text/language typology. This study extends the scope of Zipf's law from the macroscopic scale of language to specific words in contexts, with the aim of examining its potential as an indicator of word typology. The focus is confined to the high-frequency words in English and Chinese as found in the FLOB and LCMC corpora. It has been found that the log–log rank-frequency distributions of contextual words of the words in question generally abide by the linear function y = ax+b. Moreover, it has been shown that an adjusted version of parameter a can help to distinguish the words in question's classes. The contextual information as reflected by this Zipf-based index might be more important to the emergence of word classes of Chinese, which has no real inflection as a word-class indicator. From a Zipfian approach, the findings have preliminarily approved Saussure's systems thinking regarding linguistic signs. Meanwhile, they may also contribute to such fields as usage-based linguistics. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

113. Estimating the Total Volume of Queries to a Search Engine.

Author: Lillo, Fabrizio and Ruggieri, Salvatore
Subjects: *ZIPF'S law, *DATA binning, *STATISTICAL errors, *LEAST squares, *STATISTICAL models, *SEARCH engines
Abstract: We study the problem of estimating the total number of searches (volume) of queries in a specific domain, which were submitted to a search engine in a given time period. Our statistical model assumes that the distribution of searches follows a Zipf’s law, and that the observed sample volumes are biased accordingly to three possible scenarios. These assumptions are consistent with empirical data, with keyword research practices, and with approximate algorithms used to take counts of query frequencies. A few estimators of the parameters of the distribution are devised and experimented, based on the nature of the empirical/simulated data. For continuous data, we recommend using nonlinear least square regression (NLS) on the top-volume queries, where the bound on the volume is obtained from the well-known Clauset, Shalizi and Newman (CSN) estimation of power-law parameters. For binned data, we propose using a Chi-square minimization approach restricted to the top-volume queries, where the bound is obtained by the binned version of the CSN method. Estimations are then derived for the total number of queries and for the total volume of the population, including statistical error bounds. We apply the methods on the domain of recipes and cooking queries searched in Italian in 2017. The observed volumes of sample queries are collected from Google Trends (continuous data) and SearchVolume (binned data). The estimated total number of queries and total volume are computed for the two cases, and the results are compared and discussed. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

114. The effects of type and token frequency on word length: a cross-linguistic study.

Author: Berg, Thomas, Zörnig, Peter, and Lehr, Charlotte
Subjects: ZIPF'S law, WORD frequency, DISTRIBUTION (Probability theory)
Abstract: Inspired by Zipf's Law of Abbreviation, previous research was mostly directed at the interaction of word length and token frequency. Much less is known about the relationship of word length and type frequency, let alone about the differential impact of type and token frequency on word length. These issues are examined on the basis of a non-representative sample of 10 languages. The token frequency analysis reveals that 8 of the 10 languages show a monotonic decrease in frequency with increasing length while 2 languages reveal a unimodal distribution. By contrast, all 10 languages exhibit a rise followed by a monotonic drop of the frequency curve in the type frequency analysis. There appears to be a notable effect of type frequency on the nature of the token frequency distribution: the greater the average length of the words in the lexicon, the higher the probability of a unimodal distribution. Two principles are required to account for these results—a general dispreference for using long words and a language-particular dispreference for short words in the lexicon. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

115. Quantitative methods to determine the student workload. I. Empirical study based on digital platforms.

Author: Velazquez, L., Atenas, B., and Castro-Palacio, J. C.
Subjects: *ACADEMIC workload of students, *DIGITAL technology, *ZIPF'S law, *QUANTITATIVE research, *EMPIRICAL research, *DIGITAL learning
Abstract: We present a quantitative study of an online course developed during COVID19 sanitary emergency in Chile. We reconstruct the teaching–learning process considering the activity logs on digital platforms in order to answer the question of How do our students study? The results from the analysis evidence the complex adaptive character of the academic environment, which exhibits regularities similar to those found in financial markets (e.g., distributions of the daily time devoted to learning activities follow patterns like Pareto's or Zipf's law). Our empirical results illustrate (i) the relevance of economic notions in the understanding of the teaching–learning processes and (ii) the reliability of quantitative methods based on digital platforms to conduct experimental studies in this framework. We introduce in the present work a series of indicators to characterize the performance of professors, students' follow-up of the course, and their learning progress by crossing information with the results of assessments. In this context, the learning rate appears as a key statistical descriptor for the allocation of the student workload. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

116. The Impacts of COVID-19 on the Rank-Size Distribution of Regional Tourism Central Places: A Case of Guangdong-Hong Kong-Macao Greater Bay Area.

Author: Xu, Xiaohui
Abstract: It is well known that Zipf's rank-size law is powerful to investigate the rank-size distribution of tourist flow. Recently, widespread attention has been drawn to investigating the impacts of COVID-19 on tourism for its sustainability. However, little is known about the impacts of COVID-19 on the rank-size distribution of regional tourism central places. Taking Guangdong-Hong Kong-Macao Greater Bay Area as a research case, this article aims to examine the fractal characteristics of the rank-size distribution of regional tourism central places, revealing the impacts which COVID-19 has on the rank-size distribution of regional tourism central places. Based on the census data over the years from 2008 to 2021, this paper reveals that before COVID-19, the rank-size distribution of the tourism central places in Guangdong-Hong Kong-Macao Greater Bay Area appears monofractal, and the difference in the size of the tourism central places has a tendency to gradually decrease; in 2020, with the outbreak of COVID-19, the characteristic of the rank-size distribution shows that the original monofractal is broken into multifractal; in 2021, with COVID-19 becoming under control, the structure of tourism size distribution, changes into bifractal based on the original multifractal, showing that the rank-size distribution of tourism central places in Guangdong-Hong Kong-Macao Greater Bay Area becomes more ideal and the tourism order becomes better than the last year. The results obtained not only fill in the gap about the impacts of COVID-19 on tourism size distribution, but also contribute to the application of fractal theory to tourism size distribution. In addition, we propose some suggestions to the local governments and tourism authorities which have practical significance to tourism planning. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

117. Performance Study and Optimization of 3D-MANET: A New Analytical Perspective Based on Zipf's Law.

Author: Cheng, Zairan and Liu, Ying
Subjects: ZIPF'S law, AD hoc computer networks, STOCHASTIC processes, PROBABILITY theory, NETWORK performance, QUEUING theory, PERFORMANCE theory
Abstract: This paper studies the throughput capacity and delay scaling laws in a three-dimensional mobile ad hoc network (3D-MANET) under different routing schemes. Previous work generally assumed that nodes follow a uniform distribution or a power-law distribution to move in the network. From the perspective of the entire network, it is difficult for this network model to reflect communication entities' distribution in real 3D space. Moreover, the research results of analyzing network performance using different routing schemes are limited, and the research work is insufficient. With formerly related studies different,we propose a cell-gridded network model that considers the actual environment with cells of the node aggregation degree, which follows Zipf's law with exponent γ. And our model can cover a variety of distribution scenarios with changes in the γ value. The packet delivery rate, network capacity, and delay performance of 3D-MANET adopting the traditional two-hop nonredundant and redundant relay routing scheme are examined utilizing theoretical tools such as probability theory, random process, and queuing theory. We propose a wireless access point- (WAP-) enabled multihop relay routing scheme. By deploying WAP in cells with a high γ , nodes can access WAP and broadcast packets, which accelerates the delivery of packets, and the results obtained by applying this scheme indicate that compared with the two-hop relay scheme, the WAP multihop relay effectively improves the delay performance and the transmission efficiency with less loss of capacity performance. Additionally, a better delay-capacity trade-off performance is achieved. Finally, we discuss the influence of parameters such as the number of network nodes n , the number of network cells m , the redundancy r , and γ on the capacity and delay. The analysis results confirm that exploiting the users' distribution status information and dividing the cells reasonably will save deployment costs and further improve network performance. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

118. Applicability of Zipf's Law in Traditional Chinese Medicine Prescriptions.

Author: Li, Yuanbai, Du, Yu, Liu, Fangzhou, Zhang, Yiying, Li, Meng, Wang, Jing, Li, Yihao, and Yang, Yang
Subjects: *ZIPF'S law, *CHINESE medicine, *DISTRIBUTION (Probability theory), *MEDICAL prescriptions
Abstract: Traditional Chinese medicine (TCM) prescriptions have been used to cure diseases in China for thousands of years, in which many TCM herbs have no definite common quantity. Some key TCM herbs are commonly used and thus deserve in-depth investigations based on a more acceptable classification method. This study analyzes whether TCM prescriptions follow Zipf's law and attempts to obtain the thresholds of key TCM herbs based on the application of Zipf's law. A total of 84,418 TCM prescriptions were collected and standardized. We tested whether Zipf's law and Zipf's distribution fit the Chinese herb distributions. A linear fitting experiment was performed to verify the relationship between the frequency distribution and frequency of TCM herbs. The distribution of TCM herbs in TCM prescriptions conformed to Zipf's law. Accordingly, the thresholds were obtained for the key TCM herbs. The distribution of TCM herbs in TCM prescriptions follows Zipf's law. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

119. PRIVATE LABELS FROM 2010 TO 2018: A BIBLIOMETRIC STUDY.

Author: André Braga, Guilherme, Freitas, Vérica, and Freitas de Paula, Verônica Angélica
Abstract: Copyright of Revista de Administraçãao da UNIMEP is the property of Revista de Administracao da UNIMEP and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2022

120. Intra WBAN routing using Zipf's law and intelligent transmission power switching approach (ZITA).

Author: Roy, Moumita, Chowdhury, Chandreyee, Ahmed, Ghufran, Aslam, Nauman, Chattopadhyay, Samiran, and Islam, Saif Ul
Abstract: Wireless body area networks (WBANs) are becoming a popular and convenient mechanism for IoT-based health monitoring applications. Maintaining the energy efficiency of the nodes in WBANs without degrading network performance is one of the crucial factors for the success of this paradigm. Obtaining routes for data packets should be a dynamic decision depending on network conditions. Consequently, in this paper, a novel cost-based routing protocol ZITA has been proposed that addresses primary issues of WBAN routing, such as timeliness, link quality, temperature control, and energy efficiency while finding the next hop for data packets. Zipf's law is applied for relay selection to ensure the distribution of forwarding load among the potential relays. ZITA controls the transmission power level adaptively in order to cope with the time-varying channel conditions following multi-hop architecture. The protocol is simulated and the results show that the protocol gives better performance in terms of data received by the sink, heat dissipation of the wearable as well as implantable sensor nodes, and load sharing among relay nodes. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

121. Offsetting love and hate: The prosodic effects of the non-standard 1sg in tweets to Boris Johnson and Jeremy Corbyn over four days of the UK general election.

Author: Burnett, Sophia
Subjects: WRITTEN communication, ELECTIONS, ORAL communication, NATIVE language, REFERENDUM, LANGUAGE & languages, ZIPF'S law
Published: 2022
Full Text: View/download PDF

122. Situated incidental vocabulary acquisition: The effects of in-class and out-of-class novel reading.

Author: Reynolds, Barry Lee
Subjects: READING comprehension, SECOND language acquisition, VOCABULARY, ZIPF'S law, LANGUAGE teachers
Abstract: Although there are researchers that claim all vocabulary can be learned through incidental acquisition ([31], [33]), the majority of vocabulary researchers have concluded that learners who have acquired an advanced level of second language vocabulary knowledge have done so through a combination of both intentional and incidental means ([45]). L2 vocabulary size was able to account for the most variance in the incidental vocabulary acquisition outcomes, indicating the higher the L2 vocabulary size the more words that could be acquired incidentally through reading. 7.4 Vocabulary strategies [65] categorized vocabulary learning strategies as those either used to infer meaning of newly encountered words or to consolidate meaning of already known words. Keywords: incidental vocabulary acquisition; in-class reading; out-of-class reading; L2 vocabulary; experimental context EN incidental vocabulary acquisition in-class reading out-of-class reading L2 vocabulary experimental context 705 733 29 09/13/22 20220901 NES 220901 1 Introduction Acquisition of vocabulary knowledge is the prerequisite for all second language communication and comprehension because without vocabulary knowledge meaning cannot be conveyed or understood ([5]). [Extracted from the article]
Published: 2022
Full Text: View/download PDF

123. Characterizing the livingness of geographic space across scales using global nighttime light data

Author: Ren, Zheng, Jiang, Bin, de Rijke, Chris, Seipel, Stefan, Ren, Zheng, Jiang, Bin, de Rijke, Chris, and Seipel, Stefan
Abstract: The hierarchical structure of geographic or urban space can be well-characterized by the concept of living structure, a term coined by Christopher Alexander. All spaces, regardless of their size, possess certain degrees of livingness that can be mathematically quantified. While previous studies have successfully quantified the livingness of small spaces such as images or artworks, the livingness of geographic space has not yet been characterized in a recursive manner. Zipf’s law has been observed in urban systems and intra-urban structures. However, whether Zipf’s law is applicable to the hierarchical substructures of geographic space has rarely been investigated. In this study, we recursively extract the substructures of geographic space using global nighttime light imagery. We quantify the livingness of global cities considering the number of substructures (S) and their inherent hierarchy (H). We further investigate the scaling properties of the extracted substructures across scales and the relationships between livingness and population for global cities. The results demonstrate that all substructures of global cities form a living structure that conforms to Zipf’s law. The degree of livingness better captures population distribution than nighttime light intensity values for the global cities. This study contributes in three aspects: First, it considers global cities as a whole to quantify spatial livingness. Second, it applies the concept of livingness to cities to better capture the spatial structure of the population using nighttime light data. Third, it introduces a novel method to recursively extract substructures from nighttime images, offering a valuable tool to investigate urban structures across multiple spatial scales.
Published: 2024
Full Text: View/download PDF

124. Firms Growth, Distribution, and Non-Self-Averaging Revisited

Author: Fujiwara, Yoshi, Fujimoto, Takahiro, Editor-in-Chief, Aruka, Yuji, Editor-in-Chief, Aoyama, Hideaki, editor, and Yoshikawa, Hiroshi, editor
Published: 2020
Full Text: View/download PDF

125. City Size Distribution in Colombia and Its Regions, 1835–2005

Author: Pérez-Valbuena, Gerson Javier, Meisel-Roca, Adolfo, Higano, Yoshiro, Editor-in-Chief, Poot, Jacques, editor, and Roskruge, Matthew, editor
Published: 2020
Full Text: View/download PDF

126. Assessing the Balance of the Urban Settlement System in the European North of Russia

Author: Irina A. Sekushina
Subjects: zipf's law, rank–size rule, town, european north of russia, settlement system, Regional economics. Space in economics, HT388
Abstract: Introduction. In modern Economics, one of the most common and simplest methods of analyzing the balance of urban settlement systems is to assess their compliance with Zipf's law or the rank–size rule. The basis of this pattern is the relationship between urban population and its place in the hierarchy of towns ranked in descending order of size. Based on the results of the study conducted, the article assesses the balance of the urban settlement system of the European North Russia, as one of its regions, by analyzing its compliance with Zipf’s law. Materials and Methods. The official data from the Federal State Statistics Service on the population of towns in the European North of Russia for 1959, 1989 and 2019 were used as materials of the study. The method of constructing a linear regression between the logarithm of the actual population and the logarithm of the rank of the town was used to verify Zipf's law for the urban network of the region in a certain period. In order to substantiate the conclusions drawn, an analysis of the dynamics of the number of towns and the share of the population living in them was carried out. The monographic method, as well as the methods of tabular and graphical data visualization, was used to interpret the results of the calculations. Results. Based on the analysis of data on the application of the rank–size rule for the towns in the European North of Russia, it has been found that Zipf’s law was not fully observed in any time period, which indicates the imbalance of the existing urban settlement system. In the period from 1959 to 2019, there was an increase in the concentration of the population in the major cities of the region. The imbalance is also caused by the growing number of small towns with a population that does not correspond to the optimal value according to Zipf's law. Discussion and Conclusion. Based on the calculations, the author has come to the conclusion that the cities of Arkhangelsk and Cherepovets have the potential for growth, as well as some others with a population of up to 100 thousand people. The practical significance of the study lies in the possibility of using the results obtained to prognosticate the population of towns in the European North of Russia when planning the location of production facilities, as well as transport and social infrastructure in the region.
Published: 2021
Full Text: View/download PDF

127. Investigating Metropolitan Hierarchies through a Spatially Explicit (Local) Approach

Author: Rosanna Salvia, Giovanni Quaranta, Kostas Rontos, Pavel Cudlin, and Luca Salvati
Subjects: population dynamics, spatial divides, Zipf’s law, indicators, Mediterranean Europe, Geography (General), G1-922
Abstract: Assuming a non-neutral impact of space, an explicit assessment of metropolitan hierarchies based on local regression models produces a refined description of population settlement patterns and processes over time. We used Geographically Weighted Regressions (GWR) to provide an enriched interpretation of the density gradient in Greece, estimating a spatially explicit rank–size relationship inspired by Zipf’s law. The empirical results of the GWR models quantified the adherence of real data (municipal population density as a predictor of metropolitan hierarchy) to the operational assumptions of the rank–size relationship. Local deviations from its prediction were explained considering the peculiarity of the metropolitan cycle (1961–2011) in the country. Although preliminary and exploratory, these findings decomposed representative population dynamics in two stages of the cycle (namely urbanization, 1961–1991, and suburbanization, 1991–2011). Being in line with earlier studies, this timing allowed a geographical interpretation of the evolution of a particularly complex metropolitan system with intense (urban) primacy and a weak level of rural development over a sufficiently long time interval. Introducing a spatially explicit estimation of the rank–size relationship at detailed territorial resolutions provided an original contribution to regional science, covering broad geographical scales.
Published: 2023
Full Text: View/download PDF

128. Heavy-Tailed Probability Distributions: Some Examples of Their Appearance

Author: Lev B. Klebanov, Yulia V. Kuvaeva-Gudoshnikova, and Svetlozar T. Rachev
Subjects: heavy-tailed distributions, Pareto’s law, Lotka’s law, Zipf’s law, probability-generating function, Mathematics, QA1-939
Abstract: We provide two examples of the appearance of heavy-tailed distributions in social sciences applications. Among these distributions are the laws of Pareto and Lotka and some new ones. The examples are illustrated through the construction of suitable toy models.
Published: 2023
Full Text: View/download PDF

129. The Silicon Valley Bank Failure: Application of Benford’s Law to Spot Abnormalities and Risks

Author: Anurag Dutta, Liton Chandra Voumik, Lakshmanan Kumarasankaralingam, Abidur Rahaman, and Grzegorz Zimon
Subjects: Financial Risk, Benford’s law, Zipf’s law, Silicon Valley Bank, Data Validation, Insurance, HG8011-9999
Abstract: Data are produced every single instant in the modern era of technological breakthroughs we live in today and is correctly termed as the lifeblood of today’s world; whether it is Google or Meta, everyone depends on data to survive. But, with the immense surge in technological boom comes several backlashes that tend to pull it down; one similar instance is the data morphing or modification of the data unethically. In many jurisdictions, the phenomenon of data morphing is considered a severe offense, subject to lifelong imprisonment. There are several cases where data are altered to encrypt reliable details. Recently, in March 2023, Silicon Valley Bank collapsed following unrest prompted by increasing rates. Silicon Valley Bank ran out of money as entrepreneurial investors pulled investments to maintain their businesses afloat in a frigid backdrop for IPOs and individual financing. The bank’s collapse was the biggest since the financial meltdown of 2008 and the second-largest commercial catastrophe in American history. By confirming the “Silicon Valley Bank” stock price data, we will delve further into the actual condition of whether there has been any data morphing in the data put forward by the Silicon Valley Bank. To accomplish the very same, we applied a very well-known statistical paradigm, Benford’s Law and have cross-validated the results using comparable statistics, like Zipf’s Law, to corroborate the findings. Benford’s Law has several temporal proximities, known as conformal ranges, which provide a closer examination of the extent of data morphing that has occurred in the data presented by the various organizations. In this research for validating the stock price data, we have considered the opening, closing, and highest prices of stocks for a time frame of 36 years, between 1987 and 2023. Though it is worth mentioning that the data used for this research are coarse-grained, still since the validation is subjected to a larger time horizon of 36 years; Benford’s Law and the similar statistics used in this article can point out any irregularities, which can result in some insight into the situation and into whether there has been any data morphing in the Stock Price data presented by SVB or not. This research has clearly shown that the stock price variations of the SVB diverge much from the permissible ranges, which can give a conclusive direction on further investigations in this issue by the responsible authorities. In addition, readers of this article must note that the conclusion formed about the topic discussed in this article is objective and entirely based on statistical analysis and factual figures presented by the Silicon Valley Bank Group.
Published: 2023
Full Text: View/download PDF

130. Word frequency effects found in free recall are rather due to Bayesian surprise.

Author: Musca, Serban C. and Chemero, Anthony
Subjects: RECOLLECTION (Psychology), WORD frequency, ZIPF'S law, ENVIRONMENTAL psychology, CODING theory, DISTRIBUTION (Probability theory)
Abstract: The inconsistent relation between word frequency and free recall performance (sometimes a positive one, sometimes a negative one, and sometimes no relation) and the non-monotonic relation found between the two cannot all be explained by current theories. We propose a theoretical framework that can explain all extant results. Based on an ecological psychology analysis of the free recall situation in terms of environmental and informational resources available to the participants, we propose that because participants' cognitive system has been shaped by their native language, free recall performance is best understood as the end result of relational properties that preexist the experimental situation and of the way the words from the experimental list interact with those. In addition to this, we borrow from predictive coding theory the idea that the brain constantly predicts "what is coming next" so that it is mainly prediction errors that will propagate information forward. Our ecological psychology analysis indicates there will be "prediction errors" because the word frequency distribution in an experimental word list is inevitably different from the particular Zipf's law distribution of the words in the language that shaped participants' brains. We further propose the particular distributional discrepancies inherent to a given word list will trigger, as a function of the words that are included in the list, their order, and of the words that are absent from the list, a surprisal signal in the brain, something that is isomorphic to the concept of Bayesian surprise. The precise moment when Bayesian surprise is triggered will determine to what word of the list that Bayesian surprise will be associated with, and the word the Bayesian surprise will be associated with will benefit from it and become more memorable as a direct function of the magnitude of the surprisal. Two experiments are presented that show a proxy of Bayesian surprise explains the free recall performance and that no effect of word frequency is found above and beyond the effect of that proxy variable. We then discuss how our view can account for all data extant in the literature on the effect of word frequency on free recall. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

131. Study on the Remote Sensing Spectral Method for Disaster Loss Inversion in Urban Flood Areas.

Author: Duan, Chenfei, Zheng, Xiazhong, Jin, Lianghai, Chen, Yun, Li, Rong, and Yang, Yingliu
Subjects: FLOOD warning systems, REMOTE sensing, CITIES & towns, ZIPF'S law, FLOOD damage, RAINFALL
Abstract: To address the problems of traditional hydrological and hydraulic methods of estimating disasters in urban flood areas, such as small scale, poor timeliness, and difficulty of obtaining data, an inversion method of estimating urban flood disaster area based on remote sensing spectroscopy is proposed. In this paper, the spatial distribution of urban flood disasters is first inverted based on large-scale multidimensional remote sensing spectral orthography. Then, spatial coupling inversion of the remote sensing spectrum-urban economy-flood disaster is performed by simulating the urban economic density through single spectral remote sensing at night. Finally, losses at the urban flood area are estimated. The results show that (1) the heavy rain in Henan Province on 20 July is centered in Zhengzhou, and the spatial distribution of urban flood disasters accords with Zipf's law; (2) the estimated damage to the urban flood area in Henan Province is 132,256 billion yuan, and Zhengzhou has the most serious losses at 43,147 billion yuan, accounting for 32.6% of the entire province's losses. These results are consistent with the official data (accuracy ≥ 90%, R2 ≥ 0.95). This study can provide a new approach for accurately and efficiently estimating urban flood damage at a large scale. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

132. Statistical tests for text homogeneity: using forward and backward processes of numbers of different words.

Author: Abebe, Berhane, Chebunin, Mikhail, Kovalevskii, Artyom, and Zakrevskaya, Natalia
Subjects: ZIPF'S law, GAUSSIAN processes, SEARCH engines, ALGORITHMS, GAUSSIAN distribution
Abstract: The processes of growth in the number of diverse words in a text, when reading in the forward and backward directions, are studied in this article. Based upon the statistics achieved from the difference between these two processes, we construct a statistical test. This statistical test is used for text homogeneity checks. The elementary model states that words in a text are selected from some dictionary independent of each other according to the Zipf-Mandelbrot law. P-values of the statistical test are calculated based on the elementary probabilistic model using the asymptotic normality of corresponding statistics. At last but not least, this statistical test is applied for the analysis of homogeneity of sequences of sonnets. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

133. A model for simulating emergent patterns of cities and roads on real-world landscapes.

Author: Aoki, Takaaki, Fujiwara, Naoya, Fricker, Mark, and Nakagaki, Toshiyuki
Subjects: *ZIPF'S law, *CULTURAL landscapes, *NATURAL landscaping, *TOPOGRAPHIC maps, *LANDSCAPES, *HUMAN mechanics
Abstract: Emergence of cities and road networks have characterised human activity and movement over millennia. However, this anthropogenic infrastructure does not develop in isolation, but is deeply embedded in the natural landscape, which strongly influences the resultant spatial patterns. Nevertheless, the precise impact that landscape has on the location, size and connectivity of cities is a long-standing, unresolved problem. To address this issue, we incorporate high-resolution topographic maps into a Turing-like pattern forming system, in which local reinforcement rules result in co-evolving centres of population and transport networks. Using Italy as a case study, we show that the model constrained solely by topography results in an emergent spatial pattern that is consistent with Zipf's Law and comparable to the census data. Thus, we infer the natural landscape may play a dominant role in establishing the baseline macro-scale population pattern, that is then modified by higher-level historical, socio-economic or cultural factors. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

134. Quantifying relevance in learning and inference.

Author: Marsili, Matteo and Roudi, Yasser
Subjects: *ZIPF'S law, *ARTIFICIAL intelligence, *CONCEPT learning, *PROBABILISTIC generative models, *DISTRIBUTION (Probability theory), *LOSSY data compression, *MACHINE learning
Abstract: Learning is a distinctive feature of intelligent behaviour. High-throughput experimental data and Big Data promise to open new windows on complex systems such as cells, the brain or our societies. Yet, the puzzling success of Artificial Intelligence and Machine Learning shows that we still have a poor conceptual understanding of learning. These applications push statistical inference into uncharted territories where data is high-dimensional and scarce, and prior information on "true" models is scant if not totally absent. Here we review recent progress on understanding learning, based on the notion of "relevance". The relevance, as we define it here, quantifies the amount of information that a dataset or the internal representation of a learning machine contains on the generative model of the data. This allows us to define maximally informative samples, on one hand, and optimal learning machines on the other. These are ideal limits of samples and of machines, that contain the maximal amount of information about the unknown generative process, at a given resolution (or level of compression). Both ideal limits exhibit critical features in the statistical sense: Maximally informative samples are characterised by a power-law frequency distribution (statistical criticality) and optimal learning machines by an anomalously large susceptibility. The trade-off between resolution (i.e. compression) and relevance distinguishes the regime of noisy representations from that of lossy compression. These are separated by a special point characterised by Zipf's law statistics. This identifies samples obeying Zipf's law as the most compressed loss-less representations that are optimal in the sense of maximal relevance. Criticality in optimal learning machines manifests in an exponential degeneracy of energy levels, that leads to unusual thermodynamic properties. This distinctive feature is consistent with the invariance of the classification under coarse graining of the output, which is a desirable property of learning machines. This theoretical framework is corroborated by empirical analysis showing (i) how the concept of relevance can be useful to identify relevant variables in high-dimensional inference and (ii) that widely used machine learning architectures approach reasonably well the ideal limit of optimal learning machines, within the limits of the data with which they are trained. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

135. Introduction to the special section on the interaction between formal and computational linguistics.

Author: Bernard, Timothée and Winterstein, Grégoire
Subjects: COMPUTATIONAL linguistics, NATURAL language processing, LANGUAGE models, UNIVERSAL language, ZIPF'S law
Abstract: The article presents the discussion on interpretable models based on symbolic methods being still relevant and widely used in the natural language processing industry. Topics include witnessing a growing interest within formal linguistics in both explaining the remarkable successes of neural-based language models and uncovering their limitations; and computational approaches to language would come back to symbolic approaches.
Published: 2022
Full Text: View/download PDF

136. Analysis of Spatial Structure in the Kashgar Metropolitan Area, China.

Author: Li, Jiangang, Li, Songhong, Lei, Jun, Zhang, Xiaolei, Qi, Jianwei, Tohti, Buayxam, and Duan, Zuliang
Subjects: ZIPF'S law, SMALL cities, CITIES & towns, GRAVITY model (Social sciences), ARID regions, METROPOLITAN areas
Abstract: Taking metropolitan areas as space carriers has become the engine of the Chinese government in its promotion of high-quality development, and this has also become an important measure by which to balance regional development. We used Zipf's law and the gravity model to study the urban scale distribution characteristics of the Kashgar Metropolitan Area (KMA) in this paper. We also constructed a spatial structure judgment vector for the KMA and put forward the development objectives of different circles. The findings show the following: (1) large cities have a high primacy of development, while small and medium-sized cities are underdeveloped. At present, the KMA is a concentrated monocentric-pattern metropolitan area, with Kashgar City as its core city. (2) The urban built-up area of Kashgar City is expanding to the east and south, where it has broken through the administrative boundary and become integrated with the urban built-up area of Shule County. The spatial structure characteristics of the KMA have been further clarified. The KMA forms three circles: core, middle, and outer. (3) Tumxuk City, Bachu County, Yecheng County, Shache County, and other counties are far from the core city and cannot be connected with Kashgar, but they are closely related to the surrounding cities, forming the Bachu–Tumxuk Urban Group and the Shache–Zepu–Yecheng Urban Group. This study contributes to the understanding of the characteristics of urban scale distribution and the spatial structure of metropolitan areas in arid regions, as well as providing guidance for the formulation of policies for the development of different circles in the KMA. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

137. How to Employ Zipf's Laws for Content Analysis in Tourism Studies?

Author: Cardoso, Lucília, Araújo Vila, Noelia, Soliman, Mohammad, Filipe Araújo, Arthur, and Feijó de Almeida, Giovana Goretti
Subjects: ZIPF'S law, CONTENT analysis, TOURISM
Abstract: Although Zipf's laws were deployed in various contexts, no previous research has investigated their adoption in tourism academic publications. This study aims to fill this gap by employing Zipf's laws as a content analysis method for analyzing tourism literature, with emphasis on the gamification subject. Employing the DB Gnosis software, the findings revealed that the application of Zipf's laws on obtained tourism gamification articles not only broadens the academic understanding of these laws but also brings about relevant implications for academic literature in tourism. This research provides a new content analysis method in tourism and presents new insights into existing content analysis approaches in tourism. It also adds to the few studies that have addressed the topic of tourism gamification. [ABSTRACT FROM AUTHOR]
Published: 2022

138. Firm-Size Distribution in Poland: Is Power Law Applicable?

Author: Piotr Gabrielczak and Tomasz Serwach
Subjects: power law, zipf's law, firm-size distribution, scaling, Economics as a science, HB71-74
Abstract: This article focuses on the existence of power laws in the firm-size distribution in Poland. Specifically, we empirically test whether the size distribution of companies in Poland has the characteristics of Zipf ’s law, a special case of power law observed in many different contexts in empirical economic literature. Our analysis uses 2019 data on the 2,000 largest companies in Poland as ranked by the Rzeczpospolita daily newspaper in its “Lista 2000” (Top 2,000 List). We reviewed theoretical mechanisms generating power laws and used several estimators of the power-law exponent in our empirical analysis. Our results confirm statistically significant deviations from Zipf ’s law in the firm-size distribution in Poland. We found evidence that the power law cannot satisfactorily approximate the sales-based distribution of firms.
Published: 2021
Full Text: View/download PDF

139. Social Networks: Modelling and Analysis.

Author: Sengupta, Arindam
Subjects: *SOCIAL network analysis, *ZIPF'S law, *MATHEMATICAL notation, *INFORMATION dissemination, *GRAPH algorithms
Abstract: These include normalised degree centrality, eigenvector centrality, Katz centrality, betweenness centrality and normalised closeness centrality. I Readership i : Management professionals, marketing analysts, students taking courses in discrete mathematics, networks and associated algorithms. Chapter 1 describes social networks and associated concepts as well as types of analysis in broad terms with some examples and goes on to preview some of the later material. [Extracted from the article]
Published: 2023
Full Text: View/download PDF

140. Word frequency effects found in free recall are rather due to Bayesian surprise

Author: Serban C. Musca and Anthony Chemero
Subjects: word frequency, free recall, ecological psychology, predictive coding, Zipf’s law, Bayesian surprise, Psychology, BF1-990
Abstract: The inconsistent relation between word frequency and free recall performance (sometimes a positive one, sometimes a negative one, and sometimes no relation) and the non-monotonic relation found between the two cannot all be explained by current theories. We propose a theoretical framework that can explain all extant results. Based on an ecological psychology analysis of the free recall situation in terms of environmental and informational resources available to the participants, we propose that because participants’ cognitive system has been shaped by their native language, free recall performance is best understood as the end result of relational properties that preexist the experimental situation and of the way the words from the experimental list interact with those. In addition to this, we borrow from predictive coding theory the idea that the brain constantly predicts “what is coming next” so that it is mainly prediction errors that will propagate information forward. Our ecological psychology analysis indicates there will be “prediction errors” because the word frequency distribution in an experimental word list is inevitably different from the particular Zipf’s law distribution of the words in the language that shaped participants’ brains. We further propose the particular distributional discrepancies inherent to a given word list will trigger, as a function of the words that are included in the list, their order, and of the words that are absent from the list, a surprisal signal in the brain, something that is isomorphic to the concept of Bayesian surprise. The precise moment when Bayesian surprise is triggered will determine to what word of the list that Bayesian surprise will be associated with, and the word the Bayesian surprise will be associated with will benefit from it and become more memorable as a direct function of the magnitude of the surprisal. Two experiments are presented that show a proxy of Bayesian surprise explains the free recall performance and that no effect of word frequency is found above and beyond the effect of that proxy variable. We then discuss how our view can account for all data extant in the literature on the effect of word frequency on free recall.
Published: 2022
Full Text: View/download PDF

141. Call of the wild.

Author: Patricelli, Gail
Subjects: *ANIMAL diversity, *HUMAN-animal relationships, *ANIMAL behavior, *ZIPF'S law, *SCIENTIFIC literature, *ANIMAL communication
Published: 2024
Full Text: View/download PDF

142. From Zipf to Price and beyond.

Author: Eliazar, Iddo
Subjects: *ZIPF'S law, *PRICES, *PHASE transitions, *LORENZ curve, *POWER spectra
Abstract: Consider a society comprising n members that are ranked in a decreasing order of their personal wealths. For this society: Zipf's Law manifests the case in which the members' ranks and wealths display an inverse power-law relation; and Price's Law manifests the case in which the n richest members possess, collectively, half of all the wealth. This paper goes from Zipf's Law to Price's Law and beyond, and it does so by introducing and exploring a novel Generalized Price's Law (GPL): a general allometric scaling of the quantiles of rank distributions. Akin to multifractals, the GPL is governed by spectrums of powers. The GPL spectra are investigated, and are shown to be fractal objects: from a socioeconomic perspective, the spectra are power-law Lorenz curves that characterize poor fractality and rich fractality; from a probabilistic perspective, the spectra are characterized by Pareto statistics and by Lindy Laws. The intersection of Zipf's Law and of the GPL is also investigated, and it is shown to be: (i) the phase transition between the two markedly different fractal regimes of the GPL spectra; (ii) the phase transition between two markedly different macroscopic regimes of Zipf's Law; (iii) Price's Law. Metaphorically, this paper establishes navigation directions in the space of rank distributions: how to get from the 'Zipf street' to the 'Price junction', from there to the new 'GPL avenue', and back. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

143. Defining urban boundaries through DBSCAN and Shannon's entropy: The case of the Mexican National Urban System.

Author: Caudillo-Cos, Camilo Alberto, Montejano-Escamilla, Jorge Alberto, Tapia-McClung, Rodrigo, Ávila-Jiménez, Felipe Gerardo, and Barrera-Alarcón, Itzia Gabriela
Subjects: *UNCERTAINTY (Information theory), *ZIPF'S law, *URBANIZATION, *GEOSPATIAL data, *CITIES & towns, *ENTROPY, *MAXIMUM entropy method, *LAND settlement patterns
Abstract: A novel method is proposed to define cities' boundaries within a given national urban system. Point Volunteered Geographic Information, such as the OpenStreetMap road network nodes, together with national geospatial information, like economic units, are recursively clustered using DBSCAN at different distance thresholds; clusters smaller than a threshold reflecting the rural settlements pattern are filtered, and then Shannon's entropy for the entire system is calculated. The system of percolated nodes obtained when this entropy reaches its maximum is compared against the current definition of what is officially urban. It is found to be very similar to the actual urban-nonurban classification, and also to have a very high correspondence with the current classification of the Mexican National Urban System (SUN). An additional finding is that the percolated system at the maximum entropy value has a very high correspondence with Zipf's law, while the current Mexican SUN does not. [Display omitted] • We define urban boundaries with a combination of crowdsourced and official datasets using DBSCAN. • The criteria are rural cluster size threshold and the distance at which Shannon's entropy index maximum was found. • The percolated clusters fit Zipf's Law better than the National Urban System (SUN). • We obtained a global similarity of around 86% and locally above 75% with only five cities missing. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

144. A long-term, regional-level analysis of Zipf's and Gibrat's laws in the United States.

Author: González-Val, Rafael, Ximénez-de-Embún, Domingo P., and Sanz-Gracia, Fernando
Subjects: *ZIPF'S law, *AMERICAN law, *MUNICIPAL ordinances, *NULL hypothesis, *TEST validity
Abstract: We test the validity of Zipf's and Gibrat's laws for city size distributions at the regional level from 1900 to 2010 by considering US states. Zipf's law is satisfied for a majority of states, but for the United States as a whole it only held during the first half of the twentieth century. The null hypothesis of a power law is not rejected at the national level or for most states (the maximum number of rejections in one year is 13 states out of 48). There is evidence supporting a weak version of Gibrat's law in the long-term; mean growth is independent of initial population for most city sizes over the entire United States and in 27 states, while the variance of growth is size-dependent. • We test the validity of Zipf's and Gibrat's laws for city size distributions at the US regional level from 1900–2010. • Zipf's law holds for a majority of states, but for the US as a whole it only held during the first half of the 20th century. • The null hypothesis of a power law is not rejected at the national level or for most states. • There is evidence supporting a weak version of Gibrat's law in the long-term. • Mean growth is independent of initial population for most city sizes over the entire US and in 27 states. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

145. Caching scheme for information‐centric networks with balanced content distribution.

Author: Dutta, Nitul, Patel, Shobhit K., Faragallah, Osama S., Baz, Mohammed, and Rashed, Ahmed Nabih Zaki
Subjects: *ZIPF'S law, *MATHEMATICAL analysis, *MULTICASTING (Computer networks), *WIRELESS mesh networks
Abstract: Summary: Information‐centric network (ICN) emphasizes on content retrieval without much bothering about the location of its actual producer. This novel networking paradigm makes content retrieval faster and less expensive by shifting data provisioning into content holder rather than content owner. Caching is the feature of ICN that makes content serving possible from any intermediate device. An efficient caching is one of the primary requirements for effective deployment of ICN. In this paper, a caching approach with balanced content distribution among network devices is proposed. The selection of contents to be cached is determined through universal and computed using Zipf's law. The dynamic change in popularity of contents is also considered to take make caching decisions. For balancing the cached content across the network, every router keeps track of its neighbor's cache status. Three parameters, the proportionate distance of the router from the client (pd), the router congestion (rc), and the cache status (cs), are contemplated to select a router for caching contents. The new caching approach is evaluated in the simulated environment using ndnSIM‐2.0. Three state‐of‐the‐art approaches, Leave Copy Everywhere (LCE), centrality measures‐based algorithm (CMBA), and a probability‐based caching (probCache), are considered for comparison. The proposed method of caching shows the better performance compared to the other three protocols used in the comparison. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

146. Avoiding adverse autonomous agent actions.

Author: Hancock, P.A.
Subjects: *INDUSTRIAL safety, *ZIPF'S law, *AUTONOMOUS robots, *WATSON (Computer), *MIXED reality
Abstract: The potential threats of autonomy One of the obvious threats of autonomy, which lies in its form as, prospectively, one of the most powerful expressions of technology, is its influence upon evolution; both human and technical. Any real-world challenges that can present the need for an I exact i and I deterministic i expression of intent will already be an obvious candidate for precise automation and so probably not a candidate for exploratory forms of autonomy (and see Hancock, [41], [42]). The "isles of autonomy" metaphor initially casts humans in the littoral role of the beaches and riparian shorelines that surround such emerging island (i.e., the outer boundary layer of emerging autonomies, as they "rise" above the ocean of extant automation). Challenges to formal methods assessments It would be both desirable and actually rather gratifying then if, in counter to each of these prospective weaknesses, we could specify a variety of provably effective formal methods which would test and indemnify us against any untoward outcomes of a singular or interactive group of autonomous systems. The potential promises of autonomy The vista of autonomy's promise is only circumscribed by the limits of the advocates' imagination that such envisaged autonomous systems can underwrite (Arkin, [4]), see Figure 3. [Extracted from the article]
Published: 2022
Full Text: View/download PDF

147. Optimal Coding and the Origins of Zipfian Laws.

Author: Ferrer-i-Cancho, Ramon, Bentz, Christian, and Seguin, Caio
Subjects: *ZIPF'S law, *MAXIMUM entropy method, *INFORMATION theory, *NATURAL languages, *COST control, *VIDEO compression
Abstract: The problem of compression in standard information theory consists of assigning codes as short as possible to numbers. Here we consider the problem of optimal coding – under an arbitrary coding scheme – and show that it predicts Zipf's law of abbreviation, namely a tendency in natural languages for more frequent words to be shorter. We apply this result to investigate optimal coding also under so-called non-singular coding, a scheme where unique segmentation is not warranted but codes stand for a distinct number. Optimal non-singular coding predicts that the length of a word should grow approximately as the logarithm of its frequency rank, which is again consistent with Zipf's law of abbreviation. Optimal non-singular coding in combination with the maximum entropy principle also predicts Zipf's rank-frequency distribution. Furthermore, our findings on optimal non-singular coding challenge common beliefs about random typing. It turns out that random typing is in fact an optimal coding process, in stark contrast with the common assumption that it is detached from cost cutting considerations. Finally, we discuss the implications of optimal coding for the construction of a compact theory of Zipfian laws more generally as well as other linguistic laws. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

148. Characteristics of Malay translated hadith corpus.

Author: Sazali, Siti Syakirah, Rahman, Nurazzah Abdul, and Bakar, Zainab Abu
Subjects: ZIPF'S law, HADITH, MALAY language, CORPORA
Abstract: Annotated corpus can greatly assist in the natural language processing field. For example, computers can understand more of the document context, and indexing and clustering in information retrieval can be done precisely with less or no ambiguity of words. However, there are only a few annotated corpora in Malay language, which are not publicly shared. In this paper, we delve into analysing and annotating Malay translated hadith documents in terms of tagging and entities. There are three phases, which are manual filtering and cleaning, analysing the corpus and creating the benchmark. As the result, an analysis and benchmark of Malay translated hadith corpus were produced in term of part-of-speech and named entities tags that follows the Zipf's law distribution. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

149. The Gravity Equation in International Trade: A Note.

Author: Dewitte, Ruben
Subjects: INTERNATIONAL trade, GRAVITY, ZIPF'S law, DISTANCE education, ELASTICITY (Economics), CUMULATIVE distribution function
Abstract: Applying the reported cutoff of 2,000 km results in a distance elasticity of HT $_{long} = 1.185$ ht ( HT $SE = 0.250$ ht ). 6 This range is further restricted to cutoff values between 1.91 and 4.52 million FF if one requires the point estimate ( HT $+ 1$ ht ) to be significantly larger than . Meanwhile, the power exponent of the specified firm size-distance relation increased by one ( HT $+ 1$ ht ) takes values between 1.03 and 1.25. [Extracted from the article]
Published: 2022
Full Text: View/download PDF

150. Cascade events in geographical space.

Author: Ordoñez, Dylan Marcus T. and Batac, Rene C.
Subjects: *ZIPF'S law, *DISTRIBUTION (Probability theory), *SPACE, *CENTROID
Published: 2022
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

3,456 results on '"Zipf's law"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources