1,950 results on '"data synthesis"'
Search Results
52. Drivers of biodiversity change in the Anthropocene
- Author
-
Daskalova, Gergana Nikolaeva, Myers-Smith, Isla, Bjorkman, Anne, and Dornelas, Maria
- Subjects
biodiversity ,conservation ,global change ,ecology ,data science ,data synthesis ,time-series ,forest loss ,global change drivers ,rarity ,species traits ,biodiversity change ,species richness ,community composition - Abstract
Across the globe, the populations of species and the biodiversity of ecological communities are changing, including declines, gains and stable trends over time. Against a backdrop of accelerating global change, a critical research challenge is to disentangle the sources of the heterogeneous patterns of population and biodiversity change over time. In this thesis, I linked population and biodiversity change with species traits like rarity and commonness, and with global change drivers like forest loss. I synthesised global biodiversity databases with gridded driver datasets to quantify how species' populations and biodiversity are being impacted by human activities in the Anthropocene. The rise of open-access data in ecology has produced databases with millions of records which have launched large-scale syntheses of how Earth's biota is changing over time and space. However, our knowledge of biodiversity change is limited by the available data and their biases. In Chapter 1, I tested the representation of three worldwide biodiversity databases (Living Planet, BioTIME and PREDICTS) across geographic and temporal variation in global change over land and sea and across the tree of life. I found that variation in global change drivers is better captured over space than over time and in the marine realm versus on land. I provided recommendations on how to improve the use of existing data, better target future ecological monitoring and capture different combinations of global change. In Chapter 2, I tested whether vertebrate species from specific biomes, taxa or with certain species traits are more likely to increase or decrease in a time of accelerating global change. I analysed nearly 10 000 population abundance time series from over 2000 vertebrate species part of the Living Planet Database. I integrated abundance data with information on geographic range, habitat preference, taxonomic and phylogenetic relationships, and IUCN Red List Categories and threats. I found that 15% of populations declined, 18% increased, and 67% showed no net changes over time. Amphibians were the only taxa that experienced net declines in the analysed data, while birds, mammals and reptiles experienced net increases. Despite this variation among broad taxonomic groups, surprisingly I did not detect phylogenetic patterns in which species were more likely to decline versus increase. Population trends were poorly explained by species' rarity and global-scale threats. I found that incorporating the full spectrum of population change, including declines, gains and stable trends, will improve conservation efforts to protect global biodiversity. In Chapter 3, I explored land-use change to fill the gap in empirical evidence of how habitat transformations such as forest loss and gain are reshaping biodiversity over time. I quantified how change in forest cover has influenced temporal shifts in populations and ecological assemblages from over 6000 globally distributed time series across six taxonomic groups. I found that local-scale increases and decreases in abundance, species richness, and temporal species replacement (turnover) were intensified by as much as 48% after forest loss. Larger amounts of forest loss did not always correlate with higher population and biodiversity change across sites, highlighting the mediating effects of local context and historical baselines. Temporal lags in population- and assemblage-level shifts after forest loss extended up to 50 years and increased with species' generation time. My findings indicate that forest loss amplified population and biodiversity change, with effects on both short and long temporal scales. A mix of immediate and lagged biodiversity change following land-use change emphasises the need for temporally explicit biodiversity scenarios to accurately estimate progress towards conservation goals. Together, my thesis findings demonstrate the wide spectrum of population and biodiversity change happening across varying amounts of global change and different realms, taxa and species traits. These heterogeneous impacts of global change on population and biodiversity spanned temporal scales from immediate effects in a couple of years to lagged responses decades after disturbance. The links between global change drivers and shifts in species' abundance, species richness and compositional turnover depended on historical context and species' characteristics like generation time. I documented both immediate and temporally delayed effects of global change drivers on species' populations abundance and the biodiversity of ecological assemblages which highlights the importance of long-term ecological monitoring. The main implications of my thesis findings are that first, any inferences drawn from biodiversity syntheses reflect the types of species and places represented by the data and the global change that is experienced. To create accurate scenarios, we need biodiversity data that span not only different taxa and locations, but also the spectrum of global change variation around the world. Second, biodiversity predictions should incorporate both positive and negative impacts of global change drivers as well as lagged responses. Finally, ecosystems and the species within them are usually simultaneously exposed to a suite of global change drivers and a key future research step is to test the synergy and/or antagony in the effects and interactions among multiple types of environmental change on populations and biodiversity. Overall, my thesis research demonstrates that the drivers of biodiversity change in the Anthropocene have both immediate and temporally-delayed effects which depend on species' traits and the sites' historical context. My findings suggest that by incorporating the full spectrum of biodiversity change and the nuance around interacting global change drivers we can improve projections of future ecological shifts and enhance local and international conservation policies.
- Published
- 2021
- Full Text
- View/download PDF
53. Efficient Wheat Head Segmentation with Minimal Annotation: A Generative Approach
- Author
-
Jaden Myers, Keyhan Najafian, Farhad Maleki, and Katie Ovens
- Subjects
deep learning ,segmentation ,generative adversarial networks ,data synthesis ,Photography ,TR1-1050 ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Deep learning models have been used for a variety of image processing tasks. However, most of these models are developed through supervised learning approaches, which rely heavily on the availability of large-scale annotated datasets. Developing such datasets is tedious and expensive. In the absence of an annotated dataset, synthetic data can be used for model development; however, due to the substantial differences between simulated and real data, a phenomenon referred to as domain gap, the resulting models often underperform when applied to real data. In this research, we aim to address this challenge by first computationally simulating a large-scale annotated dataset and then using a generative adversarial network (GAN) to fill the gap between simulated and real images. This approach results in a synthetic dataset that can be effectively utilized to train a deep-learning model. Using this approach, we developed a realistic annotated synthetic dataset for wheat head segmentation. This dataset was then used to develop a deep-learning model for semantic segmentation. The resulting model achieved a Dice score of 83.4% on an internal dataset and Dice scores of 79.6% and 83.6% on two external datasets from the Global Wheat Head Detection datasets. While we proposed this approach in the context of wheat head segmentation, it can be generalized to other crop types or, more broadly, to images with dense, repeated patterns such as those found in cellular imagery.
- Published
- 2024
- Full Text
- View/download PDF
54. Enhanced Pet Behavior Prediction via S2GAN-Based Heterogeneous Data Synthesis
- Author
-
Jinah Kim and Nammee Moon
- Subjects
behavior prediction ,behavior monitoring ,heterogeneous data ,data synthesis ,generative adversarial network ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
Heterogeneous data have been used to enhance behavior prediction performance; however, it involves issues such as missing data, which need to be addressed. This paper proposes enhanced pet behavior prediction via Sensor to Skeleton Generative Adversarial Networks (S2GAN)-based heterogeneous data synthesis. The S2GAN model synthesizes the key features of video skeletons based on collected nine-axis sensor data and replaces missing data, thereby enhancing the accuracy of behavior prediction. In this study, data collected from 10 pets in a real-life-like environment were used to conduct recognition experiments on 9 commonly occurring types of indoor behavior. Experimental results confirmed that the proposed S2GAN-based synthesis method effectively resolves possible missing data issues in real environments and significantly improves the performance of the pet behavior prediction model. Additionally, by utilizing data collected under conditions similar to the real environment, the method enables more accurate and reliable behavior prediction. This research demonstrates the importance and utility of synthesizing heterogeneous data in behavior prediction, laying the groundwork for applications in various fields such as abnormal behavior detection and monitoring.
- Published
- 2024
- Full Text
- View/download PDF
55. CTAB-GAN+: enhancing tabular data synthesis
- Author
-
Zilong Zhao, Aditya Kunar, Robert Birke, Hiek Van der Scheer, and Lydia Y. Chen
- Subjects
GAN ,data synthesis ,tabular data ,differential privacy ,imbalanced distribution ,Information technology ,T58.5-58.64 - Abstract
The usage of synthetic data is gaining momentum in part due to the unavailability of original data due to privacy and legal considerations and in part due to its utility as an augmentation to the authentic data. Generative adversarial networks (GANs), a paragon of generative models, initially for images and subsequently for tabular data, has contributed many of the state-of-the-art synthesizers. As GANs improve, the synthesized data increasingly resemble the real data risking to leak privacy. Differential privacy (DP) provides theoretical guarantees on privacy loss but degrades data utility. Striking the best trade-off remains yet a challenging research question. In this study, we propose CTAB-GAN+ a novel conditional tabular GAN. CTAB-GAN+ improves upon state-of-the-art by (i) adding downstream losses to conditional GAN for higher utility synthetic data in both classification and regression domains; (ii) using Wasserstein loss with gradient penalty for better training convergence; (iii) introducing novel encoders targeting mixed continuous-categorical variables and variables with unbalanced or skewed data; and (iv) training with DP stochastic gradient descent to impose strict privacy guarantees. We extensively evaluate CTAB-GAN+ on statistical similarity and machine learning utility against state-of-the-art tabular GANs. The results show that CTAB-GAN+ synthesizes privacy-preserving data with at least 21.9% higher machine learning utility (i.e., F1-Score) across multiple datasets and learning tasks under given privacy budget.
- Published
- 2024
- Full Text
- View/download PDF
56. Risky business: human-related data is lacking from Lyme disease risk models
- Author
-
Erica Fellin, Mathieu Varin, and Virginie Millien
- Subjects
blacklegged ticks ,data synthesis ,human-related ,Lyme disease ,risk assessment ,risk map ,Public aspects of medicine ,RA1-1270 - Abstract
Used as a communicative tool for risk management, risk maps provide a service to the public, conveying information that can raise risk awareness and encourage mitigation. Several studies have utilized risk maps to determine risks associated with the distribution of Borrelia burgdorferi, the causal agent of Lyme disease in North America and Europe, as this zoonotic disease can lead to severe symptoms. This literature review focused on the use of risk maps to model distributions of B. burgdorferi and its vector, the blacklegged tick (Ixodes scapularis), in North America to compare variables used to predict these spatial models. Data were compiled from the existing literature to determine which ecological, environmental, and anthropic (i.e., human focused) variables past research has considered influential to the risk level for Lyme disease. The frequency of these variables was examined and analyzed via a non-metric multidimensional scaling analysis to compare different map elements that may categorize the risk models performed. Environmental variables were found to be the most frequently used in risk spatial models, particularly temperature. It was found that there was a significantly dissimilar distribution of variables used within map elements across studies: Map Type, Map Distributions, and Map Scale. Within these map elements, few anthropic variables were considered, particularly in studies that modeled future risk, despite the objective of these models directly or indirectly focusing on public health intervention. Without including human-related factors considering these variables within risk map models, it is difficult to determine how reliable these risk maps truly are. Future researchers may be persuaded to improve disease risk models by taking this into consideration.
- Published
- 2023
- Full Text
- View/download PDF
57. SeedArc, a global archive of primary seed germination data.
- Author
-
Fernández‐Pascual, Eduardo, Carta, Angelino, Rosbakh, Sergey, Guja, Lydia, Phartyal, Shyam S., Silveira, Fernando A. O., Chen, Si‐Chong, Larson, Julie E., and Jiménez‐Alfaro, Borja
- Subjects
- *
GERMINATION , *BOTANY , *BIOTIC communities , *SEED size , *PLANT reproduction , *BIOMES , *PLANT ecology - Abstract
Keywords: data synthesis; database; germination; open science; plant reproduction; repository; seed; trait EN data synthesis database germination open science plant reproduction repository seed trait 466 470 5 09/25/23 20231015 NES 231015 Data availability The data and code used to produce this article are available at https://github.com/efernandezpascual/seedarcms. The need for a global archive of primary seed germination data The seed ecology community has recently recognized the need to synthesize knowledge, setting the research agenda for functional seed ecology (Saatkamp I et al i ., [34]). I SeedArc i compiles primary seed germination data to synthesize the seed germination spectrum at a global scale. The theory underlying the seed germination spectrum has been laid out by decades of work on seed ecology (Baskin & Baskin, [1]), but empirical studies testing major ecological hypotheses at both global and local scales remain elusive without a standardized seed germination database. [Extracted from the article]
- Published
- 2023
- Full Text
- View/download PDF
58. Distribution and trends of mercury in aquatic and terrestrial biota of New York, USA: a synthesis of 50 years of research and monitoring.
- Author
-
Adams, Evan M., Gulka, Julia E., Yang, Yang, Burton, Mark E. H., Burns, Douglas A., Buxton, Valerie, Cleckner, Lisa, DeSorbo, Christopher R., Driscoll, Charles T., Evers, David C., Fisher, Nicholas, Lane, Oksana, Mao, Huiting, Riva-Murray, Karen, Millard, Geoffrey, Razavi, N. Roxanna, Richter, Wayne, Sauer, Amy K., and Schoch, Nina
- Subjects
AQUATIC organisms ,MERCURY ,AQUATIC habitats ,LAND cover ,MERCURY vapor ,RISK exposure ,METHYLMERCURY - Abstract
Mercury (Hg) inputs have particularly impacted the northeastern United States due to its proximity to anthropogenic emissions sources and abundant habitats that efficiently convert inorganic Hg into methylmercury. Intensive research and monitoring efforts over the past 50 years in New York State, USA, have informed the assessment of the extent and impacts of Hg exposure on fishes and wildlife. By synthesizing Hg data statewide, this study quantified temporal trends of Hg exposure, spatiotemporal patterns of risk, the role that habitat and Hg deposition play in producing spatial patterns of Hg exposure in fish and other wildlife, and the effectiveness of current monitoring approaches in describing Hg trends. Most temporal trends were stable, but we found significant declines in Hg exposure over time in some long-sampled fish. The Adirondack Mountains and Long Island showed the greatest number of aquatic and terrestrial species with elevated Hg concentrations, reflecting an unequal distribution of exposure risk to fauna across the state. Persistent hotspots were detected for aquatic species in central New York and the Adirondack Mountains. Elevated Hg concentrations were associated with open water, forests, and rural, developed habitats for aquatic species, and open water and forested habitats for terrestrial species. Areas of consistently elevated Hg were found in areas driven by atmospheric and local Hg inputs, and habitat played a significant role in translating those inputs into biotic exposure. Continued long-term monitoring will be important in evaluating how these patterns continue to change in the face of changing land cover, climate, and Hg emissions. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
59. Of causes and symptoms: using monitoring data and expert knowledge to diagnose the causes of stream degradation.
- Author
-
Rettig, Katharina, Semmler-Elpers, Renate, Brettschneider, Denise, Hering, Daniel, and Feld, Christian K.
- Subjects
WATER management ,BAYESIAN analysis ,ECOLOGICAL assessment ,WATER use ,LAND use ,FECAL contamination - Abstract
Ecological status assessment under the European Water Framework Directive (WFD) often integrates the impact of multiple stressors into a single index value. This hampers the identification of individual stressors being responsible for status deterioration. As a consequence, management measures are often disentangled from assessment results. To close this gap and to support river basin managers in the diagnosis of stressors, we linked numerous macroinvertebrate assessment metrics and one diatom index with potential causes of ecological deterioration through Bayesian belief networks (BBNs). The BBNs were informed by WFD monitoring data as well as regular consultation with experts and allow to estimate the probabilities of individual degradation causes based upon a selection of biological metrics. Macroinvertebrate metrics were shown to be stronger linked to hydromorphological conditions and land use than to water quality-related parameters (e.g., thermal and nutrient pollution). The modeled probabilities also allow to order the potential causes of degradation hierarchically. The comparison of assessment metrics showed that compositional and trait-based community metrics performed equally well in the diagnosis. The testing of the BBNs by experts resulted in an agreement between model output and expert opinion of 17–92% for individual stressors. Overall, the expert-based validation confirmed a good diagnostic potential of the BBNs; on average 80% of the diagnosed causes were in agreement with expert judgement. We conclude that diagnostic BBNs can assist the identification of causes of stream and river degradation and thereby inform the derivation of appropriate management decisions. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
60. ASIDS: A Robust Data Synthesis Method for Generating Optimal Synthetic Samples.
- Author
-
Du, Yukun, Cai, Yitao, Jin, Xiao, Wang, Hongxia, Li, Yao, and Lu, Min
- Subjects
- *
SAMPLE size (Statistics) , *INTERPOLATION - Abstract
Most existing data synthesis methods are designed to tackle problems with dataset imbalance, data anonymization, and an insufficient sample size. There is a lack of effective synthesis methods in cases where the actual datasets have a limited number of data points but a large number of features and unknown noise. Thus, in this paper we propose a data synthesis method named Adaptive Subspace Interpolation for Data Synthesis (ASIDS). The idea is to divide the original data feature space into several subspaces with an equal number of data points, and then perform interpolation on the data points in the adjacent subspaces. This method can adaptively adjust the sample size of the synthetic dataset that contains unknown noise, and the generated sample data typically contain minimal errors. Moreover, it adjusts the feature composition of the data points, which can significantly reduce the proportion of the data points with large fitting errors. Furthermore, the hyperparameters of this method have an intuitive interpretation and usually require little calibration. Analysis results obtained using simulated original data and benchmark original datasets demonstrate that ASIDS is a robust and stable method for data synthesis. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
61. Climate Evolution Through the Onset and Intensification of Northern Hemisphere Glaciation.
- Author
-
McClymont, E. L., Ho, S. L., Ford, H. L., Bailey, I., Berke, M. A., Bolton, C. T., De Schepper, S., Grant, G. R., Groeneveld, J., Inglis, G. N., Karas, C., Patterson, M. O., Swann, G. E. A., Thirumalai, K., White, S. M., Alonso‐Garcia, M., Anand, P., Hoogakker, B. A. A., Littler, K., and Petrick, B. F.
- Subjects
- *
PLIOCENE-Pleistocene boundary , *GLACIATION , *ICE sheets , *ATMOSPHERIC carbon dioxide , *PLIOCENE Epoch , *OCEAN circulation - Abstract
The Pliocene Epoch (∼5.3–2.6 million years ago, Ma) was characterized by a warmer than present climate with smaller Northern Hemisphere ice sheets, and offers an example of a climate system in long‐term equilibrium with current or predicted near‐future atmospheric CO2 concentrations (pCO2). A long‐term trend of ice‐sheet expansion led to more pronounced glacial (cold) stages by the end of the Pliocene (∼2.6 Ma), known as the "intensification of Northern Hemisphere Glaciation" (iNHG). We assessed the spatial and temporal variability of ocean temperatures and ice‐volume indicators through the late Pliocene and early Pleistocene (from 3.3 to 2.4 Ma) to determine the character of this climate transition. We identified asynchronous shifts in long‐term means and the pacing and amplitude of shorter‐term climate variability, between regions and between climate proxies. Early changes in Antarctic glaciation and Southern Hemisphere ocean properties occurred even during the mid‐Piacenzian warm period (∼3.264–3.025 Ma) which has been used as an analog for future warming. Increased climate variability subsequently developed alongside signatures of larger Northern Hemisphere ice sheets (iNHG). Yet, some regions of the ocean felt no impact of iNHG, particularly in lower latitudes. Our analysis has demonstrated the complex, non‐uniform and globally asynchronous nature of climate changes associated with the iNHG. Shifting ocean gateways and ocean circulation changes may have pre‐conditioned the later evolution of ice sheets with falling atmospheric pCO2. Further development of high‐resolution, multi‐proxy reconstructions of climate is required so that the full potential of the rich and detailed geological records can be realized. Plain Language Summary: Warm climates of the geological past provide windows into future environmental responses to elevated atmospheric CO2 concentrations, and past climate transitions identify important or sensitive regions and processes. We assessed the patterns of average ocean temperatures and indicators of ice sheet size over hundreds of thousands of years, and compared to shorter‐term variability (tens of thousands of years) during a recent transition from late Pliocene warmth (when CO2 was similar to present) to the onset of the large and repeated advances of northern hemisphere ice sheets referred to as the "ice ages." We show that different regions of the climate system changed at different times, with some changing before the ice sheets expanded. The development of larger ice sheets in the Northern Hemisphere then impacted ocean temperatures and circulation, but there were many regions where no impacts were felt. Our analysis highlights regional differences in the timing and amplitudes of change within a globally‐significant climate transition as well as in response to the current atmospheric CO2 concentrations in our climate system. Key Points: The "stable" warm late Pliocene ∼3.3–3.1 million years ago was a time of climate transition, especially in the southern hemisphereOcean temperatures and ice sheets evolved asynchronously 3.3–2.4 Ma during the onset and intensification of Northern Hemisphere GlaciationClimate variability evolved in complex, non‐uniform ways, most strongly expressed in northern mid‐latitude sea‐surface temperature records [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
62. In silico simulation of hepatic arteries: An open‐source algorithm for efficient synthetic data generation.
- Author
-
Whitehead, Joseph F., Laeseke, Paul F., Periyasamy, Sarvesh, Speidel, Michael A., and Wagner, Martin G.
- Subjects
- *
MACHINE learning , *IMAGE reconstruction algorithms , *COST functions , *HEPATIC artery , *DEEP learning , *ALGORITHMS - Abstract
Background: In silico testing of novel image reconstruction and quantitative algorithms designed for interventional imaging requires realistic high‐resolution modeling of arterial trees with contrast dynamics. Furthermore, data synthesis for training of deep learning algorithms requires that an arterial tree generation algorithm be computationally efficient and sufficiently random. Purpose: The purpose of this paper is to provide a method for anatomically and physiologically motivated, computationally efficient, random hepatic arterial tree generation. Methods: The vessel generation algorithm uses a constrained constructive optimization approach with a volume minimization‐based cost function. The optimization is constrained by the Couinaud liver classification system to assure a main feeding artery to each Couinaud segment. An intersection check is included to guarantee non‐intersecting vasculature and cubic polynomial fits are used to optimize bifurcation angles and to generate smoothly curved segments. Furthermore, an approach to simulate contrast dynamics and respiratory and cardiac motion is also presented. Results: : The proposed algorithm can generate a synthetic hepatic arterial tree with 40 000 branches in 11 s. The high‐resolution arterial trees have realistic morphological features such as branching angles (MAD with Murray's law =1.2±1.2o$ = \;1.2 \pm {1.2^o}$), radii (median Murray deviation =0.08$ = \;0.08$), and smoothly curved, non‐intersecting vessels. Furthermore, the algorithm assures a main feeding artery to each Couinaud segment and is random (variability = 0.98 ± 0.01). Conclusions: This method facilitates the generation of large datasets of high‐resolution, unique hepatic angiograms for the training of deep learning algorithms and initial testing of novel 3D reconstruction and quantitative algorithms designed for interventional imaging. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
63. Assessing the exposure of UK habitats to 20th‐ and 21st‐century climate change, and its representation in ecological monitoring schemes.
- Author
-
Wilson, Oliver J. and Pescott, Oliver L.
- Subjects
- *
ENVIRONMENTAL monitoring , *WEATHER , *MEDITERRANEAN climate , *HABITATS , *LAND cover , *CLIMATE change - Abstract
Climate change is a significant driver of contemporary biodiversity change. Ecological monitoring schemes can be crucial in highlighting its consequences, but connecting and interpreting observed climatic and ecological changes demands an understanding of monitored locations' exposure to climate change. Generalising from trends in monitored sites to habitats also requires an assessment of how closely sampled locations' climate change trajectories mirror those of wider ecosystems. Such assessments are rare but vital for drawing robust ecological conclusions.Focusing on the UK, we generated a metric of climate change exposure by quantifying the change in observed historical (1901–2019) and predicted future (2021–2080, pessimistic emissions scenario) conditions. We then assessed habitat‐specific climate change exposure by overlaying the resulting data with maps of contemporary (2019) land cover. Finally, we compared patterns of climate change exposure in locations sampled by ecological monitoring schemes to random samples from wider habitats.The UK's climate changed significantly between the early 20th century and the last decade, and is predicted to undergo even greater changes (including the development of Iberian/Mediterranean climate types in places) into the 21st century. Climate change exposure is unevenly distributed: regionally, it falls more in southern, central and eastern England; locally, it is greater at higher‐elevation locations than nearby areas at lower elevations.Areas with contemporary arable and horticulture, urban, calcareous grassland and suburban land cover are predicted to experience the greatest overall climatic change, though other habitats experienced relatively greater change than these in the first half of the 20th century.The extent to which locations sampled by ecological monitoring schemes represent broader habitat‐level gradients of climate change exposure varies. Monitored sites' coverage of wider trends is heterogeneous across habitats, time periods and schemes.Policy implications. UK ecological monitoring schemes can effectively, though variably, capture the effects of climate change on habitats. To improve their performance, climate change could be explicitly included in the design of such programmes. Additionally, our findings on how effectively different datasets represent wider patterns of climate change are crucial for informing syntheses of ecological change connected to shifting atmospheric conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
64. Data for Digital Forensics: Why a Discussion on "How Realistic is Synthetic Data" is Dispensable.
- Author
-
Göbel, Thomas, Baier, Harald, and Breitinger, Frank
- Subjects
DIGITAL forensics ,DATA libraries ,FORENSIC sciences ,RESEARCH personnel - Abstract
Digital forensics depends on data sets for various purposes like concept evaluation, educational training, and tool validation. Researchers have gathered such data sets into repositories and created data simulation frameworks for producing large amounts of data. Synthetic data often face skepticism due to its perceived deviation from real-world data, raising doubts about its realism. This paper addresses this concern, arguing that there is no definitive answer. We focus on four common digital forensic use cases that rely on data. Through these, we elucidate the specifications and prerequisites of data sets within their respective contexts. Our discourse uncovers that both real-world and synthetic data are indispensable for advancing digital forensic science, software, tools, and the competence of practitioners. Additionally, we provide an overview of available data set repositories and data generation frameworks, contributing to the ongoing dialogue on digital forensic data sets' utility. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
65. Emotion recognition using facial expressions in an immersive virtual reality application.
- Author
-
Chen, Xinrun and Chen, Hengxin
- Subjects
EMOTION recognition ,FACIAL expression ,VIRTUAL reality ,HEAD-mounted displays ,INFRARED cameras ,EMOTIONS ,LIGHT sources - Abstract
Facial expression recognition (FER) is an important method to study and distinguish human emotions. In the virtual reality (VR) context, people's emotions are instantly and naturally triggered and mobilized due to the high immersion and realism of VR. However, when people are wearing head mounted display (HMD) VR equipment, the eye regions will be covered. The FER accuracy will be reduced if the eye region information is discarded. Therefore, it is necessary to obtain the information of eye regions using other methods. The main difficulty in FER in an immersive VR context is that the conventional FER methods depend on public databases. The image facial information in the public databases is complete, so these methods are difficult to directly apply to the VR context. To solve this problem, this paper designs and implements a solution for FER in the VR context as follows. A real facial expression database collection scheme in the VR context is implemented by adding an infrared camera and infrared light source to the HMD. A virtual database construction method is presented for FER in the VR context, which can improve the generalization of models. A deep network named the multi-region facial expression recognition model is designed for FER in the VR context. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
66. On Evaluating IoT Data Trust via Machine Learning.
- Author
-
Tadj, Timothy, Arablouei, Reza, and Dedeoglu, Volkan
- Subjects
TRUST ,SUPERVISED learning ,MACHINE learning ,INTERNET of things ,PYTHON programming language ,TAGS (Metadata) ,SECURE Sockets Layer (Computer network protocol) ,RANDOM walks ,CLUSTER analysis (Statistics) - Abstract
Data trust in IoT is crucial for safeguarding privacy, security, reliable decision-making, user acceptance, and complying with regulations. Various approaches based on supervised or unsupervised machine learning (ML) have recently been proposed for evaluating IoT data trust. However, assessing their real-world efficacy is hard mainly due to the lack of related publicly available datasets that can be used for benchmarking. Since obtaining such datasets is challenging, we propose a data synthesis method, called random walk infilling (RWI), to augment IoT time-series datasets by synthesizing untrustworthy data from existing trustworthy data. Thus, RWI enables us to create labeled datasets that can be used to develop and validate ML models for IoT data trust evaluation. We also extract new features from IoT time-series sensor data that effectively capture its autocorrelation as well as its cross-correlation with the data of the neighboring (peer) sensors. These features can be used to learn ML models for recognizing the trustworthiness of IoT sensor data. Equipped with our synthesized ground-truth-labeled datasets and informative correlation-based features, we conduct extensive experiments to critically examine various approaches to evaluating IoT data trust via ML. The results reveal that commonly used ML-based approaches to IoT data trust evaluation, which rely on unsupervised cluster analysis to assign trust labels to unlabeled data, perform poorly. This poor performance is due to the underlying assumption that clustering provides reliable labels for data trust, which is found to be untenable. The results also indicate that ML models, when trained on datasets augmented via RWI and using the proposed features, generalize well to unseen data and surpass existing related approaches. Moreover, we observe that a semi-supervised ML approach that requires only about 10% of the data labeled offers competitive performance while being practically more appealing compared to the fully supervised approaches. The related Python code and data are available online. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
67. Genetic diversity and IUCN Red List status.
- Author
-
Schmidt, Chloé, Hoban, Sean, Hunter, Margaret, Paz‐Vinas, Ivan, and Garroway, Colin J.
- Subjects
- *
GENETIC variation , *GENETIC drift , *BIOLOGICAL extinction , *INBREEDING , *GENETIC correlations , *ENDANGERED species - Abstract
The International Union for Conservation of Nature (IUCN) Red List is an important and widely used tool for conservation assessment. The IUCN uses information about a species' range, population size, habitat quality and fragmentation levels, and trends in abundance to assess extinction risk. Genetic diversity is not considered, although it affects extinction risk. Declining populations are more strongly affected by genetic drift and higher rates of inbreeding, which can reduce the efficiency of selection, lead to fitness declines, and hinder species' capacities to adapt to environmental change. Given the importance of conserving genetic diversity, attempts have been made to find relationships between red‐list status and genetic diversity. Yet, there is still no consensus on whether genetic diversity is captured by the current IUCN Red List categories in a way that is informative for conservation. To assess the predictive power of correlations between genetic diversity and IUCN Red List status in vertebrates, we synthesized previous work and reanalyzed data sets based on 3 types of genetic data: mitochondrial DNA, microsatellites, and whole genomes. Consistent with previous work, species with higher extinction risk status tended to have lower genetic diversity for all marker types, but these relationships were weak and varied across taxa. Regardless of marker type, genetic diversity did not accurately identify threatened species for any taxonomic group. Our results indicate that red‐list status is not a useful metric for informing species‐specific decisions about the protection of genetic diversity and that genetic data cannot be used to identify threat status in the absence of demographic data. Thus, there is a need to develop and assess metrics specifically designed to assess genetic diversity and inform conservation policy, including policies recently adopted by the UN's Convention on Biological Diversity Kunming‐Montreal Global Biodiversity Framework. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
68. Lipid profiles and production performance responses of laying hens to dietary Moringa oleifera leaf meal: systematic review and meta-analysis.
- Author
-
Ogbuewu, Ifeanyichukwu P. and Mbajiorgu, Christian A.
- Abstract
The inclusion of Moringa oleifera leaf meal (MLM) in chicken diets especially in developing countries is on the increase due to scarcity of traditional feedstuffs. Therefore, this investigation aimed to explore the effects of MLM on lipid profiles and production characteristics of laying hens. Twenty-three publications retrieved from Web of Science, PubMed, Scopus and Google Scholar search engines were used for the analysis. Data from the 23 studies were analysed using random-effects model in OpenMEE software. Results were presented as standardised mean difference (SMD) at a 95% confidence interval. The results show significant improvement in feed conversion ratio (SMD = − 0.49; p <.001), egg mass (SMD = 0.35; p =.003), Haugh unit (SMD = 0.39; p <.001), eggshell thickness (SMD = 0.63; p <.001) and eggshell weight (SMD = 0.45; p <.001) at a reduced feed intake. On the other hand, egg weight, hen-day egg production and blood high-density lipoprotein cholesterol were not statistically different from controls. Results reveal that dietary MLM enhanced blood cholesterol, low-density lipoprotein (LDL) cholesterol, triglycerides and yolk cholesterol concentrations in laying hens. There is presence of significant heterogeneity and meta-regression revealed that study country, number of hen, housing system, hen age, inclusion level and layer strains were predictors of the treatment effect. In conclusion, the results of this meta-analysis suggest that inclusion of MLM in the diet of laying hens improved feed conversion ratio, aspects of egg quality and blood/yolk cholesterol concentrations in laying hens at a reduced feed intake. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
69. Trends in Published Comparative and International Education Research, 2014–2020, with a Focus on Global South and Non-academic Authors
- Author
-
Wiseman, Alexander W.
- Published
- 2022
- Full Text
- View/download PDF
70. Genetic algorithms and their applications to synthetic data generation
- Author
-
Chen, Yingrui, Elliot, Mark, and Smith, Duncan
- Subjects
Machine Learning ,Data Privacy ,Genetic Algorithms ,Data Synthesis - Abstract
Data synthesis is a statistical disclosure control technique that prevents the leakage of personal information from survey data. Rubin, who originally proposed this technique, treated the confidential data within a dataset as missing and then replaced those data using multiple imputation [103]. Most methods in data synthesis were then developed based on this principle. However, data synthesis is a multi-objective problem that aims to maximise information utility as well as minimising disclosure risks, and these methods have no explicit mechanism for balancing the objectives. This issue is the basis for the line of enquiry embodied in this thesis. The need to optimise competing objectives suggests the possible use of iterative machine learning techniques for data synthesis, but - to date - investigations of this possibility have been limited. In the thesis, a new synthesis method using Genetic Algorithms (GAs) is introduced. GAs are evolutionary computational methods that simulate natural evolution. They allow candidates (which in this thesis are datasets) to compete, reproduce and mate in a pre-determined environment until one or more of them perfectly fits the environment (which is defined by a set of objectives). GAs were firstly used on binary strings and now they have variants that deal with different problems and data forms. In this thesis, a GA data synthesiser whose candidates are matrix and real-coded data is designed, and most of its parameters and hyper-parameters tested. A new information utility function to measure the overall divergence from synthetic data to the original data is used. The results of running the synthesiser on a real dataset are presented, which show that the GA approach successfully produced plausible synthetic data using a single utility objective and they were proved to be able to seek for a trade-off between information utility and disclosure risks during the process of synthesising. The overall conclusion is that GAs represent a significant opportunity for the practice of data synthesis.
- Published
- 2020
71. Generative neural data synthesis for autonomous systems
- Author
-
Jegorova, Marija, Hospedales, Timothy, Mistry, Michael, and Ramamoorthy, Subramanian
- Subjects
GANs ,data synthesis ,data augmentation - Abstract
A significant number of Machine Learning methods for automation currently rely on data-hungry training techniques. The lack of accessible training data often represents an insurmountable obstacle, especially in the fields of robotics and automation, where acquiring new data can be far from trivial. Additional data acquisition is not only often expensive and time-consuming, but occasionally is not even an option. Furthermore, the real world applications sometimes have commercial sensitivity issues associated with the distribution of the raw data. This doctoral thesis explores bypassing the aforementioned difficulties by synthesising new realistic and diverse datasets using the Generative Adversarial Network (GAN). The success of this approach is demonstrated empirically through solving a variety of case-specific data-hungry problems, via application of novel GAN-based techniques and architectures. Specifically, it starts with exploring the use of GANs for the realistic simulation of the extremely high-dimensional underwater acoustic imagery for the purpose of training both teleoperators and autonomous target recognition systems. We have developed a method capable of generating realistic sonar data of any chosen dimension by image-translation GANs with Markov principle. Following this, we apply GAN-based models to robot behavioural repertoire generation, that enables a robot manipulator to successfully overcome unforeseen impedances, such as unknown sets of obstacles and random broken joints scenarios. Finally, we consider dynamical system identification for articulated robot arms. We show how using diversity-driven GAN models to generate exploratory trajectories can allow dynamic parameters to be identified more efficiently and accurately than with conventional optimisation approaches. Together, these results show that GANs have the potential to benefit a variety of robotics learning problems where training data is currently a bottleneck.
- Published
- 2020
- Full Text
- View/download PDF
72. Data Synthesis for Alfalfa Biomass Yield Estimation
- Author
-
Jonathan Vance, Khaled Rasheed, Ali Missaoui, and Frederick W. Maier
- Subjects
machine learning ,data synthesis ,generative models ,alfalfa ,biomass ,precision agriculture ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Alfalfa is critical to global food security, and its data is abundant in the U.S. nationally, but often scarce locally, limiting the potential performance of machine learning (ML) models in predicting alfalfa biomass yields. Training ML models on local-only data results in very low estimation accuracy when the datasets are very small. Therefore, we explore synthesizing non-local data to estimate biomass yields labeled as high, medium, or low. One option to remedy scarce local data is to train models using non-local data; however, this only works about as well as using local data. Therefore, we propose a novel pipeline that trains models using data synthesized from non-local data to estimate local crop yields. Our pipeline, synthesized non-local training (SNLT pronounced like sunlight), achieves a gain of 42.9% accuracy over the best results from regular non-local and local training on our very small target dataset. This pipeline produced the highest accuracy of 85.7% with a decision tree classifier. From these results, we conclude that SNLT can be a useful tool in helping to estimate crop yields with ML. Furthermore, we propose a software application called Predict Your CropS (PYCS pronounced like Pisces) designed to help farmers and researchers estimate and predict crop yields based on pretrained models.
- Published
- 2022
- Full Text
- View/download PDF
73. Effects and parameters of community-based exercise on motor symptoms in Parkinson’s disease: a meta-analysis
- Author
-
Chun-Lan Yang, Jia-Peng Huang, Ting-Ting Wang, Ying-Chao Tan, Yin Chen, Zi-Qi Zhao, Chao-Hua Qu, and Yun Qu
- Subjects
Data synthesis ,Exercise ,Movement ,Parkinson’s disease ,Prescription ,Review ,Neurology. Diseases of the nervous system ,RC346-429 - Abstract
Abstract Background Community-based exercise is a continuation and complement to inpatient rehabilitation for Parkinson's disease and does not require a professional physical therapist or equipment. The effects, parameters, and forms of each exercise are diverse, and the effect is affected by many factors. A meta-analysis was conducted to determine the effect and the best parameters for improving motor symptoms and to explore the possible factors affecting the effect of community-based exercise. Methods We conducted a comprehensive search of six databases: PEDro, PubMed/Medline, CENTRAL, Scopus, Embase, and WOS. Studies that compared community-based exercise with usual care were included. The intervention mainly included dance, Chinese martial arts, Nordic walking, and home-based exercise. The primary outcome measure was the Unified Parkinson’s Disease Rating Scale part III (UPDRS-III) score. The mean difference (95% CI) was used to calculate the treatment outcomes of continuous outcome variables, and the I2 statistic was used to estimate the heterogeneity of the statistical analysis. We conducted subgroup analysis and meta-regression analysis to determine the optimal parameters and the most important influencing factors of the exercise effect. Results Twenty-two studies that enrolled a total of 809 subjects were included in the analysis. Exercise had a positive effect on the UPDRS-III (MD = -5.83; 95% CI, -8.29 to -3.37), Timed Up and Go test (MD = -2.22; 95% CI -3.02 to -1.42), UPDRS ((MD = -7.80; 95% CI -10.98 to -6.42), 6-Minute Walk Test (MD = 68.81; 95% CI, 32.14 to 105.48), and Berg Balance Scale (MD = 4.52; 95% CI, 2.72 to 5.78) scores. However, the heterogeneity of each included study was obvious. Weekly frequency, age, and duration of treatment were all factors that potentially influenced the effect. Conclusions This meta-analysis suggests that community-based exercise may benefit motor function in patients with PD. The most commonly used modalities of exercise were tango and tai chi, and the most common prescription was 60 min twice a week. Future studies should consider the influence of age, duration of treatment, and weekly frequency on the effect of exercise. PROSPERO trial registration number CRD42022327162.
- Published
- 2022
- Full Text
- View/download PDF
74. A systematic review and future research agenda on detection of polycystic ovary syndrome (PCOS) with computer-aided techniques
- Author
-
Sayma Alam Suha and Muhammad Nazrul Islam
- Subjects
Polycystic ovary syndrome (PCOS) ,Computer-assisted methods ,Systematic literature review (SLR) ,Data synthesis ,Future research scopes ,Science (General) ,Q1-390 ,Social sciences (General) ,H1-99 - Abstract
Polycystic Ovary Syndrome (PCOS) is among the most prevalent endocrinological abnormalities seen in reproductive female bodies posing serious health hazards. The correctness of interpreting this condition depends heavily on the wide spectrum of associated symptoms and the doctor's expertise, making real-time clinical detection quite challenging. Thus, investigations on computer-aided PCOS detection systems have recently been explored by several researchers worldwide as a potential replacement for manual assessment. This review study's objective is to analyze the relevant research works on computer-assisted methods for automatically identifying PCOS through a systematic literature review (SLR) methodology as well as investigate the research limitations and explore potential future research scopes in this domain. 28 articles have been selected using the PRISMA approach based on a set of inclusion-exclusion criteria for conducting the review. The data synthesis of the selected articles has been conducted using six data exploration themes. As outcomes, the SLR explored the topical association between the studies; their research profiles; objectives; data size, type, and sources; methodologies applied for the detection of PCOS; and lastly the research outcomes along with their evaluation measures and performances. The study also highlights areas for future research directions examining the study gaps to enhance the current efforts for autonomous PCOS identification; such as integrating advanced techniques with the current methods; developing interactive software systems; exploring deep learning and unsupervised machine learning techniques; enhancing datasets and country context; and investigating more unknown factors behind PCOS. Thus, this SLR provides a state-of-the-art paradigm of autonomous PCOS detection which will support significantly efficient clinical assessment, diagnosis and treatment of PCOS.
- Published
- 2023
- Full Text
- View/download PDF
75. Pesticide effects on soil fauna communities—A meta‐analysis.
- Author
-
Beaumelle, Léa, Tison, Léa, Eisenhauer, Nico, Hines, Jes, Malladi, Sandhya, Pelosi, Céline, Thouvenot, Lise, and Phillips, Helen R. P.
- Subjects
- *
SOIL animals , *PESTICIDES , *AGRICULTURAL pests , *INVERTEBRATE communities , *GROWING season , *ECOSYSTEM health - Abstract
Soil invertebrate communities represent a significant fraction of global biodiversity and play crucial roles in ecosystems. A number of human activities threaten soil communities, in particular intensive agricultural practices such as pesticide use. However, there is currently no quantitative synthesis of the impacts of pesticides on soil fauna communities.Here, using a meta‐analysis of 54 studies and 294 observations, we quantify pesticide effects on the abundance, biomass, richness and diversity of natural soil fauna communities across a wide range of environmental contexts. We also identify scenarios with the most detrimental effects on soil fauna communities by analysing the effects of different pesticides (herbicides, fungicides, insecticides, broad‐spectrum substances and multiple substances), different application rates and temporal extents (short‐ or long‐term), as well as the response of different functional groups of soil animals (body size categories, presence of exoskeleton).Pesticides overall decreased the abundance and diversity of soil fauna communities across studies (Grand mean effect size (Hedge's g) = −0.30 +/− 0.16) and had stronger effects on soil fauna diversity than abundance. The most detrimental scenarios involved multiple substances, broad‐spectrum substances and insecticides, which significantly decreased soil fauna diversity even at recommended rates. We found no evidence that pesticide effects dampen over time, as short‐term and long‐term studies exhibited similar mean effect sizes.Policy implications: Our study highlights that pesticide use has significant detrimental non‐target effects on soil biodiversity, eroding a substantial part of global biodiversity and threatening ecosystem health. This provides crucial evidence supporting recent policies, such as the European Green Deal, that aim to reduce pesticide use in agriculture to conserve biodiversity. The detrimental effects of multiple substances revealed here are particularly concerning because realistic pesticide use often combines several substances targeting different pests and diseases over the crop season. We suggest that future guidelines for pesticide registration, restrictions and banning should rely on data able to fully capture the long‐term consequences of multiple substances for multiple non‐target species in realistic conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
76. Laparoscopic versus ultrasoundguided transversus abdominis plane block for postoperative pain management in minimally invasive colorectal surgery: a meta-analysis protocol.
- Author
-
Wenming Yang, Tao Yuan, Zhaolun Cai, Qin Ma, Xueting Liu, Hang Zhou, Siyuan Qiu, and Lie Yang
- Subjects
POSTOPERATIVE pain treatment ,TRANSVERSUS abdominis muscle ,MINIMALLY invasive procedures ,INFLAMMATORY bowel diseases ,SURGICAL site infections - Abstract
Introduction: Transversus abdominis plane block (TAPB) is now commonly administered for postoperative pain control and reduced opioid consumption in patients undergoing major colorectal surgeries, such as colorectal cancer, diverticular disease, and inflammatory bowel disease resection. However, there remain several controversies about the effectiveness and safety of laparoscopic TAPB compared to ultrasound-guided TAPB. Therefore, the aim of this study is to integrate both direct and indirect comparisons to identify a more effective and safer TAPB approach. Materials and methods: Systematic electronic literature surveillance will be performed in the PubMed, Embase, Cochrane Central Register of Controlled Trials (CENTRAL), and ClinicalTrials.gov databases for eligible studies through July 31, 2023. The Cochrane Risk of Bias version 2 (RoB 2) and Risk of Bias in Non-randomized Studies of Interventions (ROBINS-I) tools will be applied to scrutinize the methodological quality of the selected studies. The primary outcomes will include (1) opioid consumption at 24 hours postoperatively and (2) pain scores at 24 hours postoperatively both at rest and at coughing and movement according to the numerical rating scale (NRS). Additionally, the probability of TAPB-related adverse events, overall postoperative 30-day complications, postoperative 30-day ileus, postoperative 30-day surgical site infection, postoperative 7-day nausea and vomiting, and length of stay will be analyzed as secondary outcome measures. The findings will be assessed for robustness through subgroup analyses and sensitivity analyses. Data analyses will be performed using RevMan 5.4.1 and Stata 17.0. P value of less than 0.05 will be defined as statistically significant. The certainty of evidence will be examined via the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) working group approach. Ethics and dissemination: Owing to the nature of the secondary analysis of existing data, no ethical approval will be required. Our meta-analysis will summarize all the available evidence for the effectiveness and safety of TAPB approaches for minimally invasive colorectal surgery. High-quality peerreviewed publications and presentations at international conferences will facilitate disseminating the results of this study, which are expected to inform future clinical trials and help anesthesiologists and surgeons determine the optimal tailored clinical practice for perioperative pain management. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
77. Social Science and Consensus in Estimates of the US Jewish Population: Response to Sasson and DellaPergola.
- Author
-
Saxe, Leonard, Tighe, Elizabeth, Magidin de Kramer, Raquel, Nussbaum, Daniel, and Parmer, Daniel
- Subjects
- *
JUDAISM , *JEWISH children , *JEWISH communities , *JEWISH studies , *JEWISH identity , *CONSENSUS (Social sciences) - Abstract
In response to Isaac Sasson and Sergio DellaPergola's commentaries on our assessment of the validity of the Pew Research Center's 2020 estimate of 7.5 million US Jewish adults and children (Tighe et al. 2022), we address key points of agreement and contention in the validity of the estimate; in particular, how the Jewish population is identified and defined. We argue that Pew's definition of the Jewish population is consistent with major studies of American Jewry, from NJPS 1990 to recent local Jewish community studies. Applying a consistent definition that includes the growing group of "Jews of no religion" with one Jewish parent, as Pew Research Center does, allows for a faithful comparison across national and local studies and a more accurate understanding of levels of Jewish engagement and expressions of Jewish identity. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
78. According to their Numbers: Assessing the Pew Research Center's Estimate of 7.5 Million Jewish Americans.
- Author
-
Tighe, Elizabeth, Saxe, Leonard, Parmer, Daniel, Nussbaum, Daniel, and de Kramer, Raquel Magidin
- Subjects
- *
AMERICAN Jews , *INTERMARRIAGE , *JEWISH children , *RELIGIOUS groups , *JEWISH identity , *RESEARCH institutes - Abstract
The Pew Research Center's survey, Jewish Americans in 2020, was designed to provide estimates of the size of the US Jewish population, sociodemographic data on issues such as intermarriage, child-rearing, engagement in Jewish communal life, and a description of American Jewish attitudes. A sophisticated sample design was employed to ensure accurate and generalizable assessments of the population. Because Jews are a small sub-group and the US government does not collect census data on religious groups, creating estimates is a non-trivial task. The focus of this paper is on the validity of Pew's estimate of 7.5 million US Jewish adults and children, 2.4% of the overall US population. The estimate is an important standalone indicator and is the basis for assessments of current Jewish attitudes and behavior. This paper considers the underlying construct of Jewish identity and its operationalization by Pew and evaluates the convergent validity of Pew's findings. The efforts to define "who is a Jew" in sociodemographic surveys is described, and a set of methodological challenges to creating estimates are considered. The results of this review indicate that Pew's criteria for inclusion in the population estimate comports with long-standing views of how to assess the Jewish population. Furthermore, Pew's estimate of 7.5 million Jewish Americans is consistent with other recent demographic studies of the population. Their conclusions about a growing US Jewish population suggest a new narrative of American Jewish life that reflects the diversity of ways in which Jewish identity is expressed. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
79. Test–retest reliability of the EUROFIT test battery: a review.
- Author
-
Grgic, Jozo
- Subjects
- *
RELIABILITY in engineering , *TEST reliability , *STATISTICAL reliability , *EQUILIBRIUM testing , *PHYSICAL fitness testing - Abstract
Purpose: While several studies have examined the reliability of the EUROFIT test battery, the findings are conflicting. Therefore, this paper aimed to conduct a review of studies that explored the reliability of the EUROFIT test battery. Methods: Seven databases were searched to find studies that investigated the reliability of the EUROFIT test battery. From all included studies, intra-class correlation coefficients for the nine tests used in EUROFIT were extracted. The COSMIN checklist was used to evaluate the methodological quality of the studies. Results: Six excellent quality studies were included in the review. The following findings were observed in the included studies: (a) the flamingo balance test has moderate-to-good reliability; (b) plate tapping, handgrip strength, sit-ups, bent-arm hang, 10 × 5-m agility shuttle run, and the 20-m multistage shuttle run have moderate-to-excellent reliability; and (c) the sit-and-reach and standing board jump tests have good-to-excellent reliability. Conclusion: Overall, the findings of this review suggest that the EUROFIT can be used as a reliable battery of tests to assess physical fitness in research and practice. Still, as there were only six included studies, more research in different populations is needed. Future studies are also required to explore the influence of variables (e.g., familiarization with the exercise tests) that may impact the reliability of the EUROFIT test battery. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
80. P‐97: Applying a generative model to improve TFT measurement capacity performance.
- Author
-
Park, Kyongtae and Khim, Taeyoung
- Subjects
LANGUAGE models ,NOISE measurement ,MASS production - Abstract
AMOLED is based on current driving, so the current characteristics of TFT IV(current per voltage) have much influence. Now, as the technology of AMOLED is expanded to HOP(hybrid of Oxide and Polysilicon TFT) and UPC(Under Panel Camera), polysilicon and oxide TFT are used in combination, and TFTs of various sizes must be used at the same time. So, first, it is attempted to reduce the IV measurement points and improve them with the known interpolation method. However, there is a limit to accurate prediction due to the non‐linear characteristics of TFT and noise measurement in the off region of TFT. In this paper, we applied an asymmetric neural network structure to overcome this limitation of reduction decoding. To do this, we developed an asymmetric autoencoder to decode or reconstruct TFT IV data from small sampled IV measurements (14~35%). Then, to overcome the error estimation of generating many errors in the surrounding interpolation because noise is included in the TFT off region, a function of partially removing noise only in the off region is introduced. In addition, to overcome the performance decrease problem due to the minimal amount of abnormal data, which should be accurately predicted in a situation close to abnormal, generative models were applied to overcome it. The IV information to be generated was encoded using a pretrained large language model to reduce the dimensionality and converted into text, which was then trained on the distillate GPT‐2 model and used as a generative model. In the general interpolation method, the performance decreases as the number of measurement samples are reduced; the method proposed in this paper maintains its performance even if the sampling is under 20%. This means that it is adequate to restore only a small portion of information using the latent space information that has learned the TFT IV saturation and linear mode characteristics. This is applied to mass production by increasing the measurement capability without additional equipment investment. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
81. Exploring the utility of synthetic data to extract more value from sensitive health data assets: A focused example in perinatal epidemiology.
- Author
-
Braddon, Amy Elise, Robinson, Suzanne, Alati, Rosa, and Betts, Kim S.
- Subjects
- *
PRENATAL exposure , *DATA privacy , *ELECTRONIC health records , *AFFECTIVE disorders , *BIRTH weight , *LOGISTIC regression analysis - Abstract
Background: Privacy, access and security concerns can hinder the availability of health data for research. The use of synthesised data in place of de‐identified electronic health records (EHRs) presents an opportunity to conduct research while minimising privacy concerns. Objectives: To examine whether synthesised data can replicate two prenatal epidemiological associations: between prenatal smoking and lower birthweight, and between prenatal mood disorders and lower birthweight, using data synthesised from de‐identified health administrative data collections. Methods: We generated two synthetic datasets, using parametric and non‐parametric data generating methods, and examined the synthetic data for evidence of privacy concerns. Next, univariable and multivariable logistic regression was utilised to estimate the associations in both synthetic datasets, with results then compared to the real data. Results: Both synthesised datasets performed well in identifying the reduction in birthweight associated with prenatal smoking, while the non‐parametric data underestimated the reduction in birthweight associated with prenatal mood disorders. Improbable relationships between some variables were identified in the parametric synthesised data, however, these can be addressed with simple rules during data synthesis. No duplicate rows (i.e., exact copies of de‐identified data) were found in the parametric data, while only 0.6% of the rows in the non‐parametric data were duplicated. Conclusions: Both synthesised datasets performed well in replicating the statistical properties of the original data while addressing privacy issues. Data synthesis methods provide an opportunity for researchers to utilise health data while managing privacy and security concerns. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
82. A Meta-analysis of Responses of Broiler Chickens to Dietary Zinc Supplementation: Feed Intake, Feed Conversion Ratio and Average Daily Gain.
- Author
-
Ogbuewu, I. P., Modisaojang-Mojanaga, M. M. C., Mokolopi, B. G., and Mbajiorgu, C. A.
- Abstract
The importance of zinc (Zn) in broiler chicken nutrition is gaining attention due to the realization of its role in several enzymes and metabolic functions. This meta-analysis, therefore, aimed to synthesize pooled evidence on the effectiveness of Zn supplementation on enhancing feed intake, feed conversion ratio (FCR) and average daily gain (ADG) in broiler chickens. Thirty-seven peer-reviewed studies out of 436 identified from the search carried out in Scopus, Google Scholar and PubMed databases met the criteria for inclusion in this meta-analysis. Data were pooled and then disaggregated for moderators: broiler strains, sources of Zn, duration of Zn supplementation and Zn supplementation levels. All the analyses were conducted in Open Meta-analyst for Ecology and Evolution (OpenMEE) software. Pooled results indicate that Zn supplementation increased feed intake [standardised mean differences (SMD) = 0.34 g/bird/day; 95% confidence interval (CI) 0.27, 0.42)] and ADG (SMD = 0.43 g/bird/day; 95% CI 0.35, 0.50) in broiler chickens in comparison with the controls. Dietary Zn supplementation improves FCR (SMD = − 0.16 g/g; 95% CI 0.20, − 0.11), taking heterogeneity and publication biases into account. Restricted subanalysis showed that studied moderators influenced the outcomes of the meta-analysis. Meta-regression revealed that moderators explain about 38% of the sources of variations in the present study. This meta-analysis suggests that dietary zinc supplementation had a positive effect on growth performance indices in broiler chickens. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
83. Research on the Simulation Method of HTTP Traffic Based on GAN
- Author
-
Chenglin Yang, Dongliang Xu, and Xiao Ma
- Subjects
GAN ,HTTP stream ,traffic feature mimicry ,data synthesis ,network data ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
Due to the increasing severity of network security issues, training corresponding detection models requires large datasets. In this work, we propose a novel method based on generative adversarial networks to synthesize network data traffic. We introduced a network traffic data normalization method based on Gaussian mixture models (GMM), and for the first time, incorporated a generator based on the Swin Transformer structure into the field of network traffic generation. To further enhance the robustness of the model, we mapped real data through an AE (autoencoder) module and optimized the training results in the form of evolutionary algorithms. We validated the training results on four different datasets and introduced four additional models for comparative experiments in the experimental evaluation section. Our proposed SEGAN outperformed other state-of-the-art network traffic emulation methods.
- Published
- 2024
- Full Text
- View/download PDF
84. Element Information Enhancement for Diagram Question Answering with Synthetic Data
- Author
-
Zhang, Yadong, Chen, Yang, Ren, Yupei, Lan, Man, Chen, Yuefeng, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Zhang, Ningyu, editor, Wang, Meng, editor, Wu, Tianxing, editor, Hu, Wei, editor, and Deng, Shumin, editor
- Published
- 2022
- Full Text
- View/download PDF
85. Towards Real-World HDRTV Reconstruction: A Data Synthesis-Based Approach
- Author
-
Cheng, Zhen, Wang, Tao, Li, Yong, Song, Fenglong, Chen, Chang, Xiong, Zhiwei, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Avidan, Shai, editor, Brostow, Gabriel, editor, Cissé, Moustapha, editor, Farinella, Giovanni Maria, editor, and Hassner, Tal, editor
- Published
- 2022
- Full Text
- View/download PDF
86. BézierPalm: A Free Lunch for Palmprint Recognition
- Author
-
Zhao, Kai, Shen, Lei, Zhang, Yingyi, Zhou, Chuhan, Wang, Tao, Zhang, Ruixin, Ding, Shouhong, Jia, Wei, Shen, Wei, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Avidan, Shai, editor, Brostow, Gabriel, editor, Cissé, Moustapha, editor, Farinella, Giovanni Maria, editor, and Hassner, Tal, editor
- Published
- 2022
- Full Text
- View/download PDF
87. Data Synthesis and Iterative Refinement for Neural Semantic Parsing without Annotated Logical Forms
- Author
-
Wu, Shan, Chen, Bo, Han, Xianpei, Sun, Le, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Sun, Maosong, editor, Liu, Yang, editor, Che, Wanxiang, editor, Feng, Yang, editor, Qiu, Xipeng, editor, Rao, Gaoqi, editor, and Chen, Yubo, editor
- Published
- 2022
- Full Text
- View/download PDF
88. Comparing the Utility and Disclosure Risk of Synthetic Data with Samples of Microdata
- Author
-
Little, Claire, Elliot, Mark, Allmendinger, Richard, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Domingo-Ferrer, Josep, editor, and Laurent, Maryline, editor
- Published
- 2022
- Full Text
- View/download PDF
89. 3D Reconstruction of Medical Image Based on Improved Ray Casting Algorithm
- Author
-
Yu, Wang, Ning, Gong, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, El Yacoubi, Mounîm, editor, Granger, Eric, editor, Yuen, Pong Chi, editor, Pal, Umapada, editor, and Vincent, Nicole, editor
- Published
- 2022
- Full Text
- View/download PDF
90. Systematic Review and Evidence-Based Research in Dentistry
- Author
-
Tabatabaei, Fahimeh, Tayebi, Lobat, Tabatabaei, Fahimeh, and Tayebi, Lobat
- Published
- 2022
- Full Text
- View/download PDF
91. Software Application Profile: The Anchored Multiplier calculator—a Bayesian tool to synthesize population size estimates
- Author
-
Wesson, Paul D, McFarland, Willi, Qin, Cong Charlie, and Mirzazadeh, Ali
- Subjects
Mathematical Sciences ,Statistics ,Bayes Theorem ,Female ,HIV Infections ,Humans ,Iran ,Models ,Statistical ,Population Density ,Population Surveillance ,Sex Workers ,Software ,Bayesian modelling ,population size estimation ,key populations ,data synthesis ,Public Health and Health Services ,Epidemiology ,Public health - Abstract
Estimating the number of people in hidden populations is needed for public health research, yet available methods produce highly variable and uncertain results. The Anchored Multiplier calculator uses a Bayesian framework to synthesize multiple population size estimates to generate a consensus estimate. Users submit point estimates and lower/upper bounds which are converted to beta probability distributions and combined to form a single posterior probability distribution. The Anchored Multiplier calculator is available as a web browser-based application. The software allows for unlimited empirical population size estimates to be submitted and combined according to Bayes Theorem to form a single estimate. The software returns output as a forest plot (to visually compare data inputs and the final Anchored Multiplier estimate) and a table that displays results as population percentages and counts. The web application 'Anchored Multiplier Calculator' is free software and is available at [http://globalhealthsciences.ucsf.edu/resources/tools] or directly at [http://anchoredmultiplier.ucsf.edu/].
- Published
- 2019
92. Towards automated molecular detection through simulated generation of CMOS-based rotational spectroscopy
- Author
-
Yasamin Fozouni, Eric C. Larson, and Bruce Gnade
- Subjects
Rotational spectroscopy ,Molecular detection ,Data synthesis ,Science (General) ,Q1-390 ,Social sciences (General) ,H1-99 - Abstract
The use of CMOS sensors for rotational spectroscopy is a promising, but challenging avenue for low-cost gas sensing and molecular identification. A main challenge in this approach is that practical CMOS spectroscopy samples contain various different noise sources that reduce the effectiveness of matching techniques for molecular identification with rotational spectroscopy. To help solve this challenge, we develop a software application tool that can demonstrate the feasibility and reliability of detection with CMOS sensor samples. Specifically, the tool characterizes the types of noise in CMOS sample collection and synthesizes spectroscopy files based upon existing databases of rotational spectroscopy samples gathered from other sensors. We use the software to create a large database of plausible CMOS-generated sample files of gases. This dataset is used to help evaluate spectral matching algorithms used in gas sensing and molecular identification applications. We evaluate these traditional methods on the synthesized dataset and discuss how peak finding and spectral matching algorithms can be altered to accommodate the noise sources present in CMOS sample collection.
- Published
- 2023
- Full Text
- View/download PDF
93. Laparoscopic versus ultrasound-guided transversus abdominis plane block for postoperative pain management in minimally invasive colorectal surgery: a meta-analysis protocol
- Author
-
Wenming Yang, Tao Yuan, Zhaolun Cai, Qin Ma, Xueting Liu, Hang Zhou, Siyuan Qiu, and Lie Yang
- Subjects
transversus abdominis plane block ,postoperative pain management ,minimally invasive ,colorectal surgery ,data synthesis ,Neoplasms. Tumors. Oncology. Including cancer and carcinogens ,RC254-282 - Abstract
IntroductionTransversus abdominis plane block (TAPB) is now commonly administered for postoperative pain control and reduced opioid consumption in patients undergoing major colorectal surgeries, such as colorectal cancer, diverticular disease, and inflammatory bowel disease resection. However, there remain several controversies about the effectiveness and safety of laparoscopic TAPB compared to ultrasound-guided TAPB. Therefore, the aim of this study is to integrate both direct and indirect comparisons to identify a more effective and safer TAPB approach.Materials and methodsSystematic electronic literature surveillance will be performed in the PubMed, Embase, Cochrane Central Register of Controlled Trials (CENTRAL), and ClinicalTrials.gov databases for eligible studies through July 31, 2023. The Cochrane Risk of Bias version 2 (RoB 2) and Risk of Bias in Non-randomized Studies of Interventions (ROBINS-I) tools will be applied to scrutinize the methodological quality of the selected studies. The primary outcomes will include (1) opioid consumption at 24 hours postoperatively and (2) pain scores at 24 hours postoperatively both at rest and at coughing and movement according to the numerical rating scale (NRS). Additionally, the probability of TAPB-related adverse events, overall postoperative 30-day complications, postoperative 30-day ileus, postoperative 30-day surgical site infection, postoperative 7-day nausea and vomiting, and length of stay will be analyzed as secondary outcome measures. The findings will be assessed for robustness through subgroup analyses and sensitivity analyses. Data analyses will be performed using RevMan 5.4.1 and Stata 17.0. P value of less than 0.05 will be defined as statistically significant. The certainty of evidence will be examined via the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) working group approach.Ethics and disseminationOwing to the nature of the secondary analysis of existing data, no ethical approval will be required. Our meta-analysis will summarize all the available evidence for the effectiveness and safety of TAPB approaches for minimally invasive colorectal surgery. High-quality peer-reviewed publications and presentations at international conferences will facilitate disseminating the results of this study, which are expected to inform future clinical trials and help anesthesiologists and surgeons determine the optimal tailored clinical practice for perioperative pain management.Systematic review registrationhttps://www.crd.york.ac.uk/PROSPERO/display_record.php?RecordID=281720, identifier (CRD42021281720).
- Published
- 2023
- Full Text
- View/download PDF
94. Tracking the global application of conservation translocation and social attraction to reverse seabird declines.
- Author
-
Spatz, Dena R., Young, Lindsay C., Holmes, Nick D., Jones, Holly P., VanderWerf, Eric A., Lyons, Donald E., Kress, Stephen, Miskelly, Colin M., and Taylor, Graeme A.
- Subjects
- *
COASTAL zone management , *DATABASES , *ECOLOGICAL resilience , *ENVIRONMENTAL degradation , *CHARADRIIFORMES - Abstract
The global loss of biodiversity has inspired actions to restore nature across the planet. Translocation and social attraction actions deliberately move or lure a target species to a restoration site to reintroduce or augment populations and enhance biodiversity and ecosystem resilience. Given limited conservation funding and rapidly accelerating extinction trajectories, tracking progress of these interventions can inform best practices and advance management outcomes. Seabirds are globally threatened and commonly targeted for translocation and social attraction (“active seabird restoration”), yet no framework exists for tracking these efforts nor informing best practices. This study addresses this gap for conservation decision makers responsible for seabirds and coastal management. We systematically reviewed active seabird restoration projects worldwide and collated results into a publicly accessible Seabird Restoration Database. We describe global restoration trends, apply a systematic process to measure success rates and response times since implementation, and examine global factors influencing outcomes. The database contains 851 active restoration events in 551 locations targeting 138 seabird species; 16% of events targeted globally threatened taxa. Visitation occurred in 80% of events and breeding occurred in 76%, on average 2 y after implementation began (SD = 3.2 y). Outcomes varied by taxonomy, with the highest and quickest breeding response rates for Charadriiformes (terns, gulls, and auks), primarily with social attraction. Given delayed and variable response times to active restoration, 5 y is appropriate before evaluating outcomes. The database and results serve as a model for tracking and evaluating restoration outcomes, and is applicable to measuring conservation interventions for additional threatened taxa. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
95. Data Synthesis for Alfalfa Biomass Yield Estimation.
- Author
-
Vance, Jonathan, Rasheed, Khaled, Missaoui, Ali, and Maier, Frederick W.
- Subjects
- *
BIOMASS estimation , *MACHINE learning , *ALFALFA , *CROP yields , *MACHINE performance , *DECISION trees - Abstract
Alfalfa is critical to global food security, and its data is abundant in the U.S. nationally, but often scarce locally, limiting the potential performance of machine learning (ML) models in predicting alfalfa biomass yields. Training ML models on local-only data results in very low estimation accuracy when the datasets are very small. Therefore, we explore synthesizing non-local data to estimate biomass yields labeled as high, medium, or low. One option to remedy scarce local data is to train models using non-local data; however, this only works about as well as using local data. Therefore, we propose a novel pipeline that trains models using data synthesized from non-local data to estimate local crop yields. Our pipeline, synthesized non-local training (SNLT pronounced like sunlight), achieves a gain of 42.9% accuracy over the best results from regular non-local and local training on our very small target dataset. This pipeline produced the highest accuracy of 85.7% with a decision tree classifier. From these results, we conclude that SNLT can be a useful tool in helping to estimate crop yields with ML. Furthermore, we propose a software application called Predict Your CropS (PYCS pronounced like Pisces) designed to help farmers and researchers estimate and predict crop yields based on pretrained models. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
96. BioNoMo: the Biodiversity Network of Mozambique.
- Author
-
Malatesta, Luca, Alves, Tereza, Attorre, Fabio, Brito, Denise, Cianciullo, Silvio, Datizua, Castigo, De Abreu, Daniela, De Felici, Stefano, De Sousa, Camila, Langa, Clayton, Mate, Boavida, Matimele, Hermenegildo, Nicosia, Enrico, Odorico, Delcio, Raiva, Raquel, Sandramo, Domingos, Santana Afonso, Paula, Sardinha, Celso, Souane, Joelma, and Timane, Renato
- Abstract
Mozambique biodiversity richness plays a pivotal role to achieve the sustainable development of the country. However, Mozambique's flora and fauna diversity still remains broadly unknown and poorly documented. To properly address this issue, one of the strategic needs expressed by the Mozambican institutions was the development of a national biodiversity data repository to aggregate, manage and make data available online. Thus, a sustainable infrastructure for the standardisation, aggregation, organisation and sharing of primary biodiversity data was developed. Named the "Biodiversity Network of Mozambique" (BioNoMo), such a tool serves as a national repository of biodiversity data and aggregates occurrence records of plants and animals in the country obtained from floristic and faunistic observations and from specimens of biological collections. In this paper, the authors present the structure and data of BioNoMO, including software details, the process of data gathering and aggregation, the taxonomic coverage and the WebGIS development. Currently, aggregating a total of 273,172 records, including 85,092 occurrence records of plants and 188,080 occurrence records of animals (41.2% terrestrial, 58,8% aquatic), BioNoMo represents the largest aggregator of primary biodiversity data in Mozambique and it is planned to grow further by aggregating new datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
97. Computationally efficient data synthesis for AC-OPF: Integrating Physics-Informed Neural Network solvers and active learning.
- Author
-
Zhang, Jiahao, Peng, Ruo, Lu, Chenbei, and Wu, Chenye
- Subjects
- *
DATA privacy , *BILEVEL programming , *ELECTRICAL load , *DATA release , *TEST systems - Abstract
This study addresses the challenges of privacy, utility, and efficiency in releasing privacy-preserving operational data for AC Optimal Power Flow (AC-OPF) research. Traditional methods, injecting noise into operational data (i.e. , demand data and dispatch profiles) within the Differential Privacy (DP) framework, often violate physical constraints within the data, leading to unrealistic and infeasible outcomes that diminish data utility. While AC-OPF-solver-based bi-level post-processing optimizations can enforce physical feasibility, the objective divergence between post-processing and AC-OPF leads to discrepancies, compromising data utility. Additionally, their non-convex and adversarial nature makes computation prohibitively expensive, further preventing efficient data release. To overcome these challenges, our research introduces a DP approach that combines strategic noise injection for demand data with the computation of corresponding dispatch profiles, ensuring the privacy-preserving data satisfy AC-OPF's physical constraints. To accelerate data release, we employ Physics-Informed Neural Networks (PINNs). This ensures solutions' physical feasibility while enhancing computational efficiency. Furthermore, we incorporate active learning to target the most informative data samples, enhancing PINN training and optimizing efficiency while maintaining solution accuracy. Comprehensive experiments on IEEE test systems reveal our approach's improved performance and accelerated computation speed over traditional methods, highlighting its efficiency in maintaining data privacy and utility and decreasing computational burden amidst diverse privacy considerations. • Realistic, feasible, and fast data synthesis via AC-OPF-tailored PINNs. • Quantifying trade-off between demand privacy and dispatch profile accuracy. • Sampling efficient PINN training set via active learning. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
98. Enhancing link prediction in graph data augmentation through graphon mixup.
- Author
-
Sultana, Tangina, Hossain, Md. Delowar, Morshed, Md. Golam, and Lee, Young-Koo
- Abstract
Link prediction in complex networks is a fundamental problem with applications in diverse domains, from social networks to biological systems. Traditional approaches often struggle to capture intricate relationships in graphs, leading to suboptimal predictions. To address this, we introduce a novel method called graphon mixup (GM), which leverages the power of graphons to enhance link prediction. The augmentation strategy involves generating a synthetic graph by combining the original graph with a graphon-based synthetic graph. This process, expressed as a weighted combination of adjacency matrices, strategically blends real and synthetic information, enriching the training dataset. GM formulates link prediction as a joint optimization problem, aligning the characteristics of the synthetic graph with the true underlying structure. The objective is to minimize cross-entropy loss between predicted and true edge probabilities. A detailed computational complexity analysis evaluates the time and space requirements, aiding in understanding the efficiency and scalability of GM across different datasets and network sizes. Empirical validation on benchmark datasets demonstrates GM’s effectiveness in consistently improving average precision across diverse network types. The proposed method enhances the generalization capabilities of link prediction models, providing a more robust framework capable of accurate predictions even in the presence of noise or unseen patterns. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
99. Data Synthesis
- Author
-
Zhang, Ting, Schintler, Laurie A., editor, and McNeely, Connie L., editor
- Published
- 2022
- Full Text
- View/download PDF
100. Ad-RuLer: A Novel Rule-Driven Data Synthesis Technique for Imbalanced Classification
- Author
-
Xiao Zhang, Iván Paz, Àngela Nebot, Francisco Mugica, and Enrique Romero
- Subjects
rule-based approach ,oversampling ,data synthesis ,imbalanced data ,classification ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
When classifiers face imbalanced class distributions, they often misclassify minority class samples, consequently diminishing the predictive performance of machine learning models. Existing oversampling techniques predominantly rely on the selection of neighboring data via interpolation, with less emphasis on uncovering the intrinsic patterns and relationships within the data. In this research, we present the usefulness of an algorithm named RuLer to deal with the problem of classification with imbalanced data. RuLer is a learning algorithm initially designed to recognize new sound patterns within the context of the performative artistic practice known as live coding. This paper demonstrates that this algorithm, once adapted (Ad-RuLer), has great potential to address the problem of oversampling imbalanced data. An extensive comparison with other mainstream oversampling algorithms (SMOTE, ADASYN, Tomek-links, Borderline-SMOTE, and KmeansSMOTE), using different classifiers (logistic regression, random forest, and XGBoost) is performed on several real-world datasets with different degrees of data imbalance. The experiment results indicate that Ad-RuLer serves as an effective oversampling technique with extensive applicability.
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.