134 results on '"Exploratory data analysis (EDA)"'
Search Results
2. Using Sentiment Analysis to Construe Tweets and the Controversy During the 2020 American Presidential Elections
- Author
-
Venkata Sai Sathvik, T, Sharma, Jaideep, Agarwal, Nihal, Shaan, Shaik Abdul, Gupta, Arpita, Das, Kakali, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Kumar, Sandeep, editor, Hiranwal, Saroj, editor, Garg, Ritu, editor, and Purohit, S.D., editor
- Published
- 2025
- Full Text
- View/download PDF
3. Netflix Data Analysis Using EDA
- Author
-
Jadhav, Aakanksha Ramesh, Jadhav, Ramesh D., Jadhav, Aditya, Singh, Chandrani, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Lin, Frank, editor, Pastor, David, editor, Kesswani, Nishtha, editor, Patel, Ashok, editor, Bordoloi, Sushanta, editor, and Koley, Chaitali, editor
- Published
- 2025
- Full Text
- View/download PDF
4. Build a Dataset of ResearchGate Members
- Author
-
Zaid Mundher and Yasir Ali
- Subjects
web scraping ,researchgate ,exploratory data analysis (eda) ,k-means ,anova ,Education ,Science (General) ,Q1-390 - Abstract
In today's academic environment, social media platforms, ResearchGate in particular, play an essential role in facilitating collaboration and communication among researchers. Even with the wide usage of this site, there remains a significant gap in the availability of structured datasets that focus on researchers and their academic outputs. This lack of accessible data obstructs comprehensive analysis and evaluation of research trends and impact. This study seeks to address this gap by employing web scraping techniques to construct a dataset derived from ResearchGate, a leading platform for academic professionals. The introduced dataset consists of eight key features, including metrics related to publications, citations, and areas of research specialization. The availability of such a dataset not only provides a valuable resource for future research but also enables scholars to analyze research performance, identify collaboration opportunities, and uncover trends in academic productivity. Beyond dataset construction, this paper also details the exploratory data analysis (EDA) conducted to derive insights from the collected data. The K-means clustering algorithm was applied to categorize researchers according to their publication and citation patterns, offering a clearer understanding of academic achievements. The main contribution of this work is the establishment of a researcher’s dataset that can be used in future studies to analyze and examine the efforts and trends of researchers in their scientific work.
- Published
- 2025
- Full Text
- View/download PDF
5. IMPLEMENTATION OF BUSINESS INTELLIGENCE FOR ANALYSIS DATA OF DRUG SALES USING EXPLORATORY DATA ANALYSIS (EDA) AND VISUALIZATION DATA USING LOOKER STUDIO.
- Author
-
Oktafialfa, Geraldyo and Wahyu, Ari Purno
- Subjects
- *
BUSINESS intelligence , *DATA analysis , *DRUG prices , *QUANTITATIVE research , *DATA visualization - Abstract
This study aims to analyze pharmaceutical sales performance by leveraging big data and advanced information technology tools, providing insights for improved business strategies. Using the Exploratory Data Analysis (EDA) method, the study processes raw sales data through spreadsheet applications to identify key patterns and trends. The findings are then visualized with Looker Studio on an intelligent dashboard tailored to the needs of the marketing team. The dashboard enables quick, data-driven decision-making by displaying performance metrics and trends relevant to sales and marketing strategies. The results reveal critical insights into sales behavior, including high-performing products, regional sales variations, and temporal sales trends. The analysis equips the marketing team with actionable data to refine their strategies, prioritize product focus, and adjust marketing efforts in response to identified patterns. In conclusion, the use of EDA and data visualization provides a structured approach to big data interpretation, allowing the company to optimize business and marketing performance. The study underscores the potential of data-driven dashboards to enhance strategic planning, suggesting broader applications in various data-intensive business environments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. The Interplay Between Economic Growth and Environmental Sustainability: A Multifaceted Approach to Assessing Global Trends and National Policies.
- Author
-
Kankanam Pathiranage, Heshan Sameera
- Subjects
SUSTAINABLE development ,SUSTAINABILITY ,CARBON emissions ,GROSS domestic product ,ENVIRONMENTAL economics - Abstract
This study explores the complex relationship between economic growth and environmental sustainability through a multifaceted analytical approach. Utilizing a robust dataset that includes key economic and sustainability metrics from various countries, the research employs exploratory data analysis, regression analysis, time-series analysis, and clustering techniques to unveil these relationships. The findings of the research illustrate a significant correlation between gross domestic product and carbon dioxide emissions, supporting the Environmental Kuznets Curve hypothesis, which proposes that initial economic growth worsens environmental degradation until a certain level of wealth is reached. Furthermore, the research emphasizes that while renewable energy consumption and electricity access are crucial, their current impact is insufficient to offset the environmental costs of economic growth. Through clustering analysis, the study identifies diverse national profiles, highlighting the necessity of tailored strategies to effectively address sustainability challenges. These insights emphasize the importance of integrated policies that promote both economic and environmental goals, supporting energy efficiency, cleaner technologies, and stringent environmental regulations. This comprehensive study lays the groundwork for informed policymaking aimed at achieving sustainable development. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Analysis of Effects of COVID-19 Pandemic on Small- and Medium-Sized Enterprises (SMEs) in Rwanda Using Wood Firm-Level Data.
- Author
-
Munyemana, Emmanuel, Mung'atu, Joseph, and Ruranga, Charles
- Subjects
COVID-19 pandemic ,SMALL business ,TAX cuts ,ECONOMIC indicators ,ECONOMIC impact - Abstract
This study assesses and quantifies the economic and financial impacts of the COVID-19 pandemic during the period of business operation restrictions countrywide (lockdown measures). We examine the strategies adopted by small and medium-sized enterprises (SMEs) to reopen their business operations after lockdown measures had been relaxed or lifted. Data were collected in Rwanda from nearly 244 SMEs across the country, providing firsthand and reliable information on the effects of the pandemic on business performance, with a particular emphasis on wood-based enterprises. We used Exploratory Data Analysis (EDA) and multivariate linear regression methods to measure the pandemic's effects on employment, sales, and tax payments among SMEs. The findings reveal that firms downsized employment by 36%, with significant deviations within different SME sizes. Small businesses were particularly affected by reduced sales levels due to the pandemic. Although there was an overall reduction in tax payments during the crisis, medium-sized enterprises experienced a more significant decrease in taxes paid to the government by 74.6%. Additionally, regression findings affirm that the COVID-19 effects on SMEs were manifested in reduced sales across all categories of SMEs, reduced employment, and a reduced amount of taxes paid to the government, which further translate to reduced economic performance during COVID-19 period. Furthermore, SME owners utilised various coping mechanisms during the reopening phase, including a reliance on savings and selling assets. The analysis recommends establishing medium-term financing mechanisms and providing technical support for SMEs to ensure a steady and sustainable recovery from the pandemic's effects, as well as enhancing their resilience to future socio-economic shocks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Exploring Analytics-Related Occupations: A Data Mining Perspective on US Labor Market Insights for Analytics Students and Graduates
- Author
-
Riauwindu, Putranegara, Zlatev, Vladimir, Rajagopal, Series Editor, Goncalves, Marcus, editor, and Zlatev, Vladimir, editor
- Published
- 2024
- Full Text
- View/download PDF
9. Lung Cancer Prognosis: A Machine Learning Approach to Symptom-Based Prediction and Early Detection
- Author
-
Darda, Shivaan, Lu, Sophia, Jain, Reetu, Kacprzyk, Janusz, Series Editor, Novikov, Dmitry A., Editorial Board Member, Shi, Peng, Editorial Board Member, Cao, Jinde, Editorial Board Member, Polycarpou, Marios, Editorial Board Member, Pedrycz, Witold, Editorial Board Member, Alareeni, Bahaaeddin, editor, and Elgedawy, Islam, editor
- Published
- 2024
- Full Text
- View/download PDF
10. Bearing Fault Diagnosis Using Machine Learning Models
- Author
-
Chandrvanshi, Shagun, Sharma, Shivam, Singh, Mohini Preetam, Singh, Rahul, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Sharma, Devendra Kumar, editor, Peng, Sheng-Lung, editor, Sharma, Rohit, editor, and Jeon, Gwanggil, editor
- Published
- 2024
- Full Text
- View/download PDF
11. Extracting the Recommended Features from the Elementary School Student Dataset through Exploration Data Analysis (EDA)
- Author
-
Sartika, Devi, Elfaladonna, Febie, Isa, Indra Griha Tofik, Putra, Andre Mariza, Chan, Albert P. C., Series Editor, Hong, Wei-Chiang, Series Editor, Mellal, Mohamed Arezki, Series Editor, Narayanan, Ramadas, Series Editor, Nguyen, Quang Ngoc, Series Editor, Ong, Hwai Chyuan, Series Editor, Sachsenmeier, Peter, Series Editor, Sun, Zaicheng, Series Editor, Ullah, Sharif, Series Editor, Wu, Junwei, Series Editor, Zhang, Wei, Series Editor, Husni, Nyayu Latifah, editor, Caesarendra, Wahyu, editor, Aznury, Martha, editor, Novianti, Leni, editor, and Stiawan, Deris, editor
- Published
- 2024
- Full Text
- View/download PDF
12. Novel Framework for Image Classification Based on Patch-Based CNN Model
- Author
-
Gour, Ayush, Bhanodia, Praveen Kumar, Sethi, Kamal K., Rajput, Shivashankar, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Swaroop, Abhishek, editor, Polkowski, Zdzislaw, editor, Correia, Sérgio Duarte, editor, and Virdee, Bal, editor
- Published
- 2024
- Full Text
- View/download PDF
13. Towards High-Performance Exploratory Data Analysis (EDA) via Stable Equilibrium Point
- Author
-
Song, Yuxuan, Wang, Yongyu, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Luo, Biao, editor, Cheng, Long, editor, Wu, Zheng-Guang, editor, Li, Hongyi, editor, and Li, Chaojie, editor
- Published
- 2024
- Full Text
- View/download PDF
14. Empowering students through active learning in educational big data analytics
- Author
-
Yun-Cheng Tsai
- Subjects
Active learning ,Inquiry-based learning ,ChatGPT Python APIs ,Exploratory Data Analysis (EDA) ,Text mining ,Latent Dirichlet Allocation (LDA) ,Special aspects of education ,LC8-6691 - Abstract
Abstract Purpose This paper explores how Educational Big Data Analytics can enhance student learning. It investigates the role of active learning in improving students’ data analysis skills and critical thinking. By actively engaging students in data analysis assessments, the aim is to equip them with the skills to navigate the data-rich educational landscape. Methods The study uses a teaching strategy that combines structured and unstructured data analysis using Python tools and ChatGPT APIs. It presents five assignments, each highlighting data analysis skills and encouraging critical thinking. Results The paper offers insights into how the teaching strategy effectively enhances students’ data analysis and critical thinking skills. It investigates the specific impact of active learning on students’ engagement with educational data. The study reveals that all students can complete a comprehensive project, integrating the skills they have learned in the five assignments related to educational big data while incorporating the educational implications from their respective disciplines. Conclusion The key lies in instructors being able to design individual assignments that link practical experiences, enabling each teaching session’s effectiveness to accumulate in students’ personal experiences and practical skills, ultimately empowering them with the abilities necessary to work effectively with Educational Big Data Analytics. The findings of this study make a valuable contribution to the ongoing conversation about enhancing the educational experience for students in this data-rich era.
- Published
- 2024
- Full Text
- View/download PDF
15. Empowering students through active learning in educational big data analytics.
- Author
-
Tsai, Yun-Cheng
- Subjects
ACTIVE learning ,BIG data ,SELF-efficacy ,CHATGPT ,CRITICAL thinking ,DATA analysis - Abstract
Purpose: This paper explores how Educational Big Data Analytics can enhance student learning. It investigates the role of active learning in improving students' data analysis skills and critical thinking. By actively engaging students in data analysis assessments, the aim is to equip them with the skills to navigate the data-rich educational landscape. Methods: The study uses a teaching strategy that combines structured and unstructured data analysis using Python tools and ChatGPT APIs. It presents five assignments, each highlighting data analysis skills and encouraging critical thinking. Results: The paper offers insights into how the teaching strategy effectively enhances students' data analysis and critical thinking skills. It investigates the specific impact of active learning on students' engagement with educational data. The study reveals that all students can complete a comprehensive project, integrating the skills they have learned in the five assignments related to educational big data while incorporating the educational implications from their respective disciplines. Conclusion: The key lies in instructors being able to design individual assignments that link practical experiences, enabling each teaching session's effectiveness to accumulate in students' personal experiences and practical skills, ultimately empowering them with the abilities necessary to work effectively with Educational Big Data Analytics. The findings of this study make a valuable contribution to the ongoing conversation about enhancing the educational experience for students in this data-rich era. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Marble Powder as a Soil Stabilizer: An Experimental Investigation of the Geotechnical Properties and Unconfined Compressive Strength Analysis.
- Author
-
Umar, Ibrahim Haruna and Lin, Hang
- Subjects
- *
SOIL conditioners , *COMPRESSIVE strength , *POWDERS , *SOIL stabilization , *SOIL mechanics , *MARBLE - Abstract
Fine-grained soils present engineering challenges. Stabilization with marble powder has shown promise for improving engineering properties. Understanding the temporal evolution of Unconfined Compressive Strength (UCS) and geotechnical properties in stabilized soils could aid strength assessment. This study investigates the stabilization of fine-grained clayey soils using waste marble powder as an alternative binder. Laboratory experiments were conducted to evaluate the geotechnical properties of soil–marble powder mixtures, including Atterberg's limits, compaction characteristics, California Bearing Ratio (CBR), Indirect Tensile Strength (ITS), and Unconfined Compressive Strength (UCS). The effects of various factors, such as curing time, molding water content, and composition ratios, on UCS, were analyzed using Exploratory Data Analysis (EDA) techniques, including histograms, box plots, and statistical modeling. The results show that the CBR increased from 10.43 to 22.94% for unsoaked and 4.68 to 12.46% for soaked conditions with 60% marble powder, ITS rose from 100 to 208 kN/m2 with 60–75% marble powder, and UCS rose from 170 to 661 kN/m2 after 28 days of curing, molding water content (optimum at 22.5%), and composition ratios (optimum at 60% marble powder). Complex modeling yielded R2 (0.954) and RMSE (29.82 kN/m2) between predicted and experimental values. This study demonstrates the potential of utilizing waste marble powder as a sustainable and cost-effective binder for soil stabilization, transforming weak soils into viable construction materials. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. CRIME STATUS PREDICTION USING ENSEMBLE LEARNING.
- Author
-
JAIN, SANJAY and SINGH, PRASHANT
- Subjects
LOCATION data ,CRIME ,RANDOM forest algorithms ,DECISION trees ,CRIME statistics ,DATA analysis ,FORECASTING - Abstract
This paper focuses on crime status prediction through an ensemble methodology applied to extensive datasets obtained from catalog.data.gov, specifically targeting Los Angeles crime incidents since 2020. The research methodology comprises meticulous data collection, rigorous preprocessing, exploratory data analysis, model selection, and comprehensive model evaluation. Initial challenges included data inaccuracies and privacy-preserving measures in location data, necessitating thorough cleaning and transformation processes. Exploratory data analysis revealed crucial insights, including the 'Status' attribute's limited correlation, crime code distributions, areawise crime counts, and temporal patterns. To address class imbalance within 'Status', the Synthetic Minority Oversampling Technique (SMOTE) was applied to balance the dataset. Model evaluation highlighted the superiority of random forest models employing 10 and 20 decision trees, alongside KNN, which demonstrated consistent high accuracy, balanced precision-recall trade-offs, and notable F1 scores in crime status prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Understanding and Attaining an Investment Grade Rating in the Age of Explainable AI
- Author
-
Makwana, Ravi, Bhatt, Dhruvil, Delwadia, Kirtan, Shah, Agam, and Chaudhury, Bhaskar
- Published
- 2024
- Full Text
- View/download PDF
19. Statistical Inference
- Author
-
Emmert-Streib, Frank, Moutari, Salissou, Dehmer, Matthias, Emmert-Streib, Frank, Moutari, Salissou, and Dehmer, Matthias
- Published
- 2023
- Full Text
- View/download PDF
20. Cryptocurrency Price Prediction Using Deep Learning
- Author
-
Tharun, S. V., Saranya, G., Tamilvizhi, T., Surendran, R., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Kadry, Seifedine, editor, and Prasath, Rajendra, editor
- Published
- 2023
- Full Text
- View/download PDF
21. XGBoost-Based Prediction and Evaluation Model for Enchanting Subscribers in Industrial Sector
- Author
-
Pradeep, S., Kishore, M., Oviya, G., Poorani, S., Anitha, R., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Tanwar, Sudeep, editor, Wierzchon, Slawomir T., editor, Singh, Pradeep Kumar, editor, Ganzha, Maria, editor, and Epiphaniou, Gregory, editor
- Published
- 2023
- Full Text
- View/download PDF
22. Finding Recommended Feature on Student Enrolment Dataset of University XYZ Using Exploratory Data Analysis (EDA)
- Author
-
Zulkarnaini, Zulkarnaini, Isa, Indra Griha Tofik, Novianti, Leni, Elfaladonna, Febie, Agustri, Suzan, Zheng, Zheng, Editor-in-Chief, Xi, Zhiyu, Associate Editor, Gong, Siqian, Series Editor, Hong, Wei-Chiang, Series Editor, Mellal, Mohamed Arezki, Series Editor, Narayanan, Ramadas, Series Editor, Nguyen, Quang Ngoc, Series Editor, Ong, Hwai Chyuan, Series Editor, Sun, Zaicheng, Series Editor, Ullah, Sharif, Series Editor, Wu, Junwei, Series Editor, Zhang, Baochang, Series Editor, Zhang, Wei, Series Editor, Zhu, Quanxin, Series Editor, Zheng, Wei, Series Editor, Husni, Nyayu Latifah, editor, Caesarendra, Wahyu, editor, Aznury, Martha, editor, Novianti, Leni, editor, and Stiawan, Deris, editor
- Published
- 2023
- Full Text
- View/download PDF
23. Analysis of Effects of COVID-19 Pandemic on Small- and Medium-Sized Enterprises (SMEs) in Rwanda Using Wood Firm-Level Data
- Author
-
Emmanuel Munyemana, Joseph Mung’atu, and Charles Ruranga
- Subjects
firm ,COVID-19 ,SMEs ,wood sector ,exploratory data analysis (EDA) ,regression models ,Economics as a science ,HB71-74 - Abstract
This study assesses and quantifies the economic and financial impacts of the COVID-19 pandemic during the period of business operation restrictions countrywide (lockdown measures). We examine the strategies adopted by small and medium-sized enterprises (SMEs) to reopen their business operations after lockdown measures had been relaxed or lifted. Data were collected in Rwanda from nearly 244 SMEs across the country, providing firsthand and reliable information on the effects of the pandemic on business performance, with a particular emphasis on wood-based enterprises. We used Exploratory Data Analysis (EDA) and multivariate linear regression methods to measure the pandemic’s effects on employment, sales, and tax payments among SMEs. The findings reveal that firms downsized employment by 36%, with significant deviations within different SME sizes. Small businesses were particularly affected by reduced sales levels due to the pandemic. Although there was an overall reduction in tax payments during the crisis, medium-sized enterprises experienced a more significant decrease in taxes paid to the government by 74.6%. Additionally, regression findings affirm that the COVID-19 effects on SMEs were manifested in reduced sales across all categories of SMEs, reduced employment, and a reduced amount of taxes paid to the government, which further translate to reduced economic performance during COVID-19 period. Furthermore, SME owners utilised various coping mechanisms during the reopening phase, including a reliance on savings and selling assets. The analysis recommends establishing medium-term financing mechanisms and providing technical support for SMEs to ensure a steady and sustainable recovery from the pandemic’s effects, as well as enhancing their resilience to future socio-economic shocks.
- Published
- 2024
- Full Text
- View/download PDF
24. Evaluation and prediction of design-time product structural analysis assistance using XGBoost and Grey Wolf Optimizer
- Author
-
Ali, Mohamad and Hussein, Mohammad
- Published
- 2024
- Full Text
- View/download PDF
25. Biology-inspired data-driven quality control for scientific discovery in single-cell transcriptomics
- Author
-
Ayshwarya Subramanian, Mikhail Alperovich, Yiming Yang, and Bo Li
- Subjects
scRNA-seq ,Quality control (QC) ,Data-driven ,Single cell ,Adaptive QC ,Exploratory data analysis (EDA) ,Biology (General) ,QH301-705.5 ,Genetics ,QH426-470 - Abstract
Abstract Background Quality control (QC) of cells, a critical first step in single-cell RNA sequencing data analysis, has largely relied on arbitrarily fixed data-agnostic thresholds applied to QC metrics such as gene complexity and fraction of reads mapping to mitochondrial genes. The few existing data-driven approaches perform QC at the level of samples or studies without accounting for biological variation. Results We first demonstrate that QC metrics vary with both tissue and cell types across technologies, study conditions, and species. We then propose data-driven QC (ddqc), an unsupervised adaptive QC framework to perform flexible and data-driven QC at the level of cell types while retaining critical biological insights and improved power for downstream analysis. ddqc applies an adaptive threshold based on the median absolute deviation on four QC metrics (gene and UMI complexity, fraction of reads mapping to mitochondrial and ribosomal genes). ddqc retains over a third more cells when compared to conventional data-agnostic QC filters. Finally, we show that ddqc recovers biologically meaningful trends in gradation of gene complexity among cell types that can help answer questions of biological interest such as which cell types express the least and most number of transcripts overall, and ribosomal transcripts specifically. Conclusions ddqc retains cell types such as metabolically active parenchymal cells and specialized cells such as neutrophils which are often lost by conventional QC. Taken together, our work proposes a revised paradigm to quality filtering best practices—iterative QC, providing a data-driven QC framework compatible with observed biological diversity.
- Published
- 2022
- Full Text
- View/download PDF
26. A Comparative Study on Machine Learning Based Type 2 Diabetes Mellitus Prediction
- Author
-
Zhan, Weiyi, Fournier-Viger, Philippe, Series Editor, Wu, Haocun, editor, Mishra, Tapas, editor, and Erokhin, Vasilii, editor
- Published
- 2022
- Full Text
- View/download PDF
27. Text Mining Amazon Mobile Phone Reviews: Insight from an Exploratory Data Analysis
- Author
-
Suhasini, S., Krishnamurthy, Vallidevi, Rannenberg, Kai, Editor-in-Chief, Soares Barbosa, Luís, Editorial Board Member, Goedicke, Michael, Editorial Board Member, Tatnall, Arthur, Editorial Board Member, Neuhold, Erich J., Editorial Board Member, Stiller, Burkhard, Editorial Board Member, Tröltzsch, Fredi, Editorial Board Member, Pries-Heje, Jan, Editorial Board Member, Kreps, David, Editorial Board Member, Reis, Ricardo, Editorial Board Member, Furnell, Steven, Editorial Board Member, Mercier-Laurent, Eunika, Editorial Board Member, Winckler, Marco, Editorial Board Member, Malaka, Rainer, Editorial Board Member, Kalinathan, Lekshmi, editor, R., Priyadharsini, editor, Kanmani, Madheswari, editor, and S., Manisha, editor
- Published
- 2022
- Full Text
- View/download PDF
28. Exploring Rural Shrink Smart Through Guided Discovery Dashboards.
- Author
-
BRADFORD, DENISE and VANDERPLAS, SUSAN
- Subjects
- *
RURAL geography , *DATA visualization , *DECISION making - Abstract
Many small and rural places are shrinking. Interactive dashboards are the most common use cases for data visualization and context for exploratory data tools. In our paper, we will use Iowa data to explore the specific scope of how dashboards are used in small and rural area to empower novice analysts to make data-driven decisions. Our framework will suggest a number of research directions to better support small and rural places from shrinking using an interactive dashboard design, implementation and use for the every day analyst. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
29. Black Lives Matter: Exploratory Analysis of Social Media Disclosures.
- Author
-
Nanayakkara, A. C. and Thennakoon, G. A. D. M.
- Subjects
BLACK Lives Matter movement ,DISCLOSURE ,PYTHON programming language ,USER-generated content ,SENTIMENT analysis ,SOCIAL media ,PARTS of speech - Abstract
Social media has become a contemporary platform for vox populi, enriched with the facilities of almost unrestricted access and versatility in terms of time and location of the users. The behaviors of social media users are creating a plethora of data, and is a fertile ground for Exploratory Data Analysis (EDA), which enables users to discover the veiled story within the datasets often using visual methods. This study focuses on the tragic incident of George Floyd’s death that took place on May 25, 2020, in Minneapolis, Minnesota, US, in terms of the social media responses by analyzing the corpus of comments for a selected YouTube video. Python programming language has been used to implement the EDA process using the methods of Text statistics analysis, Sentiment analysis, Ngram exploration, Topic modeling, Parts of Speech (POS) tagging, Word cloud formation, Named Entity Recognition (NER) and Text complexity analysis. By exploring the video disclosure with relevant tools, the study provides insights on the netizens, their behavior and their influence on society. This endeavor will help in preventing the manipulation of public opinion. [ABSTRACT FROM AUTHOR]
- Published
- 2022
30. Interdependence in Artificial Intelligence to Empower Worldwide COVID-19 Sensitivity
- Author
-
Laxmi Lydia, E., Moses Gummadi, Jose, Ranjan Pattanaik, Chinmaya, Krishna Mohan, A., Jaya Suma, G., Daniel, Ravuri, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zhang, Junjie James, Series Editor, Bindhu, V., editor, Tavares, João Manuel R. S., editor, Boulogeorgos, Alexandros-Apostolos A., editor, and Vuppalapati, Chandrasekar, editor
- Published
- 2021
- Full Text
- View/download PDF
31. Evaluating Machine Learning Algorithms for Marketing Data Analysis: Predicting Grocery Store Sales
- Author
-
Gopagoni, Deepa Rani, Lakshmi, P. V., Chaudhary, Ankur, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Satapathy, Suresh Chandra, editor, Bhateja, Vikrant, editor, Ramakrishna Murty, M., editor, Gia Nhu, Nguyen, editor, and Jayasri Kotti, editor
- Published
- 2021
- Full Text
- View/download PDF
32. Working with Large and Complex Datasets
- Author
-
MacFarland, Thomas W., Yates, Jan M., MacFarland, Thomas W., and Yates, Jan M.
- Published
- 2021
- Full Text
- View/download PDF
33. Artificial Intelligence and Exploratory-Data-Analysis-Based Initial Public Offering Gain Prediction for Public Investors.
- Author
-
Munshi, Manushi, Patel, Manan, Alqahtani, Fayez, Tolba, Amr, Gupta, Rajesh, Jadav, Nilesh Kumar, Tanwar, Sudeep, Neagu, Bogdan-Constantin, and Dragomir, Alin
- Abstract
An initial public offering (IPO) refers to a process by which private corporations offer their shares in a public stock market for investment by public investors. This listing of private corporations in the stock market leads to the easy generation and exchange of capital between private corporations and public investors. Investing in a company's shares is accompanied by careful consideration and study of the company's public image, financial policies, and position in the financial market. The stock market is highly volatile and susceptible to changes in the political and socioeconomic environment. Therefore, the prediction of a company's IPO performance in the stock market is an important study area for researchers. However, there are several challenges in this path, such as the fragile nature of the stock market, the irregularity of data, and the influence of external factors on the IPO performance. Researchers over the years have proposed various artificial intelligence (AI)-based solutions for predicting IPO performance. However, they have some lacunae in terms of the inadequate data size, data irregularity, and lower prediction accuracy. Motivated by the aforementioned issues, we proposed an analytical model for predicting IPO gains or losses by incorporating regression-based AI models. We also performed a detailed exploratory data analysis (EDA) on a standard IPO dataset to identify useful inferences and trends. The XGBoost Regressor showed the maximum prediction accuracy for the current IPO gains, i.e., 91.95%. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
34. Exploratory Data Analysis of Genomic Sequence of Variants of SARS-CoV-2 Reveals Sequence Divergence and Mutational Localization.
- Author
-
Sangeet, Satyam and Khan, Arshad
- Subjects
- *
GENOMICS , *SARS-CoV-2 , *PHENOMENOLOGICAL biology , *DATA analysis , *WHOLE genome sequencing , *NUCLEOTIDE sequencing - Abstract
Whole genome sequencing has rapidly progressed in recent years, with sequencing the SARS-CoV-2 genomes, making it a more reliable clinical tool for public health surveillance. This development has resulted in the production of a large amount of genomic data used for various types of genomic exploration. However, without a proper standard protocol, the usage of genomic data for analyzing various biological phenomena, such as mutation and evolution, may result in a propagating risk of using an unvalidated data set. This process could lead to irregular data being generated along with a high risk of altered analysis. Thus, the current study lays out the foundation for a preprocess pipeline using data analysis to analyze the genomic data set for its accuracy. We have used the recent example of SARS-CoV-2 to demonstrate the process overflow that can be utilized for various kinds of biological exploration such as understanding mutational events, evolutionary divergence, and speciation. Our analysis reveals a significant amount of sequence divergence in the gamma variant as compared with the reference genome thereby making the variant less infective and deadly. Moreover, we found regions in the genomic sequence that is more prone to mutational localization thereby altering the structural integrity of the virus resulting in a more reliable molecular viral mechanism. We believe that the current work will help for an initial check of the genomic data followed by the biological assessment of the process overflow which will be beneficial for the variant analysis and mutational uprising. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
35. Préparer et analyser les données de 'Demandes de valeurs foncières' en open data : proposition d’une méthodologie reproductible
- Author
-
Boris Mericskay and Florent Demoraes
- Subjects
cartography ,spatial analysis ,exploratory data analysis (EDA) ,open data ,real estate market ,Geography (General) ,G1-922 - Abstract
The data concerning land and real estate transactions, which for a long time were stored in various complex and sparsely accessible databases, are now more easily available since the opening of the DVF database ("Demandes de Valeurs Foncières" in French, "Real Estate Transaction Datafiles" in English) in 2019. However, the use of these information resources for analytical purposes requires time for acculturation and significant preparation work that should not be underestimated. This article presents a workflow dedicated to the analysis of open data DVF that we formalized in the R environment and documented in order to ensure replicability. This workflow illustrates the potential of these data to understand the spatial dynamics of residential real estate markets at different scales. Through a case study of the Brittany region and the Rennes urban/metropolitan area, the objective is to propose and discuss some methodological avenues for the preparation, analysis and (carto)graphic representation of these "new data" as well as the challenges and issues associated with their use.
- Published
- 2022
- Full Text
- View/download PDF
36. Anomaly Detection of User Behavioural Events in E-commerce Electronics Stores using SVMs
- Author
-
Bollu, Sriya Sai and Bollu, Sriya Sai
- Abstract
Background: The main purpose of this thesis is in electronic commerce, reliable anomaly detection systems are essential for maintaining security and improving user experiences, especially in the electronics industry. With the goal of filling in the gaps in the current anomaly detection methods, this study examines the efficacy of SVM as a fundamental algorithmic framework for anomaly detection in retail electronics online. The goal of the research is to better understand user interfaces and security protocols one-commerce platforms by spotting anomalies within user behavioral events. Objectives: To evaluate the effectiveness of SVM in identifying anomalies in user activity patterns, a rigorous experimental design comprising feature extraction, preprocessing, and model evaluation is used in the technique. Methods: The study establishes the foundation for analysis and model creation by utilizing data from the REES46 platform, which records a broad range of user interactions over an extended period of time. Utilizing this extensive dataset, the study explores the subtle aspects of user behavior and offers insights into SVM algorithm-based anomaly detection methods. The methodology’s rigorous data preprocessing and feature extraction ensured the dataset’s integrity, contributing to the model’s ef-fectiveness. Metrics including precision, recall, and F1-score were used to train and assess the SVM model after a thorough normalization of the dataset using Stan-dard Scaler. With an F1 score, a precision, and a recall. The model’s accuracy was further confirmed by a low Mean Squared Error (MSE), Prediction scatter plots and other visualizations. Results: The findings highlight the considerable potential of SVM-based anomaly detection systems to improve user experiences and strengthen security protocols in online retail settings. higher scores for the classification metrics. The model performed well, obtaining an F1 score of 0.93, a precision of 99per, and a recall of 1.00, d
- Published
- 2024
37. Zootechnical data analysis in a breeding animal facility: tracing the patterns of mouse production
- Author
-
Eloiza K. G. D. Ferreira, Giovanny A. C. A. Mazzarotto, and Guilherme F. Silveira
- Subjects
Data science ,Exploratory data analysis (EDA) ,Python ,Laboratory animals science (LAS) ,Medicine (General) ,R5-920 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background With the enactment of the Brazilian Law Arouca 11,794/2008 and Decree 6.899/2009, there has been an urgent need for changes in the processes related to animal experimentation in Brazil; in particular, there is a need for improvements in enhancements of the lab animal management. To improve the management capacity of the Lab animal facility of the Carlos Chagas Institute’s Laboratory Animals Science (LAS), BioterC software was developed and implemented in 2014 for tracking mouse laboratory colonies. Five years after the implementation of this software, we sought to analyze the information in the database originated from BioterC using the Exploratory Analysis Data methodology (EDA). This article aims to identify animal breeding patterns using a data mining tool (Data Science) with Python programming language. Results The results show that from September 2014 to June 2019, under the license IACUC number LW- 6/17, 15.106 animals were produced. The C57BL/6, BALB/c and Swiss strains were the most frequently produced strains. The distribution of births due to crosses between these strains showed a median of 6 to 10 animals, depending on the genetic homozygosis and heterozygosis of the animal. The median number of days of mating was 35 days. In the sexing period, the records reported a median of 19 days. A total of 393 requests for animals from internal and external laboratories were registered. It was noted that approximately half of the animals produced to meet the demand for orders were discarded. Of the 15,106 animals, 38% were requested for animal experimentation, 58% were discarded and 4% did not have an outcome recorded in the data. Conclusions This volume of data provides an initial view of the information retrieval capabilities contained in BioterC, allowing for unique breeding knowledge by installing laboratory animals.
- Published
- 2021
- Full Text
- View/download PDF
38. Comparative Study of the Dynamics of Cosmic Rays for the Pakistan and China Atmospheric Regions
- Author
-
Faisal Nawaz, Bulbul Jan, Faisal Ahmed Khan Afridi, M. Ayub Khan Yousufzai, and Faraz Mehmood
- Subjects
cosmic rays ,Exploratory Data Analysis (EDA) ,fluctuation ,interpolation ,Kriging ,Science ,Science (General) ,Q1-390 - Abstract
This paper presents an analysis of cosmic ray intensity in Pakistan air space using spatial interpolation, comparing it with Chinese cosmic ray records from 1984 to 1993. The Exploratory Data Analytic (EDA) approach was applied to compare the cosmic ray fluctuations in both countries. The time series plot of the monthly cosmic rays showed relatively flatter counts in Pakistan than in China. The cosmic ray data for the years 1984 to 1993 fell within Solar Cycle 22, which lasted from 1986 to 1996, with its maximum phase in 1989 to 1991. The cosmic radiation varies between the atmospheric regions of Pakistan and China due to modulations in intensity that are accessible accordingly. It can be explained by purely astrophysical phenomena: (1) the source of emission of cosmic radiation may be different, (2) the rate at which emanation takes place depends on bursts of deep space dynamical objects from their sources that may be affected by solar wind and other solar radiations. Therefore, modulations in intensity are not only due to different geophysical locations. This study will help government organizations to predict and forecast cosmic rays values.
- Published
- 2020
- Full Text
- View/download PDF
39. Political Alignment Identification: a Study with Documents of Argentinian Journalists
- Author
-
Viviana Mercado, Andrea Villagra, and Marcelo Errecalde
- Subjects
author profiling ,exploratory data analysis (eda) ,journalist political alignment ,liwc ,text mining ,Computer engineering. Computer hardware ,TK7885-7895 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Political alignment identification is an author profiling task that aims at identifying political bias/orientation in people’ writings. As usual in any automatic text analysis, a critical aspect here is having available adequate data sets so that the data mining and machine learning approaches can obtain reliable and informative results. This article makes a contribution in this regard by presenting a new corpus for the study of political alignment in documents of Argentinian journalists. The study also includes several kinds of analysis of documents of pro-government and opposition journalists such as the relevance of terms in each journalist class, sentiment analysis, topic modelling and the analysis of psycholinguistic indicators obtained from the Linguistic Inquiry and Word Count (LIWC) system. From the experimental results, interesting patterns could be observed such as the topics both types of journalists write about, how the sentiment polarities are distributed and how the writings of pro-government and opposition journalists differ in the distinct LIWC categories.
- Published
- 2020
- Full Text
- View/download PDF
40. Basic Multivariate Statistical Methods for Environmental Monitoring Data Mining: Introductory Course for Master Students.
- Author
-
Simeonov, Vasil
- Subjects
- *
ENVIRONMENTAL monitoring , *DATA mining , *PRINCIPAL components analysis , *ENVIRONMENTAL risk assessment , *SELF-organizing maps - Abstract
The present introductory course of lectures summarizes the principles and algorithms of several widely used multivariate statistical methods: cluster analysis, principal components analysis, principal components regression, N-way principal components analysis, partial least squares regression and self-organizing maps with respect to their possible application in intelligent analysis, classification, modelling and interpretation to environmental monitoring data. The target group of possible users is master program students (environmental chemistry, analytical chemistry, environmental modelling and risk assessment etc.). [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
41. Comparative Study of the Dynamics of Cosmic Rays for the Pakistan and China Atmospheric Regions.
- Author
-
Nawaz, Faisal, Jan, Bulbul, Afridi, Faisal Ahmed Khan, Yousufzai, M. Ayub Khan, and Mehmood, Faraz
- Subjects
SOLAR radiation ,SOLAR wind ,COMPARATIVE studies ,TIME series analysis ,SOLAR cycle - Abstract
This paper presents an analysis of cosmic ray intensity in Pakistan air space using spatial interpolation, comparing it with Chinese cosmic ray records from 1984 to 1993. The Exploratory Data Analytic (EDA) approach was applied to compare the cosmic ray fluctuations in both countries. The time series plot of the monthly cosmic rays showed relatively flatter counts in Pakistan than in China. The cosmic ray data for the years 1984 to 1993 fell within Solar Cycle 22, which lasted from 1986 to 1996, with its maximum phase in 1989 to 1991. The cosmic radiation varies between the atmospheric regions of Pakistan and China due to modulations in intensity that are accessible accordingly. It can be explained by purely astrophysical phenomena: (1) the source of emission of cosmic radiation may be different, (2) the rate at which emanation takes place depends on bursts of deep space dynamical objects from their sources that may be affected by solar wind and other solar radiations. Therefore, modulations in intensity are not only due to different geophysical locations. This study will help government organizations to predict and forecast cosmic rays values. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
42. Political Alignment Identification: a Study with Documents of Argentinian Journalists.
- Author
-
Mercado, Viviana, Villagra, Andrea, and Errecalde, Marcelo
- Subjects
IDENTIFICATION documents ,JOURNALISTS ,WORD frequency ,SENTIMENT analysis ,DATA mining - Abstract
Copyright of Journal of Computer Science & Technology (JCS&T) is the property of Journal of Computer Science & Technology and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2020
- Full Text
- View/download PDF
43. Improved Classification Models to Distinguish Natural from Anthropic Oil Slicks in the Gulf of Mexico: Seasonality and Radarsat-2 Beam Mode Effects under a Machine Learning Approach
- Author
-
Ítalo de Oliveira Matias, Patrícia Carneiro Genovez, Sarah Barrón Torres, Francisco Fábio de Araújo Ponte, Anderson José Silva de Oliveira, Fernando Pellon de Miranda, and Gil Márcio Avellino
- Subjects
synthetic aperture radar (SAR) ,machine learning (ML) ,exploratory data analysis (EDA) ,classification model (CM) ,oil slicks source (OSS) ,oil seeps ,Science - Abstract
Distinguishing between natural and anthropic oil slicks is a challenging task, especially in the Gulf of Mexico, where these events can be simultaneously observed and recognized as seeps or spills. In this study, a powerful data analysis provided by machine learning (ML) methods was employed to develop, test, and implement a classification model (CM) to distinguish an oil slick source (OSS) as natural or anthropic. A robust database containing 4916 validated oil samples, detected using synthetic aperture radar (SAR), was employed for this task. Six ML algorithms were evaluated, including artificial neural networks (ANN), random forest (RF), decision trees (DT), naive Bayes (NB), linear discriminant analysis (LDA), and logistic regression (LR). Using RF, the global CM achieved a maximum accuracy value of 73.15. An innovative approach evaluated how external factors, such as seasonality, satellite configurations, and the synergy between them, limit or improve OSS predictions. To accomplish this, specific classification models (SCMs) were derived from the global ones (CMs), tuning the best algorithms and parameters according to different scenarios. Median accuracies revealed winter and spring to be the best seasons and ScanSAR Narrow B (SCNB) as the best beam mode. The maximum median accuracy to distinguish seeps from spills was achieved in winter using SCNB (83.05). Among the tested algorithms, RF was the most robust, with a better performance in 81% of the investigated scenarios. The accuracy increment provided by the well-fitted models may minimize the confusion between seeps and spills. This represents a concrete contribution to reducing economic and geologic risks derived from exploration activities in offshore areas. Additionally, from an operational standpoint, specific models support specialists to select the best SAR products and seasons for new acquisitions, as well as to optimize performances according to the available data.
- Published
- 2021
- Full Text
- View/download PDF
44. Potentiels et limites des traces (géo)numériques dans l’analyse des mobilités : l’exemple des données de la plateforme de covoiturage BlaBlaCar
- Author
-
Boris Mericskay
- Subjects
GIS ,spatial analysis ,mobility ,exploratory data analysis (EDA) ,data ,Geography (General) ,G1-922 - Abstract
Big data is a field of investigation rich in promises but still complex in renewing spatial mobility analysis. In fact, thinking about mobility through the prism of digital footprints raises many questions regarding the data nature, the accessibility procedures and the methods and techniques of treatment. This paper aims to explore the potential of digital footprints in the analysis of mobility through the example of data from the Carpooling platform BlaBlaCar. By analyzing (temporal and spatial) trips to and from city of Rennes over the course of 5 months, the goal is both to draw a portrait of Carpooling in the capital of Britanny and to give an overview of advantages and limitations of these data in understanding this new form of mobility.
- Published
- 2019
- Full Text
- View/download PDF
45. Advances in Public Transport Platform for the Development of Sustainability Cities.
- Author
-
Corchado, Juan M., Chamoso, Pablo, Corchado, Juan M., De la Prieta, Fernando, and Larriba-Pey, Josep L.
- Subjects
Environmental science, engineering & technology ,History of engineering & technology ,Technology: general issues ,Barcelona underground ,Big Data analytics ,CPS ,Fintech ,GLPK ,HTM ,IoT ,artificial intelligence ,artificial neural network ,attention ,big-data applications ,carsharing ,centrality measures ,clustering analysis ,collaborative filtering ,complex network analysis ,content-based ,critical infrastructure ,cyber-attack detection ,data analysis ,data envelopment analysis (DEA) ,data extraction ,data fusion ,deep learning ,deep neural networks ,delays ,demand ,demand prediction ,dynamic bus travel time prediction ,energy consumption ,energy trading ,exploratory data analysis (EDA) ,forecasting systems ,integer programming ,intelligent transportation ,intelligent transportation systems ,intelligent transportation systems (ITS) ,learning object ,learning recommender system ,learning videos ,machine intelligence ,mapping application ,multi-objective optimization ,n/a ,natural language processing ,network robustness ,optimization models ,passenger flow ,passenger traffic ,passenger waiting time ,public transit ,railway ,recommender system ,recurrent neural network ,regression ,regression analysis ,reputation algorithm ,ride-hailing ,ridership patterns ,safety ,search and rescue ,security ,software application ,sustainable cities ,sustainable transport systems ,taxi ,taxi recommendation ,time series forecasting ,timetable ,transfer learning ,transport ,trust ,trusted negotiations ,unmanned aerial vehicles (UAVs) ,urban rail transit (URT) ,users' profiling ,users' reputation ,variable neighborhood search ,vehicle occupancy ratio ,vehicle social network ,wastewater treatment plants ,wide and deep - Abstract
Summary: Modern societies demand high and varied mobility, which in turn requires a complex transport system adapted to social needs that guarantees the movement of people and goods in an economically efficient and safe way, but all are subject to a new environmental rationality and the new logic of the paradigm of sustainability. From this perspective, an efficient and flexible transport system that provides intelligent and sustainable mobility patterns is essential to our economy and our quality of life. The current transport system poses growing and significant challenges for the environment, human health, and sustainability, while current mobility schemes have focused much more on the private vehicle that has conditioned both the lifestyles of citizens and cities, as well as urban and territorial sustainability. Transport has a very considerable weight in the framework of sustainable development due to environmental pressures, associated social and economic effects, and interrelations with other sectors. The continuous growth that this sector has experienced over the last few years and its foreseeable increase, even considering the change in trends due to the current situation of generalized crisis, make the challenge of sustainable transport a strategic priority at local, national, European, and global levels. This Special Issue will pay attention to all those research approaches focused on the relationship between evolution in the area of transport with a high incidence in the environment from the perspective of efficiency.
46. Digitalization-based process improvement and decision-making in offsite construction.
- Author
-
Barkokebas, Beda, Martinez, Pablo, Bouferguene, Ahmed, Hamzeh, Farook, and Al-Hussein, Mohamed
- Subjects
- *
MACHINE learning , *TACIT knowledge , *BUILDING information modeling , *DECISION making , *RADIO frequency identification systems , *STATISTICS - Abstract
The evaluation of process improvements measures in offsite construction shop floors often relies on experts' opinion, with limited use of empirical data gathered by sensors in real-time. To address this issue, there is a need for methods that integrate expert's tacit knowledge with robust data analysis techniques. This paper describes the application of exploratory data analysis techniques to evaluate improvement suggestions proposed by expert's, supported by data collected by sensors on the shop floor and building information models. The presented method involves a quantitative and qualitative digitalization-based approach where improvement suggestions are modelled and validated though machine learning algorithms and hypothesis testing. The contribution of this study is a method that combines real-time data, building information models, and knowledge modeling from experts to evaluate process improvement on offsite construction shop floors. • A method to assess improvements based on experts input and real-time data. • Machine learning is applied to analyze data from RFID sensors and BIM models. • The automation in workstations is rated based on production balance and efficiency. • Strategies to increase production flexibility are rated using statistical analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
47. Exploratory Data Analysis of Synthetic Aperture Radar (SAR) Measurements to Distinguish the Sea Surface Expressions of Naturally-Occurring Oil Seeps from Human-Related Oil Spills in Campeche Bay (Gulf of Mexico).
- Author
-
de Araújo Carvalho, Gustavo, Minnett, Peter J., de Miranda, Fernando Pellon, Landau, Luiz, and Paes, Eduardo Tavares
- Subjects
- *
SYNTHETIC aperture radar , *IMAGING systems - Abstract
An Exploratory Data Analysis (EDA) aims to use Synthetic Aperture Radar (SAR) measurements for discriminating between two oil slick types observed on the sea surface: naturally-occurring oil seeps versus human-related oil spills--the use of satellite sensors for this task is poorly documented in scientific literature. A long-term RADARSAT dataset (2008-2012) is exploited to investigate oil slicks in Campeche Bay (Gulf of Mexico). Simple Classification Algorithms to distinguish the oil slick type are designed based on standard multivariate data analysis techniques. Various attributes of geometry, shape, and dimension that describe the oil slick Size Information are combined with SAR-derived backscatter coefficients--sigma-(σo), beta-(βo), and gamma-(γo) naught. The combination of several of these characteristics is capable of distinguishing the oil slick type with ~70% of overall accuracy, however, the sole and simple use of two specific oil slick's Size Information (i.e., area and perimeter) is equally capable of distinguishing seeps from spills. The data mining exercise of our EDA promotes a novel idea bridging petroleum pollution and remote sensing research, thus paving the way to further investigate the satellite synoptic view to express geophysical differences between seeped and spilled oil observed on the sea surface for systematic use. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
48. Incidence of the artisanal fishing organizations of the San Silvestre swamp in Barrancabermeja on the sustainable livelihoods of their associates
- Author
-
Gómez Puentes, Raúl Eduardo, Restrepo Calle, Sebastián, Ramos Barón, Pablo Andrés, and Álvarez Rodríguez, Juan Fernando
- Subjects
Análisis de varianza ,Artisanal fisheries ,Pesca artesanal - Barrancabermeja (Santander, Colombia) ,Medios de vida sostenible ,Pesca artesanal ,Análisis Exploratorio de Datos (AED) ,Analysis of Variance (ANOVA) ,Exploratory Data Analysis (EDA) ,Sostenibilidad - Barrancabermeja (Santander, Colombia) ,Análisis de varianza (ANOVA) ,Sustainable livelihoods ,Maestría en desarrollo rural - Tesis y disertaciones académicas - Abstract
La investigación tiene por objetivo analizar la incidencia de las organizaciones de pesca artesanal de la ciénaga San Silvestre en los medios de vida de sus asociados en el distrito de Barrancabermeja (Santander), tomando como referencia el marco analítico de Medios de Vida Sostenible del Departamento para el Desarrollo Internacional del Reino Unido (DFID por sus siglas en inglés). Para alcanzar este propósito, se realizó inicialmente un análisis del contexto de vulnerabilidad de la pesca artesanal en este cuerpo de agua de importancia estratégica para el Distrito; posteriormente, se caracterizaron las organizaciones de pesca artesanal a través del Índice de Competencias Organizacionales (ICO) y se evaluaron los cambios en los medios de vida de los pescadores artesanales a través del Índice de Medios de Vida Sostenible (IMVS), el cual se construyó atendiendo al marco de referencia de la investigación. Los resultados de los dos ejercicios de medición (competencias organizacionales y Índice de medios de vida) se analizaron a través de técnicas estadísticas de análisis exploratorio de datos (AED) y análisis de varianza (ANOVA) cuyos resultados permitieron concluir que: 1) la pertenencia a una organización es un factor que incide positivamente en los resultados de los medios de vida de los pescadores, y 2) las características de la organización no son determinantes en dichos resultados. The objective of the research is to analyze the incidence of the artisanal fishing organizations of the San Silvestre swamp on the livelihoods of their associates in the district of Barrancabermeja (Santander), taking as a reference the analytical framework of Sustainable Livelihoods of the Department for International Development UK (DFID). To achieve this purpose, an analysis of the vulnerability context of artisanal fishing was initially carried out in this body of water of strategic importance for the District; Subsequently, the artisanal fishing organizations were characterized through the Organizational Competence Index (ICO) and the changes in the livelihoods of artisanal fishermen were evaluated through the Sustainable Livelihoods Index (IMVS), which was built according to the frame of reference of the investigation. The results of the two measurement exercises (organizational competencies and Livelihood Index) were analyzed through statistical techniques of exploratory data analysis (AED) and analysis of variance (ANOVA) whose results allowed us to conclude that: 1) belonging to an organization is a factor that positively affects the results of the fishermen's livelihoods, and 2) the characteristics of the organization are not decisive in these results. Magíster en Desarrollo Rural Maestría https://scienti.minciencias.gov.co/cvlac/visualizador/generarCurriculoCv.do?cod_rh=0000050352
- Published
- 2021
49. Expansion de l'agglomération de Lima et différenciation de l’espace résidentiel : analyse exploratoire d’un corpus de données diversifié
- Author
-
Marie Piron, Évelyne Mesclier, and Bernard Lortic
- Subjects
social integration ,exploratory data analysis (EDA) ,Lima ,socio-spatial structure ,urban peripheries ,Geography (General) ,G1-922 - Abstract
The forms of socio-spatial differentiation found in Lima are for the most part in line with the model of Andean metropolises. There is, for example, an opposition between central areas occupied by the middle and upper classes and the more recent and lower-class areas in the peripheries. There is also a directional expansion of the wealthier areas, in this case, towards the east of the central areas. This model is still very pronounced but seems to have changed since the end of the last century with internal diversification of previously homogeneous units. Our objective is to understand the current spatial organization of social differentiation in Lima. On the basis of data of the last national census of population in 2007 and of a set of aerial photos and satellite images taken since the middle of the 20th century, we build on, at the level of the blocks, various cartographical and statistical indicators concerning the successive layers of the expansion of Lima and its current socio-spatial structure. We conclude from this first, that the urban area is organized according to a scale of social and urban integration which is registered spatially on a gradient, from the center towards the peripheries, in correlation with the time the blocks were built. In a second phase, a series of multidimensional statistical analyses which visually reintroduce each block allows us to moderate this conclusion by the means of the study of the variability inside each layer of expansion. It is thus possible to understand the current territorial dynamics and to formulate the hypothesis of a change in the structure contrasting the center and the peripheries.
- Published
- 2015
- Full Text
- View/download PDF
50. Evaluation of Classification and Ensemble Algorithms for Bank Customer Marketing Response Prediction.
- Author
-
Apampa, Olatunji
- Subjects
BANKING industry customer services ,BANK marketing ,DIRECT marketing ,DATA mining ,DECISION trees ,LOGISTIC regression analysis ,PRINCIPAL components analysis - Abstract
This article attempts to improve the performance of classification algorithms used in the bank customer marketing response prediction of an unnamed Portuguese bank using the Random Forest ensemble. A thorough exploratory data analysis (EDA) was conducted on the data in order to ascertain the presence of anomalies such as outliers and extreme values. The EDA revealed that the bank data had 45, 211 instances and 17 features, with 11.7% positive responses. This was in addition to the detection of outliers and extreme values. Classification algorithms used for modelling the bank dataset include; Logistic Regression, Decision Tree, Naïve Bayes and the Random Forest ensemble. These algorithms were applied to both the balanced and original bank data using Orange 3.2 data mining application following the Cross Industry Standard for Data Mining (CRISP-DM), and the ten-fold cross-validation method. Results from the experimental methods revealed that the performance of the Random Forest ensemble improved when the data was balanced. Results also showed that the features duration, poutcome, contact, month and housing were the most important features that contribute to the success of the bank customer marketing campaign for deposit subscription. The study also revealed that the duration of call to clients, response to past promotions, and the use of cell phone contribute positively to the success of the campaign. While the months of September, November, March and April recorded higher subscription rates. Those in management cadre and technicians were found to have responded more positively to the campaign than those in other job categories. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.