55 results on '"Beltrán, Beatriz"'
Search Results
2. Recent advances in language & knowledge engineering.
- Author
-
Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
ARTIFICIAL intelligence , *PATTERN recognition systems , *ARTIFICIAL languages , *SENTIMENT analysis , *ENGINEERING - Abstract
Language & Knowledge Engineering is essential for the successfully development of artificial intelligence. The technologies proposed in international forums are meant to improve all areas of our daily life whether it is related to production industries, social communities, government, education, or something else. We consider very important to reveal the recent advances Intelligent and Fuzzy Systems applied to Language & Knowledge Engineering because they are the base for the society of tomorrow. Thus, the aim of this special issue of Journal of Intelligent and Fuzzy Systems is to present a collection of papers that cover recent research results on the two wide topics: language and knowledge engineering. Even if the special issue is structured into these two general topics, we have covered specific themes such as the following ones: Natural Language Processing, Knowledge engineering, Pattern recognition, Artificial Intelligence and Language, Information Processing, Machine Learning Applied to Text Processing, Image and Text Classification, Multimodal data analysis, sentiment analysis, etc. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
3. K-means based method for overlapping document clustering.
- Author
-
Beltrán, Beatriz, Vilariño, Darnes, Martínez-Trinidad, José Fco., Carrasco-Ochoa, J.A., Pinto, David, Singh, Vivek, and Perez, Fernando
- Subjects
- *
DOCUMENT clustering - Abstract
Overlapping clustering algorithms have shown to be effective for clustering documents. However, the current overlapping document clustering algorithms produce a big number of clusters, which make them little useful for the user. Therefore, in this paper, we propose a k-means based method for overlapping document clustering, which allows to specify by the user the number of groups to be built. Our experiments with different corpora show that our proposal allows obtaining better results in terms of FBcubed than other recent works for overlapping document clustering reported in the literature. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
4. Evaluation of the psychometric properties of the Spanish version of the Denver Developmental Screening Test II.
- Author
-
De-Andrés-Beltrán, Beatriz, Rodríguez-Fernández, Ángel, Güeita-Rodríguez, Javier, and Lambeck, Johan
- Subjects
- *
DENVER Developmental Screening Test , *PSYCHOMETRICS , *CHILDREN with developmental disabilities , *INTER-observer reliability , *CHILD development testing - Abstract
The objective of this study was to examine the psychometric properties of the Spanish version of the Denver Developmental Screening Test II in a population of Spanish children. Two hundred children ranging from 9 month to 6 years were grouped into two samples (healthy/with psychomotor delay) and screened in order to check whether they suffered from psychomotor delay. Children from three Early Intervention Centres and three schools participated in this study. Criterion validity was calculated by the method of extreme groups, comparing healthy children to those with development delay. Interobserver and intraobserver reliability were calculated using Cohen Kappa coefficient, and internal consistency was calculated via the Kuder-Richardson coefficient. The scale demonstrated 89 % sensitivity, 92 % specificity, a positive predicted value of 91 % and a negative predicted value of 89 %, whereas the positive and negative likelihood ratio was 11.12 and 0.12, respectively. Intraobserver reliability ranged from 0.662 to 1, and interobserver reliability ranged from 0.886 to 1. The Kuder-Richardson coefficient values ranged from 87.5 to 97.6 %. Conclusion: The Spanish version of the Denver Developmental Screening Test II was found to have a good criterion validity, reliability and internal consistency and is a suitable screening test for use in a population of Spanish children. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
5. Base de datos de carotenoides para valoración de la ingesta dietética de carotenos, xantofilas y de vitamina A; utilización en un estudio comparativo del estado nutricional en vitamina A de adultos jóvenes.
- Author
-
Beltrán, Beatriz, Estévez, Rocío, Cuadrado, Carmen, Jiménez, Susana, and Olmedilla Alonso, Begoña
- Subjects
- *
CAROTENOIDS , *VITAMIN A , *CAROTENES , *HEALTH of young adults , *XANTHOPHYLLS , *NUTRITIONAL status , *NUTRITION - Abstract
Objectives: 1) Develop a database of carotenoids (BD-carotenoids) in foods widely consumed in Spain. 2) To assess the vitamin A nutritional status (expressed as retinol equivalents [RE] and retinol activity equivalents [RAE]) in young adults. Methods: The BD-carotenoids includes data on carotenes (β-carotene, α-carotene and lycopene) and xanthophylls (P-cryptoxanthin, lutein and zeaxanthin) generated by HPLC. Vitamin A intake was assessed by a 3-day food record in 54 adults (20-35 years of age, not obese and with serum retinol > 30 µg/dl), using the BD-carotenoids and a Food Composition Table widely used in Spain. Results: The BD-carotenoids includes data on 89 foods (9 raw or boiled and 14 processed). The intake of provitamin-A carotenoids is 2.5 mg/p/d, that of RE 682 µg/p/d and that of RAE 499 µg/p/d. The vitamin A intake expressed as RAE is 27% lower than that expressed as RE. Seventy-six percent of the intake meets the daily intake recommendations and 63% meets the reference daily intakes of vitamin A. Conclusions: Data on individual carotenoids ensure greater accuracy in studies on diet and health, and provide easier assessment of the vitamin A intake, expressed as RE, RAE, or any other future forms. The vitamin A intake expressed as RAE represents a substantial reduction in the carotenoid contribution to vitamin A intake, which enhances the detection of inadequacies of that intake. [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
6. Influence of the flanges width and thickness on the shear strength of reinforced concrete beams with T-shaped cross section.
- Author
-
Ayensa, Alberto, Oller, Eva, Beltrán, Beatriz, Ibarz, Elena, Marí, Antonio, and Gracia, Luis
- Subjects
- *
CONCRETE beams , *SHEAR strength , *FLANGES , *SHEARING force - Abstract
Highlights • Only web contribution to shear is considered in codes for design of T-section concrete beams. • Experimental testing shows that contribution to shear strength of flanges is not negligible. • This effect may drive to considerable cost savings in new structures. • This effect may become decisive when evaluating the shear capacity of existing structures. • An experimental and parametric numerical study of T-Beams failing in shear is presented. Abstract Shear design of reinforced concrete beams with T section considers only the contribution of the web, mainly provided by aggregate interlock. However, as the load increases and large web crack openings take place, aggregate interlock reduces and shear stresses tend to concentrate near the neutral axis, usually located in the flanges of T beams, whose contribution to shear strength may be not negligible, as it has been experimentally observed. Thus, the contribution of flanges may drive to considerable cost savings in new structures and may become decisive when evaluating the shear capacity of existing structures. To quantify such contribution, a nonlinear 3D-FEA model has been developed and calibrated with the results of shear tests performed on RC T-beams by the authors. Once adjusted, the model has been used to analyze the shear response of beams with different geometry and longitudinal reinforcement, usual in practice. It has been found that, up to certain limits, the contribution of the flanges to the shear strength increases as the amount of longitudinal reinforcement decreases, as the flanges width increases and as the flange thickness increases. The maximum contribution of flanges found in the present study is 31.3% of the total shear resisted. Furthermore, the numerical model has been used to visualize and quantify aspects that are not easy to obtain experimentally, such as the distribution of the shear stresses between the web and the flanges. The present study will contribute to derive a design expression for the shear effective flanges width of T beams. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
7. A Model to Develop Chatbots for Assisting the Teaching and Learning Process.
- Author
-
Mendoza, Sonia, Sánchez-Adame, Luis Martín, Urquiza-Yllescas, José Fidel, González-Beltrán, Beatriz A., and Decouchant, Dominique
- Subjects
- *
LEARNING , *CHATBOTS , *SOCIAL workers , *TEACHERS , *MIDDLE schools , *CONSUMERS - Abstract
Recently, in the commercial and entertainment sectors, we have seen increasing interest in incorporating chatbots into websites and apps, in order to assist customers and clients. In the academic area, chatbots are useful to provide some guidance and information about courses, admission processes and procedures, study programs, and scholarly services. However, these virtual assistants have limited mechanisms to suitably help the teaching and learning process, considering that these mechanisms should be advantageous for all the people involved. In this article, we design a model for developing a chatbot that serves as an extra-school tool to carry out academic and administrative tasks and facilitate communication between middle-school students and academic staff (e.g., teachers, social workers, psychologists, and pedagogues). Our approach is designed to help less tech-savvy people by offering them a familiar environment, using a conversational agent to ease and guide their interactions. The proposed model has been validated by implementing a multi-platform chatbot that provides both textual-based and voice-based communications and uses state-of-the-art technology. The chatbot has been tested with the help of students and teachers from a Mexican middle school, and the evaluation results show that our prototype obtained positive usability and user experience endorsements from such end-users. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
8. Identifying the polarity of a text given the emotion of its author.
- Author
-
Sánchez, Belém Priego, Cabrera, Rafael Guzman, Carrillo, Michel Velazquez, Castro, Wendy Morales, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
TELECOMMUNICATION systems , *SENTIMENT analysis , *EMOTIONS , *CLASSIFICATION algorithms , *DIGITAL communications - Abstract
The rise of digital communication systems provides an almost infinite source of information that can be useful to feed classification algorithms, so it makes use of an already categorized collection of opinions of the social network Twitter for the formation and generation of a model of classification of short texts; which aims to categorize the emotional tone found in an author's Spanish-language digital text. In addition, linguistic, lexicographic and opinion mining computational tools are used to implement a series of methods that allow to automatically finding coincidences or orientations that allow determining the polarity of sentences and categorize them as positive, negative or neutral considering their lemmas. The results obtained from the analysis of emotions and polarity of this project, on the test phrases allow to observe a direct relationship between the categorized emotional tone and it is positive, negative or neutral classification, which allows to provide additional information to know the intention that the author had when he created the sentence. Determining these characteristics can be useful as a consistent information objective that can be leveraged by sectors where the prevalence of a product or service depends on user opinion, product rating or turns with satisfaction metrics. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
9. A new and efficient algorithm to look for periodic patterns on spatio-temporal databases.
- Author
-
Gutiérrez-Soto, Claudio, Gutiérrez-Bunster, Tatiana, Fuentes, Guillermo, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
PROCESS capability , *ALGORITHMS , *GEOGRAPHIC information systems , *TEMPORAL databases , *DATABASES , *BIG data - Abstract
Big Data is a generic term that involves the storing and processing of a large amount of data. This large amount of data has been promoted by technologies such as mobile applications, Internet of Things (IoT), and Geographic Information Systems (GIS). An example of GIS is a Spatio-Temporal Database (STDB). A complex problem to address in terms of processing time is pattern searching on STDB. Nowadays, high information processing capacity is available everywhere. Nevertheless, the pattern searching problem on STDB using traditional Data Mining techniques is complex because the data incorporate the temporal aspect. Traditional techniques of pattern searching, such as time series, do not incorporate the spatial aspect. For this reason, traditional algorithms based on association rules must be adapted to find these patterns. Most of the algorithms take exponential processing times. In this paper, a new efficient algorithm (named Minus-F1) to look for periodic patterns on STDB is presented. Our algorithm is compared with Apriori, Max-Subpattern, and PPA algorithms on synthetic and real STDB. Additionally, the computational complexities for each algorithm in the worst cases are presented. Empirical results show that Minus-F1 is not only more efficient than Apriori, Max-Subpattern, and PAA, but also it presents a polynomial behavior. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
10. Anbar: Collection and analysis of a large scale Urdu language Twitter corpus.
- Author
-
Tahir, Bilal, Mehmood, Muhammad Amir, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
URDU language , *MICROBLOGS , *COMPUTATIONAL linguistics , *NATURAL languages , *HIGH performance computing , *CORPORA , *CUTTING tools - Abstract
The confluence of high performance computing algorithms and large scale high-quality data has led to the availability of cutting edge tools in computational linguistics. However, these state-of-the-art tools are available only for the major languages of the world. The preparation of large scale high-quality corpora for low-resource language such as Urdu is a challenging task as it requires huge computational and human resources. In this paper, we build and analyze a large scale Urdu language Twitter corpus Anbar. For this purpose, we collect 106.9 million Urdu tweets posted by 1.69 million users during one year (September 2018-August 2019). Our corpus consists of tweets with a rich vocabulary of 3.8 million unique tokens along with 58K hashtags and 62K URLs. Moreover, it contains 75.9 million (71.0%) retweets and 847K geotagged tweets. Furthermore, we examine Anbar using a variety of metrics like temporal frequency of tweets, vocabulary size, geo-location, user characteristics, and entities distribution. To the best of our knowledge, this is the largest repository of Urdu language tweets for the NLP research community which can be used for Natural Language Understanding (NLU), social analytics, and fake news detection. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
11. Creating a corpus of historical documents for emotions identification.
- Author
-
Vázquez-González, Stephanie, Somodevilla-García, María, López, Rosalva Loreto, Gómez-Adorno, Helena, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
HISTORICAL source material , *DEEP learning , *IDENTIFICATION documents , *CORPORA , *MACHINE learning , *HISPANIC Americans - Abstract
The aim of this article is to contextualize and describe the gathering and annotation of a conventual Hispanic and Novo Hispanic texts corpus for emotions identification. Such corpus will be the dataset for an emotions identification model based on machine learning ∖ deep learning techniques. Furthermore, this document describes several exploratory experiments carried out on the corpus. Within these experiments, it is described how the corpus is also used to obtain a lexicon mapped to polarities and emotions, and how some of the documents are hand-labeled by experts for the evaluation of the Machine Learning ∖ Deep learning -based emotion classification model. Finally, the future uses and experiments with said corpus are described. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
12. A method for counting models on grid Boolean formulas1.
- Author
-
López-Medina, Marco A., Marcial-Romero, J. Raymundo, De Ita Luna, Guillermo, Hernández, José A., Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
COUNTING , *BOOLEAN functions , *COMPLEXITY (Philosophy) - Abstract
We present a novel algorithm based on combinatorial operations on lists for computing the number of models on two conjunctive normal form Boolean formulas whose restricted graph is represented by a grid graph Gm,n. We show that our algorithm is correct and its time complexity is O (t · 1. 618 t + 2 + t · 1. 618 2 t + 4) , where t = n · m is the total number of vertices in the graph. For this class of formulas, we show that our proposal improves the asymptotic behavior of the time-complexity with respect of the current leader algorithm for counting models on two conjunctive form formulas of this kind. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
13. YouTube based religious hate speech and extremism detection dataset with machine learning baselines.
- Author
-
Ashraf, Noman, Rafiq, Abid, Butt, Sabur, Shehzad, Hafiz Muhammad Faisal, Sidorov, Grigori, Gelbukh, Alexander, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
HATE speech , *ONLINE social networks , *MACHINE learning , *SUPERVISED learning , *SUPPORT vector machines - Abstract
On YouTube, billions of videos are watched online and millions of short messages are posted each day. YouTube along with other social networking sites are used by individuals and extremist groups for spreading hatred among users. In this paper, we consider religion as the most targeted domain for spreading hate speech among people of different religions. We present a methodology for the detection of religion-based hate videos on YouTube. Messages posted on YouTube videos generally express the opinions of users' related to that video. We provide a novel dataset for religious hate speech detection on Youtube comments. The proposed methodology applies data mining techniques on extracted comments from religious videos in order to filter religion-oriented messages and detect those videos which are used for spreading hate. The supervised learning algorithms: Support Vector Machine (SVM), Logistic Regression (LR), and k-Nearest Neighbor (k-NN) are used for baseline results. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
14. Improved neural machine translation for low-resource English–Assamese pair.
- Author
-
Laskar, Sahinur Rahman, Khilji, Abdullah Faiz Ur Rahman, Pakray, Partha, Bandyopadhyay, Sivaji, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
MACHINE translating , *COMMUNICATION barriers , *TRANSLATING & interpreting - Abstract
Language translation is essential to bring the world closer and plays a significant part in building a community among people of different linguistic backgrounds. Machine translation dramatically helps in removing the language barrier and allows easier communication among linguistically diverse communities. Due to the unavailability of resources, major languages of the world are accounted as low-resource languages. This leads to a challenging task of automating translation among various such languages to benefit indigenous speakers. This article investigates neural machine translation for the English–Assamese resource-poor language pair by tackling insufficient data and out-of-vocabulary problems. We have also proposed an approach of data augmentation-based NMT, which exploits synthetic parallel data and shows significantly improved translation accuracy for English-to-Assamese and Assamese-to-English translation and obtained state-of-the-art results. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
15. Deep active reinforcement learning for privacy preserve data mining in 5G environments.
- Author
-
Ahmed, Usman, Lin, Jerry Chun-Wei, Srivastava, Gautam, Chen, Hsing-Chung, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
DATA mining , *REINFORCEMENT learning , *PARTICLE swarm optimization , *DEEP learning , *ACTIVE learning , *DATA privacy - Abstract
Frequent pattern mining (FIM) identifies the most important patterns in data sets. However, due to the huge and high-dimensional nature of transactional data, classical pattern mining techniques suffer from the limitations of dimensions and data annotations. Recently, data mining while preserving privacy is considered as an important research area. Information privacy is a tradeoff that must be considered when using data. Through many years, privacy-preserving data mining (PPDM) made use of methods that are mostly based on heuristics. The operation of deletion was used to hide the sensitive information in PPDM. In this study, we used deep active learning to protect private and sensitive information. This paper combines entropy-based active learning with an attention-based approach to effectively hide sensitive patterns. The constructed models are then validated using high-dimensional transactional data with attention-based and active learning methods in a reinforcement environment. The results show that the proposed model can support and improve the effectiveness of decision-making by increasing the number of training instances through the use of a pooling technique and an entropy uncertainty measure. The proposed paradigm can achieve data sanitization by the hiding sensitive items and avoiding to hide the non-sensitive items. The model outperforms greedy, genetic, and particle swarm optimization approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
16. Neurodegenerative diseases categorization by applying the automatic model selection and hyperparameter optimization method.
- Author
-
Fuentes-Ramos, Mirta, Sánchez-DelaCruz, Eddy, Meza-Ruiz, Iván-Vladimir, Loeza-Mejía, Cecilia-Irene, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
NEURODEGENERATION , *HUNTINGTON disease , *PARKINSON'S disease , *GAIT in humans - Abstract
Neurodegenerative diseases affect a large part of the population in the world and also in Mexico, deteriorating gradually the quality of patients' life. Therefore, it is important to diagnose them with a high degree of reliability. In order to solve it, various computational methods have been applied in the analysis of biomarkers of human gait. In this study, we propose employing the automatic model selection and hyperparameter optimization method that has not been addressed before for this problem. Our results showed highly competitive percentages of correctly classified instances when discriminating binary and multiclass sets of neurodegenerative diseases: Parkinson's disease, Huntington's disease, and Spinocerebellar ataxias. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
17. Deep symbolic processing of human-performed musical sequences.
- Author
-
Rangel, Nahum, Godoy-Calderon, Salvador, Calvo, Hiram, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
MUSICAL perception , *MUSICALS , *INTELLIGENT tutoring systems , *TUTORS & tutoring - Abstract
Artificial music tutors are needed for assisting a performer during his/her practice time whenever a human tutor is not available. But for these artificial tutors to be intelligent and fulfill the role of a music tutor, they have to be able to identify errors made by the performer while playing a musical sequence. This task is not a trivial one, since all musical activities are considered as open-ended domains. Therefore, not only there is no unique correct way of performing a musical sequence, but also the analysis made by the tutor has to consider the development level of the performer, the difficulty level of the performed musical sequence, and many other variables. This paper describes an ongoing research that uses cascading connected layers of symbolic processing as the core of a human-performed error identification and characterization module able to overcome the complexity of the studied open-ended domain. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
18. Modeling a multi-layered blockchain framework for digital services that governments can implement.
- Author
-
Rebollar, Fernando, Aldeco-Perez, Rocio, Ramos, Marco A., Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
BLOCKCHAINS , *DATA integrity , *INFORMATION technology security , *GOVERNMENT property , *PANDEMICS - Abstract
The general population increasingly uses digital services, meaning services which are delivered over the internet or an electronic network, and events such as pandemics have accelerated the need of using new digital services. Governments have also increased their number of digital services, however, these digital services still lack of sufficient information security, particularly integrity. Blockchain uses cryptographic techniques that allow decentralization and increase the integrity of the information it handles, but it still has disadvantages in terms of efficiency, making it incapable of implementing some digital services where a high rate of transactions are required. In order to increase its efficient, a multi-layer proposal based on blockchain is presented. It has four layers, where each layer specializes in a different type of information and uses properties of public blockchain and private blockchain. An statistical analysis is performed and the proposal is modeled showing that it maintains and even increases the integrity of the information while preserving the efficiency of transactions. Besides, the proposal can be flexible and adapt to different types of digital services. It also considers that voluntary nodes participate in the decentralization of information making it more secure, verifiable, transparent and reliable. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
19. Prior latent distribution comparison for the RNN Variational Autoencoder in low-resource language modeling.
- Author
-
Kostiuk, Yevhen, Lukashchuk, Mykola, Gelbukh, Alexander, Sidorov, Grigori, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
NATURAL language processing , *CENTRAL limit theorem , *LATENT variables , *CONTINUOUS distributions , *GAUSSIAN distribution - Abstract
Probabilistic Bayesian methods are widely used in the machine learning domain. Variational Autoencoder (VAE) is a common architecture for solving the Language Modeling task in a self-supervised way. VAE consists of a concept of latent variables inside the model. Latent variables are described as a random variable that is fit by the data. Up to now, in the majority of cases, latent variables are considered normally distributed. The normal distribution is a well-known distribution that can be easily included in any pipeline. Moreover, the normal distribution is a good choice when the Central Limit Theorem (CLT) holds. It makes it effective when one is working with i.i.d. (independent and identically distributed) random variables. However, the conditions of CLT in Natural Language Processing are not easy to check. So, the choice of distribution family is unclear in the domain. This paper studies the priors selection impact of continuous distributions in the Low-Resource Language Modeling task with VAE. The experiment shows that there is a statistical difference between the different priors in the encoder-decoder architecture. We showed that family distribution hyperparameter is important in the Low-Resource Language Modeling task and should be considered for the model training. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
20. A Microlearning path recommendation approach based on ant colony optimization.
- Author
-
Rodriguez-Medina, Alma Eloisa, Dominguez-Isidro, Saul, Ramirez-Martinell, Alberto, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
ANT algorithms , *MICROLEARNING , *TRAVELING salesman problem , *ANTS , *LEARNING - Abstract
This paper presents the technical proposal of a novel approach based on Ant Colony Optimization (ACO) to recommend personalized microlearning paths considering the learning needs of the learner. In this study, the information of the learner was considered from a disciplinary ICT perspective, since the characteristics of our learner correspond to those of a professor with variable characteristics, such as the level of knowledge and their learning status. The recommendation problem is approached as an instance of the Traveling Salesman Problem (TSP), the educational pills represent the cities, the paths are the relationships between educational pills, the cost of going from one pill to another can be estimated by their degree of difficulty as well as the performance of the learner during the individual test. The results prove the approach proposal capacity to suggest microlearning path personalized recommendation according to the different levels of knowledge of the learners. The higher the number of learners, the behavior of the algorithm benefits in terms of stability. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
21. Detection of the level of attention in children with ADHD through brain waves and corporal posture1.
- Author
-
García, Alfredo, González, Juan M., Palomino, Amparo D., Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
BRAIN waves , *ATTENTION-deficit hyperactivity disorder , *HEART beat , *ATTENTION , *MATHEMATICAL analysis , *SITTING position - Abstract
In the current world, the need to know instantaneous information that helps people to know their current physical and intellectual conditions has become paramount, each time new systems that provide information to the user in real time are incorporated in portable devices. This information indicates different health parameters of the user, it can be obtained through their physiological variables such as: number of steps, heart rate, oxygenation level in the blood and other ones. One of the most requested intellectual conditions to be known by the user is: the level of attention reached when the user executes a task. This work describes a methodology and the experimentation to know the level of attention of people through a test to identify colors also are shown the development and the application of a system (hardware and software) to measure the level of attention of people using two input signals: corporal posture and brain waves. The mathematical analysis to find the correlation between the corporal posture and the level of attention is shown in this paper. The results obtained indicate that the corporal posture influences on the level of attention of people directly. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
22. Marked and unmarked speed bump detection for autonomous vehicles using stereo vision.
- Author
-
Ballinas-Hernández, Ana Luisa, Olmos-Pineda, Ivan, Olvera-López, José Arturo, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
SPEED bumps , *AUTONOMOUS vehicles , *PAVEMENTS , *POINT cloud , *VISION ,DEVELOPING countries - Abstract
A current challenge for autonomous vehicles is the detection of irregularities on road surfaces in order to prevent accidents; in particular, speed bump detection is an important task for safe and comfortable autonomous navigation. There are some techniques that have achieved acceptable speed bump detection under optimal road surface conditions, especially when signs are well-marked. However, in developing countries it is very common to find unmarked speed bumps and existing techniques fail. In this paper a methodology to detect both marked and unmarked speed bumps is proposed, for clearly painted speed bumps we apply local binary patterns technique to extract features from an image dataset. For unmarked speed bump detection, we apply stereo vision where point clouds obtained by the 3D reconstruction are converted to triangular meshes by applying Delaunay triangulation. A selection and extraction of the most relevant features is made to speed bump elevation on surfaces meshes. Results obtained have an important contribution and improve some of the existing techniques since the reconstruction of three-dimensional meshes provides relevant information for the detection of speed bumps by elevations on surfaces even though they are not marked. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
23. Motion estimation in vehicular environments based on Bayesian dynamic networks.
- Author
-
Reyes-Cocoletzi, Lauro, Olmos-Pineda, Ivan, Olvera-López, J. Arturo, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
BAYESIAN analysis , *RISK-taking behavior - Abstract
The cornerstone to achieve the development of autonomous ground driving with the lowest possible risk of collision in real traffic environments is the movement estimation obstacle. Predicting trajectories of multiple obstacles in dynamic traffic scenarios is a major challenge, especially when different types of obstacles such as vehicles and pedestrians are involved. According to the issues mentioned, in this work a novel method based on Bayesian dynamic networks is proposed to infer the paths of interest objects (IO). Environmental information is obtained through stereo video, the direction vectors of multiple obstacles are computed and the trajectories with the highest probability of occurrence and the possibility of collision are highlighted. The proposed approach was evaluated using test environments considering different road layouts and multiple obstacles in real-world traffic scenarios. A comparison of the results obtained against the ground truth of the paths taken by each detected IO is performed. According to experimental results, the proposed method obtains a prediction rate of 75% for the change of direction taking into consideration the risk of collision. The importance of the proposal is that it does not obviate the risk of collision in contrast with related work. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
24. Electro-impedance mammograms for automatic breast cancer screening: First insights on Mexican patients.
- Author
-
Romero-Coripuna, Rosario Lissiet, Hernández-Farías, Delia Irazú, Murillo-Ortiz, Blanca, Córdova-Fraga, Teodoro, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
MEXICANS , *MAMMOGRAMS , *EARLY detection of cancer , *ELECTRICAL impedance tomography , *EARLY diagnosis , *SUPERVISED learning , *BREAST - Abstract
Breast cancer is a very important health concern around the world. Early detection of such a disease increases the chances of survival. Among the available screening tools, there is the Electro-Impedance Mammography (EIM), which is a novel and less invasive method that captures the potential difference stored in breast tissues under the assumption that electrical properties among normal and pathologically altered tissues are different. In this paper, we address breast cancer detection as a multi-class problem aiming to determine the corresponding label in terms of the Breast Imaging Electrical Impedance classification system, the standard used by physicians for interpreting an EIM mammogram. For experimental purposes, for the first time in the literature, we took advantage of a dataset comprising EIM of Mexican patients. Aiming to establish a baseline for this task, traditional supervised learning methods were used together with two different feature extraction techniques: raw pixel data and transfer learning. Besides, data augmentation was exploited for compensating data imbalance. Different experimental settings were evaluated reaching classification rates over 0.85 in F-score. KNN emerges as a very promising classifier for addressing this task. The obtained results allow us to validate the usefulness of traditional methods for classifying electro-impedance mammograms. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
25. Improved approach to wave potential estimation using bivariate distributions.
- Author
-
Guzmán-Cabrera, Rafael, Hernández-Robles, Iván A., González-Ramírez, Xiomara, Guzmán-Sepúlveda, José Rafael, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
OCEAN waves , *ELECTRIC power production , *EXTREME value theory , *ELECTRIC generators , *WAVE energy , *POTENTIAL energy - Abstract
Probabilistic approaches are frequently used to describe irregular activity data to assist the design and development of devices. Unfortunately, useful estimations are not always feasible due to the large noise in the data modeled, as it occurs when estimating the sea waves potential for electricity generation. In this work we propose a simple methodology based on the use of joint probability models that allow discriminating extreme values, collected from measurements as pairs of independent points, while allowing the preservation of the essential statistics of the measurements. The outcome of the proposed methodology is an equivalent data series where large-amplitude fluctuations are suppressed and, therefore, can be used for design purposes. For the evaluation of the proposed method, we used year-long databases of hourly-collected measurements of the wave's height and period, performed at maritime buoys located in the Gulf of Mexico. These measurements are used to obtain a fluctuations-reduced representation of the energy potential of the waves that can be useful, for instance, for the design of electric generators. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
26. Query-focused multi-document text summarization using fuzzy inference.
- Author
-
Agarwal, Raksha, Chatterjee, Niladri, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
FUZZY logic , *LINEAR programming , *FUZZY systems , *TEXT recognition , *INTEGER programming - Abstract
The present paper proposes a fuzzy inference system for query-focused multi-document text summarization (MTS). The overall scheme is based on Mamdani Inferencing scheme which helps in designing Fuzzy Rule base for inferencing about the decision variable from a set of antecedent variables. The antecedent variables chosen for the task are from linguistic and positional heuristics, and similarity of the documents with the user-defined query. The decision variable is the rank of the sentences as decided by the rules. The final summary is generated by solving an Integer Linear Programming problem. For abstraction coreference resolution is applied on the input sentences in the pre-processing step. Although designed on the basis of a small set of antecedent variables the results are very promising. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
27. Fake News detection using n-grams for PAN@CLEF competition.
- Author
-
Damian, Sergio, Calvo, Hiram, Gelbukh, Alexander, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
FAKE news , *FEATURE selection , *ARTIFICIAL intelligence , *DISINFORMATION - Abstract
The paper presents a classifier for fake news spreaders detection in social media. Detecting fake news spreaders is an important task because this kind of disinformation aims to change the reader's opinion about a relevant topic for the society. This work presents a classifier that can compete with the ones that are found in the state-of-the-art. In addition, this work applies Explainable Artificial Intelligence (XIA) methods in order to understand the corpora used and how the model estimates results. The work focuses on the corpora developed by members of the PAN@CLEF 2020 competition. The score obtained surpasses the state-of-the-art with a mean accuracy score of 0.7825. The solution uses XIA methods for the feature selection process, since they present more stability to the selection than most of traditional feature selection methods. Also, this work concludes that the detection done by the solution approach is generally based on the topic of the text. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
28. Classification and enhancement of invasive ductal carcinoma samples using convolutional neural networks.
- Author
-
Sierra-Enriquez, Edgar E., Valdez-Rodríguez, José E., Felipe-Riveró, Edgardo M., Calvo, Hiram, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
CONVOLUTIONAL neural networks , *DUCTAL carcinoma , *BREAST , *DIGITAL images - Abstract
In the medical area, the detection of invasive ductal carcinoma is the most common sub-type of all breast cancers; about 80% of all breast cancers are invasive ductal carcinomas. Detection of this type of cancer shows a great challenge for specialist doctors since the digital images of the sample must be analyzed by sections because the spatial dimensions of this kind of image are above 50k × 50k pixels; doing this operation manually takes long time to determine if the patient suffers this type of cancer. Time is essential for the patient because this cancer can invade quickly other parts of the body. Its name reaffirms this characteristic, with the term "invasive" forming part of its name. With the purpose of solving this task, we propose an automatic methodology consisting in improving the performance of a convolutional neural network that classifies images containing invasive ductal carcinoma cells by highlighting cancer cells using several preprocessing methods such as histogram stretching and contrast enhancement. In this way, characteristics of the sub-images are extracted from the panoramic sample and it is possible to learn to classify them in a better way. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
29. Question type and answer related keywords aware question generation.
- Author
-
Zhang, Jianfei, Rong, Wenge, Chen, Dali, Xiong, Zhang, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
MEASUREMENT - Abstract
The traditional end-to-end Neural Question Generation (NQG) models tend to generate generic and bland questions, as there are two obscure points: 1) the modifications of the answer in the context can be used as the clues to the answer mentioned in the question, while they are generally not unique and can be used independently for generating diverse questions; 2) the same question content can also be asked in diverse ways, which depends on personal preference in practice. The above-mentioned two points are indeed two variables to conduct question generation, but they are not annotated in the original dataset and are thus ignored by the traditional end-to-end models. In this paper we propose a framework that clarifies those two points through two sub-modules to better conduct question generation. We take experiments based on the GPT-2 model and the SQuAD dataset, and prove that our framework can improve the performance measured by similarity metrics, while it also provides appropriate alternatives for controllable diversity enhancement. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
30. A content spectral-based text representation.
- Author
-
Crespo-Sanchez, Melesio, Lopez-Arevalo, Ivan, Aldana-Bobadilla, Edwin, Molina-Villegas, Alejandro, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
MACHINE learning , *SPAM email , *MACHINE translating , *ALGORITHMS - Abstract
In the last few years, text analysis has grown as a keystone in several domains for solving many real-world problems, such as machine translation, spam detection, and question answering, to mention a few. Many of these tasks can be approached by means of machine learning algorithms. Most of these algorithms take as input a transformation of the text in the form of feature vectors containing an abstraction of the content. Most of recent vector representations focus on the semantic component of text, however, we consider that also taking into account the lexical and syntactic components the abstraction of content could be beneficial for learning tasks. In this work, we propose a content spectral-based text representation applicable to machine learning algorithms for text analysis. This representation integrates the spectra from the lexical, syntactic, and semantic components of text producing an abstract image, which can also be treated by both, text and image learning algorithms. These components came from feature vectors of text. For demonstrating the goodness of our proposal, this was tested on text classification and complexity reading score prediction tasks obtaining promising results. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
31. A unified deep neuro-fuzzy approach for COVID-19 twitter sentiment classification.
- Author
-
Bahuguna, Aman, Yadav, Deepak, Senapati, Apurbalal, Saha, Baidya Nath, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
FUZZY integrals , *COVID-19 , *CLASSIFICATION , *SENTIMENT analysis , *FUZZY neural networks , *SEMANTICS - Abstract
Covid-19 braces serious mental health crisis across the world. Since a vast majority of the population exploit social media platforms such as twitter to exchange information, rapid collecting and analyzing social media data to understand personal well-being and subsequently adopting adequate measures could avoid severe socio-economic damage. Sentiment analysis on twitter data is very useful to understand and identify the mental health issues. In this research, we proposed a unified deep neuro-fuzzy approach for Covid-19 twitter sentiment classification. Fuzzy logic has been a very powerful tool for twitter data analysis where approximate semantic and syntactic analysis is more relevant because correcting spelling and grammar in tweets are merely obnoxious. We conducted the experiment on three challenging COVID-19 twitter sentiment datasets. Experimental results demonstrate that fuzzy Sugeno integral based ensembled classifiers succeed over individual base classifiers. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
32. A novel methodology of parametric identification for robots based on a CNN.
- Author
-
Carreón-Díaz de León, Carlos Leopoldo, Vergara-Limon, Sergio, Vargas-Treviño, María Aurora D., González-Calleros, Juan Manuel, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
CONVOLUTIONAL neural networks , *ROBOTS , *IDENTIFICATION , *ROBOT programming - Abstract
This paper presents a novel methodology to identify the dynamic parameters of a real robot with a convolutional neural network (CNN). Conventional identification methodologies use continuous motion signals. However, these signals are quantized in their amplitude and are discrete in time. Therefore, the time required to identify the parameters of a robot with a limited measurement system is related to an optimized motion trajectory performed by the robot. The proposed methodology consists of an algorithm that uses a trained CNN with the data created by the dynamical model of the case study robot. A processing technique is proposed to transform the position, velocity, acceleration, and torque robot signals into an image whose characteristics are extracted by the CNN to determine their dynamic parameters. The proposed algorithm does not require any optimal trajectory to find the dynamic parameters. A proposed time-spectral evaluation metric is used to validate the robot data and the identification data. The validation results show that the proposed methodology identifies the parameters of a Cartesian robot in less than 1 second, exceeding 90% of the proposed evaluation metric and 98% for the simulation results. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
33. Automatic generation of a large dictionary with concreteness/abstractness ratings based on a small human dictionary.
- Author
-
Ivanov, Vladimir, Solovyev, Valery, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *PSYCHOLOGICAL research , *EXTRAPOLATION - Abstract
Concrete/abstract words are used in a growing number of psychological and neurophysiological research. For a few languages, large dictionaries have been created manually. This is a very time-consuming and costly process. To generate large high-quality dictionaries of concrete/abstract words automatically one needs extrapolating the expert assessments obtained on smaller samples. The research question that arises is how small such samples should be to do a good enough extrapolation. In this paper, we present a method for automatic ranking concreteness of words and propose an approach to significantly decrease amount of expert assessment. The method has been evaluated on a large test set for English. The quality of the constructed dictionaries is comparable to the expert ones. The correlation between predicted and expert ratings is higher comparing to the state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
34. Latent semantic analysis for tagging activation states and identifiability in northwestern Mexican news outlets.
- Author
-
Sánchez-Fernández, Manuel-Alejandro, Medina-Urrea, Alfonso, Torres-Moreno, Juan-Manuel, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
LATENT semantic analysis , *PEARSON correlation (Statistics) , *NOUN phrases (Grammar) , *INFLECTION (Grammar) , *RANDOM forest algorithms - Abstract
The present work aims to study the relationship between measures, obtained from Latent Semantic Analysis (LSA) and a variant known as SPAN, and activation and identifiability states (Informative States) of referents in noun phrases present in journalistic notes from Northwestern Mexican news outlets written in Spanish. The aim and challenge is to find a strategy to achieve labelling of new / given information in the discourse rooted in a theoretically linguistic stance. The new / given distinction can be defined from different perspectives in which it varies what linguistic forms are taken into account. Thus, the focus in this work is to work with full referential devices (n = 2 388). Pearson's R correlation tests, analysis of variance, graphical exploration of the clustering of labels, and a classification experiment with random forests are performed. For the experiment, two groups were used: noun phrases labeled with all 10 tags of informative states and a binary labelling, as well as the use of two bags-of-words for each noun phrase: the interior and the exterior. It was found that using LSA in conjunction with the inner bag of words can be used to classify certain informational states. This same measure showed good results for the binary division, detecting which sentences introduce new referents in discourse. In previous work using a similar method in noun phrases in English, 80% accuracy (n = 478) was reached in their classification exercise. Our best test for Spanish reached 79%. No work on Spanish using this method has been done before and this kind of experiment is important because Spanish exhibits a more complex inflectional morphology. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
35. Onto4AIR2: An ontology to manage theses from open repositories.
- Author
-
Medina Nieto, María Auxilio, de la Calleja Mora, Jorge, Zepeda Cortés, Claudia, López Domínguez, Eduardo, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
ONTOLOGIES (Information retrieval) , *ONTOLOGY , *INSTITUTIONAL repositories , *SPANISH language , *DATA libraries , *EDUCATION associations - Abstract
This paper describes Onto4AIR2, an ontology to manage theses from open repositories, this fosters unique and formal definitions of concepts from the Mexican repositories domain in English and Spanish languages, its goal is to support the construction of machine-readable datasets that are semantically labeled for further consultations in educational organizations. The ontology instances are sample data of theses from the National Repository of Mexico, an initiative promoted by the National Council of Science and Technology. The paper describes advantages derived from the formalisms of the ontology, and describes an assessment technique where participants are developers and potential users. Developers followed a competency questions-based approach and determined that the ontology represents questions and answers using its terminology; whereas potential users participated in a satisfaction survey; the results showed a positive perception. At present, the level of the ontology is proof of concept. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
36. Improved argumentative paragraphs detection in academic theses supported with unit segmentation.
- Author
-
García-Gorrostieta, Jesús Miguel, López-López, Aurelio, González-López, Samuel, López-Monroy, Adrián Pastor, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
PARAGRAPHS , *ACADEMIC discourse , *DECISION trees , *NATURAL language processing - Abstract
Academic theses writing is a complex task that requires the author to be skilled in argumentation. The goal of the academic author is to communicate clear ideas and to convince the reader of the presented claims. However, few students are good arguers, and this is a skill that takes time to master. In this paper, we present an exploration of lexical features used to model automatic detection of argumentative paragraphs using machine learning techniques. We present a novel proposal, which combines the information in the complete paragraph with the detection of argumentative segments in order to achieve improved results for the detection of argumentative paragraphs. We propose two approaches; a more descriptive one, which uses the decision tree classifier with indicators and lexical features; and another more efficient, which uses an SVM classifier with lexical features and a Document Occurrence Representation (DOR). Both approaches consider the detection of argumentative segments to ensure that a paragraph detected as argumentative has indeed segments with argumentation. We achieved encouraging results for both approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
37. Hyperparameter tuning for multi-label classification of feedbacks in online courses.
- Author
-
Ruiz Alonso, Dorian, Zepeda Cortés, Claudia, Castillo Zacatelco, Hilda, Carballido Carranza, José Luis, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
ONLINE education , *PSYCHOLOGICAL feedback , *NAIVE Bayes classification , *SUPPORT vector machines , *K-nearest neighbor classification , *KEY performance indicators (Management) , *RANDOM forest algorithms - Abstract
In this work, we propose the extension of a methodology for the multi-label classification of feedback according to the Hattie and Timperley feedback model, incorporating a hyperparameter tuning stage. It is analyzed whether the incorporation of the hyperparameter tuning stage prior to the execution of the algorithms support vector machines, random forest and multi-label k-nearest neighbors, improves the performance metrics of multi-label classifiers that automatically locate the feedback generated by a teacher to the activities sent by students in online courses on the Blackboard platform at the task, process, regulation, praise and other levels proposed in the feedback model by Hattie and Timperley. The grid search strategy is used to refine the hyperparameters of each algorithm. The results show that the adjustment of the hyperparameters improves the performance metrics for the data set used. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
38. An autoencoder-based representation for noise reduction in distant supervision of relation extraction.
- Author
-
García-Mendoza, Juan-Luis, Villaseñor-Pineda, Luis, Orihuela-Espina, Felipe, Bustio-Martínez, Lázaro, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
SUPERVISION - Abstract
Distant Supervision is an approach that allows automatic labeling of instances. This approach has been used in Relation Extraction. Still, the main challenge of this task is handling instances with noisy labels (e.g., when two entities in a sentence are automatically labeled with an invalid relation). The approaches reported in the literature addressed this problem by employing noise-tolerant classifiers. However, if a noise reduction stage is introduced before the classification step, this increases the macro precision values. This paper proposes an Adversarial Autoencoders-based approach for obtaining a new representation that allows noise reduction in Distant Supervision. The representation obtained using Adversarial Autoencoders minimize the intra-cluster distance concerning pre-trained embeddings and classic Autoencoders. Experiments demonstrated that in the noise-reduced datasets, the macro precision values obtained over the original dataset are similar using fewer instances considering the same classifier. For example, in one of the noise-reduced datasets, the macro precision was improved approximately 2.32% using 77% of the original instances. This suggests the validity of using Adversarial Autoencoders to obtain well-suited representations for noise reduction. Also, the proposed approach maintains the macro precision values concerning the original dataset and reduces the total instances needed for classification. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
39. A case study in authorship attribution: The Mondrigo1.
- Author
-
Sierra, Gerardo, Hernández-García, Tonatiuh, Gómez-Adorno, Helena, Bel-Enguix, Gemma, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
ATTRIBUTION of authorship , *STUDENT strikes , *K-means clustering , *STUDENT government , *AUTHORSHIP - Abstract
In this paper, we present authorship attribution methods applied to ¡El Mondrigo! (1968), a controversial text supposedly created by order of the Mexican Government to defame a student strike. Up to now, although the authorship of the book has been attributed to several journalists and writers, it could not be demonstrated and remains an open problem. The work aims at establishing which one of the most commonly attributed writers is the real author. To do that, we implement methods based on stylometric features using textual distance, supervised, and unsupervised learning. The distance-based methods implemented in this work are Kilgarriff and Delta of Burrows, an SVM algorithm is used as the supervised method, and the k-means algorithm as the unsupervised algorithm. The applied methods were consistent by pointing out a single author as the most likely one. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
40. Automatic generation of learning outcomes based on long short–term memory artificial neural network1.
- Author
-
Suárez-Cansino, Joel, López-Morales, Virgilio, Ramos-Fernández, Julio César, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
EDUCATIONAL outcomes , *INSTRUCTIONAL systems design - Abstract
Building a good instructional design requires a sound organization management to program and articulate several tasks based for instance on the time availability, process follow-up, social and educational context. Furthermore, learning outcomes are the basis involving every educational activity. Thus, based on a predefined ontology, including the instructional educative model and its characteristics, we propose the use of a Long Short–Term Memory Artificial Neural Network (LSTM) to organize the structure and automatize the obtention of learning outcomes for a focused instructional design. We present encouraging results in this direction through the use of a LSTM using as the training data, a small learning outcomes set predefined by the user, focused on the characteristics of an educative model previously defined. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
41. Fake news spreaders profiling using N-grams of various types and SHAP-based feature selection.
- Author
-
Balouchzahi, Fazlourrahman, Sidorov, Grigori, Shashirekha, Hosahalli Lakshmaiah, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
FEATURE selection , *FAKE news , *NATURAL language processing , *RADIAL basis functions , *SUPPORT vector machines , *DEEP learning - Abstract
Complex learning approaches along with complicated and expensive features are not always the best or the only solution for Natural Language Processing (NLP) tasks. Despite huge progress and advancements in learning approaches such as Deep Learning (DL) and Transfer Learning (TL), there are many NLP tasks such as Text Classification (TC), for which basic Machine Learning (ML) classifiers perform superior to DL or TL approaches. Added to this, an efficient feature engineering step can significantly improve the performance of ML based systems. To check the efficacy of ML based systems and feature engineering on TC, this paper explores char, character sequences, syllables, word n-grams as well as syntactic n-grams as features and SHapley Additive exPlanations (SHAP) values to select the important features from the collection of extracted features. Voting Classifiers (VC) with soft and hard voting of four ML classifiers, namely: Support Vector Machine (SVM) with Linear and Radial Basis Function (RBF) kernel, Logistic Regression (LR), and Random Forest (RF) was trained and evaluated on Fake News Spreaders Profiling (FNSP) shared task dataset in PAN 2020. This shared task consists of profiling fake news spreaders in English and Spanish languages. The proposed models exhibited an average accuracy of 0.785 for both languages in this shared task and outperformed the best models submitted to this task. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
42. Multi-label text classification with an ensemble feature space.
- Author
-
Tandon, Kushagri, Chatterjee, Niladri, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
FEATURE extraction , *CLASSIFICATION , *MACHINE learning , *NAIVE Bayes classification - Abstract
Multi-label text classification aims at assigning more than one class to a given text document, which makes the task more ambiguous and challenging at the same time. The ambiguities come from the fact that often several labels in the prescribed label set are semantically close to each other, making clear demarcation between them difficult. As a consequence, any Machine Learning based approach for developing multi-label classification scheme needs to define its feature space by choosing features beyond linguistic or semi-linguistic features, so that the semantic closeness between the labels is also taken into account. The present work describes a scheme of feature extraction where the training document set and the prescribed label set are intertwined in a novel way to capture the ambiguity in a meaningful way. In particular, experiments were conducted using Topic Modeling and Fuzzy C-Means clustering which aim at measuring the underlying uncertainty using probability and membership based measures, respectively. Several Nonparametric hypothesis tests establish the effectiveness of the features obtained through Fuzzy C-Means clustering in multi-label classification. A new algorithm has been proposed for training the system for multi-label classification using the above set of features. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
43. Searching for memory-lighter architectures for OCR-augmented image captioning.
- Author
-
Gallardo-García, Rafael, Beltrán-Martínez, Beatriz, Hernández-Gracidas, Carlos, Vilariño-Ayala, Darnes, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
OPTICAL character recognition , *COMPUTER performance , *IMAGING systems , *READING comprehension , *TEST scoring - Abstract
Current State-of-the-Art image captioning systems that can read and integrate read text into the generated descriptions need high processing power and memory usage, which limits the sustainability and usability of the models (as they require expensive and very specialized hardware). The present work introduces two alternative versions (L-M4C and L-CNMT) of top architectures (on the TextCaps challenge), which were mainly adapted to achieve near-State-of-The-Art performance while being memory-lighter when compared to the original architectures, this is mainly achieved by using distilled or smaller pre-trained models on the text-and-OCR embedding modules. On the one hand, a distilled version of BERT was used in order to reduce the size of the text-embedding module (the distilled model has 59% fewer parameters), on the other hand, the OCR context processor on both architectures was replaced by Global Vectors (GloVe), instead of using FastText pre-trained vectors, this can reduce the memory used by the OCR-embedding module up to a 94%. Two of the three models presented in this work surpassed the baseline (M4C-Captioner) of the challenge on the evaluation and test sets, also, our best lighter architecture reached a CIDEr score of 88.24 on the test set, which is 7.25 points above the baseline model. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
44. Handling temporality in human activity reasoning.
- Author
-
Morveli-Espinoza, Mariela, Nieves, Juan Carlos, Tacla, Cesar Augusto, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
LATENT variables , *COMMON sense , *HUMAN beings , *KNOWLEDGE base , *GENERAL semantics - Abstract
Human-aware Artificial Intelligent systems are goal directed autonomous systems that are capable of interacting, collaborating, and teaming with humans. Activity reasoning is a formal reasoning approach that aims to provide common sense reasoning capabilities to these interactive and intelligent systems. This reasoning can be done by considering evidences –which may be conflicting–related to activities a human performs. In this context, it is important to consider the temporality of such evidence in order to distinguish activities and to analyse the relations between activities. Our approach is based on formal argumentation reasoning, specifically, Timed Argumentation Frameworks (TAF), which is an appropriate technique for dealing with inconsistencies in knowledge bases. Our approach involves two steps: local selection and global selection. In the local selection, a model of the world and of the human's mind is constructed in form of hypothetical fragments of activities (pieces of evidences) by considering a set of observations. These hypothetical fragments have two kinds of relations: a conflict relation and a temporal relation. Based on these relations, the argumentation attack notion is defined. We define two forms of attacks namely the strong and the weak attack. The former has the same characteristics of attacks in TAF whereas for the latter the TAF approach has to be extended. For determining consistent sets of hypothetical fragments, that are part of an activity or are part of a set of non-conflicting activities, extension-based argumentation semantics are applied. In the global selection, the degrees of fulfillment of activities is determined. We study some properties of our approach and apply it to a scenario where a human performs activities with different temporal relations. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
45. Soft Rough Set based span for unsupervised keyword extraction.
- Author
-
Chatterjee, Niladri, Roy, Aayush Singha, Yadav, Nidhika, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
ROUGH sets , *SOFT sets , *GREEDY algorithms , *TEXT recognition - Abstract
The present work proposes an application of Soft Rough Set and its span for unsupervised keyword extraction. In recent times Soft Rough Sets are being applied in various domains, though none of its applications are in the area of keyword extraction. On the other hand, the concept of Rough Set based span has been developed for improved efficiency in the domain of extractive text summarization. In this work we amalgamate these two techniques, called Soft Rough Set based Span (SRS), to provide an effective solution for keyword extraction from texts. The universe for Soft Rough Set is taken to be a collection of words from the input texts. SRS provides an ideal platform for identifying the set of keywords from the input text which cannot always be defined clearly and unambiguously. The proposed technique uses greedy algorithm for computing spanning sets. The experimental results suggest that extraction of keywords using the proposed scheme gives consistent results across different domains. Also, it has been found to be more efficient in comparison with several existing unsupervised techniques. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
46. A statistical evaluation of the oral vaccine S3pvac papaya against cysticercosis of taenia psiformis.
- Author
-
Loranca, Maria Beatriz Bernábe, Rosales, José Espinosa, Orea, Mirna Huerta, Cardiff, John, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
ORAL vaccines , *CYSTICERCOSIS , *PAPAYA , *TAENIA , *SALINE solutions - Abstract
The objective of this paper is to compare and evaluate statistically the behavior of two vaccines against cysticercus in a sample of female rabbits. The two vaccines under discussion are 1) S3Pvac-Papaya12 mg and 2) Wild Type (WT) or S3P Wild and also 3) Saline Solution. The challenge is to show that the developed vaccine, S3Pvac-Papaya, produces more antibodies and with better stability than the other vaccine and saline solution. With the aim of proving this conjecture, an analysis of variance (ANOVA) and multiple Fisher comparisons at 95% confidence were performed. The vaccine of interest, S3Pvac-Papaya, revealed in the box diagram at T2 that the development of antibodies was high and showed little dispersion, which implies that the vaccine S3Pvac Papaya is statistically efficient in the production of antibodies. Finally, the mathematical contribution centers on highlighting the low use of inferential statistical techniques, comparing means of generated antibodies by a set of vaccines in order to determine which one is more efficient and reliable. Tacitly, a methodology both statistical and procedural has been proposed along this work, to apply when contrasting other kinds of vaccines in both animals and humans for diverse conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
47. Unsupervised authorship attribution using feature selection and weighted cosine similarity.
- Author
-
Martín-del-Campo-Rodríguez, Carolina, Sidorov, Grigori, Batyrshin, Ildar, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
ATTRIBUTION of authorship , *FEATURE extraction , *MACHINE learning , *FEATURE selection - Abstract
This paper presents a computational model for the unsupervised authorship attribution task based on a traditional machine learning scheme. An improvement over the state of the art is achieved by comparing different feature selection methods on the PAN17 author clustering dataset. To achieve this improvement, specific pre-processing and features extraction methods were proposed, such as a method to separate tokens by type to assign them to only one category. Similarly, special characters are used as part of the punctuation marks to improve the result obtained when applying typed character n-grams. The Weighted cosine similarity measure is applied to improve the B3 F-score by reducing the vector values where attributes are exclusive. This measure is used to define distances between documents, which later are occupied by the clustering algorithm to perform authorship attribution. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
48. Algorithmic music generation by harmony recombination with genetic algorithm.
- Author
-
Lopez-Rincon, Omar, Starostenko, Oleg, Lopez-Rincon, Alejandro, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
GENETIC recombination , *GENETIC algorithms , *HARMONY in music , *MUSICAL composition , *FEATURE extraction , *DESCRIPTOR systems - Abstract
Algorithmic music composition has recently become an area of prestigious research in projects such as Google's Magenta, Aiva, and Sony's CSL Lab aiming to increase the composers' tools for creativity. There are advances in systems for music feature extraction and generation of harmonies with short-time and long-time patterns of music style, genre, and motif. However, there are still challenges in the creation of poly-instrumental and polyphonic music, pieces become repetitive and sometimes these systems copy the original files. The main contribution of this paper is related to the improvement of generating new non-plagiary harmonic developments constructed from the symbolic abstraction from MIDI music non-labeled data with controlled selection of rhythmic features based on evolutionary techniques. Particularly, a novel approach for generating new music compositions by replacing existing harmony descriptors in a MIDI file with new harmonic features from another MIDI file selected by a genetic algorithm. This allows combining newly created harmony with a rhythm of another composition guaranteeing the adjustment of a new music piece to a distinctive genre with regularity and consistency. The performance of the proposed approach has been assessed using artificial intelligent computational tests, which assure goodness of the extracted features and shows its quality and competitiveness. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
49. Multi-label classification of feedbacks.
- Author
-
Ruiz Alonso, Dorian, Zepeda Cortés, Claudia, Castillo Zacatelco, Hilda, Carballido Carranza, José Luis, García Cué, José Luis, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
RANDOM forest algorithms , *PSYCHOLOGICAL feedback , *NATURAL language processing , *TEXT mining , *ONLINE education , *COLLEGE curriculum - Abstract
This work deals with educational text mining, a field of natural language processing applied to education. The objective is to classify the feedback generated by teachers in online courses to the activities sent by students according to the model of Hattie and Timperley (2007), considering that feedback may be at the levels task, process, regulation, praise and other. Four multi-label classification methods of the data transformation approach - binary relevance, classification chains, power labelset and rakel-d - are compared with the base algorithms SVM, Random Forest, Logistic Regression and Naive Bayes. The methodology was applied to a case study in which 11013 feedbacks written in Spanish language from 121 online courses of the Law degree from a public university in Mexico were collected from the Blackboard learning manager system. The results show that the random forests algorithms and vector support machines will have the best performance when using the binary relevance transformation and classifier chains methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
50. Wavelets as activation functions in Neural Networks.
- Author
-
Herrera, Oscar, Priego, Belém, Pinto, David, Beltrán, Beatriz, and Singh, Vivek
- Subjects
- *
DEEP learning - Abstract
Traditionally, a few activation functions have been considered in neural networks, including bounded functions such as threshold, sigmoidal and hyperbolic-tangent, as well as unbounded ReLU, GELU, and Soft-plus, among other functions for deep learning, but the search for new activation functions still being an open research area. In this paper, wavelets are reconsidered as activation functions in neural networks and the performance of Gaussian family wavelets (first, second and third derivatives) are studied together with other functions available in Keras-Tensorflow. Experimental results show how the combination of these activation functions can improve the performance and supports the idea of extending the list of activation functions to wavelets which can be available in high performance platforms. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.