337 results on '"sequence mining"'
Search Results
2. Sequence Analysis in Education: Principles, Technique, and Tutorial with R
- Author
-
Saqr, Mohammed, López-Pernas, Sonsoles, Helske, Satu, Durand, Marion, Murphy, Keefe, Studer, Matthias, Ritschard, Gilbert, Saqr, Mohammed, editor, and López-Pernas, Sonsoles, editor
- Published
- 2024
- Full Text
- View/download PDF
3. Dropout is not always a failure! Exploration on the prior knowledge and learning behaviors of MOOC learners
- Author
-
Matcha, Wannisa, Natthaphatwirata, Rusada, Uzir, Nora’ayu Ahmad, and Gašević, Dragan
- Published
- 2024
- Full Text
- View/download PDF
4. AirPollutionViz: visual analytics for understanding the spatio-temporal evolution of air pollution.
- Author
-
Yue, Xiaoqi, Feng, Dan, Sun, Desheng, Liu, Chao, Qin, Hongxing, and Hu, Haibo
- Abstract
Spatio-temporal evolution analysis has been a critical topic of air pollution research. However, there are still several difficulties caused by the large scale and dimensionality of the data. Specifically, First, traditional methods deal with such data by simplifying and abstracting, resulting in information loss. Second, most existing visualizations, generally focusing on overall evolution, ignore the exploration of multiple time scales and pattern transitions between subsequences. This paper presents AirPollutionViz, a visual analytics system that enables to analyze the spatio-temporal evolution in two manners: sequence mining and clustering analysis. Concretely, we propose sequence merging to shorten the sequence length and construct a weighted directed graph structure, which promotes efficient querying of sequence patterns by combination with dynamic time warping. We design a novel summary view to display the overview of pollution level changes, together with the improved node-link chart, to support the analysis of air pollution spatio-temporal evolution patterns. We also apply K-means clustering to pollutants, and a scatter plot and map reflect the spatial distribution aggregation. The system supports users' free exploration across multiple time scales with rich interactions. Case studies with three domain experts and a user study with ten users demonstrate the usefulness and effectiveness of AirPollutionViz. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Sky-signatures: detecting and characterizing recurrent behavior in sequential data.
- Author
-
Gautrais, Clément, Cellier, Peggy, Guyet, Thomas, Quiniou, René, and Termier, Alexandre
- Subjects
NATURAL language processing ,DATA mining ,POLITICAL oratory ,RECURRENT neural networks - Abstract
This paper proposes the sky-signature model, an extension of the signature model Gautrais et al. (in: Proceedings of the Pacific-Asia conference on knowledge discovery and data mining (PAKDD), Springer, 2017b) to multi-objective optimization. The signature approach considers a sequence of itemsets, and given a number k it returns a segmentation of the sequence in k segments such that the number of items occuring in all segments is maximized. The limitation of this approach is that it requires to manually set k, and thus fixes the temporal granularity at which the data is analyzed. The sky-signature model proposed in this paper removes this requirement, and allows to examine the results at multiple levels of granularity, while keeping a compact output. This paper also proposes efficient algorithms to mine sky-signatures, as well as an experimental validation both real data both from the retail domain and from natural language processing (political speeches). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Efficient Frequent Chronicle Mining Algorithms: Application to Sleep Disorder
- Author
-
Hareth Zmezm, Jose Maria Luna, Eduardo Almeda, and Sebastian Ventura
- Subjects
Frequent event graphs ,chronicle mining ,sequence mining ,temporal data mining ,sleep disorder ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Sequential pattern mining is a dynamic and thriving research field that aims to extract recurring sequences of events from complex datasets. Traditionally, focusing solely on the order of events often falls short of providing precise insights. Consequently, incorporating the temporal intervals between events has emerged as a vital necessity across various domains, e.g. medicine. Analyzing temporal event sequences within patients’ clinical histories, drug prescriptions, and monitoring alarms exemplifies this critical need. This paper presents innovative and efficient methodologies for mining frequent chronicles from temporal data. The mined graphs offer a significantly more expressive representation than mere event sequences, capturing intricate details of a series of events in a factual manner. The experimental stage includes a series of analyses of diverse databases with distinct characteristics. The proposed approaches were also applied to real-world data comprising information about subjects suffering from sleep disorders. Alluring frequent complete event graphs were obtained on patients who were under the effect of sleep medication.
- Published
- 2024
- Full Text
- View/download PDF
7. Capturing temporal pathways of collaborative roles: A multilayered analytical approach using community of inquiry
- Author
-
Elmoazen, Ramy, Saqr, Mohammed, Hirsto, Laura, and Tedre, Matti
- Published
- 2024
- Full Text
- View/download PDF
8. A learning analytics perspective on educational escape rooms.
- Author
-
López-Pernas, Sonsoles, Saqr, Mohammed, Gordillo, Aldo, and Barra, Enrique
- Subjects
- *
DATA encryption , *LEARNING , *KNOWLEDGE acquisition (Expert systems) , *DATA analysis , *PSYCHOLOGY of students - Abstract
Learning analytics methods have proven useful in providing insights from the increasingly available digital data about students in a variety of learning environments, including serious games. However, such methods have not been applied to the specific context of educational escape rooms and therefore little is known about students' behavior while playing. The present work aims to fill the gap in the existing literature by showcasing the power of learning analytics methods to reveal and represent students' behavior when participating in a computer-supported educational escape room. Specifically, we make use of sequence mining methods to analyze the temporal and sequential aspects of the activities carried out by students during these novel educational games. We further use clustering to identify different player profiles according to the sequential unfolding of students' actions and analyze how these profiles relate to knowledge acquisition. Our results show that students' behavior differed significantly in their use of hints in the escape room and resulted in differences in their knowledge acquisition levels. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
9. Exploration of Latent Structure in Test Revision and Review Log Data.
- Author
-
Zhang, Susu, Li, Anqi, and Wang, Shiyu
- Subjects
- *
DATA logging , *STATISTICAL learning , *COMPUTER adaptive testing , *ACQUISITION of data - Abstract
In computer‐based tests allowing revision and reviews, examinees' sequence of visits and answer changes to questions can be recorded. The variable‐length revision log data introduce new complexities to the collected data but, at the same time, provide additional information on examinees' test‐taking behavior, which can inform test development and instructions. In the current study, we used recently proposed statistical learning methods for sequence data to provide an exploratory analysis of item‐level revision and review log data. Based on the revision log data collected from computer‐based classroom assessments, common prototypes of revisit and review behavior were identified. The relationship between revision behavior and various item, test, and individual covariates was further explored under a Bayesian multivariate generalized linear mixed model. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. Correlation Analysis of Stock Index Data Features Using Sequential Rule Mining Algorithms
- Author
-
Mazumdar, Nayanjyoti, Sarma, Pankaj Kumar Deva, Bansal, Jagdish Chand, Series Editor, Deep, Kusum, Series Editor, Nagar, Atulya K., Series Editor, Das, Nibaran, editor, Binong, Juwesh, editor, Krejcar, Ondrej, editor, and Bhattacharjee, Debotosh, editor
- Published
- 2023
- Full Text
- View/download PDF
11. Natural Exponent Inertia Weight-based Particle Swarm Optimization for Mining Serial Episode Rules from Event Sequences.
- Author
-
Poongodi, K. and Kumar, Dhananjay
- Subjects
- *
PARTICLE swarm optimization , *DATA structures , *EXPONENTS - Abstract
An episode rule mining to extract useful and important patterns or episodes from large event sequences represents the temporal implication of associating the antecedent and consequent episodes. The existing technique for mining precise-positioning episode rules from event sequences, mines serial episodes resulting in enormous memory consumption. To resolve this issue, the proposed work ensures the generation of fixed-gap episodes and parameter settings through the use of Particle Swarm Optimization mechanism. Fixed-gap episodes are generated using Natural Exponent Inertia Weight-based Particle Swarm Optimization algorithm. In this paper, a new technique called Mining Serial Episode Rules (MSER) is proposed, which utilizes the correlation between episodes and the generation of parameter selection where the occurrence time of an event is specified in the consequent. Further, a trie-based data structure to mine MSER along with a pruning technique is incorporated in the proposed methodology to improve the performance. The efficiency of the proposed algorithm MSER is evaluated on three benchmark data sets Retail, Kosarak, and MSNBC where the experimental results outperform the existing methods with respect to memory usage and execution time. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
12. The application of advertising logo color design for big data and visual communication technology.
- Author
-
Tian, Huan
- Subjects
- *
TELECOMMUNICATION , *BIG data , *DATA transmission systems , *VISUAL communication , *LOGO design , *COLOR in design - Abstract
Color is one of the three major elements of print advertising, and different color combinations can trigger different emotional experiences of human beings. At present, the application of color in advertising in China is relatively mature, but it is limited to the traditional application method and has not been combined with big data technology. From the perspective of business needs, this research analyzes the process of visual creativity from the perspective of business value-added, and analyzes the role of big data in it. Then it introduces the semantics of common colors and how to incorporate color semantics into advertising design. And a sequence mining-based advertising click-through rate prediction model is proposed. The Criteo dataset is used as the training set. The AUC value of the model is 0.702 and the loss value is 0.415. Compared with other models, AUC values increased by 10.16%, 4.70%, 2.69% and 2.30%, respectively. Losses decreased by 10.17%, 9.19%, 6.11% and 7.57%, respectively. Finally, the online shopping data of 20 consumers was used as the test set to predict their color preferences, and the prediction accuracy was about 70%. Among them, the prediction accuracy of the group with stable shopping habits was 72.76%, and that of the group who liked to try new things was 70.60%, both meeting the expectation. Through experiments, it is concluded that the model has good performance and stability, and can more accurately judge consumers' consumption preferences. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
13. Mining Frequent Serial Positioning Episode Rules with Forward and Backward Search Technique from Event Sequences.
- Author
-
K, Poongodi and Kumar, Dhananjay
- Subjects
- *
SEARCH algorithms , *RANK correlation (Statistics) , *DATABASES , *STATISTICAL correlation - Abstract
A large event sequence can generate episode rules that are patterns which help to identify the possible dependencies existing among event types. Frequent episodes occurring in a simple sequence of events are commonly used for mining the episodes from a sequential database. Mining serial positioning episode rules (MSPER) using a fixed-gap episode occurrence suffers from unsatisfied scalability with complex sequences to test whether an episode occurs in a sequence. Large number of redundant nodes was generated in the MSPER-trie-based data structure. In this paper, forward and backward search algorithm (FBSA) is proposed here to detect minimal occurrences of frequent peak episodes. An extensive correlation of parameter settings and the generating procedure of fixed-gap episodes are carried out. To generate a fixed-gap episode and estimate the variance that decides the parameter selection in event sequences, Spearman's correlation coefficient is used for verifying the sequence of occurrences of the episodes. MFSPER with FBSA is developed to eliminate the frequent sequence scans and redundant event sets. The MFSPER–FBSA stores the minimal occurrences of frequent peak episodes from the event sequences. The experimental evaluation on benchmark datasets shows that the proposed technique outperforms the existing methods with respect to memory, execution time, recall and precision. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
14. Robust IoT Malware Detection and Classification Using Opcode Category Features on Machine Learning
- Author
-
Hyunjong Lee, Sooin Kim, Dongheon Baek, Donghoon Kim, and Doosung Hwang
- Subjects
IoT malware ,machine learning ,opcode category ,sequence mining ,visualization ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Technology advancements have led to the use of millions of IoT devices. However, IoT devices are being exploited as an entry point due to security flaws by resource constraints. IoT malware is being discovered in a variety of types. The purpose of this study is to investigate whether IoT malware can be detected from benign and whether various malware family types can be classified. We propose fixed-length and low-dimensional features using opcode category information on ML models. The binary IoT dataset for this study is converted into opcode to create features. The opcodes are categorized into 6 or 11 according to their functionality. Features are created using a sequence of opcode categories and the entropy values of opcode categories. These features can be visualized by using a 2D image in order to observe patterns. We evaluate our proposed features on various ML models (5-NN, SVM, Decision Tree, and Random Forest) and MLP with various performance metrics, such as Accuracy, Precision, Recall, F1-score, MCC, AUC-ROC, and AUC-PR. The performance results for malware detection and classification have an accuracy over 98.0%. The experiments have demonstrated that the features we’ve proposed are effective and robust for identifying different types of IoT malware and benign.
- Published
- 2023
- Full Text
- View/download PDF
15. Analyzing User Behavior in a Self-regulated Learning Environment
- Author
-
Frank, Sarah, Nussbaumer, Alexander, Gütl, Christian, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Guralnick, David, editor, Auer, Michael E., editor, and Poce, Antonella, editor
- Published
- 2022
- Full Text
- View/download PDF
16. Transferring effective learning strategies across learning contexts matters: A study in problem-based learning.
- Author
-
Saqr, Mohammed, Matcha, Wannisa, Uzir, Nora'ayu Ahmad, Jovanović, Jelena, Gašević, Dragan, and López-Pernas, Sonsoles
- Abstract
Learning strategies are important catalysts of students' learning. Research has shown that students with effective learning strategies are more likely to have better academic achievement. This study aimed to investigate students' adoption of learning strategies in different course implementations, the transfer of learning strategies between courses and relationship to performance. We took advantage of recent advances in learning analytics methods, namely sequence and process mining as well as statistical methods and visualisations to study how students regulate their online learning through learning strategies. The study included 81,739 log traces of students' learning related activities from two different problem-based learning medical courses. The results revealed that students who applied deep learning strategies were more likely to score high grades, and students who applied surface learning strategies were more likely to score lower grades in either course. More importantly, students who were able to transfer deep learning strategies or continue to use effective strategies between courses obtained higher scores, and were less likely to adopt surface strategies in the subsequent course. These results highlight the need for supporting the development of effective learning strategies in problem-based learning curricula so that students adopt and transfer effective strategies as they advance through the programme. Implications for practice or policy: • Teachers need to help students develop and transfer deep learning as they are directly related to success. • Students who continue to use light strategies are more at risk of low achievement and need to be supported. • Technology-supported problem-based learning requires more active scaffolding and teachers' support beyond "guide on the side" as in face-to-face. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
17. The temporal dynamics of online problem-based learning: Why and when sequence matters.
- Author
-
Saqr, Mohammed and López-Pernas, Sonsoles
- Subjects
PROBLEM-based learning ,ONLINE education ,GROUP dynamics ,VIRTUAL communities ,SOCIAL groups ,SOCIAL interaction - Abstract
Early research on online PBL explored student satisfaction, effectiveness, and design. The temporal aspect of online PBL has rarely been addressed. Thus, a gap exists in our knowledge regarding how online PBL unfolds: when and for how long a group engages in collaborative discussions. Similarly, little is known about whether and what sequence of interactions could predict higher achievement. This study aims to bridge such a gap by implementing the latest advances in temporal learning analytics to analyze the sequential and temporal aspects of online PBL across a large sample (n = 204 students) of qualitatively coded interactions (8,009 interactions). We analyzed interactions at the group level to understand the group dynamics across whole problem discussions, and at the student level to understand the students' contribution dynamics across different episodes. We followed such analyses by examining the association of interaction types and the sequences thereof with students' performance using multilevel linear regression models. The analysis of the interactions reflected that the scripted PBL process is followed a logical sequence, yet often lacked enough depth. When cognitive interactions (e.g., arguments, questions, and evaluations) occurred, they kindled high cognitive interactions, when low cognitive and social interactions dominated, they kindled low cognitive interactions. The order and sequence of interactions were more predictive of performance, and with a higher explanatory power as compared to frequencies. Starting or initiating interactions (even with low cognitive content) showed the highest association with performance, pointing to the importance of initiative and sequencing. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
18. An Event-Level Clustering Framework for Process Mining Using Common Sequential Rules
- Author
-
Tariq, Zeeshan, Charles, Darryl, McClean, Sally, McChesney, Ian, Taylor, Paul, Akan, Ozgur, Editorial Board Member, Bellavista, Paolo, Editorial Board Member, Cao, Jiannong, Editorial Board Member, Coulson, Geoffrey, Editorial Board Member, Dressler, Falko, Editorial Board Member, Ferrari, Domenico, Editorial Board Member, Gerla, Mario, Editorial Board Member, Kobayashi, Hisashi, Editorial Board Member, Palazzo, Sergio, Editorial Board Member, Sahni, Sartaj, Editorial Board Member, Shen, Xuemin (Sherman), Editorial Board Member, Stan, Mircea, Editorial Board Member, Jia, Xiaohua, Editorial Board Member, Zomaya, Albert Y., Editorial Board Member, Miraz, Mahdi H., editor, Southall, Garfield, editor, Ali, Maaruf, editor, Ware, Andrew, editor, and Soomro, Safeeullah, editor
- Published
- 2021
- Full Text
- View/download PDF
19. GrowHON: A Scalable Algorithm for Growing Higher-order Networks of Sequences
- Author
-
Krieg, Steven J., Kogge, Peter M., Chawla, Nitesh V., Kacprzyk, Janusz, Series Editor, Benito, Rosa M., editor, Cherifi, Chantal, editor, Cherifi, Hocine, editor, Moro, Esteban, editor, Rocha, Luis Mateus, editor, and Sales-Pardo, Marta, editor
- Published
- 2021
- Full Text
- View/download PDF
20. DiSCS: A New Sequence Segmentation Method for Open-Ended Learning Environments
- Author
-
Bywater, James P., Floryan, Mark, Chiu, Jennifer L., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Roll, Ido, editor, McNamara, Danielle, editor, Sosnovsky, Sergey, editor, Luckin, Rose, editor, and Dimitrova, Vania, editor
- Published
- 2021
- Full Text
- View/download PDF
21. Can We Replace Reads by Numeric Signatures? Lyndon Fingerprints as Representations of Sequencing Reads for Machine Learning
- Author
-
Bonizzoni, Paola, De Felice, Clelia, Petescia, Alessia, Pirola, Yuri, Rizzi, Raffaella, Stoye, Jens, Zaccagnino, Rocco, Zizza, Rosalba, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Martín-Vide, Carlos, editor, Vega-Rodríguez, Miguel A., editor, and Wheeler, Travis, editor
- Published
- 2021
- Full Text
- View/download PDF
22. Comparison of Machine Learning Methods for Life Trajectory Analysis in Demography
- Author
-
Muratova, Anna, Mitrofanova, Ekaterina, Islam, Robiul, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Nguyen, Ngoc Thanh, editor, Chittayasothorn, Suphamit, editor, Niyato, Dusit, editor, and Trawiński, Bogdan, editor
- Published
- 2021
- Full Text
- View/download PDF
23. Analyzing Events and Alarms in Control Systems
- Author
-
Dagnino, Aldo and Dagnino, Aldo
- Published
- 2021
- Full Text
- View/download PDF
24. A Sequence Mining-Based Novel Architecture for Detecting Fraudulent Transactions in Healthcare Systems
- Author
-
Irum Matloob, Shoab Ahmed Khan, Rukaiya Rukaiya, Muazzam A. Khan Khattak, and Arslan Munir
- Subjects
Fraudsters ,health insurance ,healthcare ,medical benefits ,premium ,sequence mining ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
With the exponential rise in government and private health-supported schemes, the number of fraudulent billing cases is also increasing. Detection of fraudulent transactions in healthcare systems is an exigent task due to intricate relationships among dynamic elements, including doctors, patients, and services. Hence, to introduce transparency in health support programs, there is a need to develop intelligent fraud detection models for tracing the loopholes in existing procedures, so that the fraudulent medical billing cases can be accurately identified. Moreover, there is also a need to optimize both the cost burden for the service provider and medical benefits for the client. This paper presents a novel process-based fraud detection methodology to detect insurance claim-related frauds in the healthcare system using sequence mining concepts. Recent literature focuses on the amount-based analysis or medication versus disease sequential analysis rather than detecting frauds using sequence generation of services within each specialty. The proposed methodology generates frequent sequences with different pattern lengths. The confidence values and confidence level are computed for each sequence. The sequence rule engine generates frequent sequences along with confidence values for each hospital’s specialty and compares them with the actual patient values. This identifies anomalies as both sequences would not be compliant with the rule engine’s sequences. The process-based fraud detection methodology is validated using last five years of a local hospital’s transactional data that includes many reported cases of fraudulent activities.
- Published
- 2022
- Full Text
- View/download PDF
25. Sequence Mining and Property Verification for Fault-Localization in Simulink Models
- Author
-
Aloui Dkhil, Safa, Bennani, Mohamed Taha, Tekaya, Manel, Ben Attia Sethom, Houda, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Zamojski, Wojciech, editor, Mazurkiewicz, Jacek, editor, Sugier, Jarosław, editor, and Walkowiak, Tomasz, editor
- Published
- 2020
- Full Text
- View/download PDF
26. Using Sequence Constraints for Modelling Network Interactions
- Author
-
De Smedt, Johannes, Mori, Junichiro, Ochi, Masanao, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Ohsawa, Yukio, editor, Yada, Katsutoshi, editor, Ito, Takayuki, editor, Takama, Yasufumi, editor, Sato-Shimokawara, Eri, editor, Abe, Akinori, editor, Mori, Junichiro, editor, and Matsumura, Naohiro, editor
- Published
- 2020
- Full Text
- View/download PDF
27. nTreeClus: A tree-based sequence encoder for clustering categorical series.
- Author
-
Jahanshahi, Hadi and Baydogan, Mustafa Gokce
- Subjects
- *
TIME series analysis , *AUTOREGRESSIVE models , *AMINO acid sequence , *DECISION trees , *CHANNEL coding , *NOMOGRAPHY (Mathematics) - Abstract
• A novel model-based clustering approach for sequential data, nTreeClus, is proposed. • nTreeClus introduces a Decision Tree Path encoder in an autoregressive manner. • The method's robustness to its only parameter (window size) has been examined. • nTreeClus shows competitive results compared to existing methods in sequence mining. The overwhelming presence of categorical/sequential data in diverse domains emphasizes the importance of sequence mining. The challenging nature of sequences proves the need for continuing research to find a more accurate and faster approach providing a better understanding of their (dis) similarities. This paper proposes a new Model-based approach for clustering sequence data, namely nTreeClus. The proposed method deploys Tree-based Learners, k -mers, and autoregressive models for categorical time series, culminating with a novel numerical representation of the categorical sequences. Adopting this new representation, we cluster sequences, considering the inherent patterns in categorical time series. Accordingly, the model showed robustness to its parameter. Under different simulated scenarios, nTreeClus improved the baseline methods for various internal and external cluster validation metrics for up to 10.7% and 2.7%, respectively. The empirical evaluation using synthetic and real datasets, protein sequences, and categorical time series showed that nTreeClus is competitive or superior to most state-of-the-art algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
28. A High-Level Representation of the Navigation Behavior of Website Visitors.
- Author
-
Huidobro, Alicia, Monroy, Raúl, and Cervantes, Bárbara
- Subjects
TASK analysis ,NAVIGATION ,BLOGS ,WEB analytics - Abstract
Knowing how visitors navigate a website can lead to different applications. For example, providing a personalized navigation experience or identifying website failures. In this paper, we present a method for representing the navigation behavior of an entire class of website visitors in a moderately small graph, aiming to ease the task of web analysis, especially in marketing areas. Current solutions are mainly oriented to a detailed page-by-page analysis. Thus, obtaining a high-level abstraction of an entire class of visitors may involve the analysis of large amounts of data and become an overwhelming task. Our approach extracts the navigation behavior that is common among a certain class of visitors to create a graph that summarizes class navigation behavior and enables a contrast of classes. The method works by representing website sessions as the sequence of visited pages. Sub-sequences of visited pages of common occurrence are identified as "rules". Then, we replace those rules with a symbol that is given a representative name and use it to obtain a shrinked representation of a session. Finally, this shrinked representation is used to create a graph of the navigation behavior of a visitor class (group of visitors relevant to the desired analysis). Our results show that a few rules are enough to capture a visitor class. Since each class is associated with a conversion, a marketing expert can easily find out what makes classes different. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
29. Students' active cognitive engagement with instructional videos predicts STEM learning.
- Author
-
Kuhlmann, Shelbi L., Plumley, Robert, Evans, Zoe, Bernacki, Matthew L., Greene, Jeffrey A., Hogan, Kelly A., Berro, Michael, Gates, Kathleen, and Panter, Abigail
- Abstract
The efficacy of well-designed instructional videos for STEM learning is largely reliant on how actively students cognitively engage with them. Students' ability to actively engage with videos likely depends upon individual characteristics like their prior knowledge. In this study, we investigated how digital trace data could be used as indicators of students' cognitive engagement with instructional videos, how such engagement predicted learning, and how prior knowledge moderated that relationship. One hundred twenty-eight biology undergraduate students learned with a series of instructional videos and took a biology unit exam one week later. We conducted sequence mining on the digital events of students' video-watching behaviors to capture the most commonly occurring sequences. Twenty-six sequences emerged and were aggregated into four groups indicative of cognitive engagement: repeated scrubbing, speed watching, extended scrubbing , and rewinding. Results indicated more active engagement via speed watching and rewinding behaviors positively predicted unit exam scores, but only for students with lower prior knowledge. These findings suggest that the ways students cognitively engage with videos predict how they will learn from them, that these relations are dependent upon their prior knowledge, and that researchers can measure students' cognitive engagement with instructional videos via mining digital log data. This research emphasizes the importance of active cognitive engagement with video interface tools and the need for students to accurately calibrate their learning behaviors in relation to their prior knowledge when learning from videos. • Log data was mined for behavioral sequences reflective of cognitive engagement. • Engagement categories: speed watching, rewinding, frequent and extended scrubbing. • Benefit of speed watching on learning decreases for higher prior knowledge students. • Benefit of rewinding on learning increases for lower prior knowledge students. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Extracting Workflows from Natural Language Documents: A First Step
- Author
-
Shing, Leslie, Wollaber, Allan, Chikkagoudar, Satish, Yuen, Joseph, Alvino, Paul, Chambers, Alexander, Allard, Tony, van der Aalst, Wil, Series Editor, Mylopoulos, John, Series Editor, Rosemann, Michael, Series Editor, Shaw, Michael J., Series Editor, Szyperski, Clemens, Series Editor, Daniel, Florian, editor, Sheng, Quan Z., editor, and Motahari, Hamid, editor
- Published
- 2019
- Full Text
- View/download PDF
31. Learning Interpretable Prefix-Based Patterns from Demographic Sequences
- Author
-
Gizdatullin, Danil, Baixeries, Jaume, Ignatov, Dmitry I., Mitrofanova, Ekaterina, Muratova, Anna, Espy, Thomas H., Barbosa, Simone Diniz Junqueira, Editorial Board Member, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Kotenko, Igor, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Strijov, Vadim V., editor, Ignatov, Dmitry I., editor, and Vorontsov, Konstantin V., editor
- Published
- 2019
- Full Text
- View/download PDF
32. Rich Representations for Analyzing Learning Trajectories: Systematic Review on Sequential Data Analytics in Game-Based Learning Research
- Author
-
Moon, Jewoong, Liu, Zhichun, Kinshuk, Series Editor, Huang, Ronghuai, Series Editor, Dede, Chris, Series Editor, Tlili, Ahmed, editor, and Chang, Maiga, editor
- Published
- 2019
- Full Text
- View/download PDF
33. KnowBots: Discovering Relevant Patterns in Chatbot Dialogues
- Author
-
Rivolli, Adriano, Amaral, Catarina, Guardão, Luís, de Sá, Cláudio Rebelo, Soares, Carlos, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Kralj Novak, Petra, editor, Šmuc, Tomislav, editor, and Džeroski, Sašo, editor
- Published
- 2019
- Full Text
- View/download PDF
34. Sequence and Network Mining of Touristic Routes Based on Flickr Geotagged Photos
- Author
-
Silva, Ana, Campos, Pedro, Ferreira, Carlos, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Moura Oliveira, Paulo, editor, Novais, Paulo, editor, and Reis, Luís Paulo, editor
- Published
- 2019
- Full Text
- View/download PDF
35. Mining Periodic Patterns with a MDL Criterion
- Author
-
Galbrun, Esther, Cellier, Peggy, Tatti, Nikolaj, Termier, Alexandre, Crémilleux, Bruno, Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Berlingerio, Michele, editor, Bonchi, Francesco, editor, Gärtner, Thomas, editor, Hurley, Neil, editor, and Ifrim, Georgiana, editor
- Published
- 2019
- Full Text
- View/download PDF
36. On the Need for Data-Based Model-Driven Engineering
- Author
-
Mazak, Alexandra, Wolny, Sabine, Wimmer, Manuel, Biffl, Stefan, editor, Eckhart, Matthias, editor, Lüder, Arndt, editor, and Weippl, Edgar, editor
- Published
- 2019
- Full Text
- View/download PDF
37. Bringing Synchrony and Clarity to Complex Multi-Channel Data: A Learning Analytics Study in Programming Education
- Author
-
Sonsoles Lopez-Pernas and Mohammed Saqr
- Subjects
Learning analytics ,programming ,computer science education ,sequence mining ,Hidden Markov Models ,automated assessment ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Supporting teaching and learning programming with learning analytics is an active area of inquiry. Most data used for learning analytics research comes from learning management systems. However, such systems were not developed to support learning programming. Therefore, educators have to resort to other systems that support the programming process, which can pose a challenge when it comes to understanding students’ learning since it takes place in different contexts. Methods that support the combination of different data sources are needed. Such methods would ideally account for the time-ordered sequence of students’ learning actions. In this article, we use a novel method (multi-channel sequence mining with Hidden Markov Models, HMMs) that allows the combination of multiple data sources, accounts for the temporal nature of students’ learning actions, and maps the transitions between different learning tactics. Our study included 291 students enrolled in a higher education programming course. Students’ trace-log data were collected from the learning management system and from a programming automated assessment tool. Data were analyzed using multi-channel sequence mining and HMM. The results reveal different patterns of students’ approaches to learning programming. High achievers start earlier to work on the programming assignments, use more independent strategies and consume learning resources more frequently, while the low achievers procrastinate early in the course and rely on help forums. Our findings demonstrate the potentials of multi-channel sequence mining and how this method can be analyzed using HMM. Furthermore, the results obtained can be of use for educators to understand students’ strategies when learning programming.
- Published
- 2021
- Full Text
- View/download PDF
38. Using Sequence Mining Techniques for Understanding Incorrect Behavioral Patterns on Interactive Tasks.
- Author
-
Ulitzsch, Esther, He, Qiwei, and Pohl, Steffi
- Subjects
TASK analysis ,TASKS ,MINES & mineral resources ,EDUCATIONAL evaluation - Abstract
Interactive tasks designed to elicit real-life problem-solving behavior are rapidly becoming more widely used in educational assessment. Incorrect responses to such tasks can occur for a variety of different reasons such as low proficiency levels, low metacognitive strategies, or motivational issues. We demonstrate how behavioral patterns associated with incorrect responses can, in part, be understood, supporting insights into the different sources of failure on a task. To this end, we make use of sequence mining techniques that leverage the information contained in time-stamped action sequences commonly logged in assessments with interactive tasks for (a) investigating what distinguishes incorrect behavioral patterns from correct ones and (b) identifying subgroups of examinees with similar incorrect behavioral patterns. Analyzing a task from the Programme for the International Assessment of Adult Competencies 2012 assessment, we find incorrect behavioral patterns to be more heterogeneous than correct ones. We identify multiple subgroups of incorrect behavioral patterns, which point toward different levels of effort and lack of different subskills needed for solving the task. Albeit focusing on a single task, meaningful patterns of major differences in how examinees approach a given task that generalize across multiple tasks are uncovered. Implications for the construction and analysis of interactive tasks as well as the design of interventions for complex problem-solving skills are derived. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
39. Weighted frequent sequential pattern mining.
- Author
-
Islam, Md Ashraful, Rafi, Mahfuzur Rahman, Azad, Al-amin, and Ovi, Jesan Ahammed
- Subjects
SEQUENTIAL pattern mining ,DATA mining ,MINES & mineral resources ,PATTERNS (Mathematics) - Abstract
Trillions of bytes of data are generated every day in different forms, and extracting useful information from that massive amount of data is the study of data mining. Sequential pattern mining is a major branch of data mining that deals with mining frequent sequential patterns from sequence databases. Due to items having different importance in real-life scenarios, they cannot be treated uniformly. With today's datasets, the use of weights in sequential pattern mining is much more feasible. In most cases, as in real-life datasets, pushing weights will give a better understanding of the dataset, as it will also measure the importance of an item inside a pattern rather than treating all the items equally. Many techniques have been introduced to mine weighted sequential patterns, but typically these algorithms generate a massive number of candidate patterns and take a long time to execute. This work aims to introduce a new pruning technique and a complete framework that takes much less time and generates a small number of candidate sequences without compromising with completeness. Performance evaluation on real-life datasets shows that our proposed approach can mine weighted patterns substantially faster than other existing approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
40. Sequence Mining and Prediction-Based Healthcare Fraud Detection Methodology
- Author
-
Irum Matloob, Shoab Ahmed Khan, and Habib Ur Rahman
- Subjects
Anomaly ,fraudsters ,sequence mining ,sequence prediction ,probability ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
This article presents a novel methodology to detect insurance claim related frauds in the healthcare system using concepts of sequence mining and sequence prediction. Fraud detection in healthcare is a non-trivial task due to the heterogeneous nature of healthcare records. Fraudsters behave as normal patients and with the passage of time keep on changing their way of planting frauds; hence, there is a need to develop fraud detection models. The sequence generation is not the part of previous researches which mostly focus on amount based analysis or medication versus diseases sequential analysis. The proposed methodology is able to generate sequences of services availed or prescribed by each specialty and analyse via two cascaded checks for the detection of insurance claim related frauds. The methodology addresses these challenges and self learns from historical medical records. It is based on two modules namely “Sequence rule engine and Prediction based engine”. The sequence rule engine generates frequent sequences and probabilities of rare sequences for each specialty of the hospital. The comparison of such sequences with the actual patient sequences leads to the identification of anomalies as both sequences are not compliant to the sequences of the rule engine. The system performs further in detail analysis on all non-compliant sequences in the prediction based engine. The proposed methodology is validated by generating patient sequences from last five years transactional data of a local hospital and identifies patterns of service procedures administered to patients using Prefixspan algorithm and Compact prediction tree. Various experiments have been performed to validate the applicability of the developed methodology and the results demonstrate that the methodology is pertinent to detect healthcare frauds and provides on average 85% of accuracy. Thus can help in preventing fraudulent claims and provides better insight into how to improve patient management and treatment procedures.
- Published
- 2020
- Full Text
- View/download PDF
41. Automated gadget discovery in the quantum domain
- Author
-
Lea M Trenkwalder, Andrea López-Incera, Hendrik Poulsen Nautrup, Fulvio Flamini, and Hans J Briegel
- Subjects
reinforcement learning ,machine learning ,sequence mining ,quantum optics ,quantum information ,Computer engineering. Computer hardware ,TK7885-7895 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
In recent years, reinforcement learning (RL) has become increasingly successful in its application to the quantum domain and the process of scientific discovery in general. However, while RL algorithms learn to solve increasingly complex problems, interpreting the solutions they provide becomes ever more challenging. In this work, we gain insights into an RL agent’s learned behavior through a post-hoc analysis based on sequence mining and clustering. Specifically, frequent and compact subroutines, used by the agent to solve a given task, are distilled as gadgets and then grouped by various metrics. This process of gadget discovery develops in three stages: First, we use an RL agent to generate data, then, we employ a mining algorithm to extract gadgets and finally, the obtained gadgets are grouped by a density-based clustering algorithm. We demonstrate our method by applying it to two quantum-inspired RL environments. First, we consider simulated quantum optics experiments for the design of high-dimensional multipartite entangled states where the algorithm finds gadgets that correspond to modern interferometer setups. Second, we consider a circuit-based quantum computing environment where the algorithm discovers various gadgets for quantum information processing, such as quantum teleportation. This approach for analyzing the policy of a learned agent is agent and environment agnostic and can yield interesting insights into any agent’s policy.
- Published
- 2023
- Full Text
- View/download PDF
42. Comparative analysis of real-world data of frequent treatment sequences in metastatic prostate cancer.
- Author
-
Jaipuria J, Kaur I, Doja MN, Ahmad T, Singh A, Rawal SK, Talwar V, and Sharma G
- Abstract
Background: The incidence of prostate cancer is increasing worldwide. A significant proportion of patients develop metastatic disease and are initially prescribed androgen deprivation therapy (ADT). However, subsequent sequences of treatments in real-world settings that may improve overall survival remain an area of active investigation., Materials and Methods: Data were collected from 384 patients presenting with de novo metastatic prostate cancer from 2011 to 2015 at a tertiary cancer center. Patients were categorized into surviving (n = 232) and deceased (n = 152) groups at the end of 3 years. Modified sequence pattern mining techniques (Generalized Sequential Pattern Mining and Sequential Pattern Discovery using Equivalence Classes) were applied to determine the exact order of the most frequent sets of treatments in each group., Results: Degarelix, as the initial form of ADT, was uniquely in the surviving group. The sequence of ADT followed by abiraterone and docetaxel was uniquely associated with a higher 3-year overall survival. Orchiectomy followed by fosfestrol was found to have a unique niche among surviving patients with a long duration of response to the initial ADT. Patients who received chemotherapy followed by radiotherapy and those who received radiotherapy followed by chemotherapy were found more frequently in the deceased group., Conclusions: We identified unique treatment sequences among surviving and deceased patients at the end of 3 years. Degarelix should be the preferred form of ADT. Patients who received ADT followed by abiraterone and chemotherapy showed better results. Patients requiring palliative radiation and chemotherapy in any sequence were significantly more frequent in the deceased group, identifying the need to offer such patients the most efficacious agents and to target them in clinical trial design., Competing Interests: No conflict of interest has been declared by the author., (Copyright © 2023 The Authors. Published by Wolters Kluwer Health, Inc.)
- Published
- 2024
- Full Text
- View/download PDF
43. The relational, co-temporal, contemporaneous, and longitudinal dynamics of self-regulation for academic writing.
- Author
-
Saqr, Mohammed, Peeters, Ward, and Viberg, Olga
- Subjects
ACADEMIC discourse ,SELF regulation ,ENGLISH as a foreign language ,HYPERSONIC aerodynamics ,WRITING processes - Abstract
Writing in an academic context often requires students in higher education to acquire a new set of skills while familiarising themselves with the goals, objectives and requirements of the new learning environment. Students' ability to continuously self-regulate their writing process, therefore, is seen as a determining factor in their learning success. In order to study students' self-regulated learning (SRL) behaviour, research has increasingly been tapping into learning analytics (LA) methods in recent years, making use of multimodal trace data that can be obtained from students writing and working online. Nevertheless, little is still known about the ways students apply and govern SRL processes for academic writing online, and about how their SRL behaviour might change over time. To provide new perspectives on the use of LA approaches to examine SRL, this study applied a range of methods to investigate what they could tell us about the evolution of SRL tactics and strategies on a relational, co-temporal, contemporaneous and longitudinal level. The data originates from a case study in which a private Facebook group served as an online collaboration space in a first-year academic writing course for foreign language majors of English. The findings show that learners use a range of SRL tactics to manage their writing tasks and that different tactic can take up key positions in this process over time. Several shifts could be observed in students' behaviour, from mainly addressing content-specific topics to more form-specific and social ones. Our results have also demonstrated that different methods can be used to study the relational, co-temporal, contemporaneous, and longitudinal dynamics of self-regulation in this regard, demonstrating the wealth of insights LA methods can bring to the table. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
44. Abnormal Event Correlation and Detection Based on Network Big Data Analysis.
- Author
-
Zhichao Hu, Xiangzhan Yu, Jiantao Shi, and Lin Ye
- Subjects
DATA analysis ,DATA mining ,SECURITY systems ,BIG data ,CRYPTOCURRENCY mining ,ALARMS - Abstract
With the continuous development of network technology, various large-scale cyber-attacks continue to emerge. These attacks pose a severe threat to the security of systems, networks, and data. Therefore, how to mine attack patterns from massive data and detect attacks are urgent problems. In this paper, an approach for attack mining and detection is proposed that performs tasks of alarm correlation, false-positive elimination, attack mining, and attack prediction. Based on the idea of CluStream, the proposed approach implements a flow clustering method and a two-step algorithm that guarantees efficient streaming and clustering. The context of an alarm in the attack chain is analyzed and the LightGBM method is used to perform falsepositive recognition with high accuracy. To accelerate the search for the filtered alarm sequence data to mine attack patterns, the PrefixSpan algorithm is also updated in the store strategy. The updated PrefixSpan increases the processing efficiency and achieves a better result than the original one in experiments. With Bayesian theory, the transition probability for the sequence pattern string is calculated and the alarm transition probability table constructed to draw the attack graph. Finally, a long-short-term memory network and embedding word-vector method are used to perform online prediction. Results of numerical experiments show that the method proposed in this paper has a strong practical value for attack detection and prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
45. Binary Classification of Sequences Possessing Unilateral Common Factor with AMS and APR
- Author
-
Tang, Yujin, Yonekawa, Kei, Kurokawa, Mori, Wada, Shinya, Yoshihara, Kiyohito, Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Weikum, Gerhard, Series Editor, Phung, Dinh, editor, Tseng, Vincent S., editor, Webb, Geoffrey I., editor, Ho, Bao, editor, Ganji, Mohadeseh, editor, and Rashidi, Lida, editor
- Published
- 2018
- Full Text
- View/download PDF
46. Curriculum Pacing: A New Approach to Discover Instructional Practices in Classrooms
- Author
-
Patel, Nirmal, Sharma, Aditya, Sellman, Collin, Lomas, Derek, Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Weikum, Gerhard, Series Editor, Nkambou, Roger, editor, Azevedo, Roger, editor, and Vassileva, Julita, editor
- Published
- 2018
- Full Text
- View/download PDF
47. Free-Rider Episode Screening via Dual Partition Model
- Author
-
Ao, Xiang, Liu, Yang, Huang, Zhen, Zuo, Luo, He, Qing, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Pandu Rangan, C., Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Weikum, Gerhard, Series Editor, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Pei, Jian, editor, Manolopoulos, Yannis, editor, Sadiq, Shazia, editor, and Li, Jianxin, editor
- Published
- 2018
- Full Text
- View/download PDF
48. Applying Sequence Mining for Outlier Detection in Process Mining
- Author
-
Fani Sani, Mohammadreza, van Zelst, Sebastiaan J., van der Aalst, Wil M. P., Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Weikum, Gerhard, Series Editor, Panetto, Hervé, editor, Debruyne, Christophe, editor, Proper, Henderik A., editor, Ardagna, Claudio Agostino, editor, Roman, Dumitru, editor, and Meersman, Robert, editor
- Published
- 2018
- Full Text
- View/download PDF
49. A High-Level Representation of the Navigation Behavior of Website Visitors
- Author
-
Alicia Huidobro, Raúl Monroy, and Bárbara Cervantes
- Subjects
web analytics ,web log mining ,clickstream analysis ,sequence mining ,sequitur ,graph techniques ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
Knowing how visitors navigate a website can lead to different applications. For example, providing a personalized navigation experience or identifying website failures. In this paper, we present a method for representing the navigation behavior of an entire class of website visitors in a moderately small graph, aiming to ease the task of web analysis, especially in marketing areas. Current solutions are mainly oriented to a detailed page-by-page analysis. Thus, obtaining a high-level abstraction of an entire class of visitors may involve the analysis of large amounts of data and become an overwhelming task. Our approach extracts the navigation behavior that is common among a certain class of visitors to create a graph that summarizes class navigation behavior and enables a contrast of classes. The method works by representing website sessions as the sequence of visited pages. Sub-sequences of visited pages of common occurrence are identified as “rules”. Then, we replace those rules with a symbol that is given a representative name and use it to obtain a shrinked representation of a session. Finally, this shrinked representation is used to create a graph of the navigation behavior of a visitor class (group of visitors relevant to the desired analysis). Our results show that a few rules are enough to capture a visitor class. Since each class is associated with a conversion, a marketing expert can easily find out what makes classes different.
- Published
- 2022
- Full Text
- View/download PDF
50. Adapting the User Path Through Trajectory Data Mining
- Author
-
Ramos, João, César, Analide, Neves, José, Novais, Paulo, Kacprzyk, Janusz, Series editor, Pal, Nikhil R., Advisory editor, Bello Perez, Rafael, Advisory editor, Corchado, Emilio S., Advisory editor, Hagras, Hani, Advisory editor, Kóczy, László T., Advisory editor, Kreinovich, Vladik, Advisory editor, Lin, Chin-Teng, Advisory editor, Lu, Jie, Advisory editor, Melin, Patricia, Advisory editor, Nedjah, Nadia, Advisory editor, Nguyen, Ngoc Thanh, Advisory editor, Wang, Jun, Advisory editor, De Paz, Juan F., editor, Julián, Vicente, editor, Villarrubia, Gabriel, editor, Marreiros, Goreti, editor, and Novais, Paulo, editor
- Published
- 2017
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.