Author: "Xindong Wu" / Publication Year Range: Last 50 years - Searchworks@Jio Institute Digital Library Search Results

1. Detection of ChatGPT fake science with the xFakeSci learning algorithm

Author: Ahmed Abdeen Hamed and Xindong Wu
Subjects: ChatGPT, Generative AI, Fake publication, Human-generated publications, ML Algorithm, Fake science, Medicine, Science
Abstract: Abstract Generative AI tools exemplified by ChatGPT are becoming a new reality. This study is motivated by the premise that “AI generated content may exhibit a distinctive behavior that can be separated from scientific articles”. In this study, we show how articles can be generated using means of prompt engineering for various diseases and conditions. We then show how we tested this premise in two phases and prove its validity. Subsequently, we introduce xFakeSci, a novel learning algorithm, that is capable of distinguishing ChatGPT-generated articles from publications produced by scientists. The algorithm is trained using network models driven from both sources. To mitigate overfitting issues, we incorporated a calibration step that is built upon data-driven heuristics, including proximity and ratios. Specifically, from a total of a 3952 fake articles for three different medical conditions, the algorithm was trained using only 100 articles, but calibrated using folds of 100 articles. As for the classification step, it was performed using 300 articles per condition. The actual label steps took place against an equal mix of 50 generated articles and 50 authentic PubMed abstracts. The testing also spanned publication periods from 2010 to 2024 and encompassed research on three distinct diseases: cancer, depression, and Alzheimer’s. Further, we evaluated the accuracy of the xFakeSci algorithm against some of the classical data mining algorithms (e.g., Support Vector Machines, Regression, and Naive Bayes). The xFakeSci algorithm achieved F1 scores ranging from 80 to 94%, outperforming common data mining algorithms, which scored F1 values between 38 and 52%. We attribute the noticeable difference to the introduction of calibration and a proximity distance heuristic, which underscores this promising performance. Indeed, the prediction of fake science generated by ChatGPT presents a considerable challenge. Nonetheless, the introduction of the xFakeSci algorithm is a significant step on the way to combating fake science.
Published: 2024
Full Text: View/download PDF

2. Data central-platform: architecture and practice

Author: Xindong WU, Zeyu YING, Shaojing SHENG, Tingting JIANG, Chenyang BU, and Zan ZHANG
Subjects: data central-platform, data asset, data governance, digital transformation, Electronic computers. Computer science, QA75.5-76.95
Abstract: A data central platform positions the data of an entity-be it a corporate entity, institutional body, or governmental department-as a pivotal strategic asset.It's a management mechanism that spans from data collection to processing and application, aiming to improve data quality, achieve extensive data sharing, and ultimately maximize the value of the data.A definition for data central-platforms was provided, and a generic architecture was presented along with the core technologies and functions of physical management, logical management, data asset management, data services and information security management.Finally, taking the construction of Huapu system as an example, a realization of the data central platform, which is geared towards genealogical big data and integrated with the HAO intelligence model, was introduced-Huapu Central-Platform.
Published: 2023
Full Text: View/download PDF

3. Safeguarding authenticity for mitigating the harms of generative AI: Issues, research agenda, and policies for detection, fact-checking, and ethical AI

Author: Ahmed Abdeen Hamed, Malgorzata Zachara-Szymanska, and Xindong Wu
Subjects: Biocomputational method, Bioinformatics, Biological sciences, Computational bioinformatics, Natural sciences, Neural networks, Science
Abstract: Summary: As the influence of transformer-based approaches in general and generative artificial intelligence (AI) in particular continues to expand across various domains, concerns regarding authenticity and explainability are on the rise. Here, we share our perspective on the necessity of implementing effective detection, verification, and explainability mechanisms to counteract the potential harms arising from the proliferation of AI-generated inauthentic content and science. We recognize the transformative potential of generative AI, exemplified by ChatGPT, in the scientific landscape. However, we also emphasize the urgency of addressing associated challenges, particularly in light of the risks posed by disinformation, misinformation, and unreproducible science. This perspective serves as a response to the call for concerted efforts to safeguard the authenticity of information in the age of AI. By prioritizing detection, fact-checking, and explainability policies, we aim to foster a climate of trust, uphold ethical standards, and harness the full potential of AI for the betterment of science and society.
Published: 2024
Full Text: View/download PDF

4. Integrating Symbol Similarities with Knowledge Graph Embedding for Entity Alignment: An Unsupervised Framework

Author: Tingting Jiang, Chenyang Bu, Yi Zhu, and Xindong Wu
Subjects: Electronic computers. Computer science, QA75.5-76.95
Abstract: Entity alignment refers to discovering identical entity pairs in 2 knowledge graphs, which is a significant task in knowledge fusion. Early automated entity alignment techniques are based mainly on similarity calculation and comparing symbolic features, i.e., entity names, between entities. Nevertheless, such methods’ performance would reduce significantly when the difference between knowledge graphs is enormous because of relying on predefined comparison rules. Recently, embedding-based methods calculate the similarity between entity pairs through vector embeddings and thus can deal with different knowledge graphs. However, embedding-based methods mostly require humans to annotate data, which is laborious. Therefore, we learn from each other to propose an unsupervised entity alignment framework in this work, which can generate initial alignment seeds automatically by considering symbolic similarities. It can effectively avoid the waste of human resources and is suitable for handling multiple types of knowledge graphs. In addition, we investigate improving the quality and quantity of initial alignment by integrating multiple symbolic similarity features of entities and dealing with the situation of entity information missing better. Experimental results on 3 real datasets demonstrate its state-of-the-art performance.
Published: 2023
Full Text: View/download PDF

5. Hybrid Collaborative Recommendation via Dual-Autoencoder

Author: Bingbing Dong, Yi Zhu, Lei Li, and Xindong Wu
Subjects: Recommendation system, matrix factorization, semi-autoencoder, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: With the rapid increase of internet information, personalized recommendation systems are an effective way to alleviate the information overload problem, which has attracted extensive attention in recent years. The traditional collaborative filtering utilizes matrix factorization methods to learn hidden feature representations of users and/or items. With deep learning achieved good performance in representation learning, the autoencoder model is widely applied in recommendation systems for the advantages of fast convergence and no label requirement. However, the previous recommendation systems may take the reconstruction output of an autoencoder as the prediction of missing values directly, which may deteriorate their performance and cause unsatisfactory results of recommendation. In addition, the parameters of an autoencoder need to be pre-trained ahead, which greatly increases the time complexity. To address these problems, in this paper, we propose a Hybrid Collaborative Recommendation method via Dual-Autoencoder (HCRDa). More specifically, firstly, a novel dual-autoencoder is utilized to simultaneously learn the feature representations of users and items in our HCRDa, which obviously reduces time complexity. Secondly, embedding matrix factorization into the training process of the autoencoder further improves the quality of hidden features for users and items. Finally, additional attributes of users and items are utilized to alleviate the cold start problem and to make hybrid recommendations. Comprehensive experiments on several real-world data sets demonstrate the effectiveness of our proposed method in comparison with several state-of-the-art methods.
Published: 2020
Full Text: View/download PDF

6. Comprehensive Heading Error Processing Technique Using Image Denoising and Tilt-Induced Error Compensation for Polarization Compass

Author: Chong Shen, Xindong Wu, Donghua Zhao, Shan Li, Huiliang Cao, Huijun Zhao, Jun Tang, Jun Liu, and Chenguang Wang
Subjects: Polarization compass, error processing, de-noising, error modeling, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Bionic polarization navigation has a broad variety of application in diverse fields for high reliability and strong robustness to interference, fundamental to which is the use of a polarization compass based on polarized light cues. Nevertheless, dramatical reduction of the orientation accuracy resulted from the noise in a measured angle of polarization (AoP) and the tilted angles of a polarization compass during operation gives imperative influence on navigation precision. Herein, we investigate how to improve the navigation accuracy effectively by the proposed comprehensive heading error processing technique for a polarization compass, where a novel denoising scheme is designed to eliminate the noise in AoP images directly by integrating the strength of iterative variance-stabilizing transformation (IVST) and adaptive soft interval thresholding (SIT) so as to compensate the following tilt-induced error accurately. Subsequently, a promising compensation approach inspired by efficient extreme learning machine (EELM) is introduced to correct the tilt-induced error caused by realistic execution. The AoP image denoising advance and the tilt-induced error modeling advance combine to produce remarkable performance gains on the heading error. Experimental results and comparisons with prior arts reveal that the proposed comprehensive heading error processing technique is highly appealing in terms of improving the orientation accuracy for a polarization compass with superiority to state-of-the-art alternatives.
Published: 2020
Full Text: View/download PDF

7. A Framework for Subgraph Detection in Interdependent Networks via Graph Block-Structured Optimization

Author: Fei Jie, Chunpai Wang, Feng Chen, Lei Li, and Xindong Wu
Subjects: Subgraph detection, sparse optimization, interdependent networks, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: As the backbone of many real-world complex systems, networks interact with others in nontrivial ways from time to time. It is a challenging problem to detect subgraphs that have dependencies on each other across multiple networks. Instead of devising a method for a specific scenario, we propose a generic framework to discover subgraphs in multiple interdependent networks, which generalizes the classical subgraph detection problem in a single network and can be applied to more practical applications. Specifically, we propose the Graph Block-structured Gradient Hard Thresholding Pursuit (GB-GHTP) framework to optimize interdependent networks with block-structured constraints, which enjoys 1) a theoretical guarantee and 2) a nearly linear time complexity on the network size. It is demonstrated how our framework can be applied to three practical applications: 1) evolving anomalous subgraph detection in dynamic networks, 2) anomalous subgraph detection in networks of networks, and 3) connected dense subgraph detection in dual networks. We evaluate our framework on large-scale datasets with comprehensive experiments, which validate our framework's effectiveness and efficiency.
Published: 2020
Full Text: View/download PDF

8. Unsupervised Domain Adaptation via Stacked Convolutional Autoencoder

Author: Yi Zhu, Xinke Zhou, and Xindong Wu
Subjects: domain adaptation, convolutional autoencoder, sparse autoencoder, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: Unsupervised domain adaptation involves knowledge transfer from a labeled source to unlabeled target domains to assist target learning tasks. A critical aspect of unsupervised domain adaptation is the learning of more transferable and distinct feature representations from different domains. Although previous investigations, using, for example, CNN-based and auto-encoder-based methods, have produced remarkable results in domain adaptation, there are still two main problems that occur with these methods. The first is a training problem for deep neural networks; some optimization methods are ineffective when applied to unsupervised deep networks for domain adaptation tasks. The second problem that arises is that redundancy of image data results in performance degradation in feature learning for domain adaptation. To address these problems, in this paper, we propose an unsupervised domain adaptation method with a stacked convolutional sparse autoencoder, which is based on performing layer projection from the original data to obtain higher-level representations for unsupervised domain adaptation. More specifically, in a convolutional neural network, lower layers generate more discriminative features whose kernels are learned via a sparse autoencoder. A reconstruction independent component analysis optimization algorithm was introduced to perform individual component analysis on the input data. Experiments undertaken demonstrated superior classification performance of up to 89.3% in terms of accuracy compared to several state-of-the-art domain adaptation methods, such as SSRLDA and TLMRA.
Published: 2022
Full Text: View/download PDF

9. Online Feature Selection for Streaming Features Using Self-Adaption Sliding-Window Sampling

Author: Dianlong You, Xindong Wu, Limin Shen, Song Deng, Zhen Chen, Chuan Ma, and Qiusheng Lian
Subjects: Feature selection, markov blanket, online learning, sliding-window, streaming feature, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: In recent years, online feature selection has been a research topic on streaming feature mining, as it can reduce the dimensionality of the streaming features by removing the irrelevant and redundant features in real time. There are many representative research efforts on the online feature selection with streaming features, i.e., alpha - investing, online streaming feature selection (OSFS), and scalable and accurate online approach (SAOLA) for feature selection. In these studies, alpha-investing has limited prediction accuracy and a large number of selected features. SAOLA sometimes offers outstanding efficiency in running time and prediction accuracy but possesses a large number of selected features. OSFS offers high prediction accuracy in many datasets, but its running time increases exponentially with an increasing number of features with low redundancy and high relevance. To address the limitations of the above-mentioned works, we propose an online learning algorithm named OSFAS, which samples streaming features in real-time by a self-adaption sliding-window and discards the irrelevant and redundant features by conditional independence. The OSFAS obtains an approximate Markov blanket with high prediction accuracy, meanwhile reducing the number of selected features. The efficiency of the proposed OSFASW algorithm was validated in a performance test on widely used datasets, e.g., NIPS2003 and causality workbench. Through the extensive experimental results, we demonstrate that OSFAS significantly improves the prediction accuracy and requires a smaller number of selected features than alpha - investing, OSFS, and SAOLA.
Published: 2019
Full Text: View/download PDF

10. Document Specific Supervised Keyphrase Extraction With Strong Semantic Relations

Author: Huiting Liu, Lili Wang, Peng Zhao, and Xindong Wu
Subjects: Keyphrase extraction, sequential pattern mining, general gap constraints, classification, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Keyphrase extraction is the task of automatically extracting descriptive phrases or concepts that represent the main topics in a document. Finding good keyphrases in a document can quickly summarize knowledge for information retrieval and decision making. Existing keyphrase extraction methods cannot be customized to each specific document, and cannot capture flexible semantic relations. In this paper, a keyphrase extraction algorithm using maximum sequential pattern mining with one-off and general gaps condition, called Ke-MSMING, is presented. Ke_MSMING first searches all keyphrase candidates from a document using sequential patterns mining and the topic model, and then adopts supervised machine learning to classify each keyphrase candidate as a keyphrase or not. Finally, Ke_MSMING selects top-N keyphrases as the final keyphrases. Ke_MSMING not only uses baseline features and pattern features but also uses centrality features obtained from the cooccurrence semantic network, and the cooccurrence networks can yield powerful semantic relations for keyphrase extraction. Experimental results on two datasets demonstrate that Ke_MSMING has better performance than other state-of-the-art keyphrase extraction approaches.
Published: 2019
Full Text: View/download PDF

11. NETASPNO: Approximate Strict Pattern Matching Under Nonoverlapping Condition

Author: Youxi Wu, Shasha Li, Jingyu Liu, Lei Guo, and Xindong Wu
Subjects: Approximate pattern matching, wildcard, gap constraint, sequence, occurrence, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: In pattern matching, a gap constraint is a more flexible wildcard than traditional wildcards “?”and “*”. Pattern matching with gap constraints is more difficult to handle and fulfills user's enquiries more easily. Pattern matching with gap constraints has therefore been carried out in numerous research works, such as music information retrieval, searching protein sites, and sequence pattern mining. Strict pattern matching under a nonoverlapping condition, as a type of pattern matching with gap constraints, is a key issue of sequence pattern mining with gap constraints since it can be used to compute the frequency of a pattern. Exact matching limits the flexibility of the match to some extent since it requires each character to be matched exactly. We therefore address approximate strict pattern matching under the nonoverlapping constraints (ASPNO) and propose an effective algorithm, named NETtree for ASPNO (NETASPNO), which first transforms the problem into a Nettree data structure, an extensive tree structure. To find the nonoverlapping occurrences effectively, we propose the concept of number of roots paths with distance constraints (NRPDC) which indicates the number of path from a node to the roots with distanced and can be used to delete useless parent-child relationships and useless nodes. We iteratively recalculate the NRPDCs of each node on the subnettree with the rightmost root. Then we can get a path from the rightmost leaf to its rightmost root without using the backtracking strategy. NETASPNO therefore iteratively gets the rightmost root-leaf-path and prunes the path on the Nettree. Extensive experimental results demonstrate that NETASPNO has better performance than the other competitive algorithms.
Published: 2018
Full Text: View/download PDF

12. BiTTM: A Core Biterms-Based Topic Model for Targeted Analysis

Author: Jiamiao Wang, Ling Chen, Lei Li, and Xindong Wu
Subjects: AI, text analysis, topic model, biterm, content analysis, targeted modeling, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: While most of the existing topic models perform a full analysis on a set of documents to discover all topics, it is noticed recently that in many situations users are interested in fine-grained topics related to some specific aspects only. As a result, targeted analysis (or focused analysis) has been proposed to address this problem. Given a corpus of documents from a broad area, targeted analysis discovers only topics related with user-interested aspects that are expressed by a set of user-provided query keywords. Existing approaches for targeted analysis suffer from problems such as topic loss and topic suppression because of their inherent assumptions and strategies. Moreover, existing approaches are not designed to address computation efficiency, while targeted analysis is supposed to provide responses to user queries as soon as possible. In this paper, we propose a core BiTerms-based Topic Model (BiTTM). By modelling topics from core biterms that are potentially relevant to the target query, on one hand, BiTTM captures the context information across documents to alleviate the problem of topic loss or suppression; on the other hand, our proposed model enables the efficient modelling of topics related to specific aspects. Our experiments on nine real-world datasets demonstrate BiTTM outperforms existing approaches in terms of both effectiveness and efficiency.
Published: 2021
Full Text: View/download PDF

13. Fighting the COVID-19 Infodemic in News Articles and False Publications: The NeoNet Text Classifier, a Supervised Machine Learning Algorithm

Author: Mohammad A. R. Abdeen, Ahmed Abdeen Hamed, and Xindong Wu
Subjects: COVID-19 infodemic, text classification, TF-IDF features, network training modes, supervised learning, misinformation, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: The spread of the Coronavirus pandemic has been accompanied by an infodemic. The false information that is embedded in the infodemic affects people’s ability to have access to safety information and follow proper procedures to mitigate the risks. This research aims to target the falsehood part of the infodemic, which prominently proliferates in news articles and false medical publications. Here, we present NeoNet, a novel supervised machine learning algorithm that analyzes the content of a document (news article, a medical publication) and assigns a label to it. The algorithm was trained by Term Frequency Inverse Document Frequency (TF-IDF) bigram features, which contribute a network training model. The algorithm was tested on two different real-world datasets from the CBC news network and COVID-19 publications. In five different fold comparisons, the algorithm predicted a label of an article with a precision of 97–99%. When compared with prominent algorithms such as Neural Networks, SVM, and Random Forests NeoNet surpassed them. The analysis highlighted the promise of NeoNet in detecting disputed online contents, which may contribute negatively to the COVID-19 pandemic.
Published: 2021
Full Text: View/download PDF

14. Knowledge Engineering With Big Data (BigKE): A 54-Month, 45-Million RMB, 15-Institution National Grand Project

Author: Xindong Wu, Huanhuan Chen, Jun Liu, Gongqing Wu, Ruqian Lu, and Nanning Zheng
Subjects: Knowledge engineering, data mining, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Starting in July 2016, the Ministry of Science and Technology of China, along with several other national agencies, sponsors a 54-month 45-million RMB (Chinese Yuan) project on knowledge engineering with Big Data (www.bigke.org) for 15 top research and development institutions to study the fundamental theory and the applications of BigKE, a big-data knowledge engineering framework that handles fragmented knowledge modeling and online learning from multiple information sources, nonlinear fusion on fragmented knowledge, and automated demand-driven knowledge navigation. The project seeks to provide petabytescale data and knowledge services in identified application domains. In this paper, we discuss our BigKE framework, and present a novel application scenario for BigKE services.
Published: 2017
Full Text: View/download PDF

15. A degree-based block model and a local expansion optimization algorithm for anti-community detection in networks.

Author: Jiajing Zhu, Yongguo Liu, Changhong Yang, Wen Yang, Zhi Chen, Yun Zhang, Shangming Yang, and Xindong Wu
Subjects: Medicine, Science
Abstract: Anti-community detection in networks can discover negative relations among objects. However, a few researches pay attention to detecting anti-community structure and they do not consider the node degree and most of them require high computational cost. Block models are promising methods for exploring modular regularities, but their results are highly dependent on the observed structure. In this paper, we first propose a Degree-based Block Model (DBM) for anti-community structure. DBM takes the node degree into consideration and evolves a new objective function Q(C) for evaluation. And then, a Local Expansion Optimization Algorithm (LEOA), which preferentially considers the nodes with high degree, is proposed for anti-community detection. LEOA consists of three stages: structural center detection, local anti-community expansion and group membership adjustment. Based on the formulation of DBM, we develop a synthetic benchmark DBM-Net for evaluating comparison algorithms in detecting known anti-community structures. Experiments on DBM-Net with up to 100000 nodes and 17 real-world networks demonstrate the effectiveness and efficiency of LEOA for anti-community detection in networks.
Published: 2018
Full Text: View/download PDF

16. Expanding dictionary for robust face recognition: pixel is not necessary while sparsity is

Author: Zhong‐Qiu Zhao, Yiu‐ming Cheung, Haibo Hu, and Xindong Wu
Subjects: robust face recognition, robust sparse representation model, RSR model, pixel space, identity matrix, face image occlusion reconstruction, Computer applications to medicine. Medical informatics, R858-859.7, Computer software, QA76.75-76.765
Abstract: Since sparse representation (SR) was first introduced into robust face recognition, the argument has lasted for several years about whether sparsity can improve robust face recognition or not. Some work argued that the robust sparse representation (RSR) model has a similar recognition rate as non‐sparse solution, while it needs a much higher computational cost due to the larger feature dimensionality in the pixel space. In this study, the authors reveal that the standard RSR model, which expands the dictionary with the identity matrix to reconstruct corruption or occlusion in face images, is essentially a non‐sparse solution with a relatively large residual. The reason why the RSR model underperforms may be its inappropriately expanded bases rather than the sparsity itself. Thereby, this study proposes to design a dictionary with an expanded noise bases set which can precisely reconstructs any corruption or occlusion in face images in a subspace. Experimental results show that the algorithm can greatly improve recognition rates for robust face recognition. In addition, the algorithm can be simply performed in a subspace with a small feature dimensionality, thus efficient enough for real systems. This study makes us come to the conclusion that solving the approximation problem in raw pixel space is not necessary for robust face recognition, while solving in a subspace with a much smaller feature dimensionality is enough when the dictionary is well expanded. Finally, this study also confirms that the sparsity plays an important role in SR based classification.
Published: 2015
Full Text: View/download PDF

17. Online Streaming Feature Selection via Conditional Independence

Author: Dianlong You, Xindong Wu, Limin Shen, Yi He, Xu Yuan, Zhen Chen, Song Deng, and Chuan Ma
Subjects: streaming feature, feature selection, conditional independence, markov blanket, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: Online feature selection is a challenging topic in data mining. It aims to reduce the dimensionality of streaming features by removing irrelevant and redundant features in real time. Existing works, such as Alpha-investing and Online Streaming Feature Selection (OSFS), have been proposed to serve this purpose, but they have drawbacks, including low prediction accuracy and high running time if the streaming features exhibit characteristics such as low redundancy and high relevance. In this paper, we propose a novel algorithm about online streaming feature selection, named ConInd that uses a three-layer filtering strategy to process streaming features with the aim of overcoming such drawbacks. Through three-layer filtering, i.e., null-conditional independence, single-conditional independence, and multi-conditional independence, we can obtain an approximate Markov blanket with high accuracy and low running time. To validate the efficiency, we implemented the proposed algorithm and tested its performance on a prevalent dataset, i.e., NIPS 2003 and Causality Workbench. Through extensive experimental results, we demonstrated that ConInd offers significant performance improvements in prediction accuracy and running time compared to Alpha-investing and OSFS. ConInd offers 5.62% higher average prediction accuracy than Alpha-investing, with a 53.56% lower average running time compared to that for OSFS when the dataset is lowly redundant and highly relevant. In addition, the ratio of the average number of features for ConInd is 242% less than that for Alpha-investing.
Published: 2018
Full Text: View/download PDF

18. Context-Aware Reviewer Assignment for Trust Enhanced Peer Review.

Author: Lei Li, Yan Wang, Guanfeng Liu, Meng Wang, and Xindong Wu
Subjects: Medicine, Science
Abstract: Reviewer assignment is critical to peer review systems, such as peer-reviewed research conferences or peer-reviewed funding applications, and its effectiveness is a deep concern of all academics. However, there are some problems in existing peer review systems during reviewer assignment. For example, some of the reviewers are much more stringent than others, leading to an unfair final decision, i.e., some submissions (i.e., papers or applications) with better quality are rejected. In this paper, we propose a context-aware reviewer assignment for trust enhanced peer review. More specifically, in our approach, we first consider the research area specific expertise of reviewers, and the institution relevance and co-authorship between reviewers and authors, so that reviewers with the right expertise are assigned to the corresponding submissions without potential conflict of interest. In addition, we propose a novel cross-assignment paradigm, and reviewers are cross-assigned in order to avoid assigning a group of stringent reviewers or a group of lenient reviewers to the same submission. More importantly, on top of them, we propose an academic CONtext-aware expertise relevanCe oriEnted Reviewer cross-assignmenT approach (CONCERT), which aims to effectively estimate the "true" ratings of submissions based on the ratings from all reviewers, even though no prior knowledge exists about the distribution of stringent reviewers and lenient reviewers. The experiments illustrate that compared with existing approaches, our proposed CONCERT approach can less likely assign more than one stringent reviewers or lenient reviewers to a submission simultaneously and significantly reduce the influence of ratings from stringent reviewers and lenient reviewers, leading to trust enhanced peer review and selection, no matter what kind of distributions of stringent reviewers and lenient reviewers are.
Published: 2015
Full Text: View/download PDF

19. Towards Faster Deep Graph Clustering via Efficient Graph Auto-Encoder.

Author: Shifei Ding, Benyu Wu, Ling Ding 0001, Xiao Xu 0006, Lili Guo, Hongmei Liao, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

20. Concept Evolution Detecting over Feature Streams.

Author: Peng Zhou 0008, Yufeng Guo, Haoran Yu, Yuanting Yan, Yanping Zhang, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

21. Heterogeneous Meta-Path Graph Learning for Higher-Order Social Recommendation.

Author: Munan Li, Kai Liu, Hongbo Liu 0001, Zheng Zhao, Tomas E. Ward, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

22. Online Learning for Data Streams With Incomplete Features and Labels.

Author: Dianlong You, Huigui Yan, Jiawei Xiao, Zhen Chen 0007, Di Wu 0056, Limin Shen, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

23. RNP-Miner: Repetitive Nonoverlapping Sequential Pattern Mining.

Author: Meng Geng, Youxi Wu, Yan Li 0087, Jing Liu, Philippe Fournier-Viger, Xingquan Zhu 0001, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

24. Diverse Structure-Aware Relation Representation in Cross-Lingual Entity Alignment.

Author: Yuhong Zhang 0002, Jianqing Wu, Kui Yu, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

25. DeepCPR: Deep Path Reasoning Using Sequence of User-Preferred Attributes for Conversational Recommendation.

Author: Huiting Liu, Yu Zhang, Peipei Li 0001, Cheng Qian, Peng Zhao 0010, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

26. FHCPL: An Intelligent Fixed-Horizon Constrained Policy Learning System for Risk-Sensitive Industrial Scenario.

Author: Ke Lin 0001, Duantengchuan Li, Yanjie Li, Shiyu Chen, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

27. Multiaperture Visual Velocity Measurement Method Based on Biomimetic Compound-Eye for UAVs.

Author: Chong Shen 0001, Xin Zhao, Xindong Wu 0003, Huiliang Cao, Chenguang Wang 0007, Jun Tang, and Jun Liu 0005
Published: 2024
Full Text: View/download PDF

28. An effective algorithm for genealogical graph partitioning.

Author: Shaojing Sheng, Zan Zhang, Peng Zhou 0006, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

29. LocalTGEP: A Lightweight Edge Partitioner for Time-Varying Graph.

Author: Shengwei Ji, Chenyang Bu, Lei Li 0002, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

30. Online Heterogeneous Streaming Feature Selection Without Feature Type Information.

Author: Peng Zhou 0008, Yunyun Zhang, Zhaolong Ling, Yuanting Yan, Shu Zhao, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

31. Self-Adaptive Deep Asymmetric Network for Imbalanced Recommendation.

Author: Yi Zhu 0006, Yishuai Geng, Yun Li 0010, Jipeng Qiang, Yunhao Yuan, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

32. Fuzzy Ranking-Based Preference Completion via Graph Pattern Matching and Rematching.

Author: Lei Li 0002, Pan Liu, Chenyang Bu, Zan Zhang, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

33. Geometric-Contextual Mutual Infomax Path Aggregation for Relation Reasoning on Knowledge Graph.

Author: Xingrui Zhuo, Gongqing Wu, Zan Zhang, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

34. Unifying Large Language Models and Knowledge Graphs: A Roadmap.

Author: Shirui Pan, Linhao Luo, Yufei Wang 0003, Chen Chen 0115, Jiapu Wang, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

35. Give us the Facts: Enhancing Large Language Models With Knowledge Graphs for Fact-Aware Language Modeling.

Author: Linyao Yang, Hongyang Chen, Zhao Li 0007, Xiao Ding, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

36. COPP-Miner: Top-k Contrast Order-Preserving Pattern Mining for Time Series Classification.

Author: Youxi Wu, Yufei Meng, Yan Li 0087, Lei Guo 0015, Xingquan Zhu 0001, Philippe Fournier-Viger, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

37. A Comprehensive Survey on Automatic Knowledge Graph Construction.

Author: Lingfeng Zhong, Jia Wu 0001, Qian Li, Hao Peng 0001, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

38. Shared Manifold Regularized Joint Feature Selection for Joint Classification and Regression in Alzheimer's Disease Diagnosis.

Author: Zhi Chen 0014, Yongguo Liu, Yun Zhang 0019, Jiajing Zhu, Qiaoqin Li, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

39. Intrusion Detection Models Based on Data Mining

Author: Guojun Mao, Xindong Wu, and Xuxian Jiang
Subjects: Intrusion detection, data mining, frequency pattern, tree pattern, Electronic computers. Computer science, QA75.5-76.95
Abstract: Computer intrusions are taking place everywhere, and have become a major concern for information security. Most intrusions to a computer system may result from illegitimate or irregular calls to the operating system, so analyzing the system-call sequences becomes an important and fundamental technique to detect potential intrusions. This paper proposes two models based on data mining technology, respectively called frequency patterns () and tree patterns () for intrusion detection. employs a typical method of sequential mining based on frequency analysis, and uses a short sequence model to find out quickly frequent sequential patterns in the training system-call sequences. makes use of the technique of tree pattern mining, and can get a quality profile from the training system-call sequences of a given system. Experimental results show that has good performances in training and detecting intrusions from short system-call sequences, and can achieve a high detection precision in handling long sequences.
Published: 2012
Full Text: View/download PDF

40. Intent-guided Heterogeneous Graph Contrastive Learning for Recommendation.

Author: Lei Sang, Yu Wang, Yi Zhang 0103, Yiwen Zhang 0001, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

41. Transformer-based Graph Neural Networks for Battery Range Prediction in AIoT Battery-Swap Services.

Author: Zhao Li 0007, Yang Liu 0320, Chuan Zhou 0001, Xuanwu Liu, Xuming Pan, Buqing Cao, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

42. High-degree penalty based global statistical network embedding for name disambiguation in anonymized graph.

Author: Shengxing Bai, Chenyang Bu, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

43. Synergistic Deep Graph Clustering Network.

Author: Benyu Wu, Shifei Ding, Xiao Xu 0006, Lili Guo, Ling Ding 0001, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

44. Co-occurrence order-preserving pattern mining.

Author: Youxi Wu, Zhen Wang, Yan Li 0087, Yingchun Guo, He Jiang 0001, Xingquan Zhu 0001, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

45. ConsistencyDet: A Robust Object Detector with a Denoising Paradigm of Consistency Model.

Author: Lifan Jiang, Zhihui Wang 0003, Changmiao Wang, Ming Li 0065, Jiaxu Leng, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

46. Representation learning: serial-autoencoder for personalized recommendation.

Author: Yi Zhu 0006, Yishuai Geng, Yun Li 0010, Jipeng Qiang, and Xindong Wu 0001
Published: 2024
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Category

Publication Type

Journal

Database

Publisher

1,597 results on '"Xindong Wu"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources