3,001 results on '"Xu, Yao"'
Search Results
2. Xu, Yao
- Published
- 2022
3. StylePrompter: Enhancing Domain Generalization with Test-Time Style Priors
- Author
-
Zhang, Jiao, Xu, Jian, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
In real-world applications, the sample distribution at the inference stage often differs from the one at the training stage, causing performance degradation of trained deep models. The research on domain generalization (DG) aims to develop robust algorithms that can improve the generalized performance in unseen domains by training on a few domains. However, the domain-agnostic vision model, trained on a limited number of domains using traditional domain generalization methods, cannot guarantee its effectiveness in dealing with unseen domains. The introduction of language can break the closed cognition space of the vision model, providing additional semantic information that cannot be inferred from vision-only datasets. In this paper, we propose to overcome the challenge in previous DG methods by introducing the style prompt in the language modality to adapt the trained model dynamically. In particular, we train a style prompter to extract style information of the current image into an embedding in the token embedding space and place it in front of the candidate category words as prior knowledge to prompt the model. Our open space partition of the style token embedding space and the hand-crafted style regularization enable the trained style prompter to handle data from unknown domains effectively. Extensive experiments verify the effectiveness of our method and demonstrate state-of-the-art performances on multiple public datasets. Codes will be available after the acceptance of this paper.
- Published
- 2024
4. Enabling Practical Transparent Checkpointing for MPI: A Topological Sort Approach
- Author
-
Xu, Yao and Cooperman, Gene
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing ,D.1.3 - Abstract
MPI is the de facto standard for parallel computing on a cluster of computers. Checkpointing is an important component in any strategy for software resilience and for long-running jobs that must be executed by chaining together time-bounded resource allocations. This work solves an old problem: a practical and general algorithm for transparent checkpointing of MPI that is both efficient and compatible with most of the latest network software. Transparent checkpointing is attractive due to its generality and ease of use for most MPI application developers. Earlier efforts at transparent checkpointing for MPI, one decade ago, had two difficult problems: (i) by relying on a specific MPI implementation tied to a specific network technology; and (ii) by failing to demonstrate sufficiently low runtime overhead. Problem (i) (network dependence) was already solved in 2019 by MANA's introduction of split processes. Problem (ii) (efficient runtime overhead) is solved in this work. This paper introduces an approach that avoids these limitations, employing a novel topological sort to algorithmically determine a safe future synchronization point. The algorithm is valid for both blocking and non-blocking collective communication in MPI. We demonstrate the efficacy and scalability of our approach through both micro-benchmarks and a set of five real-world MPI applications, notably including the widely used VASP (Vienna Ab Initio Simulation Package), which is responsible for 11% of the workload on the Perlmutter supercomputer at Lawrence Berkley National Laboratory. VASP was previously cited as a special challenge for checkpointing, in part due to its multi-algorithm codes., Comment: 22 pages, 9 figures and 1 table, accepted to IEEE Cluster'24
- Published
- 2024
5. PASS++: A Dual Bias Reduction Framework for Non-Exemplar Class-Incremental Learning
- Author
-
Zhu, Fei, Zhang, Xu-Yao, Cheng, Zhen, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
Class-incremental learning (CIL) aims to recognize new classes incrementally while maintaining the discriminability of old classes. Most existing CIL methods are exemplar-based, i.e., storing a part of old data for retraining. Without relearning old data, those methods suffer from catastrophic forgetting. In this paper, we figure out two inherent problems in CIL, i.e., representation bias and classifier bias, that cause catastrophic forgetting of old knowledge. To address these two biases, we present a simple and novel dual bias reduction framework that employs self-supervised transformation (SST) in input space and prototype augmentation (protoAug) in deep feature space. On the one hand, SST alleviates the representation bias by learning generic and diverse representations that can transfer across different tasks. On the other hand, protoAug overcomes the classifier bias by explicitly or implicitly augmenting prototypes of old classes in the deep feature space, which poses tighter constraints to maintain previously learned decision boundaries. We further propose hardness-aware prototype augmentation and multi-view ensemble strategies, leading to significant improvements. The proposed framework can be easily integrated with pre-trained models. Without storing any samples of old classes, our method can perform comparably with state-of-the-art exemplar-based approaches which store plenty of old data. We hope to draw the attention of researchers back to non-exemplar CIL by rethinking the necessity of storing old samples in CIL.
- Published
- 2024
6. From Instance Training to Instruction Learning: Task Adapters Generation from Instructions
- Author
-
Liao, Huanxuan, Xu, Yao, He, Shizhu, Zhang, Yuanzhe, Hao, Yanchao, Liu, Shengping, Liu, Kang, and Zhao, Jun
- Subjects
Computer Science - Computation and Language - Abstract
Large language models (LLMs) have acquired the ability to solve general tasks by utilizing instruction finetuning (IFT). However, IFT still relies heavily on instance training of extensive task data, which greatly limits the adaptability of LLMs to real-world scenarios where labeled task instances are scarce and broader task generalization becomes paramount. Contrary to LLMs, humans acquire skills and complete tasks not merely through repeated practice but also by understanding and following instructional guidelines. This paper is dedicated to simulating human learning to address the shortcomings of instance training, focusing on instruction learning to enhance cross-task generalization. Within this context, we introduce Task Adapters Generation from Instructions (TAGI), which automatically constructs the task-specific model in a parameter generation manner based on the given task instructions without retraining for unseen tasks. Specifically, we utilize knowledge distillation to enhance the consistency between TAGI developed through Learning with Instruction and task-specific models developed through Training with Instance, by aligning the labels, output logits, and adapter parameters between them. TAGI is endowed with cross-task generalization capabilities through a two-stage training process that includes hypernetwork pretraining and finetuning. We evaluate TAGI on the Super-Natural Instructions and P3 datasets. The experimental results demonstrate that TAGI can match or even outperform traditional meta-trained models and other hypernetwork models, while significantly reducing computational requirements.
- Published
- 2024
7. Differentiable Proximal Graph Matching
- Author
-
Tan, Haoru, Wang, Chuang, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Graph matching is a fundamental tool in computer vision and pattern recognition. In this paper, we introduce an algorithm for graph matching based on the proximal operator, referred to as differentiable proximal graph matching (DPGM). Specifically, we relax and decompose the quadratic assignment problem for the graph matching into a sequence of convex optimization problems. The whole algorithm can be considered as a differentiable map from the graph affinity matrix to the prediction of node correspondence. Therefore, the proposed method can be organically integrated into an end-to-end deep learning framework to jointly learn both the deep feature representation and the graph affinity matrix. In addition, we provide a theoretical guarantee to ensure the proposed method converges to a stable point with a reasonable number of iterations. Numerical experiments show that PGM outperforms existing graph matching algorithms on diverse datasets such as synthetic data, and CMU House. Meanwhile, PGM can fully harness the capability of deep feature extractors and achieve state-of-art performance on PASCAL VOC keypoints.
- Published
- 2024
8. Mapping dissolved carbon in space and time: An experimental technique for the measurement of pH and total carbon concentration in density driven convection of CO$_2$ dissolved in water
- Author
-
Birggison, Hilmar Yngvi, Xu, Yao, Moura, Marcel, Flekkøy, Eirik Grude, and Måløy, Knut Jørgen
- Subjects
Physics - Fluid Dynamics - Abstract
We present an experimental technique for determining the pH and the total carbon concentration when \ch{CO2} diffuses and flows in water. The technique employs three different pH indicators, which, when combined with an image analysis technique, provides a dynamic range in pH from 4.0 to 9.5. In contrast to usual techniques in which a single pH indicator is used, the methodology presented allows not only to produce a binary classification (pH larger or smaller than a given threshold) but to access a much more complete continuous spatial distribution of pH and concentration levels in the system. We calibrate the method against benchmark solutions and further demonstrate its potential by measuring the pH and total carbon concentration in a density driven convection (DDC) of carbon-enriched water. The motivation for testing the method in this particular experiment comes from the fact that DDC plays a pivotal role in the efficiency of engineered carbon storage processes. The application of the technique presented here provided a direct window for the analysis of the spatial distribution of captured carbon in the DDC flow., Comment: Supplementary Material containing videos of spatiotemporal pH and carbon concentration can be found in Zenodo via the link: https://doi.org/10.5281/zenodo.11148678
- Published
- 2024
9. Generate-on-Graph: Treat LLM as both Agent and KG in Incomplete Knowledge Graph Question Answering
- Author
-
Xu, Yao, He, Shizhu, Chen, Jiabei, Wang, Zihao, Song, Yangqiu, Tong, Hanghang, Liu, Kang, and Zhao, Jun
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
To address the issue of insufficient knowledge and the tendency to generate hallucination in Large Language Models (LLMs), numerous studies have endeavored to integrate LLMs with Knowledge Graphs (KGs). However, all these methods are evaluated on conventional Knowledge Graph Question Answering (KGQA) with complete KGs, where the factual triples involved in each question are entirely covered by the given KG. In this situation, LLM mainly acts as an agent to find answer entities by exploring the KG, rather than effectively integrating internal and external knowledge sources. However, in real-world scenarios, KGs are often incomplete to cover all the knowledge required to answer questions. To simulate real-world scenarios and evaluate the ability of LLMs to integrate internal and external knowledge, in this paper, we propose leveraging LLMs for QA under Incomplete Knowledge Graph (IKGQA), where the given KG doesn't include all the factual triples involved in each question. To handle IKGQA, we propose a training-free method called Generate-on-Graph (GoG) that can generate new factual triples while exploring on KGs. Specifically, we propose a selecting-generating-answering framework, which not only treat the LLM as an agent to explore on KGs, but also treat it as a KG to generate new facts based on the explored subgraph and its inherent knowledge. Experimental results on two datasets demonstrate that our GoG can solve IKGQA to a certain extent, while almost all previous methods cannot perform well on IKGQA.
- Published
- 2024
10. Unified Entropy Optimization for Open-Set Test-Time Adaptation
- Author
-
Gao, Zhengqing, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Test-time adaptation (TTA) aims at adapting a model pre-trained on the labeled source domain to the unlabeled target domain. Existing methods usually focus on improving TTA performance under covariate shifts, while neglecting semantic shifts. In this paper, we delve into a realistic open-set TTA setting where the target domain may contain samples from unknown classes. Many state-of-the-art closed-set TTA methods perform poorly when applied to open-set scenarios, which can be attributed to the inaccurate estimation of data distribution and model confidence. To address these issues, we propose a simple but effective framework called unified entropy optimization (UniEnt), which is capable of simultaneously adapting to covariate-shifted in-distribution (csID) data and detecting covariate-shifted out-of-distribution (csOOD) data. Specifically, UniEnt first mines pseudo-csID and pseudo-csOOD samples from test data, followed by entropy minimization on the pseudo-csID data and entropy maximization on the pseudo-csOOD data. Furthermore, we introduce UniEnt+ to alleviate the noise caused by hard data partition leveraging sample-level confidence. Extensive experiments on CIFAR benchmarks and Tiny-ImageNet-C show the superiority of our framework. The code is available at https://github.com/gaozhengqing/UniEnt, Comment: CVPR 2024
- Published
- 2024
11. Quantum gravity of the Heisenberg algebra
- Author
-
Almheiri, Ahmed, Goel, Akash, and Hu, Xu-Yao
- Subjects
High Energy Physics - Theory ,Condensed Matter - Strongly Correlated Electrons ,General Relativity and Quantum Cosmology - Abstract
We consider a simplified model of double scaled SYK (DSSYK) in which the Hamiltonian is the position operator of the Harmonic oscillator. This model captures the high temperature limit of DSSYK but could also be defined as a quantum theory in its own right. We study properties of the emergent geometry including its dynamics in response to inserting matter particles. In particular, we find that the model displays de Sitter-like properties such as that infalling matter reduces the rate of growth of geodesic slices between the two boundaries. The simplicity of the model allows us to compute the full generating functional for correlation functions of the length mode or any number of matter operators. We provide evidence that the effective action of the geodesic length between boundary points is non-local. Furthermore, we use the on-shell solution for the geodesic lengths between any two boundary points to reconstruct an effective bulk metric and reverse engineer the dilaton gravity theory that generates this metric as a solution., Comment: 30 pages + appendices; v2: typos corrected, references added
- Published
- 2024
12. Imagination Augmented Generation: Learning to Imagine Richer Context for Question Answering over Large Language Models
- Author
-
Liao, Huanxuan, He, Shizhu, Xu, Yao, Zhang, Yuanzhe, Liu, Kang, Liu, Shengping, and Zhao, Jun
- Subjects
Computer Science - Computation and Language - Abstract
Retrieval-Augmented-Generation and Gener-ation-Augmented-Generation have been proposed to enhance the knowledge required for question answering over Large Language Models (LLMs). However, the former relies on external resources, and both require incorporating explicit documents into the context, which increases execution costs and susceptibility to noise data. Recent works indicate that LLMs have modeled rich knowledge, albeit not effectively triggered or awakened. Inspired by this, we propose a novel knowledge-augmented framework, Imagination-Augmented-Generation (IAG), which simulates the human capacity to compensate for knowledge deficits while answering questions solely through imagination, thereby awakening relevant knowledge in LLMs without relying on external resources. Guided by IAG, we propose an imagine richer context method for question answering (IMcQA). IMcQA consists of two modules: explicit imagination, which generates a short dummy document by learning from long context compression, and implicit imagination, which creates flexible adapters by distilling from a teacher model with a long context. Experimental results on three datasets demonstrate that IMcQA exhibits significant advantages in both open-domain and closed-book settings, as well as in out-of-distribution generalization. Our code will be available at https://github.com/Xnhyacinth/IAG.
- Published
- 2024
13. Ensemble Quadratic Assignment Network for Graph Matching
- Author
-
Tan, Haoru, Wang, Chuang, Wu, Sitong, Zhang, Xu-Yao, Yin, Fei, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Graph matching is a commonly used technique in computer vision and pattern recognition. Recent data-driven approaches have improved the graph matching accuracy remarkably, whereas some traditional algorithm-based methods are more robust to feature noises, outlier nodes, and global transformation (e.g.~rotation). In this paper, we propose a graph neural network (GNN) based approach to combine the advantages of data-driven and traditional methods. In the GNN framework, we transform traditional graph-matching solvers as single-channel GNNs on the association graph and extend the single-channel architecture to the multi-channel network. The proposed model can be seen as an ensemble method that fuses multiple algorithms at every iteration. Instead of averaging the estimates at the end of the ensemble, in our approach, the independent iterations of the ensembled algorithms exchange their information after each iteration via a 1x1 channel-wise convolution layer. Experiments show that our model improves the performance of traditional algorithms significantly. In addition, we propose a random sampling strategy to reduce the computational complexity and GPU memory usage, so the model applies to matching graphs with thousands of nodes. We evaluate the performance of our method on three tasks: geometric graph matching, semantic feature matching, and few-shot 3D shape classification. The proposed model performs comparably or outperforms the best existing GNN-based methods., Comment: Accepted by IJCV in 2024
- Published
- 2024
14. Active Generalized Category Discovery
- Author
-
Ma, Shijie, Zhu, Fei, Zhong, Zhun, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Generalized Category Discovery (GCD) is a pragmatic and challenging open-world task, which endeavors to cluster unlabeled samples from both novel and old classes, leveraging some labeled data of old classes. Given that knowledge learned from old classes is not fully transferable to new classes, and that novel categories are fully unlabeled, GCD inherently faces intractable problems, including imbalanced classification performance and inconsistent confidence between old and new classes, especially in the low-labeling regime. Hence, some annotations of new classes are deemed necessary. However, labeling new classes is extremely costly. To address this issue, we take the spirit of active learning and propose a new setting called Active Generalized Category Discovery (AGCD). The goal is to improve the performance of GCD by actively selecting a limited amount of valuable samples for labeling from the oracle. To solve this problem, we devise an adaptive sampling strategy, which jointly considers novelty, informativeness and diversity to adaptively select novel samples with proper uncertainty. However, owing to the varied orderings of label indices caused by the clustering of novel classes, the queried labels are not directly applicable to subsequent training. To overcome this issue, we further propose a stable label mapping algorithm that transforms ground truth labels to the label space of the classifier, thereby ensuring consistent training across different active selection stages. Our method achieves state-of-the-art performance on both generic and fine-grained datasets. Our code is available at https://github.com/mashijie1028/ActiveGCD, Comment: Accepted to CVPR 2024
- Published
- 2024
15. Revisiting Confidence Estimation: Towards Reliable Failure Prediction
- Author
-
Zhu, Fei, Zhang, Xu-Yao, Cheng, Zhen, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
Reliable confidence estimation is a challenging yet fundamental requirement in many risk-sensitive applications. However, modern deep neural networks are often overconfident for their incorrect predictions, i.e., misclassified samples from known classes, and out-of-distribution (OOD) samples from unknown classes. In recent years, many confidence calibration and OOD detection methods have been developed. In this paper, we find a general, widely existing but actually-neglected phenomenon that most confidence estimation methods are harmful for detecting misclassification errors. We investigate this problem and reveal that popular calibration and OOD detection methods often lead to worse confidence separation between correctly classified and misclassified examples, making it difficult to decide whether to trust a prediction or not. Finally, we propose to enlarge the confidence gap by finding flat minima, which yields state-of-the-art failure prediction performance under various settings including balanced, long-tailed, and covariate-shift classification scenarios. Our study not only provides a strong baseline for reliable confidence estimation but also acts as a bridge between understanding calibration, OOD detection, and failure prediction. The code is available at \url{https://github.com/Impression2805/FMFP}., Comment: Accepted by IEEE TPAMI. arXiv admin note: text overlap with arXiv:2303.02970; text overlap with arXiv:2007.01458 by other authors
- Published
- 2024
16. Open-world Machine Learning: A Review and New Outlooks
- Author
-
Zhu, Fei, Ma, Shijie, Cheng, Zhen, Zhang, Xu-Yao, Zhang, Zhaoxiang, and Liu, Cheng-Lin
- Subjects
Computer Science - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Machine learning has achieved remarkable success in many applications. However, existing studies are largely based on the closed-world assumption, which assumes that the environment is stationary, and the model is fixed once deployed. In many real-world applications, this fundamental and rather naive assumption may not hold because an open environment is complex, dynamic, and full of unknowns. In such cases, rejecting unknowns, discovering novelties, and then incrementally learning them, could enable models to be safe and evolve continually as biological systems do. This paper provides a holistic view of open-world machine learning by investigating unknown rejection, novel class discovery, and class-incremental learning in a unified paradigm. The challenges, principles, and limitations of current methodologies are discussed in detail. Finally, we discuss several potential directions for future research. This paper aims to provide a comprehensive introduction to the emerging open-world machine learning paradigm, to help researchers build more powerful AI systems in their respective fields, and to promote the development of artificial general intelligence.
- Published
- 2024
17. PILoRA: Prototype Guided Incremental LoRA for Federated Class-Incremental Learning
- Author
-
Guo, Haiyang, Zhu, Fei, Liu, Wenzhuo, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Existing federated learning methods have effectively dealt with decentralized learning in scenarios involving data privacy and non-IID data. However, in real-world situations, each client dynamically learns new classes, requiring the global model to classify all seen classes. To effectively mitigate catastrophic forgetting and data heterogeneity under low communication costs, we propose a simple and effective method named PILoRA. On the one hand, we adopt prototype learning to learn better feature representations and leverage the heuristic information between prototypes and class features to design a prototype re-weight module to solve the classifier bias caused by data heterogeneity without retraining the classifier. On the other hand, we view incremental learning as the process of learning distinct task vectors and encoding them within different LoRA parameters. Accordingly, we propose Incremental LoRA to mitigate catastrophic forgetting. Experimental results on standard datasets indicate that our method outperforms the state-of-the-art approaches significantly. More importantly, our method exhibits strong robustness and superiority in different settings and degrees of data heterogeneity. The code is available at \url{https://github.com/Ghy0501/PILoRA}., Comment: ECCV 2024
- Published
- 2024
18. Unified Classification and Rejection: A One-versus-All Framework
- Author
-
Cheng, Zhen, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
Classifying patterns of known classes and rejecting ambiguous and novel (also called as out-of-distribution (OOD)) inputs are involved in open world pattern recognition. Deep neural network models usually excel in closed-set classification while performs poorly in rejecting OOD inputs. To tackle this problem, numerous methods have been designed to perform open set recognition (OSR) or OOD rejection/detection tasks. Previous methods mostly take post-training score transformation or hybrid models to ensure low scores on OOD inputs while separating known classes. In this paper, we attempt to build a unified framework for building open set classifiers for both classification and OOD rejection. We formulate the open set recognition of $ K $-known-class as a $ (K+1) $-class classification problem with model trained on known-class samples only. By decomposing the $ K $-class problem into $ K $ one-versus-all (OVA) binary classification tasks and binding some parameters, we show that combining the scores of OVA classifiers can give $ (K+1) $-class posterior probabilities, which enables classification and OOD rejection in a unified framework. To maintain the closed-set classification accuracy of the OVA trained classifier, we propose a hybrid training strategy combining OVA loss and multi-class cross-entropy loss. We implement the OVA framework and hybrid training strategy on the recently proposed convolutional prototype network and prototype classifier on vision transformer (ViT) backbone. Experiments on popular OSR and OOD detection datasets demonstrate that the proposed framework, using a single multi-class classifier, yields competitive performance in closed-set classification, OOD detection, and misclassification detection., Comment: Published in Machine Intelligence Research (https://link.springer.com/article/10.1007/s11633-024-1514-4)
- Published
- 2023
- Full Text
- View/download PDF
19. Query2Triple: Unified Query Encoding for Answering Diverse Complex Queries over Knowledge Graphs
- Author
-
Xu, Yao, He, Shizhu, Wang, Cunguang, Cai, Li, Liu, Kang, and Zhao, Jun
- Subjects
Computer Science - Artificial Intelligence - Abstract
Complex Query Answering (CQA) is a challenge task of Knowledge Graph (KG). Due to the incompleteness of KGs, query embedding (QE) methods have been proposed to encode queries and entities into the same embedding space, and treat logical operators as neural set operators to obtain answers. However, these methods train KG embeddings and neural set operators concurrently on both simple (one-hop) and complex (multi-hop and logical) queries, which causes performance degradation on simple queries and low training efficiency. In this paper, we propose Query to Triple (Q2T), a novel approach that decouples the training for simple and complex queries. Q2T divides the training into two stages: (1) Pre-training a neural link predictor on simple queries to predict tail entities based on the head entity and relation. (2) Training a query encoder on complex queries to encode diverse complex queries into a unified triple form that can be efficiently solved by the pretrained neural link predictor. Our proposed Q2T is not only efficient to train, but also modular, thus easily adaptable to various neural link predictors that have been studied well. Extensive experiments demonstrate that, even without explicit modeling for neural set operators, Q2T still achieves state-of-the-art performance on diverse complex queries over three public benchmarks., Comment: Accepted by EMNLP 2023 findings
- Published
- 2023
20. Implementation-Oblivious Transparent Checkpoint-Restart for MPI
- Author
-
Xu, Yao, Belyaev, Leonid, Jain, Twinkle, Schafer, Derek, Skjellum, Anthony, and Cooperman, Gene
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
This work presents experience with traditional use cases of checkpointing on a novel platform. A single codebase (MANA) transparently checkpoints production workloads for major available MPI implementations: "develop once, run everywhere". The new platform enables application developers to compile their application against any of the available standards-compliant MPI implementations, and test each MPI implementation according to performance or other features., Comment: 17 pages, 4 figures
- Published
- 2023
21. Towards Reliable Domain Generalization: A New Dataset and Evaluations
- Author
-
Zhang, Jiao, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
There are ubiquitous distribution shifts in the real world. However, deep neural networks (DNNs) are easily biased towards the training set, which causes severe performance degradation when they receive out-of-distribution data. Many methods are studied to train models that generalize under various distribution shifts in the literature of domain generalization (DG). However, the recent DomainBed and WILDS benchmarks challenged the effectiveness of these methods. Aiming at the problems in the existing research, we propose a new domain generalization task for handwritten Chinese character recognition (HCCR) to enrich the application scenarios of DG method research. We evaluate eighteen DG methods on the proposed PaHCC (Printed and Handwritten Chinese Characters) dataset and show that the performance of existing methods on this dataset is still unsatisfactory. Besides, under a designed dynamic DG setting, we reveal more properties of DG methods and argue that only the leave-one-domain-out protocol is unreliable. We advocate that researchers in the DG community refer to dynamic performance of methods for more comprehensive and reliable evaluation. Our dataset and evaluations bring new perspectives to the community for more substantial progress. We will make our dataset public with the article published to facilitate the study of domain generalization.
- Published
- 2023
22. Finite-dimensionality of attractors for wave equations with degenerate nonlocal damping
- Author
-
Tang, Zhijun, Yan, Senlin, Xu, Yao, and Zhong, Chengkui
- Subjects
Mathematics - Analysis of PDEs ,Mathematics - Dynamical Systems ,37L30, 35B41, 35B40 - Abstract
In this paper we study the fractal dimension of global attractors for a class of wave equations with (single-point) degenerate nonlocal damping. Both the equation and its linearization degenerate into linear wave equations at the degenerate point and the usual approaches to bound the dimension of the entirety of attractors do not work directly. Instead, we develop a new process concerning the dimension near the degenerate point individually and show the finite dimensionality of the attractor., Comment: 33 pages
- Published
- 2023
23. A sensitive fluorescence biosensor based on ligation-transcription and CRISPR/Cas13a-assisted cascade amplification strategies to detect the H1N1 virus
- Author
-
Xue, Lulu, Bu, Shengjun, Xu, Mengyao, Wei, Jiaqi, Zhou, Hongyu, Xu, Yao, Hao, Zhuo, Li, Zehong, and Wan, Jiayu
- Published
- 2024
- Full Text
- View/download PDF
24. Towards Trustworthy Dataset Distillation
- Author
-
Ma, Shijie, Zhu, Fei, Cheng, Zhen, and Zhang, Xu-Yao
- Subjects
Computer Science - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Efficiency and trustworthiness are two eternal pursuits when applying deep learning in real-world applications. With regard to efficiency, dataset distillation (DD) endeavors to reduce training costs by distilling the large dataset into a tiny synthetic dataset. However, existing methods merely concentrate on in-distribution (InD) classification in a closed-world setting, disregarding out-of-distribution (OOD) samples. On the other hand, OOD detection aims to enhance models' trustworthiness, which is always inefficiently achieved in full-data settings. For the first time, we simultaneously consider both issues and propose a novel paradigm called Trustworthy Dataset Distillation (TrustDD). By distilling both InD samples and outliers, the condensed datasets are capable of training models competent in both InD classification and OOD detection. To alleviate the requirement of real outlier data, we further propose to corrupt InD samples to generate pseudo-outliers, namely Pseudo-Outlier Exposure (POE). Comprehensive experiments on various settings demonstrate the effectiveness of TrustDD, and POE surpasses the state-of-the-art method Outlier Exposure (OE). Compared with the preceding DD, TrustDD is more trustworthy and applicable to open-world scenarios. Our code is available at https://github.com/mashijie1028/TrustDD, Comment: Accepted to Pattern Recognition 2024
- Published
- 2023
- Full Text
- View/download PDF
25. TensorGPT: Efficient Compression of the Embedding Layer in LLMs based on the Tensor-Train Decomposition
- Author
-
Xu, Mingxue, Xu, Yao Lei, and Mandic, Danilo P.
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning ,Computer Science - Neural and Evolutionary Computing ,Mathematics - Numerical Analysis - Abstract
High-dimensional token embeddings underpin Large Language Models (LLMs), as they can capture subtle semantic information and significantly enhance the modelling of complex language patterns. However, the associated high dimensionality also introduces considerable model parameters, and a prohibitively high model storage. To address this issue, this work proposes an approach based on the Tensor-Train Decomposition (TTD), where each token embedding is treated as a Matrix Product State (MPS) that can be efficiently computed in a distributed manner. The experimental results on GPT-2 demonstrate that, through our approach, the embedding layer can be compressed by a factor of up to 38.40 times, and when the compression factor is 3.31 times, even produced a better performance than the original GPT-2 model.
- Published
- 2023
26. Quantum gravity of the Heisenberg algebra
- Author
-
Ahmed Almheiri, Akash Goel, and Xu-Yao Hu
- Subjects
2D Gravity ,Field Theories in Lower Dimensions ,Models of Quantum Gravity ,Gauge-Gravity Correspondence ,Nuclear and particle physics. Atomic energy. Radioactivity ,QC770-798 - Abstract
Abstract We consider a simplified model of double scaled SYK (DSSYK) in which the Hamiltonian is the position operator of the Harmonic oscillator. This model captures the high temperature limit of DSSYK but could also be defined as a quantum theory in its own right. We study properties of the emergent geometry including its dynamics in response to inserting matter particles. In particular, we find that the model displays de Sitter-like properties such as that infalling matter reduces the rate of growth of geodesic slices between the two boundaries. The simplicity of the model allows us to compute the full generating functional for correlation functions of the length mode or any number of matter operators. We provide evidence that the effective action of the geodesic length between boundary points is non-local. Furthermore, we use the on-shell solution for the geodesic lengths between any two boundary points to reconstruct an effective bulk metric and reverse engineer the dilaton gravity theory that generates this metric as a solution.
- Published
- 2024
- Full Text
- View/download PDF
27. 5α-Epoxyalantolactone from Inula macrophylla attenuates cognitive deficits in scopolamine-induced Alzheimer’s disease mice model
- Author
-
Rui Ma, Xu-Yao Feng, Jiang-Jiang Tang, Wei Ha, and Yan-Ping Shi
- Subjects
Alzheimer’s disease ,5α-Epoxyalantolactone (5α-EAL) ,Anti-neuroinflammation ,Attenuates cognitive deficits ,Botany ,QK1-989 - Abstract
Abstract Alzheimer’s disease (AD) is a complex neurodegenerative condition. 5α-epoxyalantolactone (5α-EAL), a eudesmane-type sesquiterpene isolated from the herb of Inula macrophylla, has various pharmacological effects. This work supposed to investigate the improved impact of 5α-EAL on cognitive impairment. 5α-EAL inhibited the generation of nitric oxide (NO) in BV-2 cells stimulated with lipopolysaccharide (LPS) with an EC50 of 6.2 μM. 5α-EAL significantly reduced the production of prostaglandin E2 (PGE2) and tumor necrosis factor-α (TNF-α), while also inhibiting the production of cyclooxygenase-2 (COX-2) and inducible nitric oxide synthase (iNOS) proteins. The ability of 5α-EAL to penetrate the blood–brain barrier (BBB) was confirmed via a parallel artificial membrane permeation assay. Scopolamine (SCOP)-induced AD mice model was employed to assess the improved impacts of 5α-EAL on cognitive impairment in vivo. After the mice were pretreated with 5α-EAL (10 and 30 mg/kg per day, i.p.) for 21 days, the behavioral experiments indicated that the administration of the 5α-EAL could alleviate the cognitive and memory impairments. 5α-EAL significantly reduced the AChE activity in the brain of SCOP-induced AD mice. In summary, these findings highlight the beneficial effects of the natural product 5α-EAL as a potential bioactive compound for attenuating cognitive deficits in AD due to its pharmacological profile. Graphical Abstract
- Published
- 2024
- Full Text
- View/download PDF
28. Interpretation of the U.S. Preventive Clinical Services Guidelines Workgroup's Healthy Diet and Physical Activity for Cardiovascular Disease Prevention in Adults without Cardiovascular Disease Risk Factors: Behavioral Counseling Interventions
- Author
-
YANG Xu, YAO Mi
- Subjects
cardiovascular diseases ,risk factors ,adult ,behavioral counseling interventions ,guideline interpretation ,u.s. preventive services task force ,Medicine - Abstract
In 2022, the U.S. Preventive Services Task Force (USPSTF) updated its recommendations, reviewing the evidence of the benefits and harms of behavioral counseling interventions aimed at promoting healthy behaviors in adults without cardiovascular disease risk factors. The conclusions of this review align with the 2017 guidelines. Behavioral counseling interventions in adults without cardiovascular disease risk factors result in minimal net benefits. Therefore, it is recommended that clinicians make individualized decisions on whether to provide or recommend behavioral counseling interventions to adults without cardiovascular disease risk factors to promote a healthy diet and physical activity (Grade C). This article provides a comprehensive interpretation of the guidelines in the context of the current status of cardiovascular disease prevention in China, offering valuable insights into cardiovascular disease prevention practices among Chinese adults.
- Published
- 2024
- Full Text
- View/download PDF
29. GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark
- Author
-
Li, Dongyang, Ding, Ruixue, Zhang, Qiang, Li, Zheng, Chen, Boli, Xie, Pengjun, Xu, Yao, Li, Xin, Guo, Ning, Huang, Fei, and He, Xiaofeng
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
With a fast developing pace of geographic applications, automatable and intelligent models are essential to be designed to handle the large volume of information. However, few researchers focus on geographic natural language processing, and there has never been a benchmark to build a unified standard. In this work, we propose a GeoGraphic Language Understanding Evaluation benchmark, named GeoGLUE. We collect data from open-released geographic resources and introduce six natural language understanding tasks, including geographic textual similarity on recall, geographic textual similarity on rerank, geographic elements tagging, geographic composition analysis, geographic where what cut, and geographic entity alignment. We also pro vide evaluation experiments and analysis of general baselines, indicating the effectiveness and significance of the GeoGLUE benchmark.
- Published
- 2023
30. OpenMix: Exploring Outlier Samples for Misclassification Detection
- Author
-
Zhu, Fei, Cheng, Zhen, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Reliable confidence estimation for deep neural classifiers is a challenging yet fundamental requirement in high-stakes applications. Unfortunately, modern deep neural networks are often overconfident for their erroneous predictions. In this work, we exploit the easily available outlier samples, i.e., unlabeled samples coming from non-target classes, for helping detect misclassification errors. Particularly, we find that the well-known Outlier Exposure, which is powerful in detecting out-of-distribution (OOD) samples from unknown classes, does not provide any gain in identifying misclassification errors. Based on these observations, we propose a novel method called OpenMix, which incorporates open-world knowledge by learning to reject uncertain pseudo-samples generated via outlier transformation. OpenMix significantly improves confidence reliability under various scenarios, establishing a strong and unified framework for detecting both misclassified samples from known classes and OOD samples from unknown classes. The code is publicly available at https://github.com/Impression2805/OpenMix., Comment: Accepted by CVPR 2023 (Highlight)
- Published
- 2023
31. Graph Tensor Networks: An Intuitive Framework for Designing Large-Scale Neural Learning Systems on Multiple Domains
- Author
-
Xu, Yao Lei, Konstantinidis, Kriton, and Mandic, Danilo P.
- Subjects
Computer Science - Machine Learning - Abstract
Despite the omnipresence of tensors and tensor operations in modern deep learning, the use of tensor mathematics to formally design and describe neural networks is still under-explored within the deep learning community. To this end, we introduce the Graph Tensor Network (GTN) framework, an intuitive yet rigorous graphical framework for systematically designing and implementing large-scale neural learning systems on both regular and irregular domains. The proposed framework is shown to be general enough to include many popular architectures as special cases, and flexible enough to handle data on any and many data domains. The power and flexibility of the proposed framework is demonstrated through real-data experiments, resulting in improved performance at a drastically lower complexity costs, by virtue of tensor algebra.
- Published
- 2023
32. Dynamics-Aware Loss for Learning with Label Noise
- Author
-
Li, Xiu-Chuan, Xia, Xiaobo, Zhu, Fei, Liu, Tongliang, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Machine Learning - Abstract
Label noise poses a serious threat to deep neural networks (DNNs). Employing robust loss functions which reconcile fitting ability with robustness is a simple but effective strategy to handle this problem. However, the widely-used static trade-off between these two factors contradicts the dynamics of DNNs learning with label noise, leading to inferior performance. Therefore, we propose a dynamics-aware loss (DAL) to solve this problem. Considering that DNNs tend to first learn beneficial patterns, then gradually overfit harmful label noise, DAL strengthens the fitting ability initially, then gradually improves robustness. Moreover, at the later stage, to further reduce the negative impact of label noise and combat underfitting simultaneously, we let DNNs put more emphasis on easy examples than hard ones and introduce a bootstrapping term. Both the detailed theoretical analyses and extensive experimental results demonstrate the superiority of our method. Our source code can be found in https://github.com/XiuchuanLi/DAL., Comment: accepted by Pattern Recognition Journal
- Published
- 2023
33. Random coupling model of turbulence as a classical Sachdev-Ye-Kitaev model
- Author
-
Hu, Xu-Yao and Rosenhaus, Vladimir
- Subjects
High Energy Physics - Theory ,Condensed Matter - Strongly Correlated Electrons ,Nonlinear Sciences - Chaotic Dynamics ,Physics - Fluid Dynamics - Abstract
We point out that a classical analog of the Sachdev-Ye-Kitaev model -- a solvable model of quantum many-body chaos, was studied long ago in the turbulence literature. Motivated by the Navier-Stokes equation in the turbulent regime and the nonlinear Schr\"odinger equation describing plasma turbulence, in which there is mixing between many different modes, the random coupling model has a Gaussian-random coupling between any four of a large number $N$ of modes. The model was solved in the 1960s, before the introduction of large $N$ path integral techniques, using a method referred to as the direct interaction approximation. We use the path integral to derive the effective action for the model. The large-$N$ saddle gives an integral equation for the two-point function, which is very similar to the corresponding equation in the SYK model. The connection between the SYK model and the random coupling model may, on the one hand, provide new physical contexts in which to realize the SYK model and, on the other hand, suggest new models of turbulence and techniques for studying them., Comment: 16 pages, v2
- Published
- 2023
- Full Text
- View/download PDF
34. Rethinking Confidence Calibration for Failure Prediction
- Author
-
Zhu, Fei, Cheng, Zhen, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Reliable confidence estimation for the predictions is important in many safety-critical applications. However, modern deep neural networks are often overconfident for their incorrect predictions. Recently, many calibration methods have been proposed to alleviate the overconfidence problem. With calibrated confidence, a primary and practical purpose is to detect misclassification errors by filtering out low-confidence predictions (known as failure prediction). In this paper, we find a general, widely-existed but actually-neglected phenomenon that most confidence calibration methods are useless or harmful for failure prediction. We investigate this problem and reveal that popular confidence calibration methods often lead to worse confidence separation between correct and incorrect samples, making it more difficult to decide whether to trust a prediction or not. Finally, inspired by the natural connection between flat minima and confidence separation, we propose a simple hypothesis: flat minima is beneficial for failure prediction. We verify this hypothesis via extensive experiments and further boost the performance by combining two different flat minima techniques. Our code is available at https://github.com/Impression2805/FMFP, Comment: Accepted to ECCV 2022. Code is available at https://github.com/Impression2805/FMFP
- Published
- 2023
35. Performance of OTFS-NOMA Scheme for Coordinated Direct and Relay Transmission Networks in High-Mobility Scenarios
- Author
-
Xu, Yao, Du, Zhen, Yuan, Weijie, Jia, Shaobo, and Leung, Victor C. M.
- Subjects
Computer Science - Information Theory ,Electrical Engineering and Systems Science - Signal Processing - Abstract
In this letter, an orthogonal time frequency space (OTFS) based non-orthogonal multiple access (NOMA) scheme is investigated for the coordinated direct and relay transmission system, where a source directly communicates with a near user with high mobile speed, and it needs the relaying assistance to serve the far user also having high mobility. Due to the coexistence of signal superposition coding and multi-domain transformation, the performance of OTFS-based NOMA is usually challenging to be measured from a theoretical perspective. To accurately evaluate the system performance of the proposed scheme, we derive the closed-form expressions for the outage probability and the outage sum rate by using the Inversion formula and characteristic function. Numerical results verify the performance superiority and the effectiveness of the proposed scheme.
- Published
- 2023
36. Average of Pruning: Improving Performance and Stability of Out-of-Distribution Detection
- Author
-
Cheng, Zhen, Zhu, Fei, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Detecting Out-of-distribution (OOD) inputs have been a critical issue for neural networks in the open world. However, the unstable behavior of OOD detection along the optimization trajectory during training has not been explored clearly. In this paper, we first find the performance of OOD detection suffers from overfitting and instability during training: 1) the performance could decrease when the training error is near zero, and 2) the performance would vary sharply in the final stage of training. Based on our findings, we propose Average of Pruning (AoP), consisting of model averaging and pruning, to mitigate the unstable behaviors. Specifically, model averaging can help achieve a stable performance by smoothing the landscape, and pruning is certified to eliminate the overfitting by eliminating redundant features. Comprehensive experiments on various datasets and architectures are conducted to verify the effectiveness of our method.
- Published
- 2023
37. Research progress on the mechanism of the Hedgehog signaling pathway during mandibular development
- Author
-
XU Yao, LI Wenjin
- Subjects
hedgehog signaling pathway ,mandible ,condyle ,intramembranous ossification ,sonic hedgehog ,indian hedgehog ,mechanism ,abnormal developmentof the mandible ,temporomandibular osteoarthritis ,Medicine - Abstract
The source and process of mandible development are significantly different from those of other bones in the body, and abnormal development can lead to various bone-related diseases, seriously affecting the quality of life of patients. In recent years, the role of the Hedgehog signaling pathway in bone development has received increasing attention. The Hedgehog gene includes three subtypes: Sonic Hedgehog (Shh), Indian Hedgehog (Ihh), and Desert Hedgehog (Dhh). Shh and Ihh can participate in bone metabolism regulation through various pathways, with Shh primarily involved in limb development and Ihh playing a key role in endochondral osteogenesis. The Hedgehog signaling pathway includes Hedgehog signaling protein ligands, Patched (Ptch) receptors, Smoothed (Smo) receptors, nuclear transcription factors, glioma-associated oncogene homologues (Gli), and downstream target genes. The activation of typical Hedgehog signaling pathways requires the involvement of Gli, whereas atypical Hedgehog signaling is mainly regulated by Ptch, Smo, and others. Shh regulates various biological behaviors during early vertebrate embryogenesis, such as organ differentiation, neural stem formation, stem cell differentiation and proliferation, limb bone development, and tooth germ development. During the process of bone cell differentiation, Shh, Ptch1, and Gli1 are expressed in osteoblasts, further promoting the differentiation of bone marrow mesenchymal stem cells into osteoblasts and chondrocytes. IHh plays an indispensable functional role in bone growth, development, and homeostasis and participates in the formation of intramembrane bone collars, proliferation, and maturation of chondrocytes. IHh is expressed in mature skull osteoblasts and can act as a promoter of bone factor regulation of Ptch and bone morphogenetic protein (BMP) expression to induce intramembrane ossification. Brain and muscle ARNT-like protein 1 (BMAL1) can regulate the Hedgehog signaling pathway by binding to Ptch1 and Ihh, playing a crucial role in cartilage formation and endochondral osteogenesis in the temporomandibular joint. Hedgehog signal activators can improve the reduction in mandibular bone mass caused by BMAL1 deficiency. Hedgehog signaling imbalance can have a significant impact on bone development and lead to a series of bone diseases, such as abnormal bone development, fractures, osteoporosis, and osteoarthritis. The mechanism of the Hedgehog signaling pathway in relation to mandibular diseases has not been fully elucidated, and future research should seek to further explore Hedgehog signaling as a potential target for treating mandibular developmental-related diseases.
- Published
- 2024
- Full Text
- View/download PDF
38. MGeo: Multi-Modal Geographic Pre-Training Method
- Author
-
Ding, Ruixue, Chen, Boli, Xie, Pengjun, Huang, Fei, Li, Xin, Zhang, Qiang, and Xu, Yao
- Subjects
Computer Science - Computation and Language - Abstract
As a core task in location-based services (LBS) (e.g., navigation maps), query and point of interest (POI) matching connects users' intent with real-world geographic information. Recently, pre-trained models (PTMs) have made advancements in many natural language processing (NLP) tasks. Generic text-based PTMs do not have enough geographic knowledge for query-POI matching. To overcome this limitation, related literature attempts to employ domain-adaptive pre-training based on geo-related corpus. However, a query generally contains mentions of multiple geographic objects, such as nearby roads and regions of interest (ROIs). The geographic context (GC), i.e., these diverse geographic objects and their relationships, is therefore pivotal to retrieving the most relevant POI. Single-modal PTMs can barely make use of the important GC and therefore have limited performance. In this work, we propose a novel query-POI matching method Multi-modal Geographic language model (MGeo), which comprises a geographic encoder and a multi-modal interaction module. MGeo represents GC as a new modality and is able to fully extract multi-modal correlations for accurate query-POI matching. Besides, there is no publicly available benchmark for this topic. In order to facilitate further research, we build a new open-source large-scale benchmark Geographic TExtual Similarity (GeoTES). The POIs come from an open-source geographic information system (GIS). The queries are manually generated by annotators to prevent privacy issues. Compared with several strong baselines, the extensive experiment results and detailed ablation analyses on GeoTES demonstrate that our proposed multi-modal pre-training method can significantly improve the query-POI matching capability of generic PTMs, even when the queries' GC is not provided. Our code and dataset are publicly available at https://github.com/PhantomGrapes/MGeo., Comment: 10 pages, 5 figures
- Published
- 2023
- Full Text
- View/download PDF
39. Collective Vector Clocks: Low-Overhead Transparent Checkpointing for MPI
- Author
-
Xu, Yao and Cooperman, Gene
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing ,D.1.3 - Abstract
Taking snapshots of the state of a distributed computation is useful for off-line analysis of the computational state, for later restarting from the saved snapshot, for cloning a copy of the computation, and for migration to a new cluster. The problem is made more difficult when supporting collective operations across processes, such as barrier, reduce operations, scatter and gather, etc. Some processes may have reached the barrier or other collective operation, while other processes wait a long time to reach that same barrier or collective operation. At least two solutions are well-known in the literature: (I) draining in-flight network messages and then freezing the network at checkpoint time; and (ii) adding a barrier prior to the collective operation, and either completing the operation or aborting the barrier if not all processes are present. Both solutions suffer important drawbacks. The code in the first solution must be updated whenever one ports to a newer network. The second solution implies additional barrier-related network traffic prior to each collective operation. This work presents a third solution that avoids both drawbacks. There is no additional barrier-related traffic, and the solution is implemented entirely above the network layer. The work is demonstrated in the context of transparent checkpointing of MPI libraries for parallel computation, where each of the first two solutions have already been used in prior systems, and then abandoned due to the aforementioned drawbacks. Experiments demonstrate the low runtime overhead of this new, network-agnostic approach. The approach is also extended to non-blocking, collective operations in order to handle overlapping of computation and communication., Comment: 16 pages, 6 figures
- Published
- 2022
40. Complexity-based Financial Stress Evaluation
- Author
-
Xiao, Hongjian, Xu, Yao Lei, and Mandic, Danilo P.
- Subjects
Electrical Engineering and Systems Science - Signal Processing - Abstract
Financial markets typically exhibit dynamically complex properties as they undergo continuous interactions with economic and environmental factors. The Efficient Market Hypothesis indicates a rich difference in the structural complexity of security prices between normal (stable markets) and abnormal (financial crises) situations. Considering the analogy between market undulation of price time series and physical stress of bio-signals, we investigate whether stress indices in bio-systems can be adopted and modified so as to measure 'standard stress' in financial markets. This is achieved by employing structural complexity analysis, based on variants of univariate and multivariate sample entropy, to estimate the stress level of both financial markets on the whole and the performance of the individual financial indices. Further, we propose a novel graphical framework to establish the sensitivity of individual assets and stock markets to financial crises. This is achieved through Catastrophe Theory and entropy-based stress evaluations indicating the unique performance of each index/individual stock in response to different crises. Four major indices and four individual equities with gold prices are considered over the past 32 years from 1991-2021. Our findings based on nonlinear analyses and the proposed framework support the Efficient Market Hypothesis and reveal the relations among economic indices and within each price time series.
- Published
- 2022
41. Serine/Arginine-Rich Splicing Factor 7 Knockdown Inhibits Aerobic Glycolysis and Growth in HepG2 Cells by Regulating PKM2 Expression
- Author
-
Weiye Shi, Xu Yao, Xueyu Cao, Yu Fu, and Yingze Wang
- Subjects
SRSF7 ,aerobic glycolysis ,PKM2 ,HepG2 ,hepatocellular carcinoma ,Biology (General) ,QH301-705.5 - Abstract
Serine/arginine-rich splicing factors (SRSFs), part of the serine/arginine-rich (SR) protein family, play a crucial role in precursor RNA splicing. Abnormal expression of SRSFs in tumors can disrupt normal RNA splicing, contributing to tumor progression. Notably, SRSF7 has been found to be upregulated in hepatocellular carcinoma (HCC), yet its specific role and molecular mechanisms in HCC pathogenesis are not fully understood. We investigated the expression and prognostic significance of SRSF7 in HCC using bioinformatics database analysis. In HepG2 cells, the expressions of SRSF7 and glycolytic enzymes were analyzed using qRT-PCR, and Western blot. Glucose uptake and lactate production were quantified using relevant reagent kits. Additionally, cell proliferation, clonogenicity, invasion, and apoptosis were evaluated using MTS assay, clonal formation assay, Transwell assay, and mitochondrial membrane potential assay, respectively. This study demonstrated significant overexpression of SRSF7 in HCC tissue, correlating with poor prognosis. Knockdown of SRSF7 in HepG2 cells resulted in inhibited proliferation, clonogenicity, and invasion, while apoptosis was enhanced. This knockdown also decreased glucose uptake and lactate production, along with a reduction in the expression of glucose transporter 1 (GLUT1) and lactate dehydrogenase A (LDHA). Furthermore, SRSF7 downregulation increased the pyruvate kinase muscle 1 (PKM1)/PKM2 ratio. The glycolytic boost due to PKM2 overexpression partially counteracted the effects of SRSF7 silencing on HepG2 cell growth. The knockdown of SRSF7 impairs aerobic glycolysis and growth in HepG2 cells by downregulating PKM2 expression.
- Published
- 2024
- Full Text
- View/download PDF
42. Hyper-GST: Predict Metro Passenger Flow Incorporating GraphSAGE, Hypergraph, Social-meaningful Edge Weights and Temporal Exploitation
- Author
-
Miao, Yuyang, Xu, Yao, and Mandic, Danilo
- Subjects
Computer Science - Machine Learning ,Electrical Engineering and Systems Science - Signal Processing - Abstract
Predicting metro passenger flow precisely is of great importance for dynamic traffic planning. Deep learning algorithms have been widely applied due to their robust performance in modelling non-linear systems. However, traditional deep learning algorithms completely discard the inherent graph structure within the metro system. Graph-based deep learning algorithms could utilise the graph structure but raise a few challenges, such as how to determine the weights of the edges and the shallow receptive field caused by the over-smoothing issue. To further improve these challenges, this study proposes a model based on GraphSAGE with an edge weights learner applied. The edge weights learner utilises socially meaningful features to generate edge weights. Hypergraph and temporal exploitation modules are also constructed as add-ons for better performance. A comparison study is conducted on the proposed algorithm and other state-of-art graph neural networks, where the proposed algorithm could improve the performance.
- Published
- 2022
43. Graph-Regularized Tensor Regression: A Domain-Aware Framework for Interpretable Multi-Way Financial Modelling
- Author
-
Xu, Yao Lei, Konstantinidis, Kriton, and Mandic, Danilo P.
- Subjects
Quantitative Finance - Computational Finance ,Computer Science - Machine Learning - Abstract
Analytics of financial data is inherently a Big Data paradigm, as such data are collected over many assets, asset classes, countries, and time periods. This represents a challenge for modern machine learning models, as the number of model parameters needed to process such data grows exponentially with the data dimensions; an effect known as the Curse-of-Dimensionality. Recently, Tensor Decomposition (TD) techniques have shown promising results in reducing the computational costs associated with large-dimensional financial models while achieving comparable performance. However, tensor models are often unable to incorporate the underlying economic domain knowledge. To this end, we develop a novel Graph-Regularized Tensor Regression (GRTR) framework, whereby knowledge about cross-asset relations is incorporated into the model in the form of a graph Laplacian matrix. This is then used as a regularization tool to promote an economically meaningful structure within the model parameters. By virtue of tensor algebra, the proposed framework is shown to be fully interpretable, both coefficient-wise and dimension-wise. The GRTR model is validated in a multi-way financial forecasting setting and compared against competing models, and is shown to achieve improved performance at reduced computational costs. Detailed visualizations are provided to help the reader gain an intuitive understanding of the employed tensor operations.
- Published
- 2022
44. de-Broglie Wavelength Enhanced Weak Equivalence Principle Test for Atoms in Different Hyperfine States
- Author
-
Xu, Yao-Yao, Deng, Xiao-Bing, Duan, Xiao-Chun, Cao, Lu-Shuai, Zhou, Min-Kang, Shao, Cheng-Gang, and Hu, Zhong-Kun
- Subjects
Physics - Atomic Physics - Abstract
We report a hyperfine-states related weak equivalence principle (WEP) test which searches for possible WEP violation signal in single atom interferometer. With the ground hyperfine states $\left|F=1\right\rangle$ and $\left|F=2\right\rangle$ of $^{87}$Rb atoms simultaneously scanned over different paths in a Raman Mach-Zehnder interferometer (MZI), the difference of the free fall accelerations for the atom in the two hyperfine states is encoded into the phase shift of the MZI, contributing a WEP test signal. The test signal can be extracted out by reversing the direction of the effective wave vector of the Raman laser to suppress direction-dependent disturbances. More importantly, de-Broglie wavelength of cold atoms can be utilized to enhance the test signal in our scheme, which helps to improve the upper bound of the WEP test for atoms in different hyperfine states to $2.9\times10^{-11}$, about one order of magnitude lower than the previous record.
- Published
- 2022
45. Scalable colored sub-ambient radiative coolers based on a polymer-Tamm photonic structure
- Author
-
Huang, Tianzhe, Chen, Qixiang, Huang, Jinhua, Lu, Yuehui, Xu, Hua, Zhao, Meng, Xu, Yao, and Song, Weijie
- Subjects
Physics - Optics ,Physics - Applied Physics - Abstract
Daytime radiative coolers cool objects below the air temperature without any electricity input, while most of them are limited by a silvery or whitish appearance. Colored daytime radiative coolers (CDRCs) with diverse colors, scalable manufacture, and sub-ambient cooling have not been achieved. We introduce a polymer-Tamm photonic structure to enable a high infrared emittance and an engineered absorbed solar irradiance, governed by the quality factor (Q-factor). We theoretically determine the theoretical thresholds for sub-ambient cooling through yellow, magenta, and cyan CDRCs. We experimentally fabricate and observe a temperature drop of 2.6-8.8 degrees Celsius on average during daytime and 4.0-4.4degrees Celsius during nighttime. Furthermore, we demonstrate a scalable-manufactured magenta CDRC with a width of 60 cm and a length of 500 cm by a roll-to-roll deposition technique. This work provides guidelines for large-scale CDRCs and offers unprecedented opportunities for potential applications with energy-saving, aesthetic, and visual comfort demands.
- Published
- 2022
46. Reliability Index Calculation and Reserve Capacity Optimization Considering Multiple Uncertainties
- Author
-
YE Lun, OUYANG Xu, YAO Jiangang, YANG Shengjie, YIN Jungang
- Subjects
renewable energy ,spinning reserve ,reliability index ,security-constrained unit commitment ,cost-benefit analysis ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Chemical engineering ,TP155-156 ,Naval architecture. Shipbuilding. Marine engineering ,VM1-989 - Abstract
In power systems with a high proportion of renewable energy, to achieve coordinated optimal scheduling of source and load considering multiple uncertainties is an important issue in power system operation. Therefore, a probabilistic spinning reserve optimization model based on multiple scenarios is constructed. Multiple uncertain factors are considered in the model, such as wind power and solar power forecast errors, load forecast error and unscheduled generator outage. Renewable energy curtailment and load shedding are used as special reserve resources in the day-ahead security-constrained unit commitment (SCUC) to improve the economic operation efficiency. The calculations of reliability indexes, expected energy not served and expected energy curtailment, are simplified, and the inequality constraints related to these two indexes are reduced, which improves the computational performance of the model. The model optimizes the total expected cost considering multiple uncertainties. Case studies based on the IEEE-RTS demonstrate the effectiveness of the proposed model. The numerical results show that the improved calculation method of reliability indexes can effectively reduce the solution time of the SCUC model. The reserve optimization model can realize the dynamic allocation of the spinning reserve capacity of the system and improve economic operation of the system.
- Published
- 2024
- Full Text
- View/download PDF
47. An efficient polynomial-time approximation scheme for parallel multi-stage open shops
- Author
-
Dong, Jianming, Jin, Ruyan, Lin, Guohui, Su, Bing, Tong, Weitian, and Xu, Yao
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
Various new scheduling problems have been arising from practical production processes and spawning new research areas in the scheduling field. We study the parallel multi-stage open shops problem, which generalizes the classic open shop scheduling and parallel machine scheduling problems. Given m identical k-stage open shops and a set of n jobs, we aim to process all jobs on these open shops with the minimum makespan, i.e., the completion time of the last job, under the constraint that job preemption is not allowed. We present an efficient polynomial-time approximation scheme (EPTAS) for the case when both m and k are constant. The main idea for our EPTAS is the combination of several categorization, scaling, and linear programming rounding techniques. Jobs and/or operations are first scaled and then categorized carefully into multiple types so that different types of jobs and/or operations are scheduled appropriately without increasing the makespan too much.
- Published
- 2022
48. Correlation functions in linear chaotic maps
- Author
-
Hu, Xu-Yao and Rosenhaus, Vladimir
- Subjects
Nonlinear Sciences - Chaotic Dynamics ,Condensed Matter - Statistical Mechanics ,High Energy Physics - Theory ,Mathematics - Dynamical Systems - Abstract
The simplest examples of chaotic maps are linear, area-preserving maps on the circle, torus, or product of tori; respectively known as the Bernoulli map, the cat map, and the recently introduced "spatiotemporal" cat map. We study correlation functions in these maps. For the Bernoulli map, we compute the correlation functions in a variety of ways: by direct computation of the integral, through Fourier series, through symbolic dynamics, and through periodic orbits. In relation to the more standard treatment in terms of eigenfunctions of the Perron-Frobenius operator, some of these methods are simpler and also extend to multipoint correlation functions. For the cat map, we compute correlation functions through a Fourier expansion, review and expand on a prior treatment of two-point functions by Crawford and Cary, and discuss the limitations of shadowing. Finally, for the spatiotemporal cat map -- intended to be a model of many-body chaos -- we show that connected correlation functions of local operators vanish., Comment: 25 pages
- Published
- 2022
49. A Survey of Robust Adversarial Training in Pattern Recognition: Fundamental, Theory, and Methodologies
- Author
-
Qian, Zhuang, Huang, Kaizhu, Wang, Qiu-Feng, and Zhang, Xu-Yao
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
In the last a few decades, deep neural networks have achieved remarkable success in machine learning, computer vision, and pattern recognition. Recent studies however show that neural networks (both shallow and deep) may be easily fooled by certain imperceptibly perturbed input samples called adversarial examples. Such security vulnerability has resulted in a large body of research in recent years because real-world threats could be introduced due to vast applications of neural networks. To address the robustness issue to adversarial examples particularly in pattern recognition, robust adversarial training has become one mainstream. Various ideas, methods, and applications have boomed in the field. Yet, a deep understanding of adversarial training including characteristics, interpretations, theories, and connections among different models has still remained elusive. In this paper, we present a comprehensive survey trying to offer a systematic and structured investigation on robust adversarial training in pattern recognition. We start with fundamentals including definition, notations, and properties of adversarial examples. We then introduce a unified theoretical framework for defending against adversarial samples - robust adversarial training with visualizations and interpretations on why adversarial training can lead to model robustness. Connections will be also established between adversarial training and other traditional learning theories. After that, we summarize, review, and discuss various methodologies with adversarial attack and defense/training algorithms in a structured way. Finally, we present analysis, outlook, and remarks of adversarial training.
- Published
- 2022
50. Document Dewarping with Control Points
- Author
-
Xie, Guo-Wang, Yin, Fei, Zhang, Xu-Yao, and Liu, Cheng-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Document images are now widely captured by handheld devices such as mobile phones. The OCR performance on these images are largely affected due to geometric distortion of the document paper, diverse camera positions and complex backgrounds. In this paper, we propose a simple yet effective approach to rectify distorted document image by estimating control points and reference points. After that, we use interpolation method between control points and reference points to convert sparse mappings to backward mapping, and remap the original distorted document image to the rectified image. Furthermore, control points are controllable to facilitate interaction or subsequent adjustment. We can flexibly select post-processing methods and the number of vertices according to different application scenarios. Experiments show that our approach can rectify document images with various distortion types, and yield state-of-the-art performance on real-world dataset. This paper also provides a training dataset based on control points for document dewarping. Both the code and the dataset are released at https://github.com/gwxie/Document-Dewarping-with-Control-Points., Comment: International Conference on Document Analysis and Recognition, ICDAR 2021, Oral
- Published
- 2022
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.