11,670 results on '"Wei, Ying"'
Search Results
2. Effect of an OwlSpace Programming Course on the Computational Thinking of Elementary School Students
- Author
-
Wei-Ying Li and Tzu-Chuen Lu
- Abstract
This study investigates the effect of programming courses on the computational thinking (CT) skills of elementary school students and the learning effectiveness of students from different backgrounds who are studying programming. We designed a OwlSpace programming course into an elementary school curriculum. Students in fourth and fifth grades were taught the fundamentals of programming. We measured and analyzed the effectiveness of their CT skills and self-efficacy in CT. The researchers analyzed the changes in the CT of different gender, different grade, and different past experience students in programming courses and then made specific recommendations for information technology teachers and related units. The results demonstrate that students learned and improved their CT skills by taking OwlSpace programming course. Additionally, gender, grade, and past experience are found to have no impact on the students' learning that means the course can improve students ability without limited any characteristics.
- Published
- 2024
3. Pre-training with Fractional Denoising to Enhance Molecular Property Prediction
- Author
-
Ni, Yuyan, Feng, Shikun, Hong, Xin, Sun, Yuancheng, Ma, Wei-Ying, Ma, Zhi-Ming, Ye, Qiwei, and Lan, Yanyan
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Physics - Chemical Physics - Abstract
Deep learning methods have been considered promising for accelerating molecular screening in drug discovery and material design. Due to the limited availability of labelled data, various self-supervised molecular pre-training methods have been presented. While many existing methods utilize common pre-training tasks in computer vision (CV) and natural language processing (NLP), they often overlook the fundamental physical principles governing molecules. In contrast, applying denoising in pre-training can be interpreted as an equivalent force learning, but the limited noise distribution introduces bias into the molecular distribution. To address this issue, we introduce a molecular pre-training framework called fractional denoising (Frad), which decouples noise design from the constraints imposed by force learning equivalence. In this way, the noise becomes customizable, allowing for incorporating chemical priors to significantly improve molecular distribution modeling. Experiments demonstrate that our framework consistently outperforms existing methods, establishing state-of-the-art results across force prediction, quantum chemical properties, and binding affinity tasks. The refined noise design enhances force accuracy and sampling coverage, which contribute to the creation of physically consistent molecular representations, ultimately leading to superior predictive performance.
- Published
- 2024
4. Interpretable Catastrophic Forgetting of Large Language Model Fine-tuning via Instruction Vector
- Author
-
Jiang, Gangwei, Jiang, Caigao, Li, Zhaoyi, Xue, Siqiao, Zhou, Jun, Song, Linqi, Lian, Defu, and Wei, Ying
- Subjects
Computer Science - Artificial Intelligence - Abstract
Fine-tuning large language models (LLMs) can cause them to lose their general capabilities. However, the intrinsic mechanisms behind such forgetting remain unexplored. In this paper, we begin by examining this phenomenon by focusing on knowledge understanding and instruction following, with the latter identified as the main contributor to forgetting during fine-tuning. Consequently, we propose the Instruction Vector (IV) framework to capture model representations highly related to specific instruction-following capabilities, thereby making it possible to understand model-intrinsic forgetting. Through the analysis of IV dynamics pre and post-training, we suggest that fine-tuning mostly adds specialized reasoning patterns instead of erasing previous skills, which may appear as forgetting. Building on this insight, we develop IV-guided training, which aims to preserve original computation graph, thereby mitigating catastrophic forgetting. Empirical tests on three benchmarks confirm the efficacy of this new approach, supporting the relationship between IVs and forgetting. Our code will be made available soon.
- Published
- 2024
5. From Theory to Therapy: Reframing SBDD Model Evaluation via Practical Metrics
- Author
-
Gao, Bowen, Tan, Haichuan, Huang, Yanwen, Ren, Minsi, Huang, Xiao, Ma, Wei-Ying, Zhang, Ya-Qin, and Lan, Yanyan
- Subjects
Quantitative Biology - Biomolecules ,Computer Science - Machine Learning - Abstract
Recent advancements in structure-based drug design (SBDD) have significantly enhanced the efficiency and precision of drug discovery by generating molecules tailored to bind specific protein pockets. Despite these technological strides, their practical application in real-world drug development remains challenging due to the complexities of synthesizing and testing these molecules. The reliability of the Vina docking score, the current standard for assessing binding abilities, is increasingly questioned due to its susceptibility to overfitting. To address these limitations, we propose a comprehensive evaluation framework that includes assessing the similarity of generated molecules to known active compounds, introducing a virtual screening-based metric for practical deployment capabilities, and re-evaluating binding affinity more rigorously. Our experiments reveal that while current SBDD models achieve high Vina scores, they fall short in practical usability metrics, highlighting a significant gap between theoretical predictions and real-world applicability. Our proposed metrics and dataset aim to bridge this gap, enhancing the practical applicability of future SBDD models and aligning them more closely with the needs of pharmaceutical research and development.
- Published
- 2024
6. SIU: A Million-Scale Structural Small Molecule-Protein Interaction Dataset for Unbiased Bioactivity Prediction
- Author
-
Huang, Yanwen, Gao, Bowen, Jia, Yinjun, Ma, Hongbo, Ma, Wei-Ying, Zhang, Ya-Qin, and Lan, Yanyan
- Subjects
Quantitative Biology - Biomolecules ,Computer Science - Machine Learning - Abstract
Small molecules play a pivotal role in modern medicine, and scrutinizing their interactions with protein targets is essential for the discovery and development of novel, life-saving therapeutics. The term "bioactivity" encompasses various biological effects resulting from these interactions, including both binding and functional responses. The magnitude of bioactivity dictates the therapeutic or toxic pharmacological outcomes of small molecules, rendering accurate bioactivity prediction crucial for the development of safe and effective drugs. However, existing structural datasets of small molecule-protein interactions are often limited in scale and lack systematically organized bioactivity labels, thereby impeding our understanding of these interactions and precise bioactivity prediction. In this study, we introduce a comprehensive dataset of small molecule-protein interactions, consisting of over a million binding structures, each annotated with real biological activity labels. This dataset is designed to facilitate unbiased bioactivity prediction. We evaluated several classical models on this dataset, and the results demonstrate that the task of unbiased bioactivity prediction is challenging yet essential.
- Published
- 2024
7. MoleculeCLA: Rethinking Molecular Benchmark via Computational Ligand-Target Binding Analysis
- Author
-
Feng, Shikun, Zheng, Jiaxin, Jia, Yinjun, Huang, Yanwen, Zhou, Fengfeng, Ma, Wei-Ying, and Lan, Yanyan
- Subjects
Physics - Chemical Physics ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Molecular representation learning is pivotal for various molecular property prediction tasks related to drug discovery. Robust and accurate benchmarks are essential for refining and validating current methods. Existing molecular property benchmarks derived from wet experiments, however, face limitations such as data volume constraints, unbalanced label distribution, and noisy labels. To address these issues, we construct a large-scale and precise molecular representation dataset of approximately 140,000 small molecules, meticulously designed to capture an extensive array of chemical, physical, and biological properties, derived through a robust computational ligand-target binding analysis pipeline. We conduct extensive experiments on various deep learning models, demonstrating that our dataset offers significant physicochemical interpretability to guide model development and design. Notably, the dataset's properties are linked to binding affinity metrics, providing additional insights into model performance in drug-target interaction tasks. We believe this dataset will serve as a more accurate and reliable benchmark for molecular representation learning, thereby expediting progress in the field of artificial intelligence-driven drug discovery.
- Published
- 2024
8. Rotation and Permutation for Advanced Outlier Management and Efficient Quantization of LLMs
- Author
-
Lin, Haokun, Xu, Haobo, Wu, Yichen, Cui, Jingzhi, Zhang, Yingtao, Mou, Linzhan, Song, Linqi, Sun, Zhenan, and Wei, Ying
- Subjects
Computer Science - Computation and Language - Abstract
Quantizing large language models (LLMs) presents significant challenges, primarily due to outlier activations that compromise the efficiency of low-bit representation. Traditional approaches mainly focus on solving Normal Outliers-activations with consistently high magnitudes across all tokens. However, these techniques falter when dealing with Massive Outliers, which are significantly higher in value and often cause substantial performance losses during low-bit quantization. In this study, we propose DuQuant, an innovative quantization strategy employing rotation and permutation transformations to more effectively eliminate both types of outliers. Initially, DuQuant constructs rotation matrices informed by specific outlier dimensions, redistributing these outliers across adjacent channels within different rotation blocks. Subsequently, a zigzag permutation is applied to ensure a balanced distribution of outliers among blocks, minimizing block-wise variance. An additional rotation further enhances the smoothness of the activation landscape, thereby improving model performance. DuQuant streamlines the quantization process and demonstrates superior outlier management, achieving top-tier results in multiple tasks with various LLM architectures even under 4-bit weight-activation quantization. Our code is available at https://github.com/Hsu1023/DuQuant., Comment: 26 pages, 13 figures
- Published
- 2024
9. Adaptive and Efficient Learning with Blockwise Missing and Semi-Supervised Data
- Author
-
Li, Yiming, Yang, Xuehan, Wei, Ying, and Liu, Molei
- Subjects
Statistics - Methodology - Abstract
Data fusion is an important way to realize powerful and generalizable analyses across multiple sources. However, different capability of data collection across the sources has become a prominent issue in practice. This could result in the blockwise missingness (BM) of covariates troublesome for integration. Meanwhile, the high cost of obtaining gold-standard labels can cause the missingness of response on a large proportion of samples, known as the semi-supervised (SS) problem. In this paper, we consider a challenging scenario confronting both the BM and SS issues, and propose a novel Data-adaptive projecting Estimation approach for data FUsion in the SEmi-supervised setting (DEFUSE). Starting with a complete-data-only estimator, it involves two successive projection steps to reduce its variance without incurring bias. Compared to existing approaches, DEFUSE achieves a two-fold improvement. First, it leverages the BM labeled sample more efficiently through a novel data-adaptive projection approach robust to model misspecification on the missing covariates, leading to better variance reduction. Second, our method further incorporates the large unlabeled sample to enhance the estimation efficiency through imputation and projection. Compared to the previous SS setting with complete covariates, our work reveals a more essential role of the unlabeled sample in the BM setting. These advantages are justified in asymptotic and simulation studies. We also apply DEFUSE for the risk modeling and inference of heart diseases with the MIMIC-III electronic medical record (EMR) data.
- Published
- 2024
10. ZIKQ: An innovative centile chart method for utilizing natural history data in rare disease clinical development
- Author
-
Wang, Tianying, Zhang, Wenfei, and Wei, Ying
- Subjects
Statistics - Methodology ,Statistics - Applications - Abstract
Utilizing natural history data as external control plays an important role in the clinical development of rare diseases, since placebo groups in double-blind randomization trials may not be available due to ethical reasons and low disease prevalence. This article proposed an innovative approach for utilizing natural history data to support rare disease clinical development by constructing reference centile charts. Due to the deterioration nature of certain rare diseases, the distributions of clinical endpoints can be age-dependent and have an absorbing state of zero, which can result in censored natural history data. Existing methods of reference centile charts can not be directly used in the censored natural history data. Therefore, we propose a new calibrated zero-inflated kernel quantile (ZIKQ) estimation to construct reference centile charts from censored natural history data. Using the application to Duchenne Muscular Dystrophy drug development, we demonstrate that the reference centile charts using the ZIKQ method can be implemented to evaluate treatment efficacy and facilitate a more targeted patient enrollment in rare disease clinical development.
- Published
- 2024
- Full Text
- View/download PDF
11. UniCorn: A Unified Contrastive Learning Approach for Multi-view Molecular Representation Learning
- Author
-
Feng, Shikun, Ni, Yuyan, Li, Minghao, Huang, Yanwen, Ma, Zhi-Ming, Ma, Wei-Ying, and Lan, Yanyan
- Subjects
Quantitative Biology - Biomolecules ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Recently, a noticeable trend has emerged in developing pre-trained foundation models in the domains of CV and NLP. However, for molecular pre-training, there lacks a universal model capable of effectively applying to various categories of molecular tasks, since existing prevalent pre-training methods exhibit effectiveness for specific types of downstream tasks. Furthermore, the lack of profound understanding of existing pre-training methods, including 2D graph masking, 2D-3D contrastive learning, and 3D denoising, hampers the advancement of molecular foundation models. In this work, we provide a unified comprehension of existing pre-training methods through the lens of contrastive learning. Thus their distinctions lie in clustering different views of molecules, which is shown beneficial to specific downstream tasks. To achieve a complete and general-purpose molecular representation, we propose a novel pre-training framework, named UniCorn, that inherits the merits of the three methods, depicting molecular views in three different levels. SOTA performance across quantum, physicochemical, and biological tasks, along with comprehensive ablation study, validate the universality and effectiveness of UniCorn.
- Published
- 2024
12. MolCRAFT: Structure-Based Drug Design in Continuous Parameter Space
- Author
-
Qu, Yanru, Qiu, Keyue, Song, Yuxuan, Gong, Jingjing, Han, Jiawei, Zheng, Mingyue, Zhou, Hao, and Ma, Wei-Ying
- Subjects
Quantitative Biology - Biomolecules ,Computer Science - Machine Learning - Abstract
Generative models for structure-based drug design (SBDD) have shown promising results in recent years. Existing works mainly focus on how to generate molecules with higher binding affinity, ignoring the feasibility prerequisites for generated 3D poses and resulting in false positives. We conduct thorough studies on key factors of ill-conformational problems when applying autoregressive methods and diffusion to SBDD, including mode collapse and hybrid continuous-discrete space. In this paper, we introduce MolCRAFT, the first SBDD model that operates in the continuous parameter space, together with a novel noise reduced sampling strategy. Empirical results show that our model consistently achieves superior performance in binding affinity with more stable 3D structure, demonstrating our ability to accurately model interatomic interactions. To our best knowledge, MolCRAFT is the first to achieve reference-level Vina Scores (-6.59 kcal/mol) with comparable molecular size, outperforming other strong baselines by a wide margin (-0.84 kcal/mol). Code is available at https://github.com/AlgoMole/MolCRAFT., Comment: Accepted to ICML 2024
- Published
- 2024
13. Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text Generation
- Author
-
Zhong, Tianqi, Li, Zhaoyi, Wang, Quan, Song, Linqi, Wei, Ying, Lian, Defu, and Mao, Zhendong
- Subjects
Computer Science - Computation and Language - Abstract
Compositional generalization, representing the model's ability to generate text with new attribute combinations obtained by recombining single attributes from the training data, is a crucial property for multi-aspect controllable text generation (MCTG) methods. Nonetheless, a comprehensive compositional generalization evaluation benchmark of MCTG is still lacking. We propose CompMCTG, a benchmark encompassing diverse multi-aspect labeled datasets and a crafted three-dimensional evaluation protocol, to holistically evaluate the compositional generalization of MCTG approaches. We observe that existing MCTG works generally confront a noticeable performance drop in compositional testing. To mitigate this issue, we introduce Meta-MCTG, a training framework incorporating meta-learning, where we enable models to learn how to generalize by simulating compositional generalization scenarios in the training phase. We demonstrate the effectiveness of Meta-MCTG through achieving obvious improvement (by at most 3.64%) for compositional testing performance in 94.4% cases., Comment: Accepted to ACL 2024 (Main); 32 pages
- Published
- 2024
14. Unified Generative Modeling of 3D Molecules via Bayesian Flow Networks
- Author
-
Song, Yuxuan, Gong, Jingjing, Qu, Yanru, Zhou, Hao, Zheng, Mingyue, Liu, Jingjing, and Ma, Wei-Ying
- Subjects
Physics - Chemical Physics ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,Quantitative Biology - Biomolecules - Abstract
Advanced generative model (e.g., diffusion model) derived from simplified continuity assumptions of data distribution, though showing promising progress, has been difficult to apply directly to geometry generation applications due to the multi-modality and noise-sensitive nature of molecule geometry. This work introduces Geometric Bayesian Flow Networks (GeoBFN), which naturally fits molecule geometry by modeling diverse modalities in the differentiable parameter space of distributions. GeoBFN maintains the SE-(3) invariant density modeling property by incorporating equivariant inter-dependency modeling on parameters of distributions and unifying the probabilistic modeling of different modalities. Through optimized training and sampling techniques, we demonstrate that GeoBFN achieves state-of-the-art performance on multiple 3D molecule generation benchmarks in terms of generation quality (90.87% molecule stability in QM9 and 85.6% atom stability in GEOM-DRUG. GeoBFN can also conduct sampling with any number of steps to reach an optimal trade-off between efficiency and quality (e.g., 20-times speedup without sacrificing performance)., Comment: ICLR 2024
- Published
- 2024
15. Unleashing the Power of Meta-tuning for Few-shot Generalization Through Sparse Interpolated Experts
- Author
-
Chen, Shengzhuang, Tack, Jihoon, Yang, Yunqiao, Teh, Yee Whye, Schwarz, Jonathan Richard, and Wei, Ying
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
Recent successes suggest that parameter-efficient fine-tuning of foundation models as the state-of-the-art method for transfer learning in vision, replacing the rich literature of alternatives such as meta-learning. In trying to harness the best of both worlds, meta-tuning introduces a subsequent optimization stage of foundation models but has so far only shown limited success and crucially tends to underperform on out-of-distribution (OOD) tasks. In this paper, we introduce Sparse MetA-Tuning (SMAT), a method inspired by sparse mixture-of-experts approaches and trained to isolate subsets of pre-trained parameters automatically for meta-tuning on each task. SMAT successfully overcomes OOD sensitivity and delivers on the promise of enhancing the transfer abilities of vision foundation models beyond parameter-efficient fine-tuning. We establish new state-of-the-art results on a challenging combination of Meta-Dataset augmented with additional OOD tasks in both zero-shot and gradient-based adaptation settings. In addition, we provide a thorough analysis of the superiority of learned over hand-designed sparsity patterns for sparse expert methods and the pivotal importance of the sparsity level in balancing between in-distribution and out-of-distribution generalization. Our code is publicly available., Comment: The Forty-first International Conference on Machine Learning, 2024
- Published
- 2024
16. MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric
- Author
-
Lin, Haokun, Bai, Haoli, Liu, Zhili, Hou, Lu, Sun, Muyi, Song, Linqi, Wei, Ying, and Sun, Zhenan
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Multimedia - Abstract
Vision-language pre-trained models have achieved impressive performance on various downstream tasks. However, their large model sizes hinder their utilization on platforms with limited computational resources. We find that directly using smaller pre-trained models and applying magnitude-based pruning on CLIP models leads to inflexibility and inferior performance. Recent efforts for VLP compression either adopt uni-modal compression metrics resulting in limited performance or involve costly mask-search processes with learnable masks. In this paper, we first propose the Module-wise Pruning Error (MoPE) metric, accurately assessing CLIP module importance by performance decline on cross-modal tasks. Using the MoPE metric, we introduce a unified pruning framework applicable to both pre-training and task-specific fine-tuning compression stages. For pre-training, MoPE-CLIP effectively leverages knowledge from the teacher model, significantly reducing pre-training costs while maintaining strong zero-shot capabilities. For fine-tuning, consecutive pruning from width to depth yields highly competitive task-specific models. Extensive experiments in two stages demonstrate the effectiveness of the MoPE metric, and MoPE-CLIP outperforms previous state-of-the-art VLP compression methods., Comment: 18 pages, 8 figures, Published in CVPR2024
- Published
- 2024
17. ESM All-Atom: Multi-scale Protein Language Model for Unified Molecular Modeling
- Author
-
Zheng, Kangjie, Long, Siyu, Lu, Tianyu, Yang, Junwei, Dai, Xinyu, Zhang, Ming, Nie, Zaiqing, Ma, Wei-Ying, and Zhou, Hao
- Subjects
Quantitative Biology - Biomolecules ,Computer Science - Computational Engineering, Finance, and Science ,Computer Science - Machine Learning - Abstract
Protein language models have demonstrated significant potential in the field of protein engineering. However, current protein language models primarily operate at the residue scale, which limits their ability to provide information at the atom level. This limitation prevents us from fully exploiting the capabilities of protein language models for applications involving both proteins and small molecules. In this paper, we propose ESM-AA (ESM All-Atom), a novel approach that enables atom-scale and residue-scale unified molecular modeling. ESM-AA achieves this by pre-training on multi-scale code-switch protein sequences and utilizing a multi-scale position encoding to capture relationships among residues and atoms. Experimental results indicate that ESM-AA surpasses previous methods in protein-molecule tasks, demonstrating the full utilization of protein language models. Further investigations reveal that through unified molecular modeling, ESM-AA not only gains molecular knowledge but also retains its understanding of proteins. The source codes of ESM-AA are publicly released at https://github.com/zhengkangjie/ESM-AA., Comment: ICML2024 camera-ready, update some experimental results, add github url, fix some typos
- Published
- 2024
18. Rethinking Specificity in SBDD: Leveraging Delta Score and Energy-Guided Diffusion
- Author
-
Gao, Bowen, Ren, Minsi, Ni, Yuyan, Huang, Yanwen, Qiang, Bo, Ma, Zhi-Ming, Ma, Wei-Ying, and Lan, Yanyan
- Subjects
Quantitative Biology - Biomolecules ,Computer Science - Machine Learning - Abstract
In the field of Structure-based Drug Design (SBDD), deep learning-based generative models have achieved outstanding performance in terms of docking score. However, further study shows that the existing molecular generative methods and docking scores both have lacked consideration in terms of specificity, which means that generated molecules bind to almost every protein pocket with high affinity. To address this, we introduce the Delta Score, a new metric for evaluating the specificity of molecular binding. To further incorporate this insight for generation, we develop an innovative energy-guided approach using contrastive learning, with active compounds as decoys, to direct generative models toward creating molecules with high specificity. Our empirical results show that this method not only enhances the delta score but also maintains or improves traditional docking scores, successfully bridging the gap between SBDD and real-world needs.
- Published
- 2024
19. Understanding and Patching Compositional Reasoning in LLMs
- Author
-
Li, Zhaoyi, Jiang, Gangwei, Xie, Hong, Song, Linqi, Lian, Defu, and Wei, Ying
- Subjects
Computer Science - Computation and Language - Abstract
LLMs have marked a revolutonary shift, yet they falter when faced with compositional reasoning tasks. Our research embarks on a quest to uncover the root causes of compositional reasoning failures of LLMs, uncovering that most of them stem from the improperly generated or leveraged implicit reasoning results. Inspired by our empirical findings, we resort to Logit Lens and an intervention experiment to dissect the inner hidden states of LLMs. This deep dive reveals that implicit reasoning results indeed surface within middle layers and play a causative role in shaping the final explicit reasoning results. Our exploration further locates multi-head self-attention (MHSA) modules within these layers, which emerge as the linchpins in accurate generation and leveraing of implicit reasoning results. Grounded on the above findings, we develop CREME, a lightweight method to patch errors in compositional reasoning via editing the located MHSA modules. Our empirical evidence stands testament to CREME's effectiveness, paving the way for autonomously and continuously enhancing compositional reasoning capabilities in language models., Comment: Accepted by ACL'2024 Findings
- Published
- 2024
20. Contextual Molecule Representation Learning from Chemical Reaction Knowledge
- Author
-
Tang, Han, Feng, Shikun, Lin, Bicheng, Ni, Yuyan, Liu, JIngjing, Ma, Wei-Ying, and Lan, Yanyan
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Quantitative Biology - Biomolecules - Abstract
In recent years, self-supervised learning has emerged as a powerful tool to harness abundant unlabelled data for representation learning and has been broadly adopted in diverse areas. However, when applied to molecular representation learning (MRL), prevailing techniques such as masked sub-unit reconstruction often fall short, due to the high degree of freedom in the possible combinations of atoms within molecules, which brings insurmountable complexity to the masking-reconstruction paradigm. To tackle this challenge, we introduce REMO, a self-supervised learning framework that takes advantage of well-defined atom-combination rules in common chemistry. Specifically, REMO pre-trains graph/Transformer encoders on 1.7 million known chemical reactions in the literature. We propose two pre-training objectives: Masked Reaction Centre Reconstruction (MRCR) and Reaction Centre Identification (RCI). REMO offers a novel solution to MRL by exploiting the underlying shared patterns in chemical reactions as \textit{context} for pre-training, which effectively infers meaningful representations of common chemistry knowledge. Such contextual representations can then be utilized to support diverse downstream molecular tasks with minimum finetuning, such as affinity prediction and drug-drug interaction prediction. Extensive experimental results on MoleculeACE, ACNet, drug-drug interaction (DDI), and reaction type classification show that across all tested downstream tasks, REMO outperforms the standard baseline of single-molecule masked modeling used in current MRL. Remarkably, REMO is the pioneering deep learning model surpassing fingerprint-based methods in activity cliff benchmarks., Comment: Preprint. Under Review
- Published
- 2024
21. From Words to Molecules: A Survey of Large Language Models in Chemistry
- Author
-
Liao, Chang, Yu, Yemin, Mei, Yu, and Wei, Ying
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Quantitative Biology - Biomolecules ,Quantitative Biology - Quantitative Methods - Abstract
In recent years, Large Language Models (LLMs) have achieved significant success in natural language processing (NLP) and various interdisciplinary areas. However, applying LLMs to chemistry is a complex task that requires specialized domain knowledge. This paper provides a thorough exploration of the nuanced methodologies employed in integrating LLMs into the field of chemistry, delving into the complexities and innovations at this interdisciplinary juncture. Specifically, our analysis begins with examining how molecular information is fed into LLMs through various representation and tokenization methods. We then categorize chemical LLMs into three distinct groups based on the domain and modality of their input data, and discuss approaches for integrating these inputs for LLMs. Furthermore, this paper delves into the pretraining objectives with adaptations to chemical LLMs. After that, we explore the diverse applications of LLMs in chemistry, including novel paradigms for their application in chemistry tasks. Finally, we identify promising research directions, including further integration with chemical knowledge, advancements in continual learning, and improvements in model interpretability, paving the way for groundbreaking developments in the field., Comment: Submitted to IJCAI 2024 survey track
- Published
- 2024
22. RetroOOD: Understanding Out-of-Distribution Generalization in Retrosynthesis Prediction
- Author
-
Yu, Yemin, Yuan, Luotian, Wei, Ying, Gao, Hanyu, Ye, Xinhai, Wang, Zhihua, and Wu, Fei
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Quantitative Biology - Quantitative Methods - Abstract
Machine learning-assisted retrosynthesis prediction models have been gaining widespread adoption, though their performances oftentimes degrade significantly when deployed in real-world applications embracing out-of-distribution (OOD) molecules or reactions. Despite steady progress on standard benchmarks, our understanding of existing retrosynthesis prediction models under the premise of distribution shifts remains stagnant. To this end, we first formally sort out two types of distribution shifts in retrosynthesis prediction and construct two groups of benchmark datasets. Next, through comprehensive experiments, we systematically compare state-of-the-art retrosynthesis prediction models on the two groups of benchmarks, revealing the limitations of previous in-distribution evaluation and re-examining the advantages of each model. More remarkably, we are motivated by the above empirical insights to propose two model-agnostic techniques that can improve the OOD generalization of arbitrary off-the-shelf retrosynthesis prediction algorithms. Our preliminary experiments show their high potential with an average performance improvement of 4.6%, and the established benchmarks serve as a foothold for further retrosynthesis prediction research towards OOD generalization.
- Published
- 2023
23. Understanding the Multi-modal Prompts of the Pre-trained Vision-Language Model
- Author
-
Ma, Shuailei, Xie, Chen-Wei, Wei, Ying, Sun, Siyang, Fan, Jiaqi, Bao, Xiaoyi, Guo, Yuxin, and Zheng, Yun
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Prompt learning has emerged as an efficient alternative for fine-tuning foundational models, such as CLIP, for various downstream tasks. However, there is no work that provides a comprehensive explanation for the working mechanism of the multi-modal prompts. In this paper, we conduct a direct analysis of the multi-modal prompts by asking the following questions: $(i)$ How do the learned multi-modal prompts improve the recognition performance? $(ii)$ What do the multi-modal prompts learn? To answer these questions, we begin by isolating the component of the formula where the prompt influences the calculation of self-attention at each layer in two distinct ways, \ie, $(1)$ introducing prompt embeddings makes the $[cls]$ token focus on foreground objects. $(2)$ the prompts learn a bias term during the update of token embeddings, allowing the model to adapt to the target domain. Subsequently, we conduct extensive visualization and statistical experiments on the eleven diverse downstream recognition datasets. From the experiments, we reveal that the learned prompts improve the performance mainly through the second way, which acts as the dataset bias to improve the recognition performance of the pre-trained model on the corresponding dataset. Meanwhile, we propose the bias tuning way to validate our finding. With a deeper understanding of the multi-modal prompt, we hope our work can inspire new and solid research in this direction., Comment: We find that the statistical information in Figure 2 neglect the statistics for tSOS, so we make corrections. Additionally, we change the statistical samples to those where CLIP misidentify, but prompt tuning identify correctly. At the same time, we also revise some of the descriptions. The changes to the supplementary materials will be updated shortly. arXiv admin note: text overlap with arXiv:2307.06948 by other authors
- Published
- 2023
24. SKDF: A Simple Knowledge Distillation Framework for Distilling Open-Vocabulary Knowledge to Open-world Object Detector
- Author
-
Ma, Shuailei, Wang, Yuefeng, Wei, Ying, Fan, Jiaqi, Zhang, Enming, Sun, Xinyu, and Chen, Peihao
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
In this paper, we attempt to specialize the VLM model for OWOD tasks by distilling its open-world knowledge into a language-agnostic detector. Surprisingly, we observe that the combination of a simple \textbf{knowledge distillation} approach and the automatic pseudo-labeling mechanism in OWOD can achieve better performance for unknown object detection, even with a small amount of data. Unfortunately, knowledge distillation for unknown objects severely affects the learning of detectors with conventional structures for known objects, leading to catastrophic forgetting. To alleviate these problems, we propose the \textbf{down-weight loss function} for knowledge distillation from vision-language to single vision modality. Meanwhile, we propose the \textbf{cascade decouple decoding structure} that decouples the learning of localization and recognition to reduce the impact of category interactions of known and unknown objects on the localization learning process. Ablation experiments demonstrate that both of them are effective in mitigating the impact of open-world knowledge distillation on the learning of known objects. Additionally, to alleviate the current lack of comprehensive benchmarks for evaluating the ability of the open-world detector to detect unknown objects in the open world, we propose two benchmarks, which we name "\textbf{StandardSet}$\heartsuit$" and "\textbf{IntensiveSet}$\spadesuit$" respectively, based on the complexity of their testing scenarios. Comprehensive experiments performed on OWOD, MS-COCO, and our proposed benchmarks demonstrate the effectiveness of our methods. The code and proposed dataset are available at \url{https://github.com/xiaomabufei/SKDF}., Comment: arXiv admin note: substantial text overlap with arXiv:2303.11623
- Published
- 2023
25. Equivariant Flow Matching with Hybrid Probability Transport
- Author
-
Song, Yuxuan, Gong, Jingjing, Xu, Minkai, Cao, Ziyao, Lan, Yanyan, Ermon, Stefano, Zhou, Hao, and Ma, Wei-Ying
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
The generation of 3D molecules requires simultaneously deciding the categorical features~(atom types) and continuous features~(atom coordinates). Deep generative models, especially Diffusion Models (DMs), have demonstrated effectiveness in generating feature-rich geometries. However, existing DMs typically suffer from unstable probability dynamics with inefficient sampling speed. In this paper, we introduce geometric flow matching, which enjoys the advantages of both equivariant modeling and stabilized probability dynamics. More specifically, we propose a hybrid probability path where the coordinates probability path is regularized by an equivariant optimal transport, and the information between different modalities is aligned. Experimentally, the proposed method could consistently achieve better performance on multiple molecule generation benchmarks with 4.75$\times$ speed up of sampling on average., Comment: NeurIPS 2023
- Published
- 2023
26. Altered cerebellar-cerebral dynamic functional connectivity in patients with pontine stroke: a resting-state fMRI study
- Author
-
Wang, Xin, Wang, Caihong, Liu, Jingchun, Guo, Jun, Miao, Peifang, Wei, Ying, Wang, Yingying, Li, Zhen, Wang, Kaiyu, Zhang, Yong, Cheng, Jingliang, and Ren, Cuiping
- Published
- 2024
- Full Text
- View/download PDF
27. Screening mammography frequency following dense breast notification among a predominantly Hispanic/Latina screening cohort
- Author
-
Lee Argov, Erica J., Rodriguez, Carmen B., Agovino, Mariangela, Schmitt, Karen M., Desperito, Elise, Karr, Anita G., Wei, Ying, Terry, Mary Beth, and Tehranifar, Parisa
- Published
- 2024
- Full Text
- View/download PDF
28. Spatial regression with multiplicative errors, and its application with LiDAR measurements
- Author
-
You, Hojun, Wu, Wei-Ying, Lim, Chae Young, Yoon, Kyubaek, and Choi, Jongeun
- Published
- 2024
- Full Text
- View/download PDF
29. Catalytic Reduction of N2O by NH3 over Fe-zeolite Catalysts with Different Topologies
- Author
-
Du, Shaohua, Kang, Bin, Guo, Xiaonan, Wei, Ying, Jia, Jingbo, and Zhang, Runduo
- Published
- 2024
- Full Text
- View/download PDF
30. Deep learning for malignancy risk estimation of incidental sub-centimeter pulmonary nodules on CT images
- Author
-
Zhang, Rui, Wei, Ying, Wang, Denian, Chen, Bojiang, Sun, Huaiqiang, Lei, Yi, Zhou, Qing, Luo, Zhuang, Jiang, Li, Qiu, Rong, Shi, Feng, and Li, Weimin
- Published
- 2024
- Full Text
- View/download PDF
31. Epstein–Barr virus positive gastric cancer: the pathological basis of CT findings and radiomics models prediction
- Author
-
Sun, Shuangshuang, Li, Lin, Xu, Mengying, Wei, Ying, Shi, Feng, and Liu, Song
- Published
- 2024
- Full Text
- View/download PDF
32. Regularized nonlinear regression with dependent errors and its application to a biomechanical model
- Author
-
You, Hojun, Yoon, Kyubaek, Wu, Wei-Ying, Choi, Jongeun, and Lim, Chae Young
- Published
- 2024
- Full Text
- View/download PDF
33. Effects of Course Structure on Student Engagement and Learning Performance in an Electronics Course
- Author
-
Wei-Ying Cheng, Jennifer Wen-Shya Lee, and Shi-Wei Chu
- Abstract
Many educational strategies have been proposed to improve students' learning motivation and outcomes. This paper reports the student learning outcome results of a three-year study centered on the Electronics course at the Department of Physics of National Taiwan University. In the first year, peer instruction (PI) with in-class lectures was implemented. In the second year, in-class lectures were replaced with online lectures in a flipped classroom (FC) approach, and PI in class was maintained. In the third year, PI-based conceptual questions (CQs) were scored as part of in-class homework to enhance motivation for online lecture preview. Learning performance was evaluated based on cumulative percentage of correct answers to CQs and summative assessment. The results revealed improved student performance on summative assessment with PI and FC approaches combined. Furthermore, when CQs were scored, overall learning outcomes were significantly enhanced. In addition, an advantage of using a PI plus FC approach over using PI alone is that more course materials can be covered in online videos, which prevents a loss of lecture content to the time-consuming, in-class discussions involved in PI. Our study indicates that when students' motivations to prepare before class are reinforced using graded CQs, the learning outcome enhancement of PI plus FC is even more significant.
- Published
- 2023
34. Concept-wise Fine-tuning Matters in Preventing Negative Transfer
- Author
-
Yang, Yunqiao, Huang, Long-Kai, and Wei, Ying
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
A multitude of prevalent pre-trained models mark a major milestone in the development of artificial intelligence, while fine-tuning has been a common practice that enables pretrained models to figure prominently in a wide array of target datasets. Our empirical results reveal that off-the-shelf finetuning techniques are far from adequate to mitigate negative transfer caused by two types of underperforming features in a pre-trained model, including rare features and spuriously correlated features. Rooted in structural causal models of predictions after fine-tuning, we propose a Concept-wise fine-tuning (Concept-Tuning) approach which refines feature representations in the level of patches with each patch encoding a concept. Concept-Tuning minimizes the negative impacts of rare features and spuriously correlated features by (1) maximizing the mutual information between examples in the same category with regard to a slice of rare features (a patch) and (2) applying front-door adjustment via attention neural networks in channels and feature slices (patches). The proposed Concept-Tuning consistently and significantly (by up to 4.76%) improves prior state-of-the-art fine-tuning methods on eleven datasets, diverse pre-training strategies (supervised and self-supervised ones), various network architectures, and sample sizes in a target dataset.
- Published
- 2023
35. High-fidelity 3D Reconstruction of Plants using Neural Radiance Field
- Author
-
Hu, Kewei, Wei, Ying, Pan, Yaoqiang, Kang, Hanwen, and Chen, Chao
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Robotics - Abstract
Accurate reconstruction of plant phenotypes plays a key role in optimising sustainable farming practices in the field of Precision Agriculture (PA). Currently, optical sensor-based approaches dominate the field, but the need for high-fidelity 3D reconstruction of crops and plants in unstructured agricultural environments remains challenging. Recently, a promising development has emerged in the form of Neural Radiance Field (NeRF), a novel method that utilises neural density fields. This technique has shown impressive performance in various novel vision synthesis tasks, but has remained relatively unexplored in the agricultural context. In our study, we focus on two fundamental tasks within plant phenotyping: (1) the synthesis of 2D novel-view images and (2) the 3D reconstruction of crop and plant models. We explore the world of neural radiance fields, in particular two SOTA methods: Instant-NGP, which excels in generating high-quality images with impressive training and inference speed, and Instant-NSR, which improves the reconstructed geometry by incorporating the Signed Distance Function (SDF) during training. In particular, we present a novel plant phenotype dataset comprising real plant images from production environments. This dataset is a first-of-its-kind initiative aimed at comprehensively exploring the advantages and limitations of NeRF in agricultural contexts. Our experimental results show that NeRF demonstrates commendable performance in the synthesis of novel-view images and is able to achieve reconstruction results that are competitive with Reality Capture, a leading commercial software for 3D Multi-View Stereo (MVS)-based reconstruction. However, our study also highlights certain drawbacks of NeRF, including relatively slow training speeds, performance limitations in cases of insufficient sampling, and challenges in obtaining geometry quality in complex setups.
- Published
- 2023
- Full Text
- View/download PDF
36. Sliced Denoising: A Physics-Informed Molecular Pre-Training Method
- Author
-
Ni, Yuyan, Feng, Shikun, Ma, Wei-Ying, Ma, Zhi-Ming, and Lan, Yanyan
- Subjects
Quantitative Biology - Biomolecules ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
While molecular pre-training has shown great potential in enhancing drug discovery, the lack of a solid physical interpretation in current methods raises concerns about whether the learned representation truly captures the underlying explanatory factors in observed data, ultimately resulting in limited generalization and robustness. Although denoising methods offer a physical interpretation, their accuracy is often compromised by ad-hoc noise design, leading to inaccurate learned force fields. To address this limitation, this paper proposes a new method for molecular pre-training, called sliced denoising (SliDe), which is based on the classical mechanical intramolecular potential theory. SliDe utilizes a novel noise strategy that perturbs bond lengths, angles, and torsion angles to achieve better sampling over conformations. Additionally, it introduces a random slicing approach that circumvents the computationally expensive calculation of the Jacobian matrix, which is otherwise essential for estimating the force field. By aligning with physical principles, SliDe shows a 42\% improvement in the accuracy of estimated force fields compared to current state-of-the-art denoising methods, and thus outperforms traditional baselines on various molecular property prediction tasks.
- Published
- 2023
37. On the Opportunities of Green Computing: A Survey
- Author
-
Zhou, You, Lin, Xiujing, Zhang, Xiang, Wang, Maolin, Jiang, Gangwei, Lu, Huakang, Wu, Yupeng, Zhang, Kai, Yang, Zhe, Wang, Kehang, Sui, Yongduo, Jia, Fengwei, Tang, Zuoli, Zhao, Yao, Zhang, Hongxuan, Yang, Tiannuo, Chen, Weibo, Mao, Yunong, Li, Yi, Bao, De, Li, Yu, Liao, Hongrui, Liu, Ting, Liu, Jingwen, Guo, Jinchi, Zhao, Xiangyu, WEI, Ying, Qian, Hong, Liu, Qi, Wang, Xiang, Kin, Wai, Chan, Li, Chenliang, Li, Yusen, Yang, Shiyu, Yan, Jining, Mou, Chao, Han, Shuai, Jin, Wuxia, Zhang, Guannan, and Zeng, Xiaodong
- Subjects
Computer Science - Artificial Intelligence - Abstract
Artificial Intelligence (AI) has achieved significant advancements in technology and research with the development over several decades, and is widely used in many areas including computing vision, natural language processing, time-series analysis, speech synthesis, etc. During the age of deep learning, especially with the arise of Large Language Models, a large majority of researchers' attention is paid on pursuing new state-of-the-art (SOTA) results, resulting in ever increasing of model size and computational complexity. The needs for high computing power brings higher carbon emission and undermines research fairness by preventing small or medium-sized research institutions and companies with limited funding in participating in research. To tackle the challenges of computing resources and environmental impact of AI, Green Computing has become a hot research topic. In this survey, we give a systematic overview of the technologies used in Green Computing. We propose the framework of Green Computing and devide it into four key components: (1) Measures of Greenness, (2) Energy-Efficient AI, (3) Energy-Efficient Computing Systems and (4) AI Use Cases for Sustainability. For each components, we discuss the research progress made and the commonly used techniques to optimize the AI efficiency. We conclude that this new research direction has the potential to address the conflicts between resource constraints and AI development. We encourage more researchers to put attention on this direction and make AI more environmental friendly., Comment: 113 pages, 18 figures
- Published
- 2023
38. Towards Anytime Fine-tuning: Continually Pre-trained Language Models with Hypernetwork Prompt
- Author
-
Jiang, Gangwei, Jiang, Caigao, Xue, Siqiao, Zhang, James Y., Zhou, Jun, Lian, Defu, and Wei, Ying
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Continual pre-training has been urgent for adapting a pre-trained model to a multitude of domains and tasks in the fast-evolving world. In practice, a continually pre-trained model is expected to demonstrate not only greater capacity when fine-tuned on pre-trained domains but also a non-decreasing performance on unseen ones. In this work, we first investigate such anytime fine-tuning effectiveness of existing continual pre-training approaches, concluding with unanimously decreased performance on unseen domains. To this end, we propose a prompt-guided continual pre-training method, where we train a hypernetwork to generate domain-specific prompts by both agreement and disagreement losses. The agreement loss maximally preserves the generalization of a pre-trained model to new domains, and the disagreement one guards the exclusiveness of the generated hidden states for each domain. Remarkably, prompts by the hypernetwork alleviate the domain identity when fine-tuning and promote knowledge transfer across domains. Our method achieved improvements of 3.57% and 3.4% on two real-world datasets (including domain shift and temporal shift), respectively, demonstrating its efficacy.
- Published
- 2023
39. Spatial Regression With Multiplicative Errors, and Its Application With Lidar Measurements
- Author
-
You, Hojun, Wu, Wei-Ying, Lim, Chae Young, Yoon, Kyubaek, and Choi, Jongeun
- Subjects
Statistics - Methodology - Abstract
Multiplicative errors in addition to spatially referenced observations often arise in geodetic applications, particularly in surface estimation with light detection and ranging (LiDAR) measurements. However, spatial regression involving multiplicative errors remains relatively unexplored in such applications. In this regard, we present a penalized modified least squares estimator to handle the complexities of a multiplicative error structure while identifying significant variables in spatially dependent observations for surface estimation. The proposed estimator can be also applied to classical additive error spatial regression. By establishing asymptotic properties of the proposed estimator under increasing domain asymptotics with stochastic sampling design, we provide a rigorous foundation for its effectiveness. A comprehensive simulation study confirms the superior performance of our proposed estimator in accurately estimating and selecting parameters, outperforming existing approaches. To demonstrate its real-world applicability, we employ our proposed method, along with other alternative techniques, to estimate a rotational landslide surface using LiDAR measurements. The results highlight the efficacy and potential of our approach in tackling complex spatial regression problems involving multiplicative errors.
- Published
- 2023
40. EFTNet: an efficient fine-tuning method for few-shot segmentation
- Author
-
Li, Jiaguang, Wang, Yubo, Gao, Zihan, and Wei, Ying
- Published
- 2024
- Full Text
- View/download PDF
41. Effect of Gentianella acuta (Michx.) Hulten against the arsenic-induced development hindrance of mouse oocytes
- Author
-
Wang, Chunyu, Wang, Biao, Wei, Ying, Li, Shubin, Ren, Jingyu, Dai, Yanfeng, and Liu, Gang
- Published
- 2024
- Full Text
- View/download PDF
42. Serious gaming: user acceptance of the simulated motorbike license training system
- Author
-
Li, Wei-Ying, Lu, Tzu-Chuen, and Wei, Hui-Xun
- Published
- 2024
- Full Text
- View/download PDF
43. Effect of heat treatment on microstructural evolution, mechanical properties and tribological properties of H13 steel prepared using selective laser melting
- Author
-
Han, Li-xiong, Wang, Yan, Liu, Shi-feng, Zhang, Zhao-hui, Liu, Wei, Yang, Xin, Ma, Dang-shen, Zhou, Jian, and Wei, Ying-kang
- Published
- 2024
- Full Text
- View/download PDF
44. Efficacy and Safety of Fillers for the Treatment of Nasolabial Folds: A Network meta-Analysis of Randomized Controlled Trials
- Author
-
Li, Man-Yun, Chien, Wei-Ying, Kang, Yi-No, and Chen, Chiehfeng
- Published
- 2024
- Full Text
- View/download PDF
45. Influence of powder oxidation on powder properties and formability in H13 hot-work steels processed by electron beam melting
- Author
-
Liu, Wei, Wang, Yan, Han, Li-xiong, Wei, Ying-kang, Tang, Hui-ping, and Liu, Shi-feng
- Published
- 2024
- Full Text
- View/download PDF
46. Tracheal computed tomography radiomics model for prediction of the Omicron variant of severe acute respiratory syndrome coronavirus 2
- Author
-
Fang, Xu, Shi, Feng, Liu, Fang, Wei, Ying, Li, Jing, Wu, Jiaojiao, Wang, Tiegong, Lu, Jianping, Shao, Chengwei, and Bian, Yun
- Published
- 2024
- Full Text
- View/download PDF
47. Vaccination effect on patients with Delta variant of COVID-19 pneumonia: a study of longitudinal dynamic chest CTs using artificial intelligence model
- Author
-
Xin, Xiaoyan, Hu, Jun, Wei, Ying, Dai, Jinghong, Li, Jie, Yi, Changhua, Peng, Xin, Zhang, Xin, Qing, Zhao, Wang, Zhengge, Han, Xiaowei, Long, Cong, Yi, Yongxiang, Gao, Yaozong, Shi, Feng, Du, Chao, and Zhang, Bing
- Published
- 2024
- Full Text
- View/download PDF
48. Fractional Denoising for 3D Molecular Pre-training
- Author
-
Feng, Shikun, Ni, Yuyan, Lan, Yanyan, Ma, Zhi-Ming, and Ma, Wei-Ying
- Subjects
Quantitative Biology - Quantitative Methods ,Computer Science - Machine Learning ,Physics - Chemical Physics - Abstract
Coordinate denoising is a promising 3D molecular pre-training method, which has achieved remarkable performance in various downstream drug discovery tasks. Theoretically, the objective is equivalent to learning the force field, which is revealed helpful for downstream tasks. Nevertheless, there are two challenges for coordinate denoising to learn an effective force field, i.e. low coverage samples and isotropic force field. The underlying reason is that molecular distributions assumed by existing denoising methods fail to capture the anisotropic characteristic of molecules. To tackle these challenges, we propose a novel hybrid noise strategy, including noises on both dihedral angel and coordinate. However, denoising such hybrid noise in a traditional way is no more equivalent to learning the force field. Through theoretical deductions, we find that the problem is caused by the dependency of the input conformation for covariance. To this end, we propose to decouple the two types of noise and design a novel fractional denoising method (Frad), which only denoises the latter coordinate part. In this way, Frad enjoys both the merits of sampling more low-energy structures and the force field equivalence. Extensive experiments show the effectiveness of Frad in molecular representation, with a new state-of-the-art on 9 out of 12 tasks of QM9 and on 7 out of 8 targets of MD17.
- Published
- 2023
49. Learning to Substitute Spans towards Improving Compositional Generalization
- Author
-
Li, Zhaoyi, Wei, Ying, and Lian, Defu
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Despite the rising prevalence of neural sequence models, recent empirical evidences suggest their deficiency in compositional generalization. One of the current de-facto solutions to this problem is compositional data augmentation, aiming to incur additional compositional inductive bias. Nonetheless, the improvement offered by existing handcrafted augmentation strategies is limited when successful systematic generalization of neural sequence models requires multi-grained compositional bias (i.e., not limited to either lexical or structural biases only) or differentiation of training sequences in an imbalanced difficulty distribution. To address the two challenges, we first propose a novel compositional augmentation strategy dubbed \textbf{Span} \textbf{Sub}stitution (SpanSub) that enables multi-grained composition of substantial substructures in the whole training set. Over and above that, we introduce the \textbf{L}earning \textbf{to} \textbf{S}ubstitute \textbf{S}pan (L2S2) framework which empowers the learning of span substitution probabilities in SpanSub in an end-to-end manner by maximizing the loss of neural sequence models, so as to outweigh those challenging compositions with elusive concepts and novel surroundings. Our empirical results on three standard compositional generalization benchmarks, including SCAN, COGS and GeoQuery (with an improvement of at most 66.5\%, 10.3\%, 1.2\%, respectively), demonstrate the superiority of SpanSub, %the learning framework L2S2 and their combination., Comment: accepted by ACL 2023
- Published
- 2023
50. Recombination of variable and host range regions of glycoprotein gp85 in different avian leukosis virus subgroup K isolates
- Author
-
Li, Xiyue, Wang, Yajun, Li, Jiufeng, Yu, Zuhua, Wei, Ying, Chen, Songbiao, He, Lei, Ding, Ke, and Chen, Jian
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.