45 results on '"Feng, Zunlei"'
Search Results
2. Life regression based patch slimming for vision transformers
- Author
-
Chen, Jiawei, Chen, Lin, Yang, Jiang, Shi, Tianqi, Cheng, Lechao, Feng, Zunlei, and Song, Mingli
- Published
- 2024
- Full Text
- View/download PDF
3. NRD-Net: a noise-resistant distillation network for accurate diagnosis of prostate cancer with bi-parametric MRI images
- Author
-
Du, Xiangtong, Shen, Ao, Wang, Ximing, Feng, Zunlei, and Deng, Hai
- Published
- 2024
- Full Text
- View/download PDF
4. Graph Neural Networks-based hybrid framework for predicting particle crushing strength
- Author
-
Zheng, Tongya, Zhang, Tianli, Guan, Qingzheng, Huang, Wenjie, Feng, Zunlei, Song, Mingli, and Chen, Chun
- Published
- 2024
- Full Text
- View/download PDF
5. PatchDetector: Pluggable and non-intrusive patch for small object detection
- Author
-
Zhou, Linyun, Zhang, Shengxuming, Qiu, Tian, Xu, Wenxiang, Feng, Zunlei, and Song, Mingli
- Published
- 2024
- Full Text
- View/download PDF
6. DataMap: Dataset transferability map for medical image classification
- Author
-
Du, Xiangtong, Liu, Zhidong, Feng, Zunlei, and Deng, Hai
- Published
- 2024
- Full Text
- View/download PDF
7. Noise is the fatal poison: A Noise-aware Network for noisy dataset classification
- Author
-
Yu, Xiaotian, Zhang, Shengxuming, Jia, Lingxiang, Wang, Yuexuan, Song, Mingli, and Feng, Zunlei
- Published
- 2024
- Full Text
- View/download PDF
8. Disassembling Convolutional Segmentation Network
- Author
-
Hu, Kaiwen, Gao, Jing, Mao, Fangyuan, Song, Xinhui, Cheng, Lechao, Feng, Zunlei, and Song, Mingli
- Published
- 2023
- Full Text
- View/download PDF
9. Federated selective aggregation for on-device knowledge amalgamation
- Author
-
Xie, Donglin, Yu, Ruonan, Fang, Gongfan, Han, Jiaqi, Song, Jie, Feng, Zunlei, Sun, Li, and Song, Mingli
- Published
- 2023
- Full Text
- View/download PDF
10. DCAM: Disturbed class activation maps for weakly supervised semantic segmentation
- Author
-
Lei, Jie, Yang, Guoyu, Wang, Shuaiwei, Feng, Zunlei, and Liang, Ronghua
- Published
- 2023
- Full Text
- View/download PDF
11. Category-aware feature attribution for Self-Optimizing medical image classification
- Author
-
Lei, Jie, Yang, Guoyu, Wang, Shuaiwei, Feng, Zunlei, and Liang, Ronghua
- Published
- 2023
- Full Text
- View/download PDF
12. Reinforcement learning based web crawler detection for diversity and dynamics
- Author
-
Gao, Yang, Feng, Zunlei, Wang, Xiaoyang, Song, Mingli, Wang, Xingen, Wang, Xinyu, and Chen, Chun
- Published
- 2023
- Full Text
- View/download PDF
13. Deep learning based diagnosis for cysts and tumors of jaw with massive healthy samples
- Author
-
Yu, Dan, Hu, Jiacong, Feng, Zunlei, Song, Mingli, and Zhu, Huiyong
- Published
- 2022
- Full Text
- View/download PDF
14. Disassembling object representations without labels
- Author
-
Feng, Zunlei, He, Yongming, Yuan, Yike, Sun, Li, Wang, Huiqiong, and Song, Mingli
- Published
- 2021
- Full Text
- View/download PDF
15. Deep learning‐based accurate diagnosis and quantitative evaluation of microvascular invasion in hepatocellular carcinoma on whole‐slide histopathology images.
- Author
-
Zhang, Xiuming, Yu, Xiaotian, Liang, Wenjie, Zhang, Zhongliang, Zhang, Shengxuming, Xu, Linjie, Zhang, Han, Feng, Zunlei, Song, Mingli, Zhang, Jing, and Feng, Shi
- Subjects
RECEIVER operating characteristic curves ,HISTOPATHOLOGY ,DEEP learning ,ARTIFICIAL intelligence ,HEPATOCELLULAR carcinoma - Abstract
Background: Microvascular invasion (MVI) is an independent prognostic factor that is associated with early recurrence and poor survival after resection of hepatocellular carcinoma (HCC). However, the traditional pathology approach is relatively subjective, time‐consuming, and heterogeneous in the diagnosis of MVI. The aim of this study was to develop a deep‐learning model that could significantly improve the efficiency and accuracy of MVI diagnosis. Materials and Methods: We collected H&E‐stained slides from 753 patients with HCC at the First Affiliated Hospital of Zhejiang University. An external validation set with 358 patients was selected from The Cancer Genome Atlas database. The deep‐learning model was trained by simulating the method used by pathologists to diagnose MVI. Model performance was evaluated with accuracy, precision, recall, F1 score, and the area under the receiver operating characteristic curve. Results: We successfully developed a MVI artificial intelligence diagnostic model (MVI‐AIDM) which achieved an accuracy of 94.25% in the independent external validation set. The MVI positive detection rate of MVI‐AIDM was significantly higher than the results of pathologists. Visualization results demonstrated the recognition of micro MVIs that were difficult to differentiate by the traditional pathology. Additionally, the model provided automatic quantification of the number of cancer cells and spatial information regarding MVI. Conclusions: We developed a deep learning diagnostic model, which performed well and improved the efficiency and accuracy of MVI diagnosis. The model provided spatial information of MVI that was essential to accurately predict HCC recurrence after surgery. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. CU-Net: Component Unmixing Network for Textile Fiber Identification
- Author
-
Feng, Zunlei, Liang, Weixin, Tao, Daocheng, Sun, Li, Zeng, Anxiang, and Song, Mingli
- Published
- 2019
- Full Text
- View/download PDF
17. Recent advances in deep learning for retrosynthesis.
- Author
-
Zhong, Zipeng, Song, Jie, Feng, Zunlei, Liu, Tiantao, Jia, Lingxiang, Yao, Shaolun, Hou, Tingjun, and Song, Mingli
- Subjects
ARTIFICIAL intelligence ,DEEP learning ,COMPUTER algorithms ,LABOR costs ,DRUG factories ,ORGANIC chemistry - Abstract
Retrosynthesis is the cornerstone of organic chemistry, providing chemists in material and drug manufacturing access to poorly available and brand‐new molecules. Conventional rule‐based or expert‐based computer‐aided synthesis has obvious limitations, such as high labor costs and limited search space. In recent years, dramatic breakthroughs driven by deep learning have revolutionized retrosynthesis. Here we aim to present a comprehensive review of recent advances in AI‐based retrosynthesis. For single‐step and multi‐step retrosynthesis both, we first introduce their goal and provide a thorough taxonomy of existing methods. Afterwards, we analyze these methods in terms of their mechanism and performance, and introduce popular evaluation metrics for them, in which we also provide a detailed comparison among representative methods on several public datasets. In the next part, we introduce popular databases and established platforms for retrosynthesis. Finally, this review concludes with a discussion about promising research directions in this field. This article is categorized under:Data Science > Artificial Intelligence/Machine LearningData Science > Computer Algorithms and ProgrammingData Science > Chemoinformatics [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. AFPN: Asymptotic Feature Pyramid Network for Object Detection
- Author
-
Yang, Guoyu, Lei, Jie, Zhu, Zhikuan, Cheng, Siyu, Feng, Zunlei, and Liang, Ronghua
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Multi-scale features are of great importance in encoding objects with scale variance in object detection tasks. A common strategy for multi-scale feature extraction is adopting the classic top-down and bottom-up feature pyramid networks. However, these approaches suffer from the loss or degradation of feature information, impairing the fusion effect of non-adjacent levels. This paper proposes an asymptotic feature pyramid network (AFPN) to support direct interaction at non-adjacent levels. AFPN is initiated by fusing two adjacent low-level features and asymptotically incorporates higher-level features into the fusion process. In this way, the larger semantic gap between non-adjacent levels can be avoided. Given the potential for multi-object information conflicts to arise during feature fusion at each spatial location, adaptive spatial fusion operation is further utilized to mitigate these inconsistencies. We incorporate the proposed AFPN into both two-stage and one-stage object detection frameworks and evaluate with the MS-COCO 2017 validation and test datasets. Experimental evaluation shows that our method achieves more competitive results than other state-of-the-art feature pyramid networks. The code is available at \href{https://github.com/gyyang23/AFPN}{https://github.com/gyyang23/AFPN}.
- Published
- 2023
19. Improving Expressivity of GNNs with Subgraph-specific Factor Embedded Normalization
- Author
-
Chen, Kaixuan, Liu, Shunyu, Zhu, Tongtian, Zheng, Tongya, Zhang, Haofei, Feng, Zunlei, Ye, Jingwen, and Song, Mingli
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,Machine Learning (cs.LG) - Abstract
Graph Neural Networks (GNNs) have emerged as a powerful category of learning architecture for handling graph-structured data. However, existing GNNs typically ignore crucial structural characteristics in node-induced subgraphs, which thus limits their expressiveness for various downstream tasks. In this paper, we strive to strengthen the representative capabilities of GNNs by devising a dedicated plug-and-play normalization scheme, termed as SUbgraph-sPEcific FactoR Embedded Normalization (SuperNorm), that explicitly considers the intra-connection information within each node-induced subgraph. To this end, we embed the subgraph-specific factor at the beginning and the end of the standard BatchNorm, as well as incorporate graph instance-specific statistics for improved distinguishable capabilities. In the meantime, we provide theoretical analysis to support that, with the elaborated SuperNorm, an arbitrary GNN is at least as powerful as the 1-WL test in distinguishing non-isomorphism graphs. Furthermore, the proposed SuperNorm scheme is also demonstrated to alleviate the over-smoothing phenomenon. Experimental results related to predictions of graph, node, and link properties on the eight popular datasets demonstrate the effectiveness of the proposed method. The code is available at https://github.com/chenchkx/SuperNorm., 13 pages, 7 figures
- Published
- 2023
20. Improving Knowledge Distillation via Regularizing Feature Norm and Direction
- Author
-
Wang, Yuzhu, Cheng, Lechao, Duan, Manni, Wang, Yongheng, Feng, Zunlei, and Kong, Shu
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Knowledge distillation (KD) exploits a large well-trained model (i.e., teacher) to train a small student model on the same dataset for the same task. Treating teacher features as knowledge, prevailing methods of knowledge distillation train student by aligning its features with the teacher's, e.g., by minimizing the KL-divergence between their logits or L2 distance between their intermediate features. While it is natural to believe that better alignment of student features to the teacher better distills teacher knowledge, simply forcing this alignment does not directly contribute to the student's performance, e.g., classification accuracy. In this work, we propose to align student features with class-mean of teacher features, where class-mean naturally serves as a strong classifier. To this end, we explore baseline techniques such as adopting the cosine distance based loss to encourage the similarity between student features and their corresponding class-means of the teacher. Moreover, we train the student to produce large-norm features, inspired by other lines of work (e.g., model pruning and domain adaptation), which find the large-norm features to be more significant. Finally, we propose a rather simple loss term (dubbed ND loss) to simultaneously (1) encourage student to produce large-\emph{norm} features, and (2) align the \emph{direction} of student features and teacher class-means. Experiments on standard benchmarks demonstrate that our explored techniques help existing KD methods achieve better performance, i.e., higher classification accuracy on ImageNet and CIFAR100 datasets, and higher detection precision on COCO dataset. Importantly, our proposed ND loss helps the most, leading to the state-of-the-art performance on these benchmarks. The source code is available at \url{https://github.com/WangYZ1608/Knowledge-Distillation-via-ND}., 16 pages, 8 figures, 6 tables
- Published
- 2023
21. Life Regression based Patch Slimming for Vision Transformers
- Author
-
Chen, Jiawei, Chen, Lin, Yang, Jiang, Shi, Tianqi, Cheng, Lechao, Feng, Zunlei, and Song, Mingli
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Vision transformers have achieved remarkable success in computer vision tasks by using multi-head self-attention modules to capture long-range dependencies within images. However, the high inference computation cost poses a new challenge. Several methods have been proposed to address this problem, mainly by slimming patches. In the inference stage, these methods classify patches into two classes, one to keep and the other to discard in multiple layers. This approach results in additional computation at every layer where patches are discarded, which hinders inference acceleration. In this study, we tackle the patch slimming problem from a different perspective by proposing a life regression module that determines the lifespan of each image patch in one go. During inference, the patch is discarded once the current layer index exceeds its life. Our proposed method avoids additional computation and parameters in multiple layers to enhance inference speed while maintaining competitive performance. Additionally, our approach requires fewer training epochs than other patch slimming methods., 8 pages,4 figures
- Published
- 2023
22. Team DETR: Guide Queries as a Professional Team in Detection Transformers
- Author
-
Qiu, Tian, Zhou, Linyun, Xu, Wenxiang, Cheng, Lechao, Feng, Zunlei, and Song, Mingli
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Recent proposed DETR variants have made tremendous progress in various scenarios due to their streamlined processes and remarkable performance. However, the learned queries usually explore the global context to generate the final set prediction, resulting in redundant burdens and unfaithful results. More specifically, a query is commonly responsible for objects of different scales and positions, which is a challenge for the query itself, and will cause spatial resource competition among queries. To alleviate this issue, we propose Team DETR, which leverages query collaboration and position constraints to embrace objects of interest more precisely. We also dynamically cater to each query member's prediction preference, offering the query better scale and spatial priors. In addition, the proposed Team DETR is flexible enough to be adapted to other existing DETR variants without increasing parameters and calculations. Extensive experiments on the COCO dataset show that Team DETR achieves remarkable gains, especially for small and large objects. Code is available at \url{https://github.com/horrible-dong/TeamDETR}.
- Published
- 2023
23. Recent advances in artificial intelligence for retrosynthesis
- Author
-
Zhong, Zipeng, Song, Jie, Feng, Zunlei, Liu, Tiantao, Jia, Lingxiang, Yao, Shaolun, Hou, Tingjun, and Song, Mingli
- Subjects
Chemical Physics (physics.chem-ph) ,FOS: Computer and information sciences ,Computer Science - Machine Learning ,Quantitative Biology - Biomolecules ,FOS: Biological sciences ,Physics - Chemical Physics ,FOS: Physical sciences ,Biomolecules (q-bio.BM) ,Machine Learning (cs.LG) - Abstract
Retrosynthesis is the cornerstone of organic chemistry, providing chemists in material and drug manufacturing access to poorly available and brand-new molecules. Conventional rule-based or expert-based computer-aided synthesis has obvious limitations, such as high labor costs and limited search space. In recent years, dramatic breakthroughs driven by artificial intelligence have revolutionized retrosynthesis. Here we aim to present a comprehensive review of recent advances in AI-based retrosynthesis. For single-step and multi-step retrosynthesis both, we first list their goal and provide a thorough taxonomy of existing methods. Afterwards, we analyze these methods in terms of their mechanism and performance, and introduce popular evaluation metrics for them, in which we also provide a detailed comparison among representative methods on several public datasets. In the next part we introduce popular databases and established platforms for retrosynthesis. Finally, this review concludes with a discussion about promising research directions in this field., 27 pages, 6 figurs, 4 tables
- Published
- 2023
24. Transferability Estimation Based On Principal Gradient Expectation
- Author
-
Qi, Huiyan, Cheng, Lechao, Chen, Jingjing, Yu, Yue, Song, Xue, Feng, Zunlei, and Jiang, Yu-Gang
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Transfer learning aims to improve the performance of target tasks by transferring knowledge acquired in source tasks. The standard approach is pre-training followed by fine-tuning or linear probing. Especially, selecting a proper source domain for a specific target domain under predefined tasks is crucial for improving efficiency and effectiveness. It is conventional to solve this problem via estimating transferability. However, existing methods can not reach a trade-off between performance and cost. To comprehensively evaluate estimation methods, we summarize three properties: stability, reliability and efficiency. Building upon them, we propose Principal Gradient Expectation(PGE), a simple yet effective method for assessing transferability. Specifically, we calculate the gradient over each weight unit multiple times with a restart scheme, and then we compute the expectation of all gradients. Finally, the transferability between the source and target is estimated by computing the gap of normalized principal gradients. Extensive experiments show that the proposed metric is superior to state-of-the-art methods on all properties., 11 pages, 2 figures, 7 tables
- Published
- 2022
25. Federated Selective Aggregation for Knowledge Amalgamation
- Author
-
Xie, Donglin, Yu, Ruonan, Fang, Gongfan, Song, Jie, Feng, Zunlei, Wang, Xinchao, Sun, Li, and Song, Mingli
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
In this paper, we explore a new knowledge-amalgamation problem, termed Federated Selective Aggregation (FedSA). The goal of FedSA is to train a student model for a new task with the help of several decentralized teachers, whose pre-training tasks and data are different and agnostic. Our motivation for investigating such a problem setup stems from a recent dilemma of model sharing. Many researchers or institutes have spent enormous resources on training large and competent networks. Due to the privacy, security, or intellectual property issues, they are, however, not able to share their own pre-trained models, even if they wish to contribute to the community. The proposed FedSA offers a solution to this dilemma and makes it one step further since, again, the learned student may specialize in a new task different from all of the teachers. To this end, we proposed a dedicated strategy for handling FedSA. Specifically, our student-training process is driven by a novel saliency-based approach that adaptively selects teachers as the participants and integrates their representative capabilities into the student. To evaluate the effectiveness of FedSA, we conduct experiments on both single-task and multi-task settings. Experimental results demonstrate that FedSA effectively amalgamates knowledge from decentralized models and achieves competitive performance to centralized baselines., 18 pages, 4 figures
- Published
- 2022
26. Mid-level Representation Enhancement and Graph Embedded Uncertainty Suppressing for Facial Expression Recognition
- Author
-
Lei, Jie, Liu, Zhao, Zou, Zeyu, Li, Tong, Juan, Xu, Wang, Shuaiwei, Yang, Guoyu, and Feng, Zunlei
- Subjects
FOS: Computer and information sciences ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Facial expression is an essential factor in conveying human emotional states and intentions. Although remarkable advancement has been made in facial expression recognition (FER) task, challenges due to large variations of expression patterns and unavoidable data uncertainties still remain. In this paper, we propose mid-level representation enhancement (MRE) and graph embedded uncertainty suppressing (GUS) addressing these issues. On one hand, MRE is introduced to avoid expression representation learning being dominated by a limited number of highly discriminative patterns. On the other hand, GUS is introduced to suppress the feature ambiguity in the representation space. The proposed method not only has stronger generalization capability to handle different variations of expression patterns but also more robustness to capture expression representations. Experimental evaluation on Aff-Wild2 have verified the effectiveness of the proposed method.
- Published
- 2022
27. Ask-AC: An Initiative Advisor-in-the-Loop Actor-Critic Framework
- Author
-
Liu, Shunyu, Chen, Kaixuan, Yu, Na, Song, Jie, Feng, Zunlei, and Song, Mingli
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,Machine Learning (cs.LG) - Abstract
Despite the promising results achieved, state-of-the-art interactive reinforcement learning schemes rely on passively receiving supervision signals from advisor experts, in the form of either continuous monitoring or pre-defined rules, which inevitably result in a cumbersome and expensive learning process. In this paper, we introduce a novel initiative advisor-in-the-loop actor-critic framework, termed as Ask-AC, that replaces the unilateral advisor-guidance mechanism with a bidirectional learner-initiative one, and thereby enables a customized and efficacious message exchange between learner and advisor. At the heart of Ask-AC are two complementary components, namely action requester and adaptive state selector, that can be readily incorporated into various discrete actor-critic architectures. The former component allows the agent to initiatively seek advisor intervention in the presence of uncertain states, while the latter identifies the unstable states potentially missed by the former especially when environment changes, and then learns to promote the ask action on such states. Experimental results on both stationary and non-stationary environments and across different actor-critic backbones demonstrate that the proposed framework significantly improves the learning efficiency of the agent, and achieves the performances on par with those obtained by continuous advisor monitoring.
- Published
- 2022
28. CNN LEGO: Disassembling and Assembling Convolutional Neural Network
- Author
-
Hu, Jiacong, Gao, Jing, Feng, Zunlei, Cheng, Lechao, Lei, Jie, Bao, Hujun, and Song, Mingli
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Convolutional Neural Network (CNN), which mimics human visual perception mechanism, has been successfully used in many computer vision areas. Some psychophysical studies show that the visual perception mechanism synchronously processes the form, color, movement, depth, etc., in the initial stage [7,20] and then integrates all information for final recognition [38]. What's more, the human visual system [20] contains different subdivisions or different tasks. Inspired by the above visual perception mechanism, we investigate a new task, termed as Model Disassembling and Assembling (MDA-Task), which can disassemble the deep models into independent parts and assemble those parts into a new deep model without performance cost like playing LEGO toys. To this end, we propose a feature route attribution technique (FRAT) for disassembling CNN classifiers in this paper. In FRAT, the positive derivatives of predicted class probability w.r.t. the feature maps are adopted to locate the critical features in each layer. Then, relevance analysis between the critical features and preceding/subsequent parameter layers is adopted to bridge the route between two adjacent parameter layers. In the assembling phase, class-wise components of each layer are assembled into a new deep model for a specific task. Extensive experiments demonstrate that the assembled CNN classifier can achieve close accuracy with the original classifier without any fine-tune, and excess original performance with one-epoch fine-tune. What's more, we also conduct massive experiments to verify the broad application of MDA-Task on model decision route visualization, model compression, knowledge distillation, transfer learning, incremental learning, and so on.
- Published
- 2022
29. Imbalanced Sample Generation and Evaluation for Power System Transient Stability Using CTGAN
- Author
-
Han, Gengshi, Liu, Shunyu, Chen, Kaixuan, Yu, Na, Feng, Zunlei, and Song, Mingli
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Machine Learning (cs.LG) - Abstract
Although deep learning has achieved impressive advances in transient stability assessment of power systems, the insufficient and imbalanced samples still trap the training effect of the data-driven methods. This paper proposes a controllable sample generation framework based on Conditional Tabular Generative Adversarial Network (CTGAN) to generate specified transient stability samples. To fit the complex feature distribution of the transient stability samples, the proposed framework firstly models the samples as tabular data and uses Gaussian mixture models to normalize the tabular data. Then we transform multiple conditions into a single conditional vector to enable multi-conditional generation. Furthermore, this paper introduces three evaluation metrics to verify the quality of generated samples based on the proposed framework. Experimental results on the IEEE 39-bus system show that the proposed framework effectively balances the transient stability samples and significantly improves the performance of transient stability assessment models.
- Published
- 2021
30. Model Doctor: A Simple Gradient Aggregation Strategy for Diagnosing and Treating CNN Classifiers
- Author
-
Feng, Zunlei, Hu, Jiacong, Wu, Sai, Yu, Xiaotian, Song, Jie, and Song, Mingli
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,General Medicine ,Machine Learning (cs.LG) - Abstract
Recently, Convolutional Neural Network (CNN) has achieved excellent performance in the classification task. It is widely known that CNN is deemed as a 'black-box', which is hard for understanding the prediction mechanism and debugging the wrong prediction. Some model debugging and explanation works are developed for solving the above drawbacks. However, those methods focus on explanation and diagnosing possible causes for model prediction, based on which the researchers handle the following optimization of models manually. In this paper, we propose the first completely automatic model diagnosing and treating tool, termed as Model Doctor. Based on two discoveries that 1) each category is only correlated with sparse and specific convolution kernels, and 2) adversarial samples are isolated while normal samples are successive in the feature space, a simple aggregate gradient constraint is devised for effectively diagnosing and optimizing CNN classifiers. The aggregate gradient strategy is a versatile module for mainstream CNN classifiers. Extensive experiments demonstrate that the proposed Model Doctor applies to all existing CNN classifiers, and improves the accuracy of $16$ mainstream CNN classifiers by 1%-5%., Accepted by AAAI 2022
- Published
- 2021
31. Explainable Fragment‐Based Molecular Property Attribution.
- Author
-
Jia, Lingxiang, Feng, Zunlei, Zhang, Haotian, Song, Jie, Zhong, Zipeng, Yao, Shaolun, and Song, Mingli
- Subjects
ATTRIBUTION (Social psychology) ,DEEP learning ,DRUG discovery ,DRUG development - Abstract
"AI & Drug Discovery" mode has significantly promoted drug development and achieved excellent performance, especially with the rapid development of deep learning, making remarkable contributions to protecting human physiological health. However, due to the "black‐box" characteristic of the deep learning model, the decision route and predicted results in different research stages assisted by deep models are usually unexplainable, limiting their application in practice and more in‐depth research of drug discovery. Focusing on the drug molecules, an explainable fragment‐based molecular property attribution technique is proposed for analyzing the influence of particular molecule fragments on properties and the relationship between the molecular properties herein. Quantitative experiments on 42 benchmark property tasks demonstrate that 325 attribution fragments, which account for 90% of the overall attribution results obtained by the proposed method, have positive relevance to the corresponding property tasks. More impressively, most of the attribution results randomly selected are consistent with the existing mechanism explanations. The discovery mentioned above provides a reference standard for assisting researchers in developing more specific and practical drug molecule studies, such as synthesizing molecule with the targeted property using a fragment obtained from the attribution method. An interactive preprint version of the article can be found at: https://www.authorea.com/doi/full/10.22541/au.165279262.29589148. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
32. Root-aligned SMILES: a tight representation for chemical reaction prediction.
- Author
-
Zhong, Zipeng, Song, Jie, Feng, Zunlei, Liu, Tiantao, Jia, Lingxiang, Yao, Shaolun, Wu, Min, Hou, Tingjun, and Song, Mingli
- Published
- 2022
- Full Text
- View/download PDF
33. CoEvo-Net: Coevolution Network for Video Highlight Detection.
- Author
-
Chen, Jiawei, Wang, Jian, Wang, Xinchao, Wang, Xingen, Feng, Zunlei, Liu, Ruitao, and Song, Mingli
- Subjects
COEVOLUTION ,VIDEOS ,VIDEO coding - Abstract
Video highlight detection (VHD) has emerged as a pressing task due to the unprecedentedly increasing amount of video data, such as those from e-commerce live-broadcasting platforms. Many approaches focus on exploiting text data, in the form of video description or time-sync comments, to facilitate the VHD task. Despite the promising results, they have largely overlooked the noises inherent in the text data and have mostly relied on isolating the feature of text and video. In this paper, we introduce a novel model to handle VHD, termed Coevolution Network (CoEvo-Net), that allows us to account for joint learning of the language and video features explicitly via a coevolution paradigm, in which features from the two data modalities progressively refine each other. This is achieved by a dedicated CoEvo-Cell that takes language and video together as inputs, extracts cross-modality, and filters the undesired parts of the input, such as words in a sentence. Furthermore, we release a large-scale dataset of e-commerce for VHD, in which each video is coupled with a sentence for description, to benchmark the sentence-based VHD approaches. Extensive experiments on the released dataset demonstrate that CoEvo-Net achieves state-of-the-art performance. Our dataset and code will be made publicly available. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
34. Unsupervised Facial Action Unit Intensity Estimation via Differentiable Optimization
- Author
-
Song, Xinhui, Shi, Tianyang, Shao, Tianjia, Yuan, Yi, Feng, Zunlei, and Fan, Changjie
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer Science - Computer Vision and Pattern Recognition - Abstract
The automatic intensity estimation of facial action units (AUs) from a single image plays a vital role in facial analysis systems. One big challenge for data-driven AU intensity estimation is the lack of sufficient AU label data. Due to the fact that AU annotation requires strong domain expertise, it is expensive to construct an extensive database to learn deep models. The limited number of labeled AUs as well as identity differences and pose variations further increases the estimation difficulties. Considering all these difficulties, we propose an unsupervised framework GE-Net for facial AU intensity estimation from a single image, without requiring any annotated AU data. Our framework performs differentiable optimization, which iteratively updates the facial parameters (i.e., head pose, AU parameters and identity parameters) to match the input image. GE-Net consists of two modules: a generator and a feature extractor. The generator learns to "render" a face image from a set of facial parameters in a differentiable way, and the feature extractor extracts deep features for measuring the similarity of the rendered image and input real image. After the two modules are trained and fixed, the framework searches optimal facial parameters by minimizing the differences of the extracted features between the rendered image and the input image. Experimental results demonstrate that our method can achieve state-of-the-art results compared with existing methods.
- Published
- 2020
35. Semantic Regularization: Improve Few-shot Image Classification by Reducing Meta Shift
- Author
-
Chen, Da, Yang, Yongliang, Feng, Zunlei, Wu, Xiang, Song, Mingli, Li, Wenbin, He, Yuan, Xue, Hui, and Mao, Feng
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Few-shot image classification requires the classifier to robustly cope with unseen classes even if there are only a few samples for each class. Recent advances benefit from the meta-learning process where episodic tasks are formed to train a model that can adapt to class change. However, these task sare independent to each other and existing works mainly rely on limited samples of individual support set in a single meta task. This strategy leads to severe meta shift issues across multiple tasks, meaning the learned prototypes or class descriptors are not stable as each task only involves their own support set. To avoid this problem, we propose a concise Semantic RegularizationNetwork to learn a common semantic space under the framework of meta-learning. In this space, all class descriptors can be regularized by the learned semantic basis, which can effectively solve the meta shift problem. The key is to train a class encoder and decoder structure that can encode the sample embedding features into the semantic domain with trained semantic basis, and generate a more stable and general class descriptor from the decoder. We evaluate our work by extensive comparisons with previous methods on three benchmark datasets (MiniImageNet, TieredImageNet, and CUB). The results show that the semantic regularization module improves performance by 4%-7% over the baseline method, and achieves competitive results over the current state-of-the-art models.
- Published
- 2019
36. Development of a Deep Learning Model to Assist With Diagnosis of Hepatocellular Carcinoma.
- Author
-
Feng, Shi, Yu, Xiaotian, Liang, Wenjie, Li, Xuejie, Zhong, Weixiang, Hu, Wanwan, Zhang, Han, Feng, Zunlei, Song, Mingli, Zhang, Jing, and Zhang, Xiuming
- Subjects
DEEP learning ,HEPATOCELLULAR carcinoma ,CONVOLUTIONAL neural networks ,NOISE control ,DIAGNOSIS ,HISTOPATHOLOGY - Abstract
Background: An accurate pathological diagnosis of hepatocellular carcinoma (HCC), one of the malignant tumors with the highest mortality rate, is time-consuming and heavily reliant on the experience of a pathologist. In this report, we proposed a deep learning model that required minimal noise reduction or manual annotation by an experienced pathologist for HCC diagnosis and classification. Methods: We collected a whole-slide image of hematoxylin and eosin-stained pathological slides from 592 HCC patients at the First Affiliated Hospital, College of Medicine, Zhejiang University between 2015 and 2020. We propose a noise-specific deep learning model. The model was trained initially with 137 cases cropped into multiple-scaled datasets. Patch screening and dynamic label smoothing strategies are adopted to handle the histopathological liver image with noise annotation from the perspective of input and output. The model was then tested in an independent cohort of 455 cases with comparable tumor types and differentiations. Results: Exhaustive experiments demonstrated that our two-step method achieved 87.81% pixel-level accuracy and 98.77% slide-level accuracy in the test dataset. Furthermore, the generalization performance of our model was also verified using The Cancer Genome Atlas dataset, which contains 157 HCC pathological slides, and achieved an accuracy of 87.90%. Conclusions: The noise-specific histopathological classification model of HCC based on deep learning is effective for the dataset with noisy annotation, and it significantly improved the pixel-level accuracy of the regular convolutional neural network (CNN) model. Moreover, the model also has an advantage in detecting well-differentiated HCC and microvascular invasion. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
37. Neural Style Transfer: A Review.
- Author
-
Jing, Yongcheng, Yang, Yezhou, Feng, Zunlei, Ye, Jingwen, Yu, Yizhou, and Song, Mingli
- Subjects
CONVOLUTIONAL neural networks ,ALGORITHMS - Abstract
The seminal work of Gatys et al. demonstrated the power of Convolutional Neural Networks (CNNs) in creating artistic imagery by separating and recombining image content and style. This process of using CNNs to render a content image in different styles is referred to as Neural Style Transfer (NST). Since then, NST has become a trending topic both in academic literature and industrial applications. It is receiving increasing attention and a variety of approaches are proposed to either improve or extend the original NST algorithm. In this paper, we aim to provide a comprehensive overview of the current progress towards NST. We first propose a taxonomy of current algorithms in the field of NST. Then, we present several evaluation methods and compare different NST algorithms both qualitatively and quantitatively. The review concludes with a discussion of various applications of NST and open problems for future research. A list of papers discussed in this review, corresponding codes, pre-trained models and more comparison results are publicly available at: https://osf.io/f8tu4/. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
38. Finding intrinsic color themes in images with human visual perception.
- Author
-
Feng, Zunlei, Yuan, Wolong, Fu, Chunli, Lei, Jie, and Song, Mingli
- Subjects
- *
VISUAL perception , *IMAGE color analysis , *COMPUTER algorithms , *ITERATIVE methods (Mathematics) , *LINEAR statistical models - Abstract
Extracting color themes from an image is to get a color palette that consists of dominant colors of the image. In this article, we construct a color network to build the intrinsic connections of color information of pixels. By applying improved linear iterative clustering (SLIC) algorithm [1,2] , we obtain initial color themes. In the following stage, with learning from human-extract color themes, we can get the final sorted color themes result. Experimental results demonstrate that our model outperforms previous approaches in terms of the number of themes, span and accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
39. Which face is more attractive?
- Author
-
Lei, Jie, Feng, Zunlei, Song, Mingli, and Tao, Dacheng
- Published
- 2016
- Full Text
- View/download PDF
40. Scale insensitive and focus driven mobile screen defect detection in industry.
- Author
-
Lei, Jie, Gao, Xin, Feng, Zunlei, Qiu, Huamou, and Song, Mingli
- Subjects
- *
CELL phones , *COMPUTER input-output equipment , *RECURRENT neural networks , *END-to-end delay , *STATISTICAL accuracy - Abstract
With the wide-spread of smartphones, mobile phone screen has become an important IO device in HCI and its quality is of great matter in interaction. Traditional defect detection process involves heavy labor cost or relies on unstable low-level features and suffers from both scale and model sensitive problems. Screen defect varies in size, shape, intensity and is hard to be described. Efficient and accurate detection system remains an urgent need in mobile phone screen manufacturing. In this paper, we propose an end-to-end screen defect detection framework. A defect detection network with merging and splitting strategies (MSDDN) to deal with multiple size and shape variations of defect image patches is firstly designed. After training, feature maps of the last layer before the output of MSDDN can be regarded as good representations of a screen image patch. These feature maps are concatenated into a unified feature vector. We then train a recurrent neural network (SCN) to decide which input screen image patch in a sequence is the most likely to contain defects, where the patches are cropped from the same image, and the feature maps are used as input. As SCN emphasizes on the comparison of image patches from one image, it is less sensitive to different screen batches. The patch with the highest probability of containing defects, or called focus area, is further processed with a sliding window to fill in the MSDDN to produce the final results. Finally, to improve the efficiency of the calculation process to fulfill the real industrial demand, we perform both filter selection and weight quantization on the weights in MSDDN under the purpose of building a low-precision version network without great loss in accuracy (MSDDN-l). Experimental results show MSDDN can better handle the defect variations than traditional models and general purpose convolutional neural networks. Meanwhile, SCN can accurately predict the focus area, and MSDDN-l can greatly improve the efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
41. HairStyle Editing via Parametric Controllable Strokes.
- Author
-
Song X, Liu C, Zheng Y, Feng Z, Li L, Zhou K, and Yu X
- Abstract
In this work, we propose a stroke-based hairstyle editing network, dubbed HairstyleNet, allowing users to conveniently change the hairstyles of an image in an interactive fashion. Different from previous works, we simplify the hairstyle editing process where users can manipulate local or entire hairstyles by adjusting the parameterized hair regions. Our HairstyleNet consists of two stages: a stroke parameterization stage and a stroke-to-hair generation stage. In the stroke parameterization stage, we first introduce parametric strokes to approximate the hair wisps, where the stroke shape is controlled by a quadratic Bézier curve and a thickness parameter. Since rendering strokes with thickness to an image is not differentiable, we opt to leverage a neural renderer to construct the mapping from stroke parameters to a stroke image. Thus, the stroke parameters can be directly estimated from hair regions in a differentiable way, enabling us to flexibly edit the hairstyles of input images. In the stroke-to-hair generation stage, we design a hairstyle refinement network that first encodes coarsely composed images of hair strokes, face, and background into latent representations and then generates high-fidelity face images with desirable new hairstyles from the latent codes. Extensive experiments demonstrate that our HairstyleNet achieves state-of-the-art performance and allows flexible hairstyle manipulation.
- Published
- 2024
- Full Text
- View/download PDF
42. Interaction Pattern Disentangling for Multi-Agent Reinforcement Learning.
- Author
-
Liu S, Song J, Zhou Y, Yu N, Chen K, Feng Z, and Song M
- Abstract
Deep cooperative multi-agent reinforcement learning has demonstrated its remarkable success over a wide spectrum of complex control tasks. However, recent advances in multi-agent learning mainly focus on value decomposition while leaving entity interactions still intertwined, which easily leads to over-fitting on noisy interactions between entities. In this work, we introduce a novel interactiOn Pattern disenTangling (OPT) method, to disentangle the entity interactions into interaction prototypes, each of which represents an underlying interaction pattern within a subgroup of the entities. OPT facilitates filtering the noisy interactions between irrelevant entities and thus significantly improves generalizability as well as interpretability. Specifically, OPT introduces a sparse disagreement mechanism to encourage sparsity and diversity among discovered interaction prototypes. Then the model selectively restructures these prototypes into a compact interaction pattern by an aggregator with learnable weights. To alleviate the training instability issue caused by partial observability, we propose to maximize the mutual information between the aggregation weights and the history behaviors of each agent. Experiments on single-task, multi-task and zero-shot benchmarks demonstrate that the proposed method yields results superior to the state-of-the-art counterparts. Our code is available at https://github.com/liushunyu/OPT.
- Published
- 2024
- Full Text
- View/download PDF
43. Transition Propagation Graph Neural Networks for Temporal Networks.
- Author
-
Zheng T, Feng Z, Zhang T, Hao Y, Song M, Wang X, Wang X, Zhao J, and Chen C
- Abstract
Researchers of temporal networks (e.g., social networks and transaction networks) have been interested in mining dynamic patterns of nodes from their diverse interactions. Inspired by recently powerful graph mining methods like skip-gram models and graph neural networks (GNNs), existing approaches focus on generating temporal node embeddings sequentially with nodes' sequential interactions. However, the sequential modeling of previous approaches cannot handles the transition structure between nodes' neighbors with limited memorization capacity. In detail, an effective method for the transition structures is required to both model nodes' personalized patterns adaptively and capture node dynamics accordingly. In this article, we propose a method, namely t ransition p ropagation g raph n eural n etworks (TIP-GNN), to tackle the challenges of encoding nodes' transition structures. The proposed TIP-GNN focuses on the bilevel graph structure in temporal networks: besides the explicit interaction graph, a node's sequential interactions can also be constructed as a transition graph. Based on the bilevel graph, TIP-GNN further encodes transition structures by multistep transition propagation and distills information from neighborhoods by a bilevel graph convolution. Experimental results over various temporal networks reveal the efficiency of our TIP-GNN, with at most 7.2% improvements of accuracy on temporal link prediction. Extensive ablation studies further verify the effectiveness and limitations of the transition propagation module. Our code is available at https://github.com/doujiang-zheng/TIP-GNN.
- Published
- 2024
- Full Text
- View/download PDF
44. Conservative-Progressive Collaborative Learning for Semi-Supervised Semantic Segmentation.
- Author
-
Fan S, Zhu F, Feng Z, Lv Y, Song M, and Wang FY
- Abstract
Pseudo supervision is regarded as the core idea in semi-supervised learning for semantic segmentation, and there is always a tradeoff between utilizing only the high-quality pseudo labels and leveraging all the pseudo labels. Addressing that, we propose a novel learning approach, called Conservative-Progressive Collaborative Learning (CPCL), among which two predictive networks are trained in parallel, and the pseudo supervision is implemented based on both the agreement and disagreement of the two predictions. One network seeks common ground via intersection supervision and is supervised by the high-quality labels to ensure a more reliable supervision, while the other network reserves differences via union supervision and is supervised by all the pseudo labels to keep exploring with curiosity. Thus, the collaboration of conservative evolution and progressive exploration can be achieved. To reduce the influences of the suspicious pseudo labels, the loss is dynamic re-weighted according to the prediction confidence. Extensive experiments demonstrate that CPCL achieves state-of-the-art performance for semi-supervised semantic segmentation.
- Published
- 2023
- Full Text
- View/download PDF
45. Knowledge Amalgamation for Object Detection With Transformers.
- Author
-
Zhang H, Mao F, Xue M, Fang G, Feng Z, Song J, and Song M
- Abstract
Knowledge amalgamation (KA) is a novel deep model reusing task aiming to transfer knowledge from several well-trained teachers to a multi-talented and compact student. Currently, most of these approaches are tailored for convolutional neural networks (CNNs). However, there is a tendency that Transformers, with a completely different architecture, are starting to challenge the domination of CNNs in many computer vision tasks. Nevertheless, directly applying the previous KA methods to Transformers leads to severe performance degradation. In this work, we explore a more effective KA scheme for Transformer-based object detection models. Specifically, considering the architecture characteristics of Transformers, we propose to dissolve the KA into two aspects: sequence-level amalgamation (SA) and task-level amalgamation (TA). In particular, a hint is generated within the sequence-level amalgamation by concatenating teacher sequences instead of redundantly aggregating them to a fixed-size one as previous KA approaches. Besides, the student learns heterogeneous detection tasks through soft targets with efficiency in the task-level amalgamation. Extensive experiments on PASCAL VOC and COCO have unfolded that the sequence-level amalgamation significantly boosts the performance of students, while the previous methods impair the students. Moreover, the Transformer-based students excel in learning amalgamated knowledge, as they have mastered heterogeneous detection tasks rapidly and achieved superior or at least comparable performance to those of the teachers in their specializations.
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.