Author: "Li, Baopu" / Topic: fos: computer and information sciences - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Li, Baopu"' showing total 21 results

Start Over Author "Li, Baopu" Topic fos: computer and information sciences

21 results on '"Li, Baopu"'

1. Multi-view Vision-Prompt Fusion Network: Can 2D Pre-trained Model Boost 3D Point Cloud Data-scarce Learning?

Author: Peng, Haoyang, Li, Baopu, Zhang, Bo, Chen, Xin, Chen, Tao, and Zhu, Hongyuan
Subjects: FOS: Computer and information sciences, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition
Abstract: Point cloud based 3D deep model has wide applications in many applications such as autonomous driving, house robot, and so on. Inspired by the recent prompt learning in natural language processing, this work proposes a novel Multi-view Vision-Prompt Fusion Network (MvNet) for few-shot 3D point cloud classification. MvNet investigates the possibility of leveraging the off-the-shelf 2D pre-trained models to achieve the few-shot classification, which can alleviate the over-dependence issue of the existing baseline models towards the large-scale annotated 3D point cloud data. Specifically, MvNet first encodes a 3D point cloud into multi-view image features for a number of different views. Then, a novel multi-view prompt fusion module is developed to effectively fuse information from different views to bridge the gap between 3D point cloud data and 2D pre-trained models. A set of 2D image prompts can then be derived to better describe the suitable prior knowledge for a large-scale pre-trained image model for few-shot 3D point cloud classification. Extensive experiments on ModelNet, ScanObjectNN, and ShapeNet datasets demonstrate that MvNet achieves new state-of-the-art performance for 3D few-shot point cloud image classification. The source code of this work will be available soon., 10 pages,5 figures
Published: 2023

2. ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design

Author: You, Haoran, Sun, Zhanyi, Shi, Huihong, Yu, Zhongzhi, Zhao, Yang, Zhang, Yongan, Li, Chaojian, Li, Baopu, and Lin, Yingyan
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Hardware Architecture (cs.AR), Computer Science - Computer Vision and Pattern Recognition, Computer Science - Hardware Architecture, Machine Learning (cs.LG)
Abstract: Vision Transformers (ViTs) have achieved state-of-the-art performance on various vision tasks. However, ViTs' self-attention module is still arguably a major bottleneck, limiting their achievable hardware efficiency. Meanwhile, existing accelerators dedicated to NLP Transformers are not optimal for ViTs. This is because there is a large difference between ViTs and NLP Transformers: ViTs have a relatively fixed number of input tokens, whose attention maps can be pruned by up to 90% even with fixed sparse patterns; while NLP Transformers need to handle input sequences of varying numbers of tokens and rely on on-the-fly predictions of dynamic sparse attention patterns for each input to achieve a decent sparsity (e.g., >=50%). To this end, we propose a dedicated algorithm and accelerator co-design framework dubbed ViTCoD for accelerating ViTs. Specifically, on the algorithm level, ViTCoD prunes and polarizes the attention maps to have either denser or sparser fixed patterns for regularizing two levels of workloads without hurting the accuracy, largely reducing the attention computations while leaving room for alleviating the remaining dominant data movements; on top of that, we further integrate a lightweight and learnable auto-encoder module to enable trading the dominant high-cost data movements for lower-cost computations. On the hardware level, we develop a dedicated accelerator to simultaneously coordinate the enforced denser/sparser workloads and encoder/decoder engines for boosted hardware utilization. Extensive experiments and ablation studies validate that ViTCoD largely reduces the dominant data movement costs, achieving speedups of up to 235.3x, 142.9x, 86.0x, 10.1x, and 6.8x over general computing platforms CPUs, EdgeGPUs, GPUs, and prior-art Transformer accelerators SpAtten and Sanger under an attention sparsity of 90%, respectively., Accepted to HPCA 2023
Published: 2023

3. Effective Invertible Arbitrary Image Rescaling

Author: Pan, Zhihong, Li, Baopu, He, Dongliang, Wu, Wenhao, and Ding, Errui
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Image and Video Processing (eess.IV), FOS: Electrical engineering, electronic engineering, information engineering, Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Great successes have been achieved using deep learning techniques for image super-resolution (SR) with fixed scales. To increase its real world applicability, numerous models have also been proposed to restore SR images with arbitrary scale factors, including asymmetric ones where images are resized to different scales along horizontal and vertical directions. Though most models are only optimized for the unidirectional upscaling task while assuming a predefined downscaling kernel for low-resolution (LR) inputs, recent models based on Invertible Neural Networks (INN) are able to increase upscaling accuracy significantly by optimizing the downscaling and upscaling cycle jointly. However, limited by the INN architecture, it is constrained to fixed integer scale factors and requires one model for each scale. Without increasing model complexity, a simple and effective invertible arbitrary rescaling network (IARN) is proposed to achieve arbitrary image rescaling by training only one model in this work. Using innovative components like position-aware scale encoding and preemptive channel splitting, the network is optimized to convert the non-invertible rescaling cycle to an effectively invertible process. It is shown to achieve a state-of-the-art (SOTA) performance in bidirectional arbitrary rescaling without compromising perceptual quality in LR outputs. It is also demonstrated to perform well on tests with asymmetric scales using the same network architecture.
Published: 2023

4. $β$-DARTS++: Bi-level Regularization for Proxy-robust Differentiable Architecture Search

Author: Ye, Peng, He, Tong, Li, Baopu, Chen, Tao, Bai, Lei, and Ouyang, Wanli
Subjects: FOS: Computer and information sciences, Artificial Intelligence (cs.AI), Machine Learning (cs.LG)
Abstract: Neural Architecture Search has attracted increasing attention in recent years. Among them, differential NAS approaches such as DARTS, have gained popularity for the search efficiency. However, they still suffer from three main issues, that are, the weak stability due to the performance collapse, the poor generalization ability of the searched architectures, and the inferior robustness to different kinds of proxies. To solve the stability and generalization problems, a simple-but-effective regularization method, termed as Beta-Decay, is proposed to regularize the DARTS-based NAS searching process (i.e., $β$-DARTS). Specifically, Beta-Decay regularization can impose constraints to keep the value and variance of activated architecture parameters from being too large, thereby ensuring fair competition among architecture parameters and making the supernet less sensitive to the impact of input on the operation set. In-depth theoretical analyses on how it works and why it works are provided. Comprehensive experiments validate that Beta-Decay regularization can help to stabilize the searching process and makes the searched network more transferable across different datasets. To address the robustness problem, we first benchmark different NAS methods under a wide range of proxy data, proxy channels, proxy layers and proxy epochs, since the robustness of NAS under different kinds of proxies has not been explored before. We then conclude some interesting findings and find that $β$-DARTS always achieves the best result among all compared NAS methods under almost all proxies. We further introduce the novel flooding regularization to the weight optimization of $β$-DARTS (i.e., Bi-level regularization), and experimentally and theoretically verify its effectiveness for improving the proxy robustness of differentiable NAS., arXiv admin note: substantial text overlap with arXiv:2203.01665
Published: 2023
Full Text: View/download PDF

5. Stimulative Training of Residual Networks: A Social Psychology Perspective of Loafing

Author: Ye, Peng, Tang, Shengji, Li, Baopu, Chen, Tao, and Ouyang, Wanli
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition
Abstract: Residual networks have shown great success and become indispensable in today's deep models. In this work, we aim to re-investigate the training process of residual networks from a novel social psychology perspective of loafing, and further propose a new training strategy to strengthen the performance of residual networks. As residual networks can be viewed as ensembles of relatively shallow networks (i.e., \textit{unraveled view}) in prior works, we also start from such view and consider that the final performance of a residual network is co-determined by a group of sub-networks. Inspired by the social loafing problem of social psychology, we find that residual networks invariably suffer from similar problem, where sub-networks in a residual network are prone to exert less effort when working as part of the group compared to working alone. We define this previously overlooked problem as \textit{network loafing}. As social loafing will ultimately cause the low individual productivity and the reduced overall performance, network loafing will also hinder the performance of a given residual network and its sub-networks. Referring to the solutions of social psychology, we propose \textit{stimulative training}, which randomly samples a residual sub-network and calculates the KL-divergence loss between the sampled sub-network and the given residual network, to act as extra supervision for sub-networks and make the overall goal consistent. Comprehensive empirical results and theoretical analyses verify that stimulative training can well handle the loafing problem, and improve the performance of a residual network by improving the performance of its sub-networks. The code is available at https://github.com/Sunshine-Ye/NIPS22-ST ., NIPS2022 accept
Published: 2022

6. SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning

Author: You, Haoran, Li, Baopu, Sun, Zhanyi, Ouyang, Xu, and Lin, Yingyan
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (cs.LG)
Abstract: Neural architecture search (NAS) has demonstrated amazing success in searching for efficient deep neural networks (DNNs) from a given supernet. In parallel, the lottery ticket hypothesis has shown that DNNs contain small subnetworks that can be trained from scratch to achieve a comparable or higher accuracy than original DNNs. As such, it is currently a common practice to develop efficient DNNs via a pipeline of first search and then prune. Nevertheless, doing so often requires a search-train-prune-retrain process and thus prohibitive computational cost. In this paper, we discover for the first time that both efficient DNNs and their lottery subnetworks (i.e., lottery tickets) can be directly identified from a supernet, which we term as SuperTickets, via a two-in-one training scheme with jointly architecture searching and parameter pruning. Moreover, we develop a progressive and unified SuperTickets identification strategy that allows the connectivity of subnetworks to change during supernet training, achieving better accuracy and efficiency trade-offs than conventional sparse training. Finally, we evaluate whether such identified SuperTickets drawn from one task can transfer well to other tasks, validating their potential of handling multiple tasks simultaneously. Extensive experiments and ablation studies on three tasks and four benchmark datasets validate that our proposed SuperTickets achieve boosted accuracy and efficiency trade-offs than both typical NAS and pruning pipelines, regardless of having retraining or not. Codes and pretrained models are available at https://github.com/RICE-EIC/SuperTickets., Accepted by ECCV 2022
Published: 2022

7. ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks

Author: You, Haoran, Li, Baopu, Shi, Huihong, Fu, Yonggan, and Lin, Yingyan
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Machine Learning (cs.LG)
Abstract: Neural networks (NNs) with intensive multiplications (e.g., convolutions and transformers) are capable yet power hungry, impeding their more extensive deployment into resource-constrained devices. As such, multiplication-free networks, which follow a common practice in energy-efficient hardware implementation to parameterize NNs with more efficient operators (e.g., bitwise shifts and additions), have gained growing attention. However, multiplication-free networks usually under-perform their vanilla counterparts in terms of the achieved accuracy. To this end, this work advocates hybrid NNs that consist of both powerful yet costly multiplications and efficient yet less powerful operators for marrying the best of both worlds, and proposes ShiftAddNAS, which can automatically search for more accurate and more efficient NNs. Our ShiftAddNAS highlights two enablers. Specifically, it integrates (1) the first hybrid search space that incorporates both multiplication-based and multiplication-free operators for facilitating the development of both accurate and efficient hybrid NNs; and (2) a novel weight sharing strategy that enables effective weight sharing among different operators that follow heterogeneous distributions (e.g., Gaussian for convolutions vs. Laplacian for add operators) and simultaneously leads to a largely reduced supernet size and much better searched networks. Extensive experiments and ablation studies on various models, datasets, and tasks consistently validate the efficacy of ShiftAddNAS, e.g., achieving up to a +7.7% higher accuracy or a +4.9 better BLEU score compared to state-of-the-art NN, while leading to up to 93% or 69% energy and latency savings, respectively. Codes and pretrained models are available at https://github.com/RICE-EIC/ShiftAddNAS., Accepted by ICML 2022
Published: 2022

8. $��$-DARTS: Beta-Decay Regularization for Differentiable Architecture Search

Author: Ye, Peng, Li, Baopu, Li, Yikang, Chen, Tao, Fan, Jiayuan, and Ouyang, Wanli
Subjects: FOS: Computer and information sciences, Artificial Intelligence (cs.AI), Machine Learning (cs.LG)
Abstract: Neural Architecture Search~(NAS) has attracted increasingly more attention in recent years because of its capability to design deep neural networks automatically. Among them, differential NAS approaches such as DARTS, have gained popularity for the search efficiency. However, they suffer from two main issues, the weak robustness to the performance collapse and the poor generalization ability of the searched architectures. To solve these two problems, a simple-but-efficient regularization method, termed as Beta-Decay, is proposed to regularize the DARTS-based NAS searching process. Specifically, Beta-Decay regularization can impose constraints to keep the value and variance of activated architecture parameters from too large. Furthermore, we provide in-depth theoretical analysis on how it works and why it works. Experimental results on NAS-Bench-201 show that our proposed method can help to stabilize the searching process and makes the searched network more transferable across different datasets. In addition, our search scheme shows an outstanding property of being less dependent on training time and data. Comprehensive experiments on a variety of search spaces and datasets validate the effectiveness of the proposed method., CVPR2022
Published: 2022
Full Text: View/download PDF

9. Instance-aware Model Ensemble With Distillation For Unsupervised Domain Adaptation

Author: Wu, Weimin, Fan, Jiayuan, Chen, Tao, Ye, Hancheng, Zhang, Bo, and Li, Baopu
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition
Abstract: The linear ensemble based strategy, i.e., averaging ensemble, has been proposed to improve the performance in unsupervised domain adaptation tasks. However, a typical UDA task is usually challenged by dynamically changing factors, such as variable weather, views, and background in the unlabeled target domain. Most previous ensemble strategies ignore UDA's dynamic and uncontrollable challenge, facing limited feature representations and performance bottlenecks. To enhance the model, adaptability between domains and reduce the computational cost when deploying the ensemble model, we propose a novel framework, namely Instance aware Model Ensemble With Distillation, IMED, which fuses multiple UDA component models adaptively according to different instances and distills these components into a small model. The core idea of IMED is a dynamic instance aware ensemble strategy, where for each instance, a nonlinear fusion subnetwork is learned that fuses the extracted features and predicted labels of multiple component models. The nonlinear fusion method can help the ensemble model handle dynamically changing factors. After learning a large capacity ensemble model with good adaptability to different changing factors, we leverage the ensemble teacher model to guide the learning of a compact student model by knowledge distillation. Furthermore, we provide the theoretical analysis of the validity of IMED for UDA. Extensive experiments conducted on various UDA benchmark datasets, e.g., Office 31, Office Home, and VisDA 2017, show the superiority of the model based on IMED to the state of the art methods under the comparable computation cost., Comment: 12 pages
Published: 2022
Full Text: View/download PDF

10. GLiT: Neural Architecture Search for Global and Local Image Transformer

Author: Chen, Boyu, Li, Peixia, Li, Chuming, Li, Baopu, Bai, Lei, Lin, Chen, Sun, Ming, yan, Junjie, and Ouyang, Wanli
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition
Abstract: We introduce the first Neural Architecture Search (NAS) method to find a better transformer architecture for image recognition. Recently, transformers without CNN-based backbones are found to achieve impressive performance for image recognition. However, the transformer is designed for NLP tasks and thus could be sub-optimal when directly used for image recognition. In order to improve the visual representation ability for transformers, we propose a new search space and searching algorithm. Specifically, we introduce a locality module that models the local correlations in images explicitly with fewer computational cost. With the locality module, our search space is defined to let the search algorithm freely trade off between global and local information as well as optimizing the low-level design choice in each module. To tackle the problem caused by huge search space, a hierarchical neural architecture search method is proposed to search the optimal vision transformer from two levels separately with the evolutionary algorithm. Extensive experiments on the ImageNet dataset demonstrate that our method can find more discriminative and efficient transformer variants than the ResNet family (e.g., ResNet101) and the baseline ViT for image classification., Accepted by ICCV 2021
Published: 2021

11. BN-NAS: Neural Architecture Search with Batch Normalization

Author: Chen, Boyu, Li, Peixia, Li, Baopu, Lin, Chen, Li, Chuming, Sun, Ming, Yan, Junjie, and Ouyang, Wanli
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition
Abstract: We present BN-NAS, neural architecture search with Batch Normalization (BN-NAS), to accelerate neural architecture search (NAS). BN-NAS can significantly reduce the time required by model training and evaluation in NAS. Specifically, for fast evaluation, we propose a BN-based indicator for predicting subnet performance at a very early training stage. The BN-based indicator further facilitates us to improve the training efficiency by only training the BN parameters during the supernet training. This is based on our observation that training the whole supernet is not necessary while training only BN parameters accelerates network convergence for network architecture search. Extensive experiments show that our method can significantly shorten the time of training supernet by more than 10 times and shorten the time of evaluating subnets by more than 600,000 times without losing accuracy., ICCV 2021
Published: 2021

12. PSViT: Better Vision Transformer via Token Pooling and Attention Sharing

Author: Chen, Boyu, Li, Peixia, Li, Baopu, Li, Chuming, Bai, Lei, Lin, Chen, Sun, Ming, Yan, Junjie, and Ouyang, Wanli
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition
Abstract: In this paper, we observe two levels of redundancies when applying vision transformers (ViT) for image recognition. First, fixing the number of tokens through the whole network produces redundant features at the spatial level. Second, the attention maps among different transformer layers are redundant. Based on the observations above, we propose a PSViT: a ViT with token Pooling and attention Sharing to reduce the redundancy, effectively enhancing the feature representation ability, and achieving a better speed-accuracy trade-off. Specifically, in our PSViT, token pooling can be defined as the operation that decreases the number of tokens at the spatial level. Besides, attention sharing will be built between the neighboring transformer layers for reusing the attention maps having a strong correlation among adjacent layers. Then, a compact set of the possible combinations for different token pooling and attention sharing mechanisms are constructed. Based on the proposed compact set, the number of tokens in each layer and the choices of layers sharing attention can be treated as hyper-parameters that are learned from data automatically. Experimental results show that the proposed scheme can achieve up to 6.6% accuracy improvement in ImageNet classification compared with the DeiT.
Published: 2021

13. AutoSampling: Search for Effective Data Sampling Schedules

Author: Sun, Ming, Dou, Haoxuan, Li, Baopu, Cui, Lei, Yan, Junjie, and Ouyang, Wanli
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition
Abstract: Data sampling acts as a pivotal role in training deep learning models. However, an effective sampling schedule is difficult to learn due to the inherently high dimension of parameters in learning the sampling schedule. In this paper, we propose an AutoSampling method to automatically learn sampling schedules for model training, which consists of the multi-exploitation step aiming for optimal local sampling schedules and the exploration step for the ideal sampling distribution. More specifically, we achieve sampling schedule search with shortened exploitation cycle to provide enough supervision. In addition, we periodically estimate the sampling distribution from the learned sampling schedules and perturb it to search in the distribution space. The combination of two searches allows us to learn a robust sampling schedule. We apply our AutoSampling method to a variety of image classification tasks illustrating the effectiveness of the proposed method., Automl for sampling firstly without any assumpation
Published: 2021

14. A Unified Joint Maximum Mean Discrepancy for Domain Adaptation

Author: Wang, Wei, Li, Baopu, Yang, Shuhui, Sun, Jing, Ding, Zhengming, Chen, Junyang, Dong, Xiao, Wang, Zhihui, and Li, Haojie
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Machine Learning (cs.LG)
Abstract: Domain adaptation has received a lot of attention in recent years, and many algorithms have been proposed with impressive progress. However, it is still not fully explored concerning the joint probability distribution (P(X, Y)) distance for this problem, since its empirical estimation derived from the maximum mean discrepancy (joint maximum mean discrepancy, JMMD) will involve complex tensor-product operator that is hard to manipulate. To solve this issue, this paper theoretically derives a unified form of JMMD that is easy to optimize, and proves that the marginal, class conditional and weighted class conditional probability distribution distances are our special cases with different label kernels, among which the weighted class conditional one not only can realize feature alignment across domains in the category level, but also deal with imbalance dataset using the class prior probabilities. From the revealed unified JMMD, we illustrate that JMMD degrades the feature-label dependence (discriminability) that benefits to classification, and it is sensitive to the label distribution shift when the label kernel is the weighted class conditional one. Therefore, we leverage Hilbert Schmidt independence criterion and propose a novel MMD matrix to promote the dependence, and devise a novel label kernel that is robust to label distribution shift. Finally, we conduct extensive experiments on several cross-domain datasets to demonstrate the validity and effectiveness of the revealed theoretical results.
Published: 2021
Full Text: View/download PDF

15. Learning-based Fast Path Planning in Complex Environments

Author: Liu, Jianbang, Li, Baopu, Li, Tingguang, Chi, Wenzheng, Wang, Jiankun, and Meng, Max Q. -H.
Subjects: FOS: Computer and information sciences, Computer Science - Robotics, Robotics (cs.RO)
Abstract: In this paper, we present a novel path planning algorithm to achieve fast path planning in complex environments. Most existing path planning algorithms are difficult to quickly find a feasible path in complex environments or even fail. However, our proposed framework can overcome this difficulty by using a learning-based prediction module and a sampling-based path planning module. The prediction module utilizes an auto-encoder-decoder-like convolutional neural network (CNN) to output a promising region where the feasible path probably lies in. In this process, the environment is treated as an RGB image to feed in our designed CNN module, and the output is also an RGB image. No extra computation is required so that we can maintain a high processing speed of 60 frames-per-second (FPS). Incorporated with a sampling-based path planner, we can extract a feasible path from the output image so that the robot can track it from start to goal. To demonstrate the advantage of the proposed algorithm, we compare it with conventional path planning algorithms in a series of simulation experiments. The results reveal that the proposed algorithm can achieve much better performance in terms of planning time, success rate, and path length., Comment: Accepted by ROBIO2021
Published: 2021
Full Text: View/download PDF

16. AutoPruning for Deep Neural Network with Dynamic Channel Masking

Author: Li, Baopu, Fan, Yanwen, Pan, Zhihong, and Zhang, Gang
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition
Abstract: Modern deep neural network models are large and computationally intensive. One typical solution to this issue is model pruning. However, most current pruning algorithms depend on hand crafted rules or domain expertise. To overcome this problem, we propose a learning based auto pruning algorithm for deep neural network, which is inspired by recent automatic machine learning(AutoML). A two objectives' problem that aims for the the weights and the best channels for each layer is first formulated. An alternative optimization approach is then proposed to derive the optimal channel numbers and weights simultaneously. In the process of pruning, we utilize a searchable hyperparameter, remaining ratio, to denote the number of channels in each convolution layer, and then a dynamic masking process is proposed to describe the corresponding channel evolution. To control the trade-off between the accuracy of a model and the pruning ratio of floating point operations, a novel loss function is further introduced. Preliminary experimental results on benchmark datasets demonstrate that our scheme achieves competitive results for neural network pruning., 9 pages
Published: 2020

17. AIM 2020 Challenge on Real Image Super-Resolution: Methods and Results

Author: Wei, Pengxu, Lu, Hannan, Timofte, Radu, Lin, Liang, Zuo, Wangmeng, Pan, Zhihong, Li, Baopu, Xi, Teng, Fan, Yanwen, Zhang, Gang, Liu, Jingtuo, Han, Junyu, Ding, Errui, Xie, Tangxin, Cao, Liang, Zou, Yan, Shen, Yi, Zhang, Jialiang, Jia, Yu, Cheng, Kaihua, Wu, Chenhuan, Lin, Yue, Liu, Cen, Peng, Yunbo, Zou, Xueyi, Luo, Zhipeng, Yao, Yuehan, Xu, Zhenyu, Zamir, Syed Waqas, Arora, Aditya, Khan, Salman, Hayat, Munawar, Khan, Fahad Shahbaz, Ahn, Keon-Hee, Kim, Jun-Hyuk, Choi, Jun-Ho, Lee, Jong-Seok, Zhao, Tongtong, Zhao, Shanshan, Han, Yoseob, Kim, Byung-Hoon, Baek, JaeHyun, Wu, Haoning, Xu, Dejia, Zhou, Bo, Guan, Wei, Li, Xiaobo, Ye, Chen, Li, Hao, Zhong, Haoyu, Shi, Yukai, Yang, Zhijing, Yang, Xiaojun, Li, Xin, Jin, Xin, Wu, Yaojun, Pang, Yingxue, Liu, Sen, Liu, Zhi-Song, Wang, Li-Wen, Li, Chu-Tak, Cani, Marie-Paule, Siu, Wan-Chi, Zhou, Yuanbo, Umer, Rao Muhammad, Micheloni, Christian, Cong, Xiaofeng, Gupta, Rajat, Almasri, Feras, Vandamme, Thomas, and Debeir, Olivier
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Computer Science - Computer Vision and Pattern Recognition
Abstract: This paper introduces the real image Super-Resolution (SR) challenge that was part of the Advances in Image Manipulation (AIM) workshop, held in conjunction with ECCV 2020. This challenge involves three tracks to super-resolve an input image for $\times$2, $\times$3 and $\times$4 scaling factors, respectively. The goal is to attract more attention to realistic image degradation for the SR task, which is much more complicated and challenging, and contributes to real-world image super-resolution applications. 452 participants were registered for three tracks in total, and 24 teams submitted their results. They gauge the state-of-the-art approaches for real image SR in terms of PSNR and SSIM.
Published: 2020

18. SAMOT: Switcher-Aware Multi-Object Tracking and Still Another MOT Measure

Author: Feng, Weitao, Hu, Zhihao, Li, Baopu, Gan, Weihao, Wu, Wei, and Ouyang, Wanli
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition
Abstract: Multi-Object Tracking (MOT) is a popular topic in computer vision. However, identity issue, i.e., an object is wrongly associated with another object of a different identity, still remains to be a challenging problem. To address it, switchers, i.e., confusing targets thatmay cause identity issues, should be focused. Based on this motivation,this paper proposes a novel switcher-aware framework for multi-object tracking, which consists of Spatial Conflict Graph model (SCG) and Switcher-Aware Association (SAA). The SCG eliminates spatial switch-ers within one frame by building a conflict graph and working out the optimal subgraph. The SAA utilizes additional information from potential temporal switcher across frames, enabling more accurate data association. Besides, we propose a new MOT evaluation measure, Still Another IDF score (SAIDF), aiming to focus more on identity issues.This new measure may overcome some problems of the previous measures and provide a better insight for identity issues in MOT. Finally,the proposed framework is tested under both the traditional measures and the new measure we proposed. Extensive experiments show that ourmethod achieves competitive results on all measure.
Published: 2020

19. UDC 2020 Challenge on Image Restoration of Under-Display Camera: Methods and Results

Author: Zhou, Yuqian, Kwan, Michael, Tolentino, Kyle, Emerton, Neil, Lim, Sehoon, Large, Tim, Fu, Lijiang, Pan, Zhihong, Li, Baopu, Yang, Qirui, Liu, Yihao, Tang, Jigang, Ku, Tao, Ma, Shibin, Hu, Bingnan, Wang, Jiarong, Puthussery, Densen, S, Hrishikesh P, Kuriakose, Melvin, C, Jiji, Sundar, Varun, Hegde, Sumanth, Kothandaraman, Divya, Mitra, Kaushik, Jassal, Akashdeep, Shah, Nisarg A., Nathan, Sabari, Rahel, Nagat Abdalla Esiad, Chen, Dafan, Nie, Shichao, Yin, Shuting, Ma, Chengconghui, Wang, Haoran, Zhao, Tongtong, Zhao, Shanshan, Rego, Joshua, Chen, Huaijin, Li, Shuai, Hu, Zhenhua, Lau, Kin Wai, Po, Lai-Man, Yu, Dahai, Rehman, Yasar Abbas Ur, Li, Yiqun, and Xing, Lianping
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Image and Video Processing (eess.IV), FOS: Electrical engineering, electronic engineering, information engineering, Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: This paper is the report of the first Under-Display Camera (UDC) image restoration challenge in conjunction with the RLQ workshop at ECCV 2020. The challenge is based on a newly-collected database of Under-Display Camera. The challenge tracks correspond to two types of display: a 4k Transparent OLED (T-OLED) and a phone Pentile OLED (P-OLED). Along with about 150 teams registered the challenge, eight and nine teams submitted the results during the testing phase for each track. The results in the paper are state-of-the-art restoration performance of Under-Display Camera Restoration. Datasets and paper are available at https://yzhouas.github.io/projects/UDC/udc.html., 15 pages
Published: 2020

20. Real Image Super Resolution Via Heterogeneous Model Ensemble using GP-NAS

Author: Pan, Zhihong, Li, Baopu, Xi, Teng, Fan, Yanwen, Zhang, Gang, Liu, Jingtuo, Han, Junyu, and Ding, Errui
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Image and Video Processing (eess.IV), Computer Science - Computer Vision and Pattern Recognition, FOS: Electrical engineering, electronic engineering, information engineering, Electrical Engineering and Systems Science - Image and Video Processing, Machine Learning (cs.LG)
Abstract: With advancement in deep neural network (DNN), recent state-of-the-art (SOTA) image superresolution (SR) methods have achieved impressive performance using deep residual network with dense skip connections. While these models perform well on benchmark dataset where low-resolution (LR) images are constructed from high-resolution (HR) references with known blur kernel, real image SR is more challenging when both images in the LR-HR pair are collected from real cameras. Based on existing dense residual networks, a Gaussian process based neural architecture search (GP-NAS) scheme is utilized to find candidate network architectures using a large search space by varying the number of dense residual blocks, the block size and the number of features. A suite of heterogeneous models with diverse network structure and hyperparameter are selected for model-ensemble to achieve outstanding performance in real image SR. The proposed method won the first place in all three tracks of the AIM 2020 Real Image Super-Resolution Challenge., Comment: This is a manuscript related to our algorithm that won the ECCV AIM 2020 Real Image Super-Resolution Challenge
Published: 2020
Full Text: View/download PDF

21. AIM 2020 Challenge on Learned Image Signal Processing Pipeline

Author: Ignatov, Andrey, Timofte, Radu, Zhang, Zhilu, Liu, Ming, Wang, Haolin, Zuo, Wangmeng, Zhang, Jiawei, Zhang, Ruimao, Peng, Zhanglin, Ren, Sijie, Dai, Linhui, Liu, Xiaohong, Li, Chengqi, Chen, Jun, Ito, Yuichi, Vasudeva, Bhavya, Deora, Puneesh, Pal, Umapada, Guo, Zhenyu, Zhu, Yu, Liang, Tian, Li, Chenghua, Leng, Cong, Pan, Zhihong, Li, Baopu, Kim, Byung-Hoon, Song, Joonyoung, Ye, Jong Chul, Baek, JaeHyun, Zhussip, Magauiya, Koishekenov, Yeskendir, Ye, Hwechul Cho, Liu, Xin, Hu, Xueying, Jiang, Jun, Gu, Jinwei, Li, Kai, Tan, Pengliang, and Hou, Bingxin
Subjects: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Image and Video Processing (eess.IV), Computer Science - Computer Vision and Pattern Recognition, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, FOS: Electrical engineering, electronic engineering, information engineering, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: This paper reviews the second AIM learned ISP challenge and provides the description of the proposed solutions and results. The participating teams were solving a real-world RAW-to-RGB mapping problem, where to goal was to map the original low-quality RAW images captured by the Huawei P20 device to the same photos obtained with the Canon 5D DSLR camera. The considered task embraced a number of complex computer vision subtasks, such as image demosaicing, denoising, white balancing, color and contrast correction, demoireing, etc. The target metric used in this challenge combined fidelity scores (PSNR and SSIM) with solutions' perceptual results measured in a user study. The proposed solutions significantly improved the baseline results, defining the state-of-the-art for practical image signal processing pipeline modeling., Comment: Published in ECCV 2020 Workshops (Advances in Image Manipulation), https://data.vision.ee.ethz.ch/cvl/aim20/
Published: 2020
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

21 results on '"Li, Baopu"'

1. Multi-view Vision-Prompt Fusion Network: Can 2D Pre-trained Model Boost 3D Point Cloud Data-scarce Learning?

2. ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design

3. Effective Invertible Arbitrary Image Rescaling

4. $β$-DARTS++: Bi-level Regularization for Proxy-robust Differentiable Architecture Search

5. Stimulative Training of Residual Networks: A Social Psychology Perspective of Loafing

6. SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning

7. ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks

8. $��$-DARTS: Beta-Decay Regularization for Differentiable Architecture Search

9. Instance-aware Model Ensemble With Distillation For Unsupervised Domain Adaptation

10. GLiT: Neural Architecture Search for Global and Local Image Transformer

11. BN-NAS: Neural Architecture Search with Batch Normalization

12. PSViT: Better Vision Transformer via Token Pooling and Attention Sharing

13. AutoSampling: Search for Effective Data Sampling Schedules

14. A Unified Joint Maximum Mean Discrepancy for Domain Adaptation

15. Learning-based Fast Path Planning in Complex Environments

16. AutoPruning for Deep Neural Network with Dynamic Channel Masking

17. AIM 2020 Challenge on Real Image Super-Resolution: Methods and Results

18. SAMOT: Switcher-Aware Multi-Object Tracking and Still Another MOT Measure

19. UDC 2020 Challenge on Image Restoration of Under-Display Camera: Methods and Results

20. Real Image Super Resolution Via Heterogeneous Model Ensemble using GP-NAS

21. AIM 2020 Challenge on Learned Image Signal Processing Pipeline

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

21 results on '"Li, Baopu"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources