116 results on '"Huanqiang Zeng"'
Search Results
2. Self-Supervised Video-Based Action Recognition With Disturbances
- Author
-
Wei Lin, Xinghao Ding, Yue Huang, and Huanqiang Zeng
- Subjects
Computer Graphics and Computer-Aided Design ,Software - Published
- 2023
- Full Text
- View/download PDF
3. Deep Cross-modal Hashing Based on Semantic Consistent Ranking
- Author
-
Xiaoqing Liu, Huanqiang Zeng, Yifan Shi, Jianqing Zhu, Chih-Hsien Hsia, and Kai-Kuang Ma
- Subjects
Signal Processing ,Media Technology ,Electrical and Electronic Engineering ,Computer Science Applications - Published
- 2023
- Full Text
- View/download PDF
4. Stacked One-Class Broad Learning System for Intrusion Detection in Industry 4.0
- Author
-
Kaixiang Yang, Yifan Shi, Zhiwen Yu, Qinmin Yang, Arun Kumar Sangaiah, and Huanqiang Zeng
- Subjects
Control and Systems Engineering ,Electrical and Electronic Engineering ,Computer Science Applications ,Information Systems - Published
- 2023
- Full Text
- View/download PDF
5. GiT: Graph Interactive Transformer for Vehicle Re-Identification
- Author
-
Fei Shen, Yi Xie, Jianqing Zhu, Xiaobin Zhu, and Huanqiang Zeng
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,Computer Graphics and Computer-Aided Design ,Software - Abstract
Transformers are more and more popular in computer vision, which treat an image as a sequence of patches and learn robust global features from the sequence. However, pure transformers are not entirely suitable for vehicle re-identification because vehicle re-identification requires both robust global features and discriminative local features. For that, a graph interactive transformer (GiT) is proposed in this paper. In the macro view, a list of GiT blocks are stacked to build a vehicle re-identification model, in where graphs are to extract discriminative local features within patches and transformers are to extract robust global features among patches. In the micro view, graphs and transformers are in an interactive status, bringing effective cooperation between local and global features. Specifically, one current graph is embedded after the former level's graph and transformer, while the current transform is embedded after the current graph and the former level's transformer. In addition to the interaction between graphs and transforms, the graph is a newly-designed local correction graph, which learns discriminative local features within a patch by exploring nodes' relationships. Extensive experiments on three large-scale vehicle re-identification datasets demonstrate that our GiT method is superior to state-of-the-art vehicle re-identification approaches., Comment: Accepted in IEEE TIP 2023
- Published
- 2023
- Full Text
- View/download PDF
6. DCAM-Net: A Rapid Detection Network for Strip Steel Surface Defects Based on Deformable Convolution and Attention Mechanism
- Author
-
Haixin Chen, Yongzhao Du, Yuqing Fu, Jianqing Zhu, and Huanqiang Zeng
- Subjects
Electrical and Electronic Engineering ,Instrumentation - Published
- 2023
- Full Text
- View/download PDF
7. DeflickerCycleGAN: Learning to Detect and Remove Flickers in a Single Image
- Author
-
Xiaodan Lin, Yangfu Li, Jianqing Zhu, and Huanqiang Zeng
- Subjects
Computer Graphics and Computer-Aided Design ,Software - Published
- 2023
- Full Text
- View/download PDF
8. Adaptive Ensemble Clustering With Boosting BLS-Based Autoencoder
- Author
-
Yifan Shi, Kaixiang Yang, Zhiwen Yu, C. L. Philip Chen, and Huanqiang Zeng
- Subjects
Computational Theory and Mathematics ,Computer Science Applications ,Information Systems - Published
- 2023
- Full Text
- View/download PDF
9. Clustering-Guided Pairwise Metric Triplet Loss for Person Reidentification
- Author
-
Weiyu Zeng, Tianlei Wang, Jiuwen Cao, Jianzhong Wang, and Huanqiang Zeng
- Subjects
Computer Networks and Communications ,Hardware and Architecture ,Signal Processing ,Computer Science Applications ,Information Systems - Published
- 2022
- Full Text
- View/download PDF
10. A sample‐proxy dual triplet loss function for object re‐identification
- Author
-
Hanxiao Wu, Fei Shen, Jianqing Zhu, Huanqiang Zeng, Xiaobin Zhu, and Zhen Lei
- Subjects
Signal Processing ,Computer Vision and Pattern Recognition ,Electrical and Electronic Engineering ,Software - Published
- 2022
- Full Text
- View/download PDF
11. An Efficient Multiresolution Network for Vehicle Reidentification
- Author
-
Canhui Cai, Jianqing Zhu, Zhen Lei, Fei Shen, Jingchang Huang, Xiaobin Zhu, and Huanqiang Zeng
- Subjects
Backbone network ,Computer Networks and Communications ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Process (computing) ,Function (mathematics) ,Computer Science Applications ,Image (mathematics) ,Feature Dimension ,Hardware and Architecture ,Signal Processing ,Overhead (computing) ,Computer vision ,Artificial intelligence ,Scale (map) ,business ,Information Systems ,Resolution (algebra) - Abstract
In general, vehicle images have varying resolutions due to vehicles’ movements and different camera settings. However, most existing vehicle re-identification models are single-resolution deep networks trained with pre-uniformly resizing vehicle images, which underestimates adverse effects of varying resolutions and leads to unsatisfactory performance. A straightforward solution for dealing with varying resolutions is to train multiple vehicle re-identification models. Each model is independently trained with images of a specific resolution. However, this straightforward solution requires significant overhead and ignores intrinsic associations among different resolution images. For that, an efficient multi-resolution network (EMRN) is proposed for vehicle re-identification in this paper. Firstly, EMRN embeds a newly-designed multi-resolution feature dimension uniform module (MR-FDUM) behind a traditional backbone network (i.e., ResNet-50). As a result, the whole model can extract fixed dimensional features from different resolution images so that it can be trained with one loss function of fixed dimensional parameters rather than training multiple models. Secondly, a multi-resolution image randomly feeding strategy is designed to train EMRN, making each mini-batch data of a random resolution during the training process. Consequently, EMRN can implicitly learn collaborative multi-resolution features via only a unitary deep network. Experiments on three large scale datasets, i.e., VeRi776, VehicleID, and VRIC, demonstrate that EMRN is superior to state-of-the-art vehicle re-identification methods.
- Published
- 2022
- Full Text
- View/download PDF
12. Deep Coarse-to-Fine Dense Light Field Reconstruction With Flexible Sampling and Geometry-Aware Fusion
- Author
-
Jie Chen, Sam Kwong, Jing Jin, Jingyi Yu, Junhui Hou, and Huanqiang Zeng
- Subjects
FOS: Computer and information sciences ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer Science - Computer Vision and Pattern Recognition ,Geometry ,02 engineering and technology ,Iterative reconstruction ,Artificial Intelligence ,FOS: Electrical engineering, electronic engineering, information engineering ,0202 electrical engineering, electronic engineering, information engineering ,Angular resolution ,Image resolution ,business.industry ,Applied Mathematics ,Deep learning ,Image and Video Processing (eess.IV) ,Sampling (statistics) ,Electrical Engineering and Systems Science - Image and Video Processing ,Image-based modeling and rendering ,Computational Theory and Mathematics ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Parallax ,business ,Software ,Light field - Abstract
A densely-sampled light field (LF) is highly desirable in various applications, such as 3-D reconstruction, post-capture refocusing and virtual reality. However, it is costly to acquire such data. Although many computational methods have been proposed to reconstruct a densely-sampled LF from a sparsely-sampled one, they still suffer from either low reconstruction quality, low computational efficiency, or the restriction on the regularity of the sampling pattern. To this end, we propose a novel learning-based method, which accepts sparsely-sampled LFs with irregular structures, and produces densely-sampled LFs with arbitrary angular resolution accurately and efficiently. We also propose a simple yet effective method for optimizing the sampling pattern. Our proposed method, an end-to-end trainable network, reconstructs a densely-sampled LF in a coarse-to-fine manner. Specifically, the coarse sub-aperture image (SAI) synthesis module first explores the scene geometry from an unstructured sparsely-sampled LF and leverages it to independently synthesize novel SAIs, in which a confidence-based blending strategy is proposed to fuse the information from different input SAIs, giving an intermediate densely-sampled LF. Then, the efficient LF refinement module learns the angular relationship within the intermediate result to recover the LF parallax structure. Comprehensive experimental evaluations demonstrate the superiority of our method on both real-world and synthetic LF images when compared with state-of-the-art methods. In addition, we illustrate the benefits and advantages of the proposed approach when applied in various LF-based applications, including image-based rendering and depth estimation enhancement., 17 pages, 11 figures, 10 tables
- Published
- 2022
- Full Text
- View/download PDF
13. Robust Maximum Mixture Correntropy Criterion Based One-Class Classification Algorithm
- Author
-
Huanqiang Zeng, Baiying Lei, Jiuwen Cao, Tianlei Wang, and Haozhen Dai
- Subjects
ComputingMethodologies_PATTERNRECOGNITION ,Mean squared error ,Artificial Intelligence ,Computer Networks and Communications ,Robustness (computer science) ,Computer science ,Kernel (statistics) ,Outlier ,One-class classification ,Anomaly detection ,Similarity measure ,Algorithm ,Extreme learning machine - Abstract
One-class classification achieves anomaly/outlier detection by exploiting the characteristics of target data. As a local similarity measure defined in kernel space, correntropy is generally more robust than the mean square error (MSE) based criterion in dealing with large outliers. In this paper, the maximum mixture correntropy criterion (MMCC) with multiple kernels are applied to the shallow and hierarchical one-class extreme learning machine (OC-ELM) to enhance the model robustness and learning speed. Experiments on benchmark UCI classification datasets, urban acoustic classification (UAC) dataset and 4 synthetic datasets are carried out to show the effectiveness and comparisons with several state-of-the-art methods are provided to demonstrate the superiority of the proposed algorithms.
- Published
- 2022
- Full Text
- View/download PDF
14. A Hybrid Compression Framework for Color Attributes of Static 3D Point Clouds
- Author
-
Sam Kwong, Hao Liu, Hui Yuan, Huanqiang Zeng, Junhui Hou, and Qi Liu
- Subjects
Rate–distortion optimization ,Computer science ,Media Technology ,Discrete cosine transform ,Redundancy (engineering) ,Point cloud ,Graph (abstract data type) ,Sparse approximation ,Electrical and Electronic Engineering ,Algorithm ,Block (data storage) ,Volume (compression) - Abstract
The emergence of 3D point clouds (3DPCs) is promoting the rapid development of immersive communication, autonomous driving, and so on. Due to the huge data volume, the compression of 3DPCs is becoming more and more attractive. We propose a novel and efficient color attribute compression method for static 3DPCs. First, a 3DPC is partitioned into several sub-point clouds by color distribution analysis. Each sub-point cloud is then decomposed into a lot of 3D blocks by an improved k-d tree-based decomposition algorithm. Afterwards, a novel virtual adaptive sampling-based sparse representation strategy is proposed for each 3D block to remove the redundancy among points, in which the bases of the graph transform (GT) and the discrete cosine transform (DCT) are used as candidates of the complete dictionary. Experimental results over 10 common 3DPCs demonstrate that the proposed method can achieve superior or comparable coding performance when compared with the current state-of-the-art methods.
- Published
- 2022
- Full Text
- View/download PDF
15. A Spatial and Geometry Feature-Based Quality Assessment Model for the Light Field Images
- Author
-
Hailiang Huang, Huanqiang Zeng, Junhui Hou, Jing Chen, Jianqing Zhu, and Kai-Kuang Ma
- Subjects
Computer Graphics and Computer-Aided Design ,Software - Abstract
This paper proposes a new full-reference image quality assessment (IQA) model for performing perceptual quality evaluation on light field (LF) images, called the spatial and geometry feature-based model (SGFM). Considering that the LF image describe both spatial and geometry information of the scene, the spatial features are extracted over the sub-aperture images (SAIs) by using contourlet transform and then exploited to reflect the spatial quality degradation of the LF images, while the geometry features are extracted across the adjacent SAIs based on 3D-Gabor filter and then explored to describe the viewing consistency loss of the LF images. These schemes are motivated and designed based on the fact that the human eyes are more interested in the scale, direction, contour from the spatial perspective and viewing angle variations from the geometry perspective. These operations are applied to the reference and distorted LF images independently. The degree of similarity can be computed based on the above-measured quantities for jointly arriving at the final IQA score of the distorted LF image. Experimental results on three commonly-used LF IQA datasets show that the proposed SGFM is more in line with the quality assessment of the LF images perceived by the human visual system (HVS), compared with multiple classical and state-of-the-art IQA models.
- Published
- 2022
- Full Text
- View/download PDF
16. Point Cloud Quality Assessment via 3D Edge Similarity Measurement
- Author
-
Zian Lu, Hailiang Huang, Huanqiang Zeng, Junhui Hou, and Kai-Kuang Ma
- Subjects
Applied Mathematics ,Signal Processing ,Electrical and Electronic Engineering - Published
- 2022
- Full Text
- View/download PDF
17. 3D-Gradient Guided Rate Control Model for Screen Content Video Coding
- Author
-
Jing Chen, Linlin Chen, Huanqiang Zeng, Chih-Hsien Hsia, Tianlei Wang, and Kai-Kuang Ma
- Subjects
Signal Processing ,Media Technology ,Electrical and Electronic Engineering ,Computer Science Applications - Published
- 2022
- Full Text
- View/download PDF
18. Ensemble Clustering Based on Manifold Broad Learning System
- Author
-
Yifan Shi, Dexin Chen, Longtao Chen, Kaixiang Yang, and Huanqiang Zeng
- Published
- 2022
- Full Text
- View/download PDF
19. Viewpoint robust knowledge distillation for accelerating vehicle re-identification
- Author
-
Fei Shen, Yi Xie, Jianqing Zhu, and Huanqiang Zeng
- Subjects
Exploit ,TK7800-8360 ,Computer science ,Computation ,Pooling ,Posterior probability ,02 engineering and technology ,Knowledge distillation ,TK5101-6720 ,010501 environmental sciences ,01 natural sciences ,Task (project management) ,Vehicle re-identification ,Computer engineering ,restrict ,0202 electrical engineering, electronic engineering, information engineering ,Telecommunication ,020201 artificial intelligence & image processing ,Layer (object-oriented design) ,Electronics ,Divergence (statistics) ,0105 earth and related environmental sciences - Abstract
Vehicle re-identification is a challenging task that matches vehicle images captured by different cameras. Recent vehicle re-identification approaches exploit complex deep networks to learn viewpoint robust features for obtaining accurate re-identification results, which causes large computations in their testing phases to restrict the vehicle re-identification speed. In this paper, we propose a viewpoint robust knowledge distillation (VRKD) method for accelerating vehicle re-identification. The VRKD method consists of a complex teacher network and a simple student network. Specifically, the teacher network uses quadruple directional deep networks to learn viewpoint robust features. The student network only contains a shallow backbone sub-network and a global average pooling layer. The student network distills viewpoint robust knowledge from the teacher network via minimizing the Kullback-Leibler divergence between the posterior probability distributions resulted from the student and teacher networks. As a result, the vehicle re-identification speed is significantly accelerated since only the student network of small testing computations is demanded. Experiments on VeRi776 and VehicleID datasets show that the proposed VRKD method outperforms many state-of-the-art vehicle re-identification approaches with better accurate and speed performance.
- Published
- 2021
20. A Conv-Attention Network for Detecting the Presence of ENF Signal in Short-Duration Audio
- Author
-
Yangfu Li, Xiaodan Lin, Yingqiang Qiu, and Huanqiang Zeng
- Published
- 2022
- Full Text
- View/download PDF
21. Screen Content Video Quality Assessment Model Using Hybrid Spatiotemporal Features
- Author
-
Huanqiang Zeng, Hailiang Huang, Junhui Hou, Jiuwen Cao, Yongtao Wang, and Kai-Kuang Ma
- Subjects
Databases, Factual ,Video Recording ,Humans ,Computer Graphics and Computer-Aided Design ,Software ,Algorithms - Abstract
In this paper, a full-reference video quality assessment (VQA) model is designed for the perceptual quality assessment of the screen content videos (SCVs), called the hybrid spatiotemporal feature-based model (HSFM). The SCVs are of hybrid structure including screen and natural scenes, which are perceived by the human visual system (HVS) with different visual effects. With this consideration, the three dimensional Laplacian of Gaussian (3D-LOG) filter and three dimensional Natural Scene Statistics (3D-NSS) are exploited to extract the screen and natural spatiotemporal features, based on the reference and distorted SCV sequences separately. The similarities of these extracted features are then computed independently, followed by generating the distorted screen and natural quality scores for screen and natural scenes. After that, an adaptive screen and natural quality fusion scheme through the local video activity is developed to combine them for arriving at the final VQA score of the distorted SCV under evaluation. The experimental results on the Screen Content Video Database (SCVD) and Compressed Screen Content Video Quality (CSCVQ) databases have shown that the proposed HSFM is more in line with the perceptual quality assessment of the SCVs perceived by the HVS, compared with a variety of classic and latest IQA/VQA models.
- Published
- 2022
22. Deep Posterior Distribution-Based Embedding for Hyperspectral Image Super-Resolution
- Author
-
Jinhui Hou, Zhiyu Zhu, Junhui Hou, Huanqiang Zeng, Jinjian Wu, and Jiantao Zhou
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Image and Video Processing (eess.IV) ,FOS: Electrical engineering, electronic engineering, information engineering ,Computer Science - Computer Vision and Pattern Recognition ,Electrical Engineering and Systems Science - Image and Video Processing ,Computer Graphics and Computer-Aided Design ,Software - Abstract
In this paper, we investigate the problem of hyperspectral (HS) image spatial super-resolution via deep learning. Particularly, we focus on how to embed the high-dimensional spatial-spectral information of HS images efficiently and effectively. Specifically, in contrast to existing methods adopting empirically-designed network modules, we formulate HS embedding as an approximation of the posterior distribution of a set of carefully-defined HS embedding events, including layer-wise spatial-spectral feature extraction and network-level feature aggregation. Then, we incorporate the proposed feature embedding scheme into a source-consistent super-resolution framework that is physically-interpretable, producing lightweight PDE-Net, in which high-resolution (HR) HS images are iteratively refined from the residuals between input low-resolution (LR) HS images and pseudo-LR-HS images degenerated from reconstructed HR-HS images via probability-inspired HS embedding. Extensive experiments over three common benchmark datasets demonstrate that PDE-Net achieves superior performance over state-of-the-art methods. Besides, the probabilistic characteristic of this kind of networks can provide the epistemic uncertainty of the network outputs, which may bring additional benefits when used for other HS image-based applications. The code will be publicly available at https://github.com/jinnh/PDE-Net., Accepted by IEEE Transactions on Image Processing
- Published
- 2022
23. A Light Field Image Quality Assessment Model Based on Symmetry and Depth Features
- Author
-
Huanqiang Zeng, Yu Tian, Kai-Kuang Ma, Jing Chen, Junhui Hou, and Jianqing Zhu
- Subjects
Similarity (geometry) ,business.industry ,Computer science ,Machine vision ,Image quality ,Distortion (optics) ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,02 engineering and technology ,Luminance ,Feature (computer vision) ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,020201 artificial intelligence & image processing ,Artificial intelligence ,Electrical and Electronic Engineering ,Symmetry (geometry) ,business ,Light field - Abstract
This paper presents a new full-reference image quality assessment (IQA) method for conducting the perceptual quality evaluation of the light field (LF) images, called the symmetry and depth feature-based model (SDFM). Specifically, the radial symmetry transform is first employed on the luminance components of the reference and distorted LF images to extract their symmetry features for capturing the spatial quality of each view of an LF image. Second, the depth feature extraction scheme is designed to explore the geometry information inherited in an LF image for modeling its LF structural consistency across views. The similarity measurements are subsequently conducted on the comparison of their symmetry and depth features separately, which are further combined to achieve the quality score for the distorted LF image. Note that the proposed SDFM that explores the symmetry and depth features is conformable to the human vision system, which identifies the objects by sensing their structures and geometries. Extensive simulation results on the dense light fields dataset have clearly shown that the proposed SDFM outperforms multiple classical and recently developed IQA algorithms on quality evaluation of the LF images.
- Published
- 2021
- Full Text
- View/download PDF
24. Cascading Scene and Viewpoint Feature Learning for Pedestrian Gender Recognition
- Author
-
Kai-Kuang Ma, Jiuwen Cao, Yongtao Wang, Jianqing Zhu, Huanqiang Zeng, and Lei Cai
- Subjects
Scheme (programming language) ,Computer Networks and Communications ,Computer science ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,0211 other engineering and technologies ,02 engineering and technology ,Pedestrian ,0202 electrical engineering, electronic engineering, information engineering ,Segmentation ,Computer vision ,ComputingMethodologies_COMPUTERGRAPHICS ,computer.programming_language ,021110 strategic, defence & security studies ,business.industry ,Computer Science Applications ,Hardware and Architecture ,Signal Processing ,Task analysis ,Key (cryptography) ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Feature learning ,Information Systems - Abstract
Pedestrian gender recognition plays an important role in smart city. To effectively improve the pedestrian gender recognition performance, a new method, called cascading scene and viewpoint feature learning (CSVFL), is proposed in this article. The novelty of the proposed CSVFL lies on the joint consideration of two crucial challenges in pedestrian gender recognition, namely, scene and viewpoint variation. For that, the proposed CSVFL starts with the scene transfer (ST) scheme, followed by the viewpoint adaptation (VA) scheme in a cascading manner. Specifically, the ST scheme exploits the key pedestrian segmentation network to extract the key pedestrian masks for the subsequent key pedestrian transfer generative adversarial network, with the goal of encouraging the input pedestrian image to have the similar style to the target scene while preserving the image details of the key pedestrian as much as possible. Afterward, the obtained scene-transferred pedestrian images are fed to train the deep feature learning network with the VA scheme, in which each neuron will be enabled/disabled for different viewpoints depending on whether it has contribution on the corresponding viewpoint. Extensive experiments conducted on the commonly used pedestrian attribute data sets have demonstrated that the proposed CSVFL approach outperforms multiple recently reported pedestrian gender recognition methods.
- Published
- 2021
- Full Text
- View/download PDF
25. Deep Rank Cross-Modal Hashing with Semantic Consistent for Image-Text Retrieval
- Author
-
Xiaoqing Liu, Huanqiang Zeng, Yifan Shi, Jianqing Zhu, and Kai-Kuang Ma
- Published
- 2022
- Full Text
- View/download PDF
26. Joint Pyramid Feature Representation Network for Vehicle Re-identification
- Author
-
Jinhui Hou, Jianqing Zhu, Jing Chen, Huanqiang Zeng, Jiuwen Cao, and Lin Xiangwei
- Subjects
Computer Networks and Communications ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020206 networking & telecommunications ,02 engineering and technology ,Image (mathematics) ,Convolution ,Discriminative model ,Hardware and Architecture ,Feature (computer vision) ,Softmax function ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,Pyramid (image processing) ,Representation (mathematics) ,business ,Intelligent transportation system ,Software ,Information Systems - Abstract
Vehicle re-identification (Re-ID) technology plays an important role in the intelligent transportation system for smart city. Due to various uncertain factors in the real-world scenarios, (e.g., resolution variation, viewpoint variation, illumination changes, occlusion, etc., vehicle Re-ID is a very challenging task. To resist the adverse effect of resolution variation, a joint pyramid feature representation network (JPFRN) for vehicle Re-ID is proposed in this paper. Based on the consideration that various convolution blocks with different depths hold different resolutions and semantic information of the vehicle image, the proposed JPFRN method employs a base network to obtain multi-resolution vehicle features in the first stage. Then, a pyramid feature representation scheme is developed to reconstruct and integrate the obtained multi-resolution vehicle features together. Finally, these pyramid features are jointly represented for learning a more discriminative feature under the supervision of joint Triplet loss and softmax loss. Extensive experimental results on two commonly-used vehicle databases (i.e., VehicleID and VeRi) show that the proposed JPFRN is superior to multiple recently-developed vehicle Re-ID methods.
- Published
- 2020
- Full Text
- View/download PDF
27. Object Reidentification via Joint Quadruple Decorrelation Directional Deep Networks in Smart Transportation
- Author
-
Jingchang Huang, Xiaoqing Ye, Baoqing Li, Zhen Lei, Lixin Zheng, Jianqing Zhu, and Huanqiang Zeng
- Subjects
Matching (statistics) ,Computer Networks and Communications ,Computer science ,business.industry ,Deep learning ,Pattern recognition ,Function (mathematics) ,Object (computer science) ,Computer Science Applications ,Hardware and Architecture ,Signal Processing ,Softmax function ,Artificial intelligence ,business ,Decorrelation ,Feature learning ,Information Systems - Abstract
Object reidentification with the goal of matching pedestrian or vehicle images captured from different camera viewpoints is of considerable significance to public security. Quadruple directional deep learning features (QD-DLFs) can comprehensively describe object images. However, the correlation among QD-DLFs is an unavoidable problem, since QD-DLFs are learned with quadruple independent directional deep networks (QIDDNs) driven with the same training data, and each network holds the same basic deep feature learning architecture (BDFLA). The correlation among QD-DLFs is harmful to the complementarity of QD-DLFs, restricting the object reidentification performance. For that, we propose joint quadruple decorrelation directional deep networks (JQD3Ns) to reduce the correlation among the learned QD-DLFs. In order to jointly train JQD3Ns, besides the softmax loss functions, a parameter correlation cost function is proposed to indirectly reduce the correlation among QD-DLFs by enlarging the dissimilarity among the parameters of JQD3Ns. Extensive experiments on three publicly available large-scale data sets demonstrate that the proposed JQD3Ns approach is superior to multiple state-of-the-art object reidentification methods.
- Published
- 2020
- Full Text
- View/download PDF
28. Body Symmetry and Part-Locality-Guided Direct Nonparametric Deep Feature Enhancement for Person Reidentification
- Author
-
Jingchang Huang, Canhui Cai, Zhen Lei, Lixin Zheng, Xiaobin Zhu, Jianqing Zhu, and Huanqiang Zeng
- Subjects
Normalization (statistics) ,Computer Networks and Communications ,Computer science ,business.industry ,Deep learning ,Locality ,Pattern recognition ,02 engineering and technology ,010501 environmental sciences ,01 natural sciences ,Convolutional neural network ,Computer Science Applications ,Hardware and Architecture ,Feature (computer vision) ,Face (geometry) ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Feature learning ,Similarity learning ,0105 earth and related environmental sciences ,Information Systems - Abstract
In recent years, deep learning (DL) has been successfully and widely applied in the person reidentification (Re-ID). However, the DL-based person Re-ID methods face a bottleneck that the scales of most existing person Re-ID databases are not large enough for training very deep models. To address this problem, a body symmetry and part-locality-guided direct nonparametric deep feature enhancement (DNDFE) method is proposed in this article. Based on the observation that the body symmetry and part locality are two important appearance properties inherited in the upright walking persons, the proposed method designs two nonparametric layers, namely, the body symmetry average pooling and local normalization layers, to construct a DNDFE module to well explore the body symmetry and part locality properties. The proposed DNDFE module could be directly embedded between the traditional deep feature learning module and similarity learning module to enhance the DL features so as to improve the person Re-ID performance. The experimental results have shown that the proposed DNDFE method is superior to multiple state-of-the-art person Re-ID methods in terms of accuracy and efficiency.
- Published
- 2020
- Full Text
- View/download PDF
29. 3D Point Cloud Attribute Compression Using Geometry-Guided Sparse Representation
- Author
-
Kai-Kuang Ma, Huanqiang Zeng, Hui Yuan, Junhui Hou, Shuai Gu, and School of Electrical and Electronic Engineering
- Subjects
Optimization problem ,Sparse Representation ,Computer science ,Point cloud ,3D Point Cloud ,02 engineering and technology ,Sparse approximation ,Computer Graphics and Computer-Aided Design ,Redundancy (information theory) ,Compression (functional analysis) ,Electrical and electronic engineering [Engineering] ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Entropy encoding ,Algorithm ,Software ,Block (data storage) ,Data compression - Abstract
3D point clouds associated with attributes are considered as a promising paradigm for immersive communication. However, the corresponding compression schemes for this media are still in the infant stage. Moreover, in contrast to conventional image/video compression, it is a more challenging task to compress 3D point cloud data, arising from the irregular structure. In this paper, we propose a novel and effective compression scheme for the attributes of voxelized 3D point clouds. In the first stage, an input voxelized 3D point cloud is divided into blocks of equal size. Then, to deal with the irregular structure of 3D point clouds, a geometry-guided sparse representation (GSR) is proposed to eliminate the redundancy within each block, which is formulated as an ℓ0-norm regularized optimization problem. Also, an inter-block prediction scheme is applied to remove the redundancy between blocks. Finally, by quantitatively analyzing the characteristics of the resulting transform coefficients by GSR, an effective entropy coding strategy that is tailored to our GSR is developed to generate the bitstream. Experimental results over various benchmark datasets show that the proposed compression scheme is able to achieve better rate-distortion performance and visual quality, compared with state-of-the-art methods. This work was supported in part by the National Natural Science Foundation of China under Grant 61871434, Grant 61871342, and Grant 61571274, in part by the Natural Science Foundation for Outstanding Young Scholars of Fujian Province under Grant 2019J06017, in part by the Hong Kong RGC Early Career Scheme Funds under Grant 9048123, in part by the Shandong Provincial Key Research and Development Plan under Grant 2017CXGC150, in part by the Fujian-100 Talented People Program, in part by the High-level Talent Innovation Program of Quanzhou City under Grant 2017G027, in part by the Promotion Program for Young and Middle-aged Teacher in Science and Technology Research of Huaqiao University under Grant ZQN-YX403, and in part by the High-Level Talent Project Foundation of Huaqiao University under Grant 14BS201 and Grant14BS204. Part of this article was presented at the IEEE ICASSP2017
- Published
- 2020
- Full Text
- View/download PDF
30. 3D Point Cloud Attribute Compression via Graph Prediction
- Author
-
Hui Yuan, Shuai Gu, Huanqiang Zeng, and Junhui Hou
- Subjects
Computer science ,Applied Mathematics ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,Point cloud ,Entropy (information theory) ,020206 networking & telecommunications ,02 engineering and technology ,Electrical and Electronic Engineering ,External Data Representation ,Algorithm ,Graph - Abstract
3D point clouds associated with attributes are considered as a promising data representation for immersive communication. The large amount of data, however, poses great challenges to the subsequent transmission and storage processes. In this letter, we propose a new compression scheme for the color attribute of static voxelized 3D point clouds. Specifically, we first partition the colors of a 3D point cloud into clusters by applying k-d tree to the geometry information, which are then successively encoded. To eliminate the redundancy, we propose a novel prediction module, namely graph prediction, in which a small number of representative points selected from previously encoded clusters are used to predict the points to be encoded by exploring the underlying graph structure constructed from the geometry information. Furthermore, the prediction residuals are transformed with the graph transform, and the resulting transform coefficients are finally uniformly quantified and entropy encoded. Experimental results show that the proposed compression scheme is able to achieve better rate-distortion performance at a lower computational cost when compared with state-of-the-art methods.
- Published
- 2020
- Full Text
- View/download PDF
31. H-ViT: Hybrid Vision Transformer for Multi-modal Vehicle Re-identification
- Author
-
Wenjie Pan, Hanxiao Wu, Jianqing Zhu, Huanqiang Zeng, and Xiaobin Zhu
- Published
- 2022
- Full Text
- View/download PDF
32. Multi-scale Attentive Image De-raining Networks via Neural Architecture Search
- Author
-
Lei Cai, Yuli Fu, Wanliang Huo, Youjun Xiang, Tao Zhu, Ying Zhang, Huanqiang Zeng, and Delu Zeng
- Subjects
FOS: Computer and information sciences ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,Media Technology ,Electrical and Electronic Engineering - Abstract
Multi-scale architectures and attention modules have shown effectiveness in many deep learning-based image de-raining methods. However, manually designing and integrating these two components into a neural network requires a bulk of labor and extensive expertise. In this article, a high-performance multi-scale attentive neural architecture search (MANAS) framework is technically developed for image deraining. The proposed method formulates a new multi-scale attention search space with multiple flexible modules that are favorite to the image de-raining task. Under the search space, multi-scale attentive cells are built, which are further used to construct a powerful image de-raining network. The internal multiscale attentive architecture of the de-raining network is searched automatically through a gradient-based search algorithm, which avoids the daunting procedure of the manual design to some extent. Moreover, in order to obtain a robust image de-raining model, a practical and effective multi-to-one training strategy is also presented to allow the de-raining network to get sufficient background information from multiple rainy images with the same background scene, and meanwhile, multiple loss functions including external loss, internal loss, architecture regularization loss, and model complexity loss are jointly optimized to achieve robust de-raining performance and controllable model complexity. Extensive experimental results on both synthetic and realistic rainy images, as well as the down-stream vision applications (i.e., objection detection and segmentation) consistently demonstrate the superiority of our proposed method. The code is publicly available at https://github.com/lcai-gz/MANAS.
- Published
- 2022
- Full Text
- View/download PDF
33. Content-aware Warping for View Synthesis
- Author
-
Mantang Guo, Junhui Hou, Jing Jin, Hui Liu, Huanqiang Zeng, and Jiwen Lu
- Subjects
FOS: Computer and information sciences ,Computational Theory and Mathematics ,Artificial Intelligence ,Applied Mathematics ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,Computer Vision and Pattern Recognition ,Software - Abstract
Existing image-based rendering methods usually adopt depth-based image warping operation to synthesize novel views. In this paper, we reason the essential limitations of the traditional warping operation to be the limited neighborhood and only distance-based interpolation weights. To this end, we propose content-aware warping, which adaptively learns the interpolation weights for pixels of a relatively large neighborhood from their contextual information via a lightweight neural network. Based on this learnable warping module, we propose a new end-to-end learning-based framework for novel view synthesis from a set of input source views, in which two additional modules, namely confidence-based blending and feature-assistant spatial refinement, are naturally proposed to handle the occlusion issue and capture the spatial correlation among pixels of the synthesized view, respectively. Besides, we also propose a weight-smoothness loss term to regularize the network. Experimental results on light field datasets with wide baselines and multi-view datasets show that the proposed method significantly outperforms state-of-the-art methods both quantitatively and visually. The source code will be publicly available at https://github.com/MantangGuo/CW4VS., Comment: arXiv admin note: text overlap with arXiv:2108.07408
- Published
- 2022
- Full Text
- View/download PDF
34. CLSR: Cross-layer Interaction Pyramid Super-Resolution Network
- Author
-
Detian Huang, Xiancheng Zhu, Xiaorui Li, and Huanqiang Zeng
- Subjects
Media Technology ,Electrical and Electronic Engineering - Published
- 2023
- Full Text
- View/download PDF
35. Dual Modal Meta Metric Learning for Attribute-Image Person Re-identification
- Author
-
Rongxian Xu, Fei Shen, Hanxiao Wu, Jianqing Zhu, and Huanqiang Zeng
- Published
- 2021
- Full Text
- View/download PDF
36. Learning Spatial-angular Fusion for Compressive Light Field Imaging in a Cycle-consistent Framework
- Author
-
Jing Jin, Junhui Hou, Xianqiang Lyu, Mantang Guo, Zhiyu Zhu, and Huanqiang Zeng
- Subjects
Fusion ,Computer science ,business.industry ,Deep learning ,Feature extraction ,Posterior probability ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Inverse problem ,Set (abstract data type) ,Artificial intelligence ,Coded aperture ,business ,Algorithm ,Light field - Abstract
This paper investigates the 4-D light field (LF) reconstruction from 2-D measurements captured by the coded aperture camera. To tackle such an ill-posed inverse problem, we propose a cycle-consistent reconstruction network (CR-Net). To be specific, based on the intrinsic linear imaging model of the coded aperture, CR-Net reconstructs an LF through progressively eliminating the residuals between the projected measurements from the reconstructed LF and input measurements. Moreover, to address the crucial issue of extracting representative features from high-dimensional LF data efficiently and effectively, we formulate the problem in a probability space and propose to approximate a posterior distribution of a set of carefully-defined LF processing events, including both layer-wise spatial-angular feature extraction and network-level feature aggregation. Through droppath from a densely-connected template network, we derive an adaptively learned spatial-angular fusion strategy, which is sharply contrasted with existing manners that combine spatial and angular features empirically. Extensive experiments on both simulated measurements and measurements by a real coded aperture camera demonstrate the significant advantage of our method over state-of-the-art ones, i.e., our method improves the reconstruction quality by 4.5 dB.
- Published
- 2021
- Full Text
- View/download PDF
37. Semantic-embedded Unsupervised Spectral Reconstruction from Single RGB Images in the Wild
- Author
-
Zhiyu Zhu, Hui Liu, Junhui Hou, Huanqiang Zeng, and Qingfu Zhang
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer Science - Computer Vision and Pattern Recognition - Abstract
This paper investigates the problem of reconstructing hyperspectral (HS) images from single RGB images captured by commercial cameras, \textbf{without} using paired HS and RGB images during training. To tackle this challenge, we propose a new lightweight and end-to-end learning-based framework. Specifically, on the basis of the intrinsic imaging degradation model of RGB images from HS images, we progressively spread the differences between input RGB images and re-projected RGB images from recovered HS images via effective unsupervised camera spectral response function estimation. To enable the learning without paired ground-truth HS images as supervision, we adopt the adversarial learning manner and boost it with a simple yet effective $\mathcal{L}_1$ gradient clipping scheme. Besides, we embed the semantic information of input RGB images to locally regularize the unsupervised learning, which is expected to promote pixels with identical semantics to have consistent spectral signatures. In addition to conducting quantitative experiments over two widely-used datasets for HS image reconstruction from synthetic RGB images, we also evaluate our method by applying recovered HS images from real RGB images to HS-based visual tracking. Extensive results show that our method significantly outperforms state-of-the-art unsupervised methods and even exceeds the latest supervised method under some settings. The source code is public available at https://github.com/zbzhzhy/Unsupervised-Spectral-Reconstruction., Accepted to ICCV 2021
- Published
- 2021
38. Light Field Image Quality Assessment Using Contourlet Transform
- Author
-
Jing Chen, Huanqiang Zeng, Canhui Cai, Kai-Kuang Ma, and Hailiang Huang
- Subjects
business.industry ,Computer science ,Image quality ,Feature extraction ,Human visual system model ,Metric (mathematics) ,Line (geometry) ,Pattern recognition ,Artificial intelligence ,business ,Contourlet ,Light field ,Image (mathematics) - Abstract
In this paper, a new full-reference image quality assessment (IQA) method for performing perceptual quality evaluation on the light field (LF) image is proposed, called the contourlet transform-based model (CTM). The LF image consists of a set of sub-aperture images (SAIs) that have similar image content but with small angular deviations due to different viewing angles. Hence, the abundant image details can be extracted from the SAIs. To fulfil this goal, the contourlet transform is used to extract the multi-scale spatial features of the reference and distorted SAIs, respectively. Based on our proposed IQA metric, the degree of similarity can be computed based on the above-measured quantities for arriving at the final IQA score of the distorted LF image under evaluation. Experimental simulation results obtained from the dense light fields datasets clearly show that the proposed CTM algorithm is more in line with the quality assessment of the LF images perceived by the human visual system (HVS) when compared with that of using other state-of-the-art IQA algorithms.
- Published
- 2021
- Full Text
- View/download PDF
39. UHD Video Coding: A Light-Weight Learning-Based Fast Super-Block Approach
- Author
-
King Ngi Ngan, Xiandong Meng, Miaohui Wang, Huanqiang Zeng, and Wuyuan Xie
- Subjects
Gamut ,Speedup ,business.industry ,Computer science ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,020201 artificial intelligence & image processing ,02 engineering and technology ,Electrical and Electronic Engineering ,Frame rate ,business ,Computer hardware ,Coding (social sciences) - Abstract
The ultra high-definition (UHD) video format, which has recently become popular, aims to provide high spatial resolution, high temporal frame rate, high sample bit-depth, and wide pixel color gamut. Despite the continued development of global network capacities, it inevitably causes the increased bandwidth cost of catering to the requirement of delivering UHD video services. To address such challenges, this paper presents an improved super coding unit (SCU) method for UHD video coding in High Efficiency Video Coding (HEVC). Initially, the medium coding unit (MCU) is proposed to avoid unnecessary brute-force coding unit (CU) partitions of SCU. Furthermore, the SCU is proposed to be encoded by Direct-MCU and SCU-to-MCU modes: the Direct-MCU mode is intended to better adapt to the texture-rich region, which guarantees the compression efficiency by avoiding extra-size CU partition; the SCU-to-MCU mode is designed for the homogeneous region of UHD content, which saves the encoding time by skipping fine-grained CU partition search. Moreover, a learning-based fast SCU decision approach is proposed to speed up the determination process of Direct-MCU and SCU-to-MCU, where three representative handcrafted features are extracted. Experimental results show that our method achieves an affordable complexity and excellent coding efficiency (up to 7.30% Bjontegaard Delta rate savings) in UHD video coding compared to recent HEVC reference software.
- Published
- 2019
- Full Text
- View/download PDF
40. Deep Quadruplet Appearance Learning for Vehicle Re-Identification
- Author
-
Jing Chen, Junhui Hou, Kai-Kuang Ma, Jinhui Hou, Jianqing Zhu, and Huanqiang Zeng
- Subjects
Computer Networks and Communications ,Computer science ,business.industry ,Deep learning ,Aerospace Engineering ,020302 automobile design & engineering ,Pattern recognition ,02 engineering and technology ,Object (computer science) ,Convolutional neural network ,0203 mechanical engineering ,Discriminative model ,Automotive Engineering ,Softmax function ,Feature (machine learning) ,Artificial intelligence ,Electrical and Electronic Engineering ,business - Abstract
Vehicle re-identification (Re-ID) plays an important role in intelligent transportation systems. It usually suffers from various challenges encountered on the real-life environments, such as viewpoint variations, illumination changes, object occlusions, and other complicated scenarios. To effectively improve the vehicle Re-ID performance, a new method, called the deep quadruplet appearance learning (DQAL), is proposed in this paper. The novelty of the proposed DQAL lies on the consideration of the special difficulty in vehicle Re-ID that the vehicles with the same model and color but different identities (IDs) are highly similar to each other. For that, the proposed DQAL designs the concept of quadruplet and forms the quadruplets as the input, where each quadruplet is composed of the anchor (or target), positive, negative, and the specially considered high-similar (i.e., the same model and color but different IDs with respect to the anchor) vehicle samples. Then, the quadruplet network with the incorporation of the proposed quadruplet loss and softmax loss is developed to learn a more discriminative feature for vehicle Re-ID, especially discerning those difficult high-similar cases. Extensive experiments conducted on two commonly used datasets VeRi-776 and VehicleID have demonstrated that the proposed DQAL approach outperforms multiple recently reported vehicle Re-ID methods.
- Published
- 2019
- Full Text
- View/download PDF
41. Statistical Early Termination and Early Skip Models for Fast Mode Decision in HEVC INTRA Coding
- Author
-
Huanqiang Zeng, Na Li, Gangyi Jiang, Yun Zhang, and Sam Kwong
- Subjects
Computer Networks and Communications ,Hardware and Architecture ,Computer science ,Bit rate ,0202 electrical engineering, electronic engineering, information engineering ,Mode (statistics) ,020206 networking & telecommunications ,020201 artificial intelligence & image processing ,Statistical model ,02 engineering and technology ,Algorithm ,Fast mode ,Coding (social sciences) - Abstract
In this article, statistical Early Termination (ET) and Early Skip (ES) models are proposed for fast Coding Unit (CU) and prediction mode decision in HEVC INTRA coding, in which three categories of ET and ES sub-algorithms are included. First, the CU ranges of the current CU are recursively predicted based on the texture and CU depth of the spatial neighboring CUs. Second, the statistical model based ET and ES schemes are proposed and applied to optimize the CU and INTRA prediction mode decision, in which the coding complexities over different decision layers are jointly minimized subject to acceptable rate-distortion degradation. Third, the mode correlations among the INTRA prediction modes are exploited to early terminate the full rate-distortion optimization in each CU decision layer. Extensive experiments are performed to evaluate the coding performance of each sub-algorithm and the overall algorithm. Experimental results reveal that the overall proposed algorithm can achieve 45.47% to 74.77%, and 58.09% on average complexity reduction, while the overall Bjøntegaard delta bit rate increase and Bjøntegaard delta peak signal-to-noise ratio degradation are 2.29% and −0.11 dB, respectively.
- Published
- 2019
- Full Text
- View/download PDF
42. Multi-label learning with multi-label smoothing regularization for vehicle re-identification
- Author
-
Lei Cai, Jinhui Hou, Jing Chen, Kai-Kuang Ma, Huanqiang Zeng, Jianqing Zhu, and School of Electrical and Electronic Engineering
- Subjects
0209 industrial biotechnology ,Cognitive Neuroscience ,Library science ,Convolutional Neural Network ,Multi label learning ,02 engineering and technology ,Re identification ,Computer Science Applications ,020901 industrial engineering & automation ,Vehicle Re-identification ,Artificial Intelligence ,Electrical and electronic engineering [Engineering] ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Sociology ,China - Abstract
Vehicle re-identification (re-ID) is a vital technique to the urban intelligent video surveillance system and smart city. Given a query vehicle image, the vehicle re-ID aims to search and retrieve the images of the same vehicle that have been captured by different surveillance cameras with various viewing angles. Based on the observation that essential vehicle attributes, like vehicle‘s color and types (e.g., sedan, bus, truck, and so on), could be used as important traits to recognize vehicle, an effective multi-label learning (MLL) method is proposed in this paper that can simultaneously learn three labels: vehicle's ID, type, and color. With three labels, a multi-label smoothing regularization (MLSR) is further proposed, which can allocate a uniform label distribution to the multi-labeled training images to regularize MLL model and improve vehicle re-ID performance. Extensive experiments conducted on the VeRi and VehicleID datasets have demonstrated that the proposed MLL with MLSR approach can effectively improve the performance delivered by the baseline and outperform multiple state-of-the-art vehicle re-ID methods as well. This work was supported in part by the National Natural Science Foundation of China under the grants 61871434, 61602191, and 61802136, in part by the Natural Science Foundation of Fujian Province under the grants 2019J06017, 2016J01308 and 2017J05103, in part by the Fujian-100 Talented People Program, in part by High-level Talent Innovation Program of Quanzhou City under the grant 2017G027, in part by the Promotion Program for Young and Middle-aged Teacher in Science and Technology Research of Huaqiao University under the grants ZQN-YX403 and ZQN-PY418, and in part by the High-Level Talent Project Foundation of Huaqiao University under the grants 14BS201, 14BS204 and 16BS108, and in part by the Graduate Student Scientific Research Innovation Project Foundation of Huaqiao University
- Published
- 2019
- Full Text
- View/download PDF
43. Visual Attention Guided Pixel-Wise Just Noticeable Difference Model
- Author
-
Zhipeng Zeng, Jing Chen, Kai-Kuang Ma, Jianqing Zhu, Yun Zhang, Huanqiang Zeng, and School of Electrical and Electronic Engineering
- Subjects
Masking (art) ,Visual perception ,General Computer Science ,orientation complexity ,Computer science ,Just-noticeable difference ,media_common.quotation_subject ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Texture (music) ,Luminance ,Just Noticeable Difference ,Distortion ,Contrast (vision) ,General Materials Science ,Computer vision ,Just noticeable difference ,media_common ,ComputingMethodologies_COMPUTERGRAPHICS ,Pixel ,business.industry ,Orientation (computer vision) ,General Engineering ,Real image ,visual attention ,Electrical and electronic engineering [Engineering] ,Orientation Complexity ,Artificial intelligence ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,business ,lcsh:TK1-9971 - Abstract
The just noticeable difference (JND) models in pixel domain are generally composed of luminance adaptation (LA) and contrast masking (CM), which takes edge masking (EM) and texture masking (TM) into consideration. However, in existing pixel-wise JND models, CM is not evaluated appropriately since they overestimate the masking effect of regular oriented texture regions and neglect the visual attention characteristic of human eyes for the real image. In this work, a novel JND model in pixel domain is proposed, where orderly texture masking (OTM) for regular texture areas (also called orderly texture regions) and disorderly texture masking (DTM) for complex texture areas (also called disorderly texture regions) are presented based on the orientation complexity. Meanwhile, the visual saliency is set as the weighting factor and is incorporated into CM evaluation to enhance JND thresholds. Experimental results indicate that compared with existing relevant JND profiles, the proposed JND model tolerates more distortion in the same perceptual quality, and brings better visual perception in the same level of the injected JND-noise energy. Published version
- Published
- 2019
44. Bi-Layer Texture Discriminant Fast Depth Intra Coding for 3D-HEVC
- Author
-
Jing Chen, Jiabao Zuo, Canhui Cai, Kai-Kuang Ma, Huanqiang Zeng, and School of Electrical and Electronic Engineering
- Subjects
General Computer Science ,Computer science ,General Engineering ,3D-HEVC ,020206 networking & telecommunications ,02 engineering and technology ,Bi layer ,Depth Intra Coding ,Discriminant ,fast mode decision ,Engineering::Electrical and electronic engineering [DRNTU] ,Bit rate ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,General Materials Science ,CU size decision ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,Depth intra coding ,lcsh:TK1-9971 ,Algorithm ,Coding (social sciences) - Abstract
3D-HEVC explores several new intra-frame depth coding tools to improve the coding efficiency, but at the price of high computational complexity, which hinders the practical applications of 3D-HEVC. Therefore, a bi-layer texture discriminant fast depth intra coding for 3D-HEVC is proposed in this paper. With a sum of gradient matrix (SGM), the texture complexity (TC) of current depth coding unit (CU) and its subblocks are calculated at once. The values of TCs indicate whether to split the depth CU further and skip unnecessary DMMs checking. The experimental results show that compared with the original 3D-HEVC, the proposed algorithm reduces the depth encoding time by 44.8%, and the average bit rate of the synthetic viewpoint is only increased by 0.38%, which outperforms the state-of-the-art fast 3D-HEVC algorithms. Published version
- Published
- 2019
- Full Text
- View/download PDF
45. Spatial-Frequency Hevc Multiple Description Video Coding with Adaptive Perceptual Redundancy Allocation
- Author
-
Feifeng Wang, Jing Chen, Huanqiang Zeng, and Canhui Cai
- Subjects
History ,Polymers and Plastics ,Signal Processing ,Media Technology ,Computer Vision and Pattern Recognition ,Electrical and Electronic Engineering ,Business and International Management ,Industrial and Manufacturing Engineering - Published
- 2021
- Full Text
- View/download PDF
46. High-Capacity Framework for Reversible Data Hiding in Encrypted Image Using Pixel Predictions and Entropy Encoding
- Author
-
Yingqiang Qiu, Qichao Ying, Yuyan Yang, Huanqiang Zeng, Sheng Li, and Zhenxing Qian
- Subjects
FOS: Computer and information sciences ,Media Technology ,Electrical and Electronic Engineering ,Computer Science - Multimedia ,Multimedia (cs.MM) - Abstract
While the existing vacating room before encryption (VRBE) based schemes can achieve decent embedding rate, the payloads of the existing vacating room after encryption (VRAE) based schemes are relatively low. To address this issue, this paper proposes a generalized framework for high-capacity RDHEI for both VRBE and VRAE cases. First, an efficient embedding room generation algorithm (ERGA) is designed to produce large embedding room by using pixel prediction and entropy encoding. Then, we propose two RDHEI schemes, one for VRBE, another for VRAE. In the VRBE scenario, the image owner generates the embedding room with ERGA and encrypts the preprocessed image by using the stream cipher with two encryption keys. Then, the data hider locates the embedding room and embeds the encrypted additional data. In the VRAE scenario, the cover image is encrypted by an improved block modulation and permutation encryption algorithm, where the spatial redundancy in the plain-text image is largely preserved. Then, the data hider applies ERGA on the encrypted image to generate the embedding room and conducts data embedding. For both schemes, the receivers with different authentication keys can respectively conduct error-free data extraction and/or error-free image recovery. The experimental results show that the two proposed schemes outperform many state-of-the-art RDHEI arts. Besides, the schemes can ensure high security level, where the original image can be hardly discovered from the encrypted version before and after data hiding by the unauthorized user.
- Published
- 2021
- Full Text
- View/download PDF
47. A spatial structural similarity triplet loss for auxiliary vehicle re-identification
- Author
-
Xiaobin Zhu, Liu Liu, Jianqing Zhu, and Huanqiang Zeng
- Subjects
General Computer Science ,Triplet loss ,Computer science ,Structural similarity ,Biological system ,Re identification - Published
- 2020
- Full Text
- View/download PDF
48. Learning Matching Behavior Differences for Compressing Vehicle Re-identification Models
- Author
-
Yi Xie, Canhui Cai, Huanqiang Zeng, Lixin Zheng, and Jianqing Zhu
- Subjects
Matching (statistics) ,Computer science ,Computation ,02 engineering and technology ,Function (mathematics) ,010501 environmental sciences ,Complex network ,computer.software_genre ,01 natural sciences ,Field (computer science) ,Matrix (mathematics) ,0202 electrical engineering, electronic engineering, information engineering ,Trajectory ,020201 artificial intelligence & image processing ,Data mining ,computer ,0105 earth and related environmental sciences - Abstract
Vehicle re-identification matching vehicles captured by different cameras has great potential in the field of public security. However, recent vehicle re-identification approaches exploit complex networks, causing large computations in their testing phases. In this paper, we propose a matching behavior difference learning (MBDL) method to compress vehicle re-identification models for saving testing computations. In order to represent the matching behavior evolution across two different layers of a deep network, a matching behavior difference (MBD) matrix is designed. Then, our MBDL method minimizes the L1 loss function among MBD matrixes from a small student network and a complex teacher network, ensuring the student network use less computations to simulate the teacher network’s matching behaviors. During the testing phase, only the small student network is utilized so that testing computations can be significantly reduced. Experiments on VeRi776 and VehicleID datasets show that MBDL outperforms many state-of-the-art approaches in terms of accuracy and testing time performance.
- Published
- 2020
- Full Text
- View/download PDF
49. Maximum Correntropy Criterion-Based Hierarchical One-Class Classification
- Author
-
Huanqiang Zeng, Chun Yin, Haozhen Dai, Jiuwen Cao, Anton Kummert, and Baiying Lei
- Subjects
Computer Networks and Communications ,business.industry ,Computer science ,Gaussian ,Anomaly (natural sciences) ,Pattern recognition ,02 engineering and technology ,Computer Science Applications ,Support vector machine ,Kernel (linear algebra) ,symbols.namesake ,Artificial Intelligence ,Outlier ,0202 electrical engineering, electronic engineering, information engineering ,symbols ,One-class classification ,020201 artificial intelligence & image processing ,Anomaly detection ,Artificial intelligence ,business ,Software ,Extreme learning machine - Abstract
Due to the effectiveness of anomaly/outlier detection, one-class algorithms have been extensively studied in the past. The representatives include the shallow-structure methods and deep networks, such as the one-class support vector machine (OC-SVM), one-class extreme learning machine (OC-ELM), deep support vector data description (Deep SVDD), and multilayer OC-ELM (ML-OCELM/MK-OCELM). However, existing algorithms are generally built on the minimum mean-square-error (mse) criterion, which is robust to the Gaussian noises but less effective in dealing with large outliers. To alleviate this deficiency, a robust maximum correntropy criterion (MCC)-based OC-ELM (MC-OCELM) is first proposed and then further extended to a hierarchical network to enhance its capability in characterizing complex and large data (named HC-OCELM). The gradient derivation combining with a fixed-point iterative updation scheme is adopted for the output weight optimization. Experiments on many benchmark data sets are conducted for effectiveness validation. Comparisons to many state-of-the-art approaches are provided for the superiority demonstration.
- Published
- 2020
50. Light Field Image Quality Assessment: An Overview
- Author
-
Jing Chen, Yu Tian, Huanqiang Zeng, Kai-Kuang Ma, Hailiang Huang, and Jianqing Zhu
- Subjects
Computer graphics ,Range (mathematics) ,business.industry ,Computer science ,Image quality ,Human visual system model ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer vision ,Artificial intelligence ,Virtual reality ,business ,Light field - Abstract
The light field (LF) images have received significant attentions from both academia and industry, which allow a wide range of applications in the areas of computer vision and computer graphics, such as three-dimensional reconstruction, virtual reality, and so on. Considering that the human eyes are the final receivers of the LF images and the LF images are inevitably suffered from various distortions, image quality assessment (IQA) becomes an important issue for the LF images with the goal of evaluating the image quality of the LF images in accordance with the human visual system. This paper presents an up-to-date overview of the IQA specifically for the LF images, focusing on the formation of the LF images, the LF IQA databases and the existing LF IQA models.
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.