Author: "Lam, Kin-Man" / Database: Supplemental Index - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Lam, Kin-Man"' showing total 41 results

Start Over Author "Lam, Kin-Man" Database Supplemental Index

41 results on '"Lam, Kin-Man"'

1. Distilling Privileged Knowledge for Anomalous Event Detection From Weakly Labeled Videos

Author: Liu, Tianshan, Lam, Kin-Man, and Kong, Jun
Abstract: Weakly supervised video anomaly detection (WS-VAD) aims to identify the snippets involving anomalous events in long untrimmed videos, with solely text video-level binary labels. A typical paradigm among the existing text WS-VAD methods is to employ multiple modalities as inputs, e.g., RGB, optical flow, and audio, as they can provide sufficient discriminative clues that are robust to the diverse, complicated real-world scenes. However, such a pipeline has high reliance on the availability of multiple modalities and is computationally expensive and storage demanding in processing long sequences, which limits its use in some applications. To address this dilemma, we propose a privileged knowledge distillation (KD) framework dedicated to the WS-VAD task, which can maintain the benefits of exploiting additional modalities, while avoiding the need for using multimodal data in the inference phase. We argue that the performance of the privileged KD framework mainly depends on two factors: 1) the effectiveness of the multimodal teacher network and 2) the completeness of the useful information transfer. To obtain a reliable teacher network, we propose a text cross-modal interactive learning strategy and an anomaly normal discrimination loss, which target learning task-specific cross-modal features and encourage the separability of anomalous and normal representations, respectively. Furthermore, we design both representation- and text logits-level distillation loss functions, which force the unimodal student network to distill abundant privileged knowledge from the text well-trained multimodal teacher network, in a snippet-to-video fashion. Extensive experimental results on three public benchmarks demonstrate that the proposed privileged KD framework can train a lightweight yet effective detector, for localizing anomaly events under the supervision of video-level annotations.
Published: 2024
Full Text: View/download PDF

2. GR-PSN: Learning to Estimate Surface Normal and Reconstruct Photometric Stereo Images

Author: Ju, Yakun, Shi, Boxin, Chen, Yang, Zhou, Huiyu, Dong, Junyu, and Lam, Kin-Man
Abstract: In this paper, we propose a novel method, namely GR-PSN, which learns surface normals from photometric stereo images and generates the photometric images under distant illumination from different lighting directions and surface materials. The framework is composed of two subnetworks, named GeometryNet and ReconstructNet, which are cascaded to perform shape reconstruction and image rendering in an end-to-end manner. ReconstructNet introduces additional supervision for surface-normal recovery, forming a closed-loop structure with GeometryNet. We also encode lighting and surface reflectance in ReconstructNet, to achieve arbitrary rendering. In training, we set up a parallel framework to simultaneously learn two arbitrary materials for an object, providing an additional transform loss. Therefore, our method is trained based on the supervision by three different loss functions, namely the surface-normal loss, reconstruction loss, and transform loss. We alternately input the predicted surface-normal map and the ground-truth into ReconstructNet, to achieve stable training for ReconstructNet. Experiments show that our method can accurately recover the surface normals of an object with an arbitrary number of inputs, and can re-render images of the object with arbitrary surface materials. Extensive experimental results show that our proposed method outperforms those methods based on a single surface recovery network and shows realistic rendering results on 100 different materials.
Published: 2024
Full Text: View/download PDF

3. SMART: stratified matching and recurrent transformer for optical flow estimation

Author: Nakajima, Masayuki, Lau, Phooi Yee, Kim, Jae-Gon, Kubo, Hiroyuki, Chang, Chuan-Yu, Kemao, Qian, Chan, Kin-Chung, and Lam, Kin-Man
Published: 2024
Full Text: View/download PDF

4. Dynamic spatial aggregation network for joint denoising and HDR imaging

Author: Nakajima, Masayuki, Lau, Phooi Yee, Kim, Jae-Gon, Kubo, Hiroyuki, Chang, Chuan-Yu, Kemao, Qian, Xiao, Jun, Yang, Cuixin, Zhang, Cong, Ju, Yakun, and Lam, Kin-Man
Published: 2024
Full Text: View/download PDF

5. Semi-supervised learning for compound facial expression recognition with basic expression data

Author: Nakajima, Masayuki, Lau, Phooi Yee, Kim, Jae-Gon, Kubo, Hiroyuki, Chang, Chuan-Yu, Kemao, Qian, Dong, Rongkang, and Lam, Kin-Man
Published: 2024
Full Text: View/download PDF

6. Aligning localization and classification for anchor-free object detection in aerial imagery

Author: Nakajima, Masayuki, Lau, Phooi Yee, Kim, Jae-Gon, Kubo, Hiroyuki, Chang, Chuan-Yu, Kemao, Qian, Zhang, Cong, Ju, Yakun, Xiao, Jun, Yang, Yuting, and Lam, Kin-Man
Published: 2024
Full Text: View/download PDF

7. IACC: Cross-Illumination Awareness and Color Correction for Underwater Images Under Mixed Natural and Artificial Lighting

Author: Zhou, Jingchun, Gai, Qilin, Zhang, Dehuan, Lam, Kin-Man, Zhang, Weishi, and Fu, Xianping
Abstract: Enhancing underwater images captured under mixed artificial and natural lighting conditions presents two critical challenges. Existing methods lack a unified luminance feature extraction paradigm for mixed lighting scenes, leading to imbalance in luminance features, and consequent local overexposure or underexposure. Additionally, some color correction methods, through the fusion of features across multiple color spaces neglect the information loss due to the absence of feature alignment in cross-space fusion. To address these challenges, we propose a specialized method, namely IACC, which unifies the luminance features of underwater images under mixed lighting and guides consistent enhancement across similar luminance regions. Furthermore, complementary colors are introduced to globally guide the correction of color discrepancies, preserving the structural consistency and mitigating potential structural information loss during the original image feature extraction. Extensive experiments on various underwater datasets demonstrate the superiority of our method, which outperforms state-of-the-art methods in both machine and human visual perception. Our code is available at https://github.com/zhoujingchun03/IACC.
Published: 2024
Full Text: View/download PDF

8. Bi-Center Loss for Compound Facial Expression Recognition

Author: Dong, Rongkang and Lam, Kin-Man
Abstract: Compound facial expressions involve combinations of basic emotions, posing challenges to automatic facial expression recognition research. The focus of existing studies in facial expression recognition remains primarily on classifying basic or single expressions, which limits its application to compound facial expression recognition. Moreover, some compound facial expression recognition methods rely on laboratory-controlled expression data, narrowing their generalizability to real-world scenarios. In this work, we carry out the task of compound facial expression recognition in unconstrained environments. To this end, we devise a new loss function, specifically for compound facial expression recognition, called bi-center loss, which is built upon center loss. Unlike center loss that considers all categories, bi-center loss enables deep neural networks to learn compound emotion features by leveraging basic emotion centers. Additionally, we introduce a basic-center regularization term, based on the variance among the basic centers, to ensure appropriate discriminative capabilities of the learned features. Experiments conducted on unconstrained compound expression datasets demonstrate the effectiveness of the proposed method over the baselines, achieving state-of-the-art performance in compound facial expression recognition.
Published: 2024
Full Text: View/download PDF

9. Landmark Localization From Medical Images With Generative Distribution Prior

Author: Huang, Zixun, Zhao, Rui, Leung, Frank H. F., Banerjee, Sunetra, Lam, Kin-Man, Zheng, Yong-Ping, and Ling, Sai Ho
Abstract: In medical image analysis, anatomical landmarks usually contain strong prior knowledge of their structural information. In this paper, we propose to promote medical landmark localization by modeling the underlying landmark distribution via normalizing flows. Specifically, we introduce the flow-based landmark distribution prior as a learnable objective function into a regression-based landmark localization framework. Moreover, we employ an integral operation to make the mapping from heatmaps to coordinates differentiable to further enhance heatmap-based localization with the learned distribution prior. Our proposed Normalizing Flow-based Distribution Prior (NFDP) employs a straightforward backbone and non-problem-tailored architecture (i.e., ResNet18), which delivers high-fidelity outputs across three X-ray-based landmark localization datasets. Remarkably, the proposed NFDP can do the job with minimal additional computational burden as the normalizing flows module is detached from the framework on inferencing. As compared to existing techniques, our proposed NFDP provides a superior balance between prediction accuracy and inference speed, making it a highly efficient and effective approach. The source code of this paper is available at https://github.com/jacksonhzx95/NFDP.
Published: 2024
Full Text: View/download PDF

10. Deep Learning Approach for No-Reference Screen Content Video Quality Assessment

Author: Kwong, Ngai-Wing, Chan, Yui-Lam, Tsang, Sik-Ho, Huang, Ziyin, and Lam, Kin-Man
Abstract: Screen content video (SCV) has drawn much more attention than ever during the COVID-19 period and has evolved from a niche to a mainstream due to the recent proliferation of remote offices, online meetings, shared-screen collaboration, and gaming live streaming. Therefore, quality assessments for screen content media are highly demanded to maintain service quality recently. Although many practical natural scene video quality assessment methods have been proposed and achieved promising results, these methods cannot be applied to the screen content video quality assessment (SCVQA) task directly since the content characteristics of SCV are substantially different from natural scene video. Besides, only one no-reference SCVQA (NR-SCVQA) method, which requires handcrafted features, has been proposed in the literature. Therefore, we propose the first deep learning approach explicitly designed for NR-SCVQA. First, a multi-channel convolutional neural network (CNN) model is used to extract spatial quality features of pictorial and textual regions separately. Since there is no human annotated quality for each screen content frame (SCF), the CNN model is pre-trained in a multi-task self-supervised fashion to extract spatial quality feature representation of SCF. Second, we propose a time-distributed CNN transformer model (TCNNT) to further process all SCF spatial quality feature representations of an SCV and learn spatial and temporal features simultaneously so that high-level spatiotemporal features of SCV can be extracted and used to assess the whole SCV quality. Experimental results demonstrate the robustness and validity of our model, which is closely related to human perception.
Published: 2024
Full Text: View/download PDF

11. Structured Adversarial Self-Supervised Learning for Robust Object Detection in Remote Sensing Images

Author: Zhang, Cong, Lam, Kin-Man, Liu, Tianshan, Chan, Yui-Lam, and Wang, Qi
Abstract: Object detection plays a crucial role in scene understanding and has extensive practical applications. In the field of remote sensing object detection, both detection accuracy and robustness are of significant concern. Existing methods heavily rely on sophisticated adversarial training strategies that tend to improve robustness at the expense of accuracy. However, detection robustness is not always indicative of improved accuracy. Therefore, in this article, we research how to enhance robustness, while still preserving high accuracy, or even improve both simultaneously, with simple vanilla adversarial training or even in the absence thereof. In pursuit of a solution, we first conduct an exploratory investigation by shifting our attention from adversarial training, referred to as adversarial fine-tuning, to adversarial pretraining. Specifically, we propose a novel pretraining paradigm, namely, structured adversarial self-supervised (SASS) pretraining, to strengthen both clean accuracy and adversarial robustness for object detection in remote sensing images. At a high level, SASS pretraining aims to unify adversarial learning and self-supervised learning into pretraining and encode structured knowledge into pretrained representations for powerful transferability to downstream detection. Moreover, to fully explore the inherent robustness of vision Transformers and facilitate their pretraining efficiency, by leveraging the recent masked image modeling (MIM) as the pretext task, we further instantiate SASS pretraining into a concise end-to-end framework, named structured adversarial MIM (SA-MIM). SA-MIM consists of two pivotal components: structured adversarial attack and structured MIM (S-MIM). The former establishes structured adversaries for the context of adversarial pretraining, while the latter introduces a structured local-sampling global-masking strategy to adapt to hierarchical encoder architectures. Comprehensive experiments on three different datasets have demonstrated the significant superiority of the proposed pretraining paradigm over previous counterparts for remote sensing object detection. More importantly, regardless of with or without adversarial fine-tuning, it enables simultaneous improvements in detection accuracy and robustness as expected, promisingly alleviating the dependence on complicated adversarial fine-tuning.
Published: 2024
Full Text: View/download PDF

12. CoF-Net: A Progressive Coarse-to-Fine Framework for Object Detection in Remote-Sensing Imagery

Author: Zhang, Cong, Lam, Kin-Man, and Wang, Qi
Abstract: Object detection in remote-sensing images is a crucial task in the fields of Earth observation and computer vision. Despite impressive progress in modern remote-sensing object detectors, there are still three challenges to overcome: 1) complex background interference; 2) dense and cluttered arrangement of instances; and 3) large-scale variations. These challenges lead to two key deficiencies, namely, coarse features and coarse samples, which limit the performance of existing object detectors. To address these issues, in this article, a novel coarse-to-fine framework (CoF-Net) is proposed for object detection in remote-sensing imagery. CoF-Net mainly consists of two parallel branches, namely, coarse-to-fine feature adaptation (CoF-FA) and coarse-to-fine sample assignment (CoF-SA), which aim to progressively enhance feature representation and select stronger training samples, respectively. Specifically, CoF-FA smoothly refines the original coarse features into multispectral nonlocal fine features with discriminative spatial–spectral details and semantic relations. Meanwhile, CoF-SA dynamically considers samples from coarse to fine by progressively introducing geometric and classification constraints for sample assignment during training. Comprehensive experiments on three public datasets demonstrate the effectiveness and superiority of the proposed method.
Published: 2023
Full Text: View/download PDF

13. Efficient Inductive Vision Transformer for Oriented Object Detection in Remote Sensing Imagery

Author: Zhang, Cong, Su, Jingran, Ju, Yakun, Lam, Kin-Man, and Wang, Qi
Abstract: Object detection is a fundamental task in remote sensing image analysis and scene understanding. Previous remote sensing object detectors are typically based on convolutional neural networks (CNNs), whose performance is significantly limited by the intrinsic locality of convolution operations. The emergence of vision Transformers brings potential solutions to this problem, which has the capability to be a solid alternative to CNNs. However, three crucial obstacles hinder the application and performance of Transformers in the task of remote sensing object detection, that is: 1) high computational complexity, especially for high-resolution remote sensing images; 2) training and sample inefficiency caused by lack of inductive bias; and 3) difficulty in learning arbitrary orientation knowledge of geospatial objects. To address these issues, in this article, a novel efficient inductive vision Transformer framework is proposed for oriented object detection in remote sensing imagery. This framework follows the hierarchical feature pyramid structure and makes threefold contributions as follows: 1) spatial redundancy in remote sensing images is fully explored and an adaptive multigrained routing mechanism is proposed to facilitate token sparsity in Transformers, which can dramatically reduce the computational cost without comprising the accuracy. 2) A compact dual-path encoding architecture, where both global long-range dependencies and local semantic relations are jointly and complementarily captured, is proposed to enhance inductive bias in Transformers. 3) An angle tokenization technique is proposed to promote the encoding, embedding, and learning of direction knowledge for oriented objects in remote sensing scenarios. In this work, the above-mentioned three contributions are instantiated in an advanced Transformer-based object detector, namely, EIA-pyramid vision Transformer (PVT). Comprehensive experiments on two publicly available datasets have demonstrated its effectiveness and superiority for oriented object detection in remote sensing images.
Published: 2023
Full Text: View/download PDF

14. Point Cloud Registration Using Multiattention Mechanism and Deep Hybrid Features

Author: Zhang, Yu-Xin, Sun, Zhan-Li, Zeng, Zhi-Gang, and Lam, Kin-Man
Abstract: Due to some unfavorable factors, how to accurately register point clouds is still a challenging task. In this article, an effective point cloud registration network is proposed with multiple attention mechanism and deep hybrid features. For the features obtained with a graph neural network, three attention modules, namely the spatial attention module, channel attention module, and self-geometric attention module, are utilized to mine various areas of regional information. An attention-based feature fusion module, which consists of three consecutive residual blocks, is devised to fuse the features from the three attention modules. Moreover, the capability of the network for correctly matching point clouds is enhanced, by using deep hybrid features to guide the correspondence search and the calculation of matching confidence. Experimental results on several widely used datasets demonstrate the effectiveness of the proposed point cloud registration network.
Published: 2023
Full Text: View/download PDF

15. Boosting Object Detectors via Strong-Classification Weak-Localization Pretraining in Remote Sensing Imagery

Author: Zhang, Cong, Liu, Tianshan, Xiao, Jun, Lam, Kin-Man, and Wang, Qi
Abstract: Deep learning-based object detectors in remote sensing (RS) scenarios typically follow the paradigm of pretraining and fine-tuning to alleviate the limitation of insufficient downstream data. Despite the improved performance, existing pretraining paradigms are suboptimal due to three deficiencies: 1) inconsistent domains, i.e., pretraining on natural scenes and fine-tuning for RS scenes; 2) mismatched task objectives, i.e., classification-oriented pretraining while detection-oriented fine-tuning; and 3) misaligned architectures, i.e., pretraining only one bare backbone yet neglecting other vital detection components. Against these issues, this article proposes a novel pretraining paradigm specifically for the task of RS object detection, namely, RS strong-classification weak-localization (SCWL) pretraining. Unlike conventional classification pretraining, such as the widely used ImageNet pretraining, our pretraining strategy can adaptively perform bounding box generation on a reconstructed large-scale RS classification-style dataset. These pseudobounding boxes are integrated with the original accurate class labels as location- and category-related supervisions, respectively, to pretrain the entire RS detectors. The proposed RS SCWL pretraining paradigm is able to significantly improve downstream detection performance and outperforms classification pretraining methods, including ImageNet pretraining. Extensive experiments on different object detection datasets demonstrate its effectiveness and superiority in boosting various RS detectors.
Published: 2023
Full Text: View/download PDF

16. Attentive Boundary-Aware Fusion for Defect Semantic Segmentation Using Transformer

Author: Yeung, Ching-Chi and Lam, Kin-Man
Abstract: Defect semantic segmentation is a pixel-level inspection technique to guarantee the quality of various products. It can obtain the precise location of defects by assigning a class label to each image pixel. Due to the confusing appearance of various defects, most existing defect semantic segmentation methods still suffer from the problem of intraclass difference and interclass indiscrimination. To tackle these challenges, we propose an attentive boundary-aware transformer framework, namely ABFormer, for segmenting different types of defects. Specifically, we propose a split-attention boundary-aware fusion (SABF) to split and integrate the boundary and context features with two different attention modules. It can enrich and fuse the feature maps more efficiently. Moreover, we propose a boundary-aware spatial attention module (BSAM) to capture the spatial interdependencies between the positions of the boundary features and the context features. This module can enhance the consistency of the defect features of the same class for solving the intraclass difference problem. Furthermore, we propose a boundary-aware channel attention module (BCAM) to model the semantic relationship between the channels of the boundary features and the context features. This module can reinforce the discrimination between the defect features of different classes for handling the interclass indiscrimination problem. Experimental results on three defect semantic segmentation datasets, namely NEU-Seg, MT-Defect, and MSD, demonstrate that our proposed method outperforms the state-of-the-art methods.
Published: 2023
Full Text: View/download PDF

17. Spatial-frequency fusion for arbitrary-scale ultra-high-definition image super-resolution

Author: Nakajima, Masayuki, Lau, Phooi Yee, Kim, Jae-Gon, Kubo, Hiroyuki, Chang, Chuan-Yu, Kemao, Qian, Yang, Cuixin, Xiao, Jun, and Lam, Kin-Man
Published: 2024
Full Text: View/download PDF

18. An image segmentation algorithm for isolating ocean fronts of interest

Author: Nakajima, Masayuki, Lau, Phooi Yee, Kim, Jae-Gon, Kubo, Hiroyuki, Chang, Chuan-Yu, Kemao, Qian, Yang, Yuting, Zhang, Cong, Yang, Cuixin, and Lam, Kin-Man
Published: 2024
Full Text: View/download PDF

19. Application of GoogLeNet for ocean-front tracking

Author: Nakajima, Masayuki, Muramatsu, Shogo, Kim, Jae-Gon, Guo, Jing-Ming, Kemao, Qian, Yang, Yuting, Lam, Kin-Man, Rigall, Eric, Dong, Junyu, Sun, Xin, and Jian, Muwei
Published: 2022
Full Text: View/download PDF

20. Deep skip connection and multi-deconvolution network for single image super-resolution

Author: Lau, Phooi Yee, Shobri, Mohammad, Hu, Shiyu, Jian, Muwei, Wang, Guodong, Wang, Yanjie, Pan, Zhenkuan, and Lam, Kin-Man
Published: 2020
Full Text: View/download PDF

21. Self-supervised depth completion with attention-based loss

Author: Lau, Phooi Yee, Shobri, Mohammad, Wang, Yingyu, Ju, Yakun, Jian, Muwei, Lam, Kin-Man, Qia, Lin, and Dong, Junyu
Published: 2020
Full Text: View/download PDF

22. A local deviation constraint based non-rigid structure from motion approach

Author: Chen, Xia, Sun, Zhan-Li, Lam, Kin-Man, and Zeng, Zhigang
Abstract: In many traditional non-rigid structure from motion &#x0028 NRSFM &#x0029 approaches, the estimation results of part feature points may significantly deviate from their true values because only the overall estimation error is considered in their models. Aimed at solving this issue, a local deviation-constrained-based column-space-fitting approach is proposed in this paper to alleviate estimation deviation. In our work, an effective model is first constructed with two terms: the overall estimation error, which is computed by a linear subspace representation, and a constraint term, which is based on the variance of the reconstruction error for each frame. Furthermore, an augmented Lagrange multipliers &#x0028 ALM &#x0029 iterative algorithm is presented to optimize the proposed model. Moreover, a convergence analysis is performed with three steps for the optimization process. As both the overall estimation error and the local deviation are utilized, the proposed method can achieve a good estimation performance and a relatively uniform estimation error distribution for different feature points. Experimental results on several widely used synthetic sequences and real sequences demonstrate the effectiveness and feasibility of the proposed algorithm.
Published: 2020
Full Text: View/download PDF

23. The extended marine underwater environment database and baseline evaluations.

Author: Jian, Muwei, Qi, Qiang, Yu, Hui, Dong, Junyu, Cui, Chaoran, Nie, Xiushan, Zhang, Huaxiang, Yin, Yilong, and Lam, Kin-Man
Abstract: Images captured in underwater environments usually exhibit complex illuminations, severe turbidity of water, and often display objects with large varieties in pose and spatial location, etc., which cause challenges to underwater vision research. In this paper, an extended underwater image database for salient-object detection or saliency detection is introduced. This database is called the Marine Underwater Environment Database (MUED), which contains 8600 underwater images of 430 individual groups of conspicuous objects with complex backgrounds, multiple salient objects, and complicated variations in pose, spatial location, illumination, turbidity of water, etc. The publicly available MUED provides researchers in relevant industrial and academic fields with underwater images under different types of variations. Manually labeled ground-truth information is also included in the database, so as to facilitate the research on more applicable and robust methods for both underwater image processing and underwater computer vision. The scale, accuracy, diversity, and background structure of MUED cannot only be widely used to assess and evaluate the performance of the state-of-the-art salient-object detection and saliency-detection algorithms for general images, but also particularly benefit the development of underwater vision technology and offer unparalleled opportunities to researchers in the underwater vision community and beyond. • A diverse underwater image database is constructed and presented. • This released benchmark can identify the strengths and weaknesses of the existing algorithms for underwater images. • This database can offer unparalleled opportunities to researchers in the underwater vision and beyond. • This benchmark will benefit the development of underwater vision technology in the future. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

24. Spectrum-irrelevant fine-grained representation for visible–infrared person re-identification.

Author: Gong, Jiahao, Zhao, Sanyuan, Lam, Kin-Man, Gao, Xin, and Shen, Jianbing
Subjects: DATA augmentation, VISIBLE spectra, REPRESENTATIONS of graphs, INFRARED spectra, INFRARED imaging
Abstract: Visible–infrared person re-identification (VI-ReID) is an important and practical task for full-time intelligent surveillance systems. Compared to visible person re-identification, it is more challenging due to the large cross-modal discrepancy. Existing VI-ReID methods suffer from heterogeneous structures and the different spectra of visible and infrared images. In this work, we propose the Spectrum-Insensitive Data Augmentation (SIDA) strategy, which effectively alleviates the disturbance in the visible and infrared spectra and forces the network to learn spectrum-irrelevant features. The network also compares samples with both global and local features. We devise a Feature Relation Reasoning (FRR) module to learn discriminative fine-grained representations according to the graph reasoning principle. Compared to the most commonly used uniform partition, our FRR better adopts to the case of VI-ReID, in which human bodies are difficult to align. Furthermore, we design the dual center loss for learning the global feature in order to maintain the intra-modality relations, while learning the cross-modal similarities. Our method achieves better convergence in training. Extensive experiments demonstrate that our method achieves state-of-the-art performance on two visible–infrared cross-modal Re-ID datasets. • Analyzing the cross-modality discrepancy and studying the data augmentation on spectra information, we propose a Spectrum-Insensitive Data Augmentation (SIDA) strategy. • We develop a Feature Relation Reasoning (FRR) module based on the graph reasoning principle, for extraction and alignment of the fine-grained representation. Through further transferring information among cross-modality samples on the part-level, FRR learns discriminative feature representations. • We present an effective solution for VI-ReID. The experiments demonstrate that our method achieves the state-of-the-art performance on two popular benchmarks of VI-ReID datasets. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

25. Computer-aided diagnosis for preoperative invasion depth of gastric cancer with dual-energy spectral CT imaging.

Author: Li, Chao, Shi, Cen, Zhang, Huan, Hui, Chun, Lam, Kin Man, and Zhang, Su
Abstract: Rationale and Objectives: This study evaluates the accuracy of dual-energy spectral computed tomography (DEsCT) imaging with the aid of computer-aided diagnosis (CAD) system in assessing serosal invasion in patients with gastric cancer.Materials and Methods: Thirty patients with gastric cancer were enrolled in this study. Two types of features (information) were collected with the use of DEsCT imaging: conventional features including patient's clinical information (eg, age, gender) and descriptive characteristics on the CT images (eg, location of the lesion, wall thickness at the gastric cardia) and additional spectral CT features extracted from monochromatic images (eg, 60 keV) and material-decomposition images (eg, iodine- and water-density images). The classification results of the CAD system were compared to pathologic findings. Important features can be found out using support vector machine classification method in combination with feature-selection technique thereby helping the radiologists diagnose better.Results: Statistical analysis showed that for the collected cases, the feature "long axis" was significantly different between group A (serosa negative) and group B (serosa positive) (P < .05). By adding quantitative spectral features from several regions of interest (ROIs), the total classification accuracy was improved from 83.33% to 90.00%. Two feature ranking algorithms were used in the CAD scheme to derive the top-ranked features. The results demonstrated that low single-energy (approximately 60 keV) CT values, tumor size (long axis and short axis), iodine (water) density, and Effective-Z values of ROIs were important for classification. These findings concurred with the experience of the radiologist.Conclusions: The CAD system designed using machine-learning algorithms may be used to improve the identification accuracy in the assessment of serosal invasion in patients of gastric cancer with DEsCT imaging and provide some indicators which may be useful in predicting prognosis. [ABSTRACT FROM AUTHOR]
Published: 2015
Full Text: View/download PDF

26. Image Synthesis and Face Recognition Based on 3D Face Model and Illumination Model.

Author: Wang, Lipo, Chen, Ke, Ong, Yew, Liu, Dang-hui, Shen, Lan-sun, and Lam, Kin-man
Abstract: The performance of human face recognition algorithms is seriously affected by two important factors: head pose and lighting condition. The effective processing of the pose and illumination variations is a vital key for improving the recognition rate. This paper proposes a novel method that can synthesize images with different head poses and lighting conditions by using a modified 3D CANDIDE model, linear vertex interpolation and NURBS curve surface fitting method, as well as a mixed illumination model. A specific Eigenface method is also proposed to perform face recognition based on a pre-estimated head pose method. Experimental results show that the quality of the synthesized images and the recognition performance are good. [ABSTRACT FROM AUTHOR]
Published: 2005
Full Text: View/download PDF

27. Computer-Aided Diagnosis for Distinguishing Pancreatic Mucinous Cystic Neoplasms From Serous Oligocystic Adenomas in Spectral CT Images

Author: Li, Chao, Lin, Xiaozhu, Hui, Chun, Lam, Kin Man, and Zhang, Su
Abstract: Objective: This preliminary study aims to verify the effectiveness of the additional information provided by spectral computed tomography (CT) with the proposed computer-aided diagnosis (CAD) scheme to differentiate pancreatic serous oligocystic adenomas (SOAs) from mucinous cystic neoplasms of pancreas cystic lesions.Materials and Methods: This study was conducted from January 2010 to October 2013. Twenty-three patients (5 men and 18 women; mean age, 43.96 years old) with SOA and 19 patients (3 men and 16 women; mean age, 41.74 years old) with MCN were included in this retrospective study. Two types of features were collected by dual-energy spectral CT imaging as follows: conventional and additional quantitative spectral CT features. Classification results of the CAD scheme were compared using the conventional features and full feature data set. Important features were selected using support vector machine classification method combined with feature-selection technique. The optimal cutoff values of selected features were determined through receiver–operating characteristic curve analyses.Results: Combining conventional features with additional spectral CT features improved the overall accuracy from 88.37% to 93.02%. The selected features of the proposed CAD scheme were tumor size, contour, location, and low-energy CT values (43 keV). Iodine–water basis material pair densities in both arterial phase (AP) and portal venous phase (PP) were important factors for differential diagnosis of SOA and MCN. The optimal cutoff values of long axis, short axis, 40 keV monochromatic CT value in AP, iodine (water) density in AP, 43 keV monochromatic CT value in PP, and iodine (water) density in PP were 3.4 mm, 3.1 mm, 35.7 Hu, 0.32533 mg/mL, 39.4 Hu, and 0.348 mg/mL, respectively.Conclusion: The combination of conventional features and additional information provided by dual-energy spectral CT shows a high accuracy in the CAD scheme. The quantitative information of spectral CT may prove useful in the diagnosis and classification of SOAs and MCNs with machine learning algorithms.
Published: 2016
Full Text: View/download PDF

28. Are There Any Differences in Urodynamic Studies between Overactive Bladder Patients with Benign Prostatic Hyperplasia and Those without Benign Prostatic Hyperplasia?

Author: Mahawong, Phitsanu, Yu Cheong, Lam Kin Man, Chu, Peggy Sau-kwan, and Man Chi-wai
Abstract: Introduction: Overactive bladder and benign prostatic hyperplasia are quite common in middle age men. Some patients may have both of these conditions in the same time. Urodynamic study is still the most useful diagnostic test in this group of the patients because their lower urinary tract symptoms are unreliable. Objective: To compare results of urodynamic studies between overactive bladder patients with benign prostatic hyperplasia and overactive bladder patients without benign prostatic hyperplasia. Materials and Methods: From November 2007 to October 2009, 75 urodynamic studies were performed for male overactive bladder patients who were 40 years or older. The patients were retrospectively divided into two groups. Group 1 consisted of overactive bladder patients who also had been clinically diagnosed with benign prostatic hyperplasia, and Group II consisted of overactive bladder patients without benign prostatic hyperplasia. There were 41 patients in group I and 34 patients in group II. The results of the two groups were analyzed and compared. Results: There was a statistical difference only in terms of median post-void residual urine volume. Detrusor overactivities were demonstrated in 21/41 (51.2%) patients of Group I and in 19/34 (55.9%) patients of Group II (P-value -- 0.687). Nine patients in Group I (9/41; 21.9%) and eight patients in Group II (8/34; 23.5% ) represented bladder outlet obstruction (P-value = 0.871 ). Impaired detrusor contractilities were found in only five patients of Group I (P = 0.060). Conclusions: The urodynamic results indicate that only the median post-void residual urine volume was statistically different between two groups of patients. [ABSTRACT FROM AUTHOR]
Published: 2010

29. Multi-level unsupervised domain adaption for privacy-protected in-bed pose estimation

Author: Nakajima, Masayuki, Muramatsu, Shogo, Kim, Jae-Gon, Guo, Jing-Ming, Kemao, Qian, Chi, Ziheng, Wang, Shaozhi, Li, Xinyue, Chang, Chun-Tzu, Islam, Md, Holkar, Akshay, Pronger, Samantha, Liu, Tianshan, Lam, Kin-Man, and He, Xiangjian
Published: 2022
Full Text: View/download PDF

30. Multi-level feature aggregation network for high dynamic range imaging

Author: Nakajima, Masayuki, Muramatsu, Shogo, Kim, Jae-Gon, Guo, Jing-Ming, Kemao, Qian, Xiao, Jun, and Lam, Kin-Man
Published: 2022
Full Text: View/download PDF

31. Human Face Image Recognition: An Evidence Aggregation Approach

Author: Mirhosseini, Ali Reza, Yan, Hong, Lam, Kin-Man, and Pham, Tuan
Abstract: In this paper a novel analytically-based face recognition system is presented which allows incorporation of the importance of individual facial components in the recognition task. An image gallery of 40 people was used and the images searched to locate the face area and the head boundary. In this system the eyes are detected using learning graph templates, the mouth is detected using deformable templates, and the location of the nose is found by using integral projections based on the mouth and eye locations. Using a 3D model of a head, the facial rotations are estimated in order for the system to compensate for the rotation. The effect of the facial convexity is examined by using an overall recognition index, and an optimum value is used for the rest of the experiments. Each facial feature provides evidence for a classifier, with varying degrees of reliability. Furthermore, a fuzzy information fusion technique is applied to combine the decisions of individual classifiers with all possible combinations of classifiers. The reliability of each classifier is evaluated by an expert using fuzzy density measures in a training phase. An overall classification is derived using a fuzzy evidence aggregation method. The performance of the system is evaluated for various degrees of facial rotation using a cumulative score. The cumulative score provides the normal recognition rate as well as the rank of the next best matches.
Published: 1998
Full Text: View/download PDF

32. Adaptive deformable model for mouth boundary detection

Author: Mirhosseini, Ali Reza, Yan, Hong, and Lam, Kin-Man
Abstract: A new generalized algorithm is proposed to automatically extract a mouth boundary model from human face images. Such an algorithm can contribute to human face recognition and lip-reading-assisted speech recognition systems, in particular, and multimodal human computer interaction systems, in general. The new model is an iterative algorithm based on a hierarchical model adaptation scheme using deformable templates, as a generalization of some of the previous works. The role of prior knowledge is essential for perceptual organization in the algorithm. The prior knowledge about the mouth shape is used to define and initialize a primary deformable model. Each primary boundary curve of a mouth is formed on three control points, including two mouth corners, whose locations are optimized using a primary energy functional. This energy functional essentially captures the knowledge of the mouth shape to perceptually organize image information. The primary model is finely tuned in the second stage of optimization algorithm using a generalized secondary energy functional. Basically each boundary curve is finely tuned using more control points. The primary model is replaced by an adapted model if there is an increase in the secondary energy functional. The results indicate that the new model adaptation technique satisfactorily generalizes the mouth boundary model extraction in an automated fashion. © 1998 Society of Photo-Optical Instrumentation Engineers.
Published: 1998
Full Text: View/download PDF

33. Super-resolution on remote sensing images

Author: Nakajima, Masayuki, Kim, Jae-Gon, Lie, Wen-Nung, Kemao, Qian, Yang, Yuting, Lam, Kin-Man, Dong, Junyu, Sun, Xin, and Jian, Muwei
Published: 2021
Full Text: View/download PDF

34. Attention-based cross-modality interaction for multispectral pedestrian detection

Author: Nakajima, Masayuki, Kim, Jae-Gon, Lie, Wen-Nung, Kemao, Qian, Liu, Tianshan, Zhao, Rui, and Lam, Kin-Man
Published: 2021
Full Text: View/download PDF

35. 3D model retrieval based on deep learning approach with weighted three-view deep features

Author: Lau, Phooi Yee, Shobri, Mohammad, Jiang, Xuemei, Li, Yaqi, Hu, Jiwei, and Lam, Kin-Man
Published: 2020
Full Text: View/download PDF

36. Deep residual convolutional neural network with curriculum learning for source camera identification

Author: Lau, Phooi Yee, Shobri, Mohammad, Animasahun, I. O., and Lam, Kin-Man
Published: 2020
Full Text: View/download PDF

37. Elastic net with adaptive weight for image denoising

Author: Lau, Phooi Yee, Shobri, Mohammad, Xiao, Jun, Zhao, Rui, and Lam, Kin-Man
Published: 2020
Full Text: View/download PDF

38. Accurate car plate detection via car face landmark localization

Author: Falco, Charles M., Jiang, Xudong, Li, Hailiang, Chiu, Man-Yau, Wu, Kangheng, Lei, Zhibin, and Lam, Kin-Man
Published: 2017
Full Text: View/download PDF

39. Improved searching scheme for fractal image coding

Author: Lai, Cheung-Ming, Lam, Kin-Man, and Siu, Wan-Chi
Abstract: The authors propose using a non-symmetric window to search for the best matched domain block based on the local variances method. Experimental results show that the proposed scheme can achieve a speedup of 50% with only slight degradation in PSNR and compression ratio.
Published: 2002

40. PromoterExplorer: an effective promoter identification method based on the AdaBoost algorithm

Author: Xie, Xudong, Wu, Shuanhu, Lam, Kin-Man, and Yan, Hong
Abstract: Motivation: Promoter prediction is important for the analysis of gene regulations. Although a number of promoter prediction algorithms have been reported in literature, significant improvement in prediction accuracy remains a challenge. In this paper, an effective promoter identification algorithm, which is called PromoterExplorer, is proposed. In our approach, we analyze the different roles of various features, that is, local distribution of pentamers, positional CpG island features and digitized DNA sequence, and then combine them to build a high-dimensional input vector. A cascade AdaBoost-based learning procedure is adopted to select the most ‘informative’ or ‘discriminating’ features to build a sequence of weak classifiers, which are combined to form a strong classifier so as to achieve a better performance. The cascade structure used for identification can also reduce the false positive. Results: PromoterExplorer is tested based on large-scale DNA sequences from different databases, including the EPD, DBTSS, GenBank and human chromosome 22. Experimental results show that consistent and promising performance can be achieved. Contact: h.yan@cityu.edu.hk
Published: 2006
Full Text: View/download PDF

41. Efficient color face detection algorithm under different lighting conditions

Author: Chow, Tze-Yin, Lam, Kin-Man, and Wong, Kwok-Wai
Abstract: We present an efficient and reliable algorithm to detect human faces in an image under different lighting conditions. In our algorithm, skin-colored pixels are identified using a region-based approach, which can provide more reliable skin color segmentation under various lighting conditions. In addition, to compensate for extreme lighting conditions, a color compensation scheme is proposed, and the distributions of the skin-color components under various illuminations are modeled by means of the maximum-likelihood method. With the skin-color regions detected, a ratio method is proposed to determine the possible positions of the eyes in the image. Two eye candidates form a possible face region, which is then verified as a face or not by means of a two-stage procedure with an eigenmask. Finally, the face boundary region of a face candidate is further verified by a probabilistic approach to reduce the chance of false alarms. Experimental results based on the HHI MPEG-7 face database, the AR face database, and the CMU pose, illumination, and expression (PIE) database show that this face detection algorithm is efficient and reliable under different lighting conditions and facial expressions.
Published: 2006
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

41 results on '"Lam, Kin-Man"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources